Convert Sentinel-2 Data Using GDAL

The latest version of GDAL (2.1) has a driver to read Sentinel 2 data (see Like HDF files they are read as subdatasets. Running ‘gdalinfo’ on the zipped folder or the .xml file contained within the .SAFE directory will display all the subdatasets, as well as all the metadata (so quite a lot of information!).

As with HDF5 you can pass the subdataset names to gdalinfo to get more information or gdal_translate to extract them as a separate dataset.

To make it easier to extract all the subdatasets I wrote a script ( which can be downloaded from the repository.

The script will get a list of all subdatasets using GDAL:

from osgeo import gdal
dataset = gdal.Open('S2/S2.xml', gdal.GA_ReadOnly)
subdatasets = dataset.GetSubDatasets()
dataset = None

The ones to be extracted are for the 10, 20 and 60 m resolution band groups for each UTM zone (if the file crosses multiple zones).

For each subdataset it will give an output name, replacing the EPSG code with the UTM zone and ‘:’ with ‘_’.

Then the gdal_translate command is used to create a new file for each. By default the output is KEA format, called using subprocess.

To run the script first install GDAL 2.1, the conda-forge channel has recent builds, to install them using conda:

conda create -n gdal2 -c conda-forge gdal
source activate gdal2

(If you are on Windows leave out ‘source’)

To extract all subdatasets from a zipped Sentinel 2 scene to the current directory you can then use: -o . \

The gdal_translate command used is printed to the screen.

The default output format is KEA, you can change using the ‘–of’ flag. For example to convert an unzipped scene to GeoTiff: -o . --of GTiff \

To get the extension for all supported drivers, and some creation options the ‘get_gdal_drivers’ module from arsf_dem_scripts is optionally used. You can just download this file and copy into the same directory ‘’ has been saved to. For Linux or OS X you can run:

# OS X
curl >

# Linux

7 thoughts on “Convert Sentinel-2 Data Using GDAL

  1. Gwawr Jones

    Hi Dan,

    ESA changed their file naming convention on 6th of December 2016 but their metadata files have also changed, i.e. there is no band list (!!!). This means that gdal.Open throws back an error “ERROR 1: Cannot find Query_Options.Band_List” and therefore cannot get a list of subdatasets from the zip files. I was also using gdal to extract Sentinel-2 data for processing, so any ideas on a way round this would be much appreciated.


    1. danclewley Post author

      That’s quite annoying – I wasn’t aware of this (haven’t looked at Sentinel-2 data with GDAL for a bit). Hopefully the GDAL driver will be updated at some point. If I find a work around I’ll update the post – let me know if you find one first.

  2. iqbalhabibie

    I am trying for your tutorial but I cant use the extract using this command : -o . \

    I wonder what is went wrong? I am new using for python command. Thank you.

  3. Munish

    Hello , Thanks for sharing this page, I was able to convert sentinel bands from jp2 to tiff format. Values differ slightly b/w jp2 and tiff when i open them in Qgis. However, I want to process the bands in python and when I open the same geotiff in python, a whole diff range of values appear which is much higher range. I believe some basic step (say normalization) is required to be able to use these gdal created geotiff in python. Please suggest. (sorry if the query sounds too basic), thanks in advance


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s