Accessing satellite data from AWS with Python

This example notebook shows how to obtain Sentinel-2 imagery and additional data from AWS S3 storage buckets. The data at AWS is the same as original S-2 data provided by ESA.

The sentinelhub package supports obtaining data by specifying products or by specifying tiles. It can download data either to the same file structure as it is at AWS or it can download data into original .SAFE file structure introduced by ESA.

Before testing any of the examples below please check Configuration paragraph for details about configuring AWS credentials and information about charges.

[1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline
[2]:
import numpy as np
import matplotlib.pyplot as plt

Note: matplotlib is not a dependecy of sentinelhub and is used in these examples for visualizations.

Searching for available data

The archive of Sentinel-2 data at AWS consists of two buckets, one containing L1C and the other containing L2A data. There are multiple ways to search the archive for specific tiles and products:

  • Manual search using aws_cli, e.g.:
aws s3 ls s3://sentinel-s2-l2a/tiles/33/U/WR/ --request-payer

(Note: For this functionality Sentinel Hub instance ID has to be configured according to Configuration paragraph.)

[3]:
from sentinelhub import WebFeatureService, BBox, CRS, DataSource

INSTANCE_ID = ''  # In case you put instance ID into cofniguration file you can leave this unchanged

search_bbox = BBox(bbox=[46.16, -16.15, 46.51, -15.58], crs=CRS.WGS84)
search_time_interval = ('2017-12-01T00:00:00', '2017-12-15T23:59:59')


wfs_iterator = WebFeatureService(search_bbox, search_time_interval,
                                 data_source=DataSource.SENTINEL2_L1C,
                                 maxcc=1.0, instance_id=INSTANCE_ID)

for tile_info in wfs_iterator:
    print(tile_info)
{'properties': {'path': 's3://sentinel-s2-l1c/tiles/38/L/PH/2017/12/15/0', 'time': '07:12:03', 'id': 'S2B_OPER_MSI_L1C_TL_MTI__20171215T085654_A004050_T38LPH_N02.06', 'cloudCoverPercentage': 28.27, 'mbr': '600000,8190220 709800,8300020', 'date': '2017-12-15', 'crs': 'EPSG:32738'}, 'type': 'Feature', 'geometry': {'type': 'MultiPolygon', 'crs': {'properties': {'name': 'urn:ogc:def:crs:EPSG::4326'}, 'type': 'name'}, 'coordinates': [[[[45.931783967, -15.374656929], [46.954538568, -15.368029755], [46.964123606, -16.360077552], [45.936356187, -16.367155102], [45.931783967, -15.374656929]]]]}}
{'properties': {'path': 's3://sentinel-s2-l1c/tiles/38/L/PH/2017/12/10/0', 'time': '07:12:10', 'id': 'S2A_OPER_MSI_L1C_TL_SGS__20171210T103113_A012887_T38LPH_N02.06', 'cloudCoverPercentage': 94.02, 'mbr': '600000,8190220 709800,8300020', 'date': '2017-12-10', 'crs': 'EPSG:32738'}, 'type': 'Feature', 'geometry': {'type': 'MultiPolygon', 'crs': {'properties': {'name': 'urn:ogc:def:crs:EPSG::4326'}, 'type': 'name'}, 'coordinates': [[[[45.931783967, -15.374656929], [46.954538568, -15.368029755], [46.964123606, -16.360077552], [45.936356187, -16.367155102], [45.931783967, -15.374656929]]]]}}
{'properties': {'path': 's3://sentinel-s2-l1c/tiles/38/L/PH/2017/12/5/0', 'time': '07:13:30', 'id': 'S2B_OPER_MSI_L1C_TL_SGS__20171205T102636_A003907_T38LPH_N02.06', 'cloudCoverPercentage': 91.74, 'mbr': '600000,8190220 709800,8300020', 'date': '2017-12-05', 'crs': 'EPSG:32738'}, 'type': 'Feature', 'geometry': {'type': 'MultiPolygon', 'crs': {'properties': {'name': 'urn:ogc:def:crs:EPSG::4326'}, 'type': 'name'}, 'coordinates': [[[[45.931783967, -15.374656929], [46.954538568, -15.368029755], [46.964123606, -16.360077552], [45.936356187, -16.367155102], [45.931783967, -15.374656929]]]]}}

From obtained WFS iterator we can extract info which uniquely defines each tile.

[4]:
wfs_iterator.get_tiles()
[4]:
[('38LPH', '2017-12-15', 0),
 ('38LPH', '2017-12-10', 0),
 ('38LPH', '2017-12-5', 0)]
[5]:
from sentinelhub import get_area_info

for tile_info in get_area_info(search_bbox, search_time_interval, maxcc=0.5):
    print(tile_info)
{'properties': {'services': {'download': {'mimeType': 'text/html', 'url': 'http://sentinel-s2-l1c.s3-website.eu-central-1.amazonaws.com#tiles/38/L/PH/2017/12/15/0/'}}, 'centroid': {'type': 'Point', 'coordinates': [23.482061803, -15.8675924285]}, 'spacecraft': 'S2B', 'license': {'grantedFlags': None, 'grantedOrganizationCountries': None, 'licenseId': 'unlicensed', 'hasToBeSigned': 'never', 'grantedCountries': None, 'description': {'shortName': 'No license'}, 'signatureQuota': -1, 'viewService': 'public'}, 'links': [{'href': 'http://opensearch.sentinel-hub.com/resto/collections/Sentinel2/985b7c0c-5d4a-5105-a37b-ef41f4092392.json?&lang=en', 'type': 'application/json', 'title': 'GeoJSON link for 985b7c0c-5d4a-5105-a37b-ef41f4092392', 'rel': 'self'}], 'productType': 'S2MSI1C', 'published': '2017-12-15T14:03:04.356331Z', 'sgsId': 3440459, 'snowCover': 0, 'collection': 'Sentinel2', 'processingLevel': '1C', 'description': None, 'completionDate': '2017-12-15T07:12:03Z', 'orbitNumber': 4050, 'sensorMode': None, 's3Path': 'tiles/38/L/PH/2017/12/15/0', 'updated': '2017-12-15T14:03:04.356331Z', 'parentIdentifier': None, 's3URI': 's3://sentinel-s2-l1c/tiles/38/L/PH/2017/12/15/0/', 'organisationName': None, 'quicklook': None, 'thumbnail': None, 'title': 'S2B_OPER_MSI_L1C_TL_MTI__20171215T085654_A004050_T38LPH_N02.06', 'instrument': 'MSI', 'productIdentifier': 'S2B_OPER_MSI_L1C_TL_MTI__20171215T085654_A004050_T38LPH_N02.06', 'startDate': '2017-12-15T07:12:03Z', 'cloudCover': 28.27, 'platform': 'Sentinel-2', 'keywords': [], 'resolution': 10}, 'type': 'Feature', 'geometry': {'type': 'MultiPolygon', 'coordinates': [[[[45.931783967, -15.374656929], [46.954538568, -15.368029755], [46.964123606, -16.360077552], [45.936356187, -16.367155102], [45.931783967, -15.374656929]]]]}, 'id': '985b7c0c-5d4a-5105-a37b-ef41f4092392'}

Download data

Once we have found correct tiles or products we can download them and explore the data.

Aws Tile

Sentinel-2 tile can be uniquely defined either with ESA tile ID (e.g. L1C_T01WCV_A012011_20171010T003615) or with tile name (e.g. T38TML or 38TML), sensing time and AWS index. The AWS index is the last number in tile AWS path (e.g. https://roda.sentinel-hub.com/sentinel-s2-l1c/tiles/1/C/CV/2017/1/14/0/0).

The package works with the second tile definition. To transform tile ID to (tile_name, time, aws_index) do the following:

[6]:
from sentinelhub import AwsTile

tile_id = 'S2A_OPER_MSI_L1C_TL_MTI__20151219T100121_A002563_T38TML_N02.01'
tile_name, time, aws_index = AwsTile.tile_id_to_tile(tile_id)
tile_name, time, aws_index
[6]:
('38TML', '2015-12-19', 1)

Now we are ready to download the data. Let’s download only bands B8A and B10, meta data files tileInfo.json, preview.jp2 and pre-calculated cloud mask qi/MSK_CLOUDS_B00. We will save everything into folder ./AwsData.

[7]:
from sentinelhub import AwsTileRequest

bands = ['B8A', 'B10']
metafiles = ['tileInfo', 'preview', 'qi/MSK_CLOUDS_B00']
data_folder = './AwsData'

request = AwsTileRequest(tile=tile_name, time=time, aws_index=aws_index,
                         bands=bands, metafiles=metafiles, data_folder=data_folder,
                         data_source=DataSource.SENTINEL2_L1C)

request.save_data()  # This is where the download is triggered

Note that upon calling this method again the data won’t be re-downloaded unless we set the parameter redownload=True.

To obtain downloaded data we can simply do:

[8]:
data_list = request.get_data()  # This will not redownload anything because data is already stored on disk

b8a, b10, tile_info, preview, cloud_mask = data_list

Download and reading could also be done in a single call request.get_data(save_data=True).

[9]:
_ = plt.imshow(preview)
../_images/examples_aws_request_18_0.png
[10]:
_ = plt.imshow(b8a)
../_images/examples_aws_request_19_0.png

Aws Product

Sentinel-2 product is uniquely defined by ESA product ID. We can obtain data for the whole product

[11]:
from sentinelhub import AwsProductRequest

product_id = 'S2A_MSIL1C_20171010T003621_N0205_R002_T01WCV_20171010T003615'

request = AwsProductRequest(product_id=product_id, data_folder=data_folder)

# Uncomment the the following line to download the data:
# data_list = request.get_data(save_data=True)

If bands parameter is not defined all bands will be downloaded. If metafiles parameter is not defined no additional metadata files will be downloaded.

Data into .SAFE structure

The data can also be downloaded into .SAFE structure by specifying safe_format=True. The following code will download data from upper example again because now data will be stored in different folder structure.

[12]:
tile_request = AwsTileRequest(tile=tile_name, time=time, aws_index=aws_index,
                              bands=bands, metafiles=metafiles, data_folder=data_folder,
                              safe_format=True)

# Uncomment the the following line to download the data:
# tile_request.save_data()
[13]:
product_id = 'S2A_OPER_PRD_MSIL1C_PDMC_20160121T043931_R069_V20160103T171947_20160103T171947'

product_request = AwsProductRequest(product_id=product_id, bands=['B01'],
                                    data_folder=data_folder, safe_format=True)

# Uncomment the the following line to download the data:
# product_request.save_data()

Older products contain multiple tiles. In case would like to download only some tiles it is also possible to specify a list of tiles to download.

[14]:
product_request = AwsProductRequest(product_id=product_id, tile_list=['T14PNA', 'T13PHT'],
                                    data_folder=data_folder, safe_format=True)

# Uncomment the the following line to download the data:
# product_request.save_data()