Accessing satellite data from AWS
Warning
The functionalities of the .aws module have mostly been deprecated and are no longer actively maintained. Some recent changes (such as switch in file extensions) render these utilities only partially usable.
The following code can help you patch your scripts in such cases. It interacts with S3 more directly and is easier for you to adjust.
For this patch Sentinel Hub Config has to be configured according to Configuration paragraph.
[2]:
import os
from datetime import datetime
from sentinelhub import CRS, BBox, DataCollection, SentinelHubCatalog, SHConfig
from sentinelhub.aws import AwsDownloadClient
boto_params = {"RequestPayer": "requester"}
config = SHConfig()
s3_client = AwsDownloadClient.get_s3_client(config)
[3]:
search_bbox = BBox(bbox=(46.16, -16.15, 46.51, -5.58), crs=CRS.WGS84)
search_time_interval = (datetime(2022, 12, 11), datetime(2022, 12, 17))
data_collection = DataCollection.SENTINEL2_L1C # use DataCollection.SENTINEL2_L1C or DataCollection.SENTINEL2_L2A
[4]:
def get_s3_tile_paths(search_bbox, search_time_interval, data_collection, config):
results = SentinelHubCatalog(config).search(collection=data_collection, bbox=search_bbox, time=search_time_interval)
return [result["assets"]["data"]["href"] for result in results]
[5]:
get_s3_tile_paths(search_bbox, search_time_interval, data_collection, config)
[5]:
['s3://sentinel-s2-l1c/tiles/38/L/PH/2022/12/14/0/',
's3://sentinel-s2-l1c/tiles/38/L/PJ/2022/12/14/0/',
's3://sentinel-s2-l1c/tiles/38/L/PK/2022/12/14/0/',
's3://sentinel-s2-l1c/tiles/38/L/PL/2022/12/14/0/',
's3://sentinel-s2-l1c/tiles/38/L/PM/2022/12/14/0/',
's3://sentinel-s2-l1c/tiles/38/L/PN/2022/12/14/0/',
's3://sentinel-s2-l1c/tiles/38/L/PP/2022/12/14/0/',
's3://sentinel-s2-l1c/tiles/38/L/PM/2022/12/12/0/',
's3://sentinel-s2-l1c/tiles/38/L/PN/2022/12/12/0/',
's3://sentinel-s2-l1c/tiles/38/L/PP/2022/12/12/0/',
's3://sentinel-s2-l1c/tiles/38/L/PQ/2022/12/12/0/',
's3://sentinel-s2-l1c/tiles/38/L/PR/2022/12/12/0/']
[6]:
def list_tile_objects(s3_tile_path):
"""Returns list of all files, which are located on `s3 path` on s3 bucket."""
_, _, bucket_name, url_key = s3_tile_path.split("/", 3)
return s3_client.list_objects_v2(Bucket=bucket_name, Prefix=url_key, **boto_params)["Contents"]
[7]:
list_of_objects = tile_objects_list = list_tile_objects("s3://sentinel-s2-l1c/tiles/38/L/PH/2022/12/14/0/")
list_of_objects[:3]
[7]:
[{'Key': 'tiles/38/L/PH/2022/12/14/0/B01.jp2',
'LastModified': datetime.datetime(2022, 12, 14, 10, 44, 30, tzinfo=tzutc()),
'ETag': '"0a2ba9a9f7a8e1a7c4c5102d76596b87"',
'Size': 3609242,
'StorageClass': 'INTELLIGENT_TIERING'},
{'Key': 'tiles/38/L/PH/2022/12/14/0/B02.jp2',
'LastModified': datetime.datetime(2022, 12, 14, 10, 44, 30, tzinfo=tzutc()),
'ETag': '"20c0b9d7f3bf09d088facfef23d9d23a"',
'Size': 100628227,
'StorageClass': 'INTELLIGENT_TIERING'},
{'Key': 'tiles/38/L/PH/2022/12/14/0/B03.jp2',
'LastModified': datetime.datetime(2022, 12, 14, 10, 44, 30, tzinfo=tzutc()),
'ETag': '"381a0db09b39ec7e2a57c96902eb3ab8"',
'Size': 103173386,
'StorageClass': 'INTELLIGENT_TIERING'}]
[7]:
def download(s3_tile_path, download_dir, objects_to_download=None):
os.makedirs(download_dir, exist_ok=True)
all_files = list_tile_objects(s3_tile_path)
for file in all_files:
file_name = file["Key"].split("/")[-1]
if not objects_to_download or file_name in objects_to_download:
out_path = os.path.join(download_dir, file_name)
_, _, bucket_name, _ = s3_tile_path.split("/", 3)
s3_client.download_file(Bucket=bucket_name, Key=file["Key"], Filename=out_path, ExtraArgs=boto_params)
[8]:
files_to_download = ["B04.jp2", "B07.jp2"]
download("s3://sentinel-s2-l1c/tiles/38/L/PH/2022/12/14/0/", "local/file/path", files_to_download)
Accessing satellite data from AWS
This example notebook shows how to obtain Sentinel-2 imagery and additional data from AWS S3 storage buckets. The data at AWS is the same as original S-2 data provided by ESA.
The sentinelhub
package supports obtaining data by specifying products or by specifying tiles. It can download data either to the same file structure as it is at AWS or it can download data into original .SAFE
file structure introduced by ESA.
Before testing the examples below make sure to install the package with [AWS]
extension dependencies and please check Configuration paragraph for details about configuring AWS credentials and information about charges.
[1]:
%matplotlib inline
import matplotlib.pyplot as plt
Note: matplotlib
is not a dependency of sentinelhub
and is used in these examples for visualizations.
Searching for available data
For this functionality Sentinel Hub instance ID has to be configured according to Configuration paragraph.
[2]:
from sentinelhub import CRS, BBox, DataCollection, SHConfig, WebFeatureService
config = SHConfig()
if config.instance_id == "":
print("Warning! To use WFS functionality, please configure the `instance_id`.")
The archive of Sentinel-2 data at AWS consists of two buckets, one containing L1C and the other containing L2A data. There are multiple ways to search the archive for specific tiles and products:
Manual search using aws_cli, e.g.:
aws s3 ls s3://sentinel-s2-l2a/tiles/33/U/WR/ --request-payer
Manual search using service available at https://roda.sentinel-hub.com//, which does not require authentication, e.g.:
https://roda.sentinel-hub.com/sentinel-s2-l1c/tiles/1/C/CV/2017/1/14/0/
Automatic search by a tile id or by location and time interval using Sentinel Hub Catalog API. More examples are available in this notebook.
Automatic search by location and time interval using Sentinel Hub Web Feature Service (WFS):
[3]:
search_bbox = BBox(bbox=(46.16, -16.15, 46.51, -15.58), crs=CRS.WGS84)
search_time_interval = ("2017-12-01T00:00:00", "2017-12-15T23:59:59")
wfs_iterator = WebFeatureService(
search_bbox, search_time_interval, data_collection=DataCollection.SENTINEL2_L1C, maxcc=1.0, config=config
)
for tile_info in wfs_iterator:
print(tile_info)
{'type': 'Feature', 'geometry': {'type': 'MultiPolygon', 'crs': {'type': 'name', 'properties': {'name': 'urn:ogc:def:crs:EPSG::4326'}}, 'coordinates': [[[[45.93178396701427, -15.374656928849852], [46.95453856838988, -15.368029754563597], [46.96412360581364, -16.360077552492225], [45.93635618696065, -16.3671551019236], [45.93178396701427, -15.374656928849852]]]]}, 'properties': {'id': 'S2B_OPER_MSI_L1C_TL_MTI__20171215T085654_A004050_T38LPH_N02.06', 'date': '2017-12-15', 'time': '07:12:03', 'path': 's3://sentinel-s2-l1c/tiles/38/L/PH/2017/12/15/0', 'crs': 'EPSG:32738', 'mbr': '600000,8190220 709800,8300020', 'cloudCoverPercentage': 28.27}}
{'type': 'Feature', 'geometry': {'type': 'MultiPolygon', 'crs': {'type': 'name', 'properties': {'name': 'urn:ogc:def:crs:EPSG::4326'}}, 'coordinates': [[[[45.93178396701427, -15.374656928849852], [46.95453856838988, -15.368029754563597], [46.96412360581364, -16.360077552492225], [45.93635618696065, -16.3671551019236], [45.93178396701427, -15.374656928849852]]]]}, 'properties': {'id': 'S2A_OPER_MSI_L1C_TL_SGS__20171210T103113_A012887_T38LPH_N02.06', 'date': '2017-12-10', 'time': '07:12:10', 'path': 's3://sentinel-s2-l1c/tiles/38/L/PH/2017/12/10/0', 'crs': 'EPSG:32738', 'mbr': '600000,8190220 709800,8300020', 'cloudCoverPercentage': 94.02}}
{'type': 'Feature', 'geometry': {'type': 'MultiPolygon', 'crs': {'type': 'name', 'properties': {'name': 'urn:ogc:def:crs:EPSG::4326'}}, 'coordinates': [[[[45.93178396701427, -15.374656928849852], [46.95453856838988, -15.368029754563597], [46.96412360581364, -16.360077552492225], [45.93635618696065, -16.3671551019236], [45.93178396701427, -15.374656928849852]]]]}, 'properties': {'id': 'S2B_OPER_MSI_L1C_TL_SGS__20171205T102636_A003907_T38LPH_N02.06', 'date': '2017-12-05', 'time': '07:13:30', 'path': 's3://sentinel-s2-l1c/tiles/38/L/PH/2017/12/5/0', 'crs': 'EPSG:32738', 'mbr': '600000,8190220 709800,8300020', 'cloudCoverPercentage': 91.74}}
From obtained WFS iterator we can extract info which uniquely defines each tile.
[4]:
wfs_iterator.get_tiles()
[4]:
[('38LPH', '2017-12-15', 0),
('38LPH', '2017-12-10', 0),
('38LPH', '2017-12-5', 0)]
Automatic search with functions from sentinelhub.opensearch module (no authentication required):
[5]:
from sentinelhub import get_area_info
for tile_info in get_area_info(search_bbox, search_time_interval, maxcc=0.5):
print(tile_info)
{'type': 'Feature', 'id': '985b7c0c-5d4a-5105-a37b-ef41f4092392', 'geometry': {'type': 'MultiPolygon', 'coordinates': [[[[45.931783967, -15.374656929], [46.954538568, -15.368029755], [46.964123606, -16.360077552], [45.936356187, -16.367155102], [45.931783967, -15.374656929]]]]}, 'properties': {'collection': 'Sentinel2', 'license': {'licenseId': 'unlicensed', 'hasToBeSigned': 'never', 'grantedCountries': None, 'grantedOrganizationCountries': None, 'grantedFlags': None, 'viewService': 'public', 'signatureQuota': -1, 'description': {'shortName': 'No license'}}, 'productIdentifier': 'S2B_OPER_MSI_L1C_TL_MTI__20171215T085654_A004050_T38LPH_N02.06', 'parentIdentifier': None, 'title': 'S2B_OPER_MSI_L1C_TL_MTI__20171215T085654_A004050_T38LPH_N02.06', 'description': None, 'organisationName': None, 'startDate': '2017-12-15T07:12:03Z', 'completionDate': '2017-12-15T07:12:03Z', 'productType': 'S2MSI1C', 'processingLevel': '1C', 'platform': 'Sentinel-2', 'instrument': 'MSI', 'resolution': 10, 'sensorMode': None, 'orbitNumber': 4050, 'quicklook': None, 'thumbnail': None, 'updated': '2017-12-15T14:03:04.356331Z', 'published': '2017-12-15T14:03:04.356331Z', 'snowCover': 0, 'cloudCover': 28.27, 'keywords': [], 'centroid': {'type': 'Point', 'coordinates': [23.482061803, -15.8675924285]}, 's3Path': 'tiles/38/L/PH/2017/12/15/0', 'spacecraft': 'S2B', 'sgsId': 3440459, 's3URI': 's3://sentinel-s2-l1c/tiles/38/L/PH/2017/12/15/0/', 'services': {'download': {'url': 'http://sentinel-s2-l1c.s3-website.eu-central-1.amazonaws.com#tiles/38/L/PH/2017/12/15/0/', 'mimeType': 'text/html'}}, 'links': [{'rel': 'self', 'type': 'application/json', 'title': 'GeoJSON link for 985b7c0c-5d4a-5105-a37b-ef41f4092392', 'href': 'http://opensearch.sentinel-hub.com/resto/collections/Sentinel2/985b7c0c-5d4a-5105-a37b-ef41f4092392.json?&lang=en'}]}}
Download data
Once we have found correct tiles or products we can download them and explore the data. Note that in order to do that, you have to provide AWS credentials to the config. Please see also documentation.
Aws Tile
Sentinel-2 tile can be uniquely defined either with ESA tile ID (e.g. L1C_T01WCV_A012011_20171010T003615
) or with tile name (e.g. T38TML
or 38TML
), sensing time and AWS index. The AWS index is the last number in tile AWS path (e.g. https://roda.sentinel-hub.com/sentinel-s2-l1c/tiles/1/C/CV/2017/1/14/0/ → 0
).
The package works with the second tile definition. To transform tile ID to (tile_name, time, aws_index)
do the following:
[6]:
from sentinelhub.aws import AwsTile
tile_id = "S2A_OPER_MSI_L1C_TL_MTI__20151219T100121_A002563_T38TML_N02.01"
tile_name, time, aws_index = AwsTile.tile_id_to_tile(tile_id)
tile_name, time, aws_index
[6]:
('38TML', '2015-12-19', 1)
Now we are ready to download the data. Let’s download only bands B8A
and B10
, meta data files tileInfo.json
, preview.jp2
and pre-calculated cloud mask qi/MSK_CLOUDS_B00
. We will save everything into folder ./AwsData
.
[7]:
from sentinelhub.aws import AwsTileRequest
bands = ["B8A", "B10"]
metafiles = ["tileInfo", "preview", "qi/MSK_CLOUDS_B00"]
data_folder = "./AwsData"
request = AwsTileRequest(
tile=tile_name,
time=time,
aws_index=aws_index,
bands=bands,
metafiles=metafiles,
data_folder=data_folder,
data_collection=DataCollection.SENTINEL2_L1C,
)
request.save_data() # This is where the download is triggered
Note that upon calling this method again the data won’t be re-downloaded unless we set the parameter redownload=True
.
To obtain downloaded data we can simply do:
[8]:
data_list = request.get_data() # This will not redownload anything because data is already stored on disk
b8a, b10, tile_info, preview, cloud_mask = data_list
Download and reading could also be done in a single call request.get_data(save_data=True)
.
[9]:
plt.imshow(preview);

[10]:
plt.imshow(b8a);

Aws Product
Sentinel-2 product is uniquely defined by ESA product ID. We can obtain data for the whole product
[11]:
from sentinelhub.aws import AwsProductRequest
product_id = "S2A_MSIL1C_20171010T003621_N0205_R002_T01WCV_20171010T003615"
request = AwsProductRequest(product_id=product_id, data_folder=data_folder)
# Uncomment the the following line to download the data:
# data_list = request.get_data(save_data=True)
If bands
parameter is not defined all bands will be downloaded. If metafiles
parameter is not defined no additional metadata files will be downloaded.
Data into .SAFE structure
The data can also be downloaded into .SAFE structure by specifying safe_format=True
. The following code will download data from upper example again because now data will be stored in different folder structure.
[12]:
tile_request = AwsTileRequest(
tile=tile_name,
time=time,
aws_index=aws_index,
data_collection=DataCollection.SENTINEL2_L1C,
bands=bands,
metafiles=metafiles,
data_folder=data_folder,
safe_format=True,
)
# Uncomment the the following line to download the data:
# tile_request.save_data()
[13]:
product_id = "S2A_OPER_PRD_MSIL1C_PDMC_20160121T043931_R069_V20160103T171947_20160103T171947"
product_request = AwsProductRequest(product_id=product_id, bands=["B01"], data_folder=data_folder, safe_format=True)
# Uncomment the the following line to download the data:
# product_request.save_data()
Older products contain multiple tiles. In case would like to download only some tiles it is also possible to specify a list of tiles to download.
[14]:
product_request = AwsProductRequest(
product_id=product_id, tile_list=["T14PNA", "T13PHT"], data_folder=data_folder, safe_format=True
)
# Uncomment the the following line to download the data:
# product_request.save_data()