Composites

Median composites

One of the key reasons for generating a composite is to replace pixels classified as clouds with realistic values from the available data. This results in an image that doesn’t contain any clouds. In the case of a median composite, each pixel is selected to have the median (or middle) value out of all possible values.

Care should be taken when using these composites for analysis, since the relationships between spectral bands are not preserved. These composites are also affected by the timespan they’re generated over. For example, the median pixel in a single season may be different to the median value for the whole year.

Mean composites

Mean composites involve taking the average value for each pixel, rather than the middle value as is done for a median composite. Unlike the median, the mean composite can contain pixel values that were not part of the original dataset.

Care should be taken when interpreting these images, as extreme values (such as unmasked cloud) can strongly affect the mean.

Minimum and maximum composites

These composites can be useful for identifying extreme behaviour in a collection of satellite images.

For example, comparing the maximum and minimum composites for a given band index could help identify areas that take on a wide range of values, which may indicate areas that have high variability over the time-line of the composite.

Nearest-time composites (not implemented yet !)

To get an image at a certain time, often there is missing data, due to clouds and other masking. We can fill in these gaps by using data from surrounding times.

Most-recent composite

Sometime we may want to determine what the landscape looks like by examining the most recent image. If we look at the last image for our dataset, we may see there is some missing data in the last image.

Before and after composites

Often it is useful to view images before and after an event, to see the change that has occured. To generate a composite on either side of an event, we can split the dataset along time.

Nearest time composite

Sometimes we just want the closest available data to a particular point in time. This composite will take values from before or after the specified time to find the nearest observation.

reducer

[39]:
import glob, os
import numpy as np
import pandas as pd
import geopandas as gpd
import rasterio
import rasterio.plot
import datetime
from pathlib import Path
from IPython.display import display

print('All libraries successfully imported!')
All libraries successfully imported!

Set directory

[40]:
computer_path = '/export/miro/ndeffense/LBRAT2104/'
grp_nb        = '99'

# Directory for all work files
work_path = f'{computer_path}STUDENTS/GROUP_{grp_nb}/TP/'

# Output directory
composite_path = f'{work_path}COMPOSITE/'

Path(composite_path).mkdir(parents=True, exist_ok=True)

Set parameters

Compositing method : * mean * median * maximum * minimum

[41]:
method = 'minimum'

start_period_str = '20200101'   # format : YYYYMMDD
end_period_str   = '20200601'   # format : YYYYMMDD

no_data_val = -10000

Select the folder with the image timeserie

[42]:
image_selected = 'B04'

if image_selected == 'NDVI':

    raster_file_list = sorted(glob.glob(f'{work_path}NDVI/*.tif'))

elif image_selected == 'B04':

    raster_file_list = sorted(glob.glob(f'{work_path}3_L2A_MASKED/*_B04_*.tif'))

Date

The date must be in each image filename. begin_date is the first position of the date

E.g. : “SITE_3_**2**0181108_NDVI.tif” –> begin_date = 8

[43]:
format     = '%Y%m%d' # format : YYYYMMDD
begin_date = 8

Set filenames

[44]:
composite_tif = f'{composite_path}{image_selected}_{method}_composite_{start_period_str}_{end_period_str}.tif'

Get dataframe with all images - date of acquisition - file path

[45]:
dict_list = []


for im in raster_file_list:

        date_str = os.path.basename(im)[begin_date-1:begin_date-1+8]

        date_im = datetime.datetime.strptime(date_str, format).date()

        im_name = os.path.basename(im)

        dict_list.append({'date': date_im,
                          'im_name':im_name,
                          'im_path': im})

im_date_path_df = pd.DataFrame.from_dict(dict_list).sort_values('date')

display(im_date_path_df)
date im_name im_path
0 2020-01-16 T31UFS_20200116T105309_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
1 2020-02-12 T31UFS_20200212T104049_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
2 2020-03-16 T31UFS_20200316T104709_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
3 2020-04-17 T31UFS_20200417T104021_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
4 2020-05-20 T31UFS_20200520T105031_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
5 2020-06-21 T31UFS_20200621T103629_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
6 2020-07-19 T31UFS_20200719T105031_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
7 2020-08-13 T31UFS_20200813T104629_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
8 2020-09-14 T31UFS_20200914T104031_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
9 2020-10-19 T31UFS_20201019T103959_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
10 2020-11-18 T31UFS_20201118T104329_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
11 2020-12-18 T31UFS_20201218T104349_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...

Select images inside a specific time period

[46]:
start_date = datetime.datetime.strptime(start_period_str, format).date()
end_date   = datetime.datetime.strptime(end_period_str, format).date()

mask = (im_date_path_df['date'] > start_date) & (im_date_path_df['date'] <= end_date)

im_date_path_df = im_date_path_df.loc[mask]

display(im_date_path_df)
date im_name im_path
0 2020-01-16 T31UFS_20200116T105309_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
1 2020-02-12 T31UFS_20200212T104049_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
2 2020-03-16 T31UFS_20200316T104709_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
3 2020-04-17 T31UFS_20200417T104021_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...
4 2020-05-20 T31UFS_20200520T105031_B04_10m_ROI_SCL.tif /export/miro/ndeffense/LBRAT2104/STUDENTS/GROU...

Open each image with ``rasterio`` and add it in list of array

[47]:
list_src_arr = []

for i, row in im_date_path_df.iterrows():
    # Get image path
    im_file    = row['im_path']

    # Open image with rasterio
    src = rasterio.open(im_file, "r")

    # Get image values in a python array (= matrix)
    im_arr = src.read(1)

    im_arr = np.where(im_arr == no_data_val, np.nan, im_arr)


    # Add pyhon array in a list
    list_src_arr.append(im_arr)

    # Close rasterio object
    src.close()


print(f'Shape of each image of the list : {im_arr.shape}')
print(f'Number of image used to generate the composite : {len(list_src_arr)}')
Shape of each image of the list : (570, 986)
Number of image used to generate the composite : 5

Stack arrays

Convert list of 2D matrices (each image of the timeserie) to a single 3D matrix

[48]:
im_arr_stack = np.dstack(list_src_arr)

print(im_arr_stack.shape)
print(f'There are {im_arr_stack.shape[2]} features')
print(f'The features type is : {im_arr_stack.dtype}')
(570, 986, 5)
There are 5 features
The features type is : float64

Build composite

[49]:
if method == 'median':
    composite_arr = np.nanmedian(im_arr_stack, axis=2)

elif method == 'mean':
    composite_arr = np.nanmean(im_arr_stack, axis=2)

elif method == 'std':
    composite_arr = np.nanstd(im_arr_stack, axis=2)

elif method == 'minimum':
    composite_arr = np.nanmin(im_arr_stack, axis=2)

elif method == 'maximum':
    composite_arr = np.nanmax(im_arr_stack, axis=2)

else:
    print(f'Method not available !')

print(composite_arr)
[[448. 437. 438. ... 340. 317. 424.]
 [423. 399. 332. ... 319. 361. 655.]
 [391. 457. 394. ... 360. 640. 882.]
 ...
 [168. 162. 168. ... 886. 808. 914.]
 [168. 178. 172. ... 822. 788. 890.]
 [  0.   0.   0. ...   0.   0.   0.]]
/tmp/ipykernel_58693/3509333673.py:11: RuntimeWarning: All-NaN slice encountered
  composite_arr = np.nanmin(im_arr_stack, axis=2)

If you are generating a composite of reflectance bands, you may want raplce 0 value by No Data

[50]:
if image_selected == 'B04':
    composite_arr = np.where(composite_arr == 0, no_data_val, composite_arr)

Write composite in GeoTiff

[51]:
img_temp_tif = raster_file_list[0]

# Open a GeoTIF file to get the metadata (=profile)
with rasterio.open(img_temp_tif) as src:
    profile = src.profile

print(profile)

# Write the composite array in a new GeoTIF file unsing metadata of an exisiting file
with rasterio.open(composite_tif, "w", **profile) as dst:
    dst.write(composite_arr, 1)

print(f'New composite file written in : {composite_tif}')
{'driver': 'GTiff', 'dtype': 'int16', 'nodata': -10000.0, 'width': 986, 'height': 570, 'count': 1, 'crs': CRS.from_epsg(32631), 'transform': Affine(10.0, 0.0, 627260.0,
       0.0, -10.0, 5596180.0), 'blockysize': 4, 'tiled': False, 'compress': 'lzw', 'interleave': 'band'}
New composite file written in : /export/miro/ndeffense/LBRAT2104/STUDENTS/GROUP_99/TP/COMPOSITE/B04_minimum_composite_20200101_20200601.tif