Lesson 1. Work with Landsat Remote Sensing Data in Python
Learn How to Work With Landsat Multispectral Remote Sensing Data in Python - Intermediate earth data science textbook course module
Welcome to the first lesson in the Learn How to Work With Landsat Multispectral Remote Sensing Data in Python module. Learn how to work with Landsat multi-band raster data stored in .tif format in Python using rioxarrayChapter Nine - Landsat Data in Open Source Python
In this chapter, you will learn how to work with Landsat multi-band raster data stored in .tif
format in Python using rioxarray.
Learning Objectives
After completing this chapter, you will be able to:
- Use
glob()
to create a subsetted list of file names within a specified directory on your computer. - Create a raster stack from a list of
.tif
files in Python. - Crop rasters to a desired extent in Python.
- Plot various band combinations using a numpy array in Python.
What You Need
You will need a computer with internet access to complete this lesson and the Cold Springs Fire data.
Download Cold Springs Fire Data (~150 MB)
Or use earthpy et.data.get_data('cold-springs-fire')
In the NAIP data chapter in this textbook, you learned how to import a multi-band image into Python using rioxarray. You then plotted the data as a composite, RGB (and CIR) image using imshow()
and calculated NDVI.
In that case, all bands of the data were stored in a single .tif
file. However, sometimes data are downloaded in individual bands rather than a single file.
In this chapter, you will learn how to work with Landsat data in Python. Each band in a landsat scene is often stored in an individual .tif
file. Thus you will need to grab the bands that you want to work with and then bring them into a xarrray
DataFrame.
About Landsat Data
At over 40 years, the Landsat series of satellites provides the longest temporal record of moderate resolution multispectral data of the Earth’s surface on a global basis. The Landsat record has remained remarkably unbroken, proving a unique resource to assist a broad range of specialists in managing the world’s food, water, forests, and other natural resources for a growing world population. It is a record unmatched in quality, detail, coverage, and value. Source: USGS
Landsat data are spectral and collected using a platform mounted on a satellite in space that orbits the earth. The spectral bands and associated spatial resolution of the first 9 bands in the Landsat 8 sensor are listed below.
Landsat 8 Bands
Band | Wavelength range (nanometers) | Spatial Resolution (m) | Spectral Width (nm) | Units | Data Type | Fill Value (no data) | Range | Valid Range | Scale Factor |
---|---|---|---|---|---|---|---|---|---|
Band 1 - Coastal aerosol | 430 - 450 | 30 | 2.0 | Reflectance | 16-bit signed integer (int16) | -9999 | -2000 to 16000 | 0 to 10000 | 0.0001 |
Band 2 - Blue | 450 - 510 | 30 | 6.0 | Reflectance | 16-bit signed integer (int16) | -9999 | -2000 to 16000 | 0 to 10000 | 0.0001 |
Band 3 - Green | 530 - 590 | 30 | 6.0 | Reflectance | 16-bit signed integer (int16) | -9999 | -2000 to 16000 | 0 to 10000 | 0.0001 |
Band 4 - Red | 640 - 670 | 30 | 0.03 | Reflectance | 16-bit signed integer (int16) | -9999 | -2000 to 16000 | 0 to 10000 | 0.0001 |
Band 5 - Near Infrared (NIR) | 850 - 880 | 30 | 3.0 | Reflectance | 16-bit signed integer (int16) | -9999 | -2000 to 16000 | 0 to 10000 | 0.0001 |
Band 6 - SWIR 1 | 1570 - 1650 | 30 | 8.0 | Reflectance | 16-bit signed integer (int16) | -9999 | -2000 to 16000 | 0 to 10000 | 0.0001 |
Band 7 - SWIR 2 | 2110 - 2290 | 30 | 18 | Reflectance | 16-bit signed integer (int16) | -9999 | -2000 to 16000 | 0 to 10000 | 0.0001 |
Review the Landsat 8 Surface Reflectance Product Guide for more details.
There are additional collected bands that are not distributed within the Landsat 8 Surface Reflectance Product such as the panchromatic band, which provides a finer resolution, gray scale image of the landscape, and the cirrus cloud band, which is used in the quality assessment process:
Band | Wavelength range (nanometers) | Spatial Resolution (m) | Spectral Width (nm) |
---|---|---|---|
Band 8 - Panchromatic | 500 - 680 | 15 | 18 |
Band 9 - Cirrus | 1360 - 1380 | 30 | 2.0 |
Understand Landsat Data
When working with landsat, it is important to understand both the metadata and the file naming convention. The metadata tell you how the data were processed, where the data are from and how they are structured.
The file names, tell you what sensor collected the data, the date the data were collected, and more.
Landsat File Naming Convention
Landsat and many other satellite remote sensing data is named in a way that tells you a about:
- When the data were collected and processed
- What sensor was used to collect the data
- What satellite was used to collect the data.
And more.
Here you will learn a few key components of the landsat 8 collection file name. The first scene that you work with below is named:
LC080340322016072301T1-SC20180214145802
First, we have LC08
- L: Landsat Sensor
- C: OLI / TIRS combined platform
08: Landsat 8 (not 7)
- 034032: The next 6 digits represent the path and row of the scene. This identifies the spatial coverage of the scene
Finally, you have a date. In your case as follows:
- 20160723: representing the year, month and day that the data were collected.
The second part of the file name above tells you more about when the data were last processed. You can read more about this naming convention using the link below.
Learn more about Landsat 8 file naming conventions.
As you work with these data, it is good to double check that you are working with the sensor (Landsat 8) and the time period that you intend. Having this information in the file name makes it easier to keep track of this as you process your data.
import os
from glob import glob
import matplotlib.pyplot as plt
import numpy as np
import geopandas as gpd
import xarray as xr
import rioxarray as rxr
import earthpy as et
import earthpy.spatial as es
import earthpy.plot as ep
# Download data and set working directory
data = et.data.get_data("cold-springs-fire")
os.chdir(os.path.join(et.io.HOME,
"earth-analytics",
"data"))
Downloading from https://ndownloader.figshare.com/files/10960109
Extracted output to /root/earth-analytics/data/cold-springs-fire/.
# Get list of all pre-cropped data and sort the data
# Create the path to your data
landsat_post_fire_path = os.path.join("cold-springs-fire",
"landsat_collect",
"LC080340322016072301T1-SC20180214145802",
"crop")
# Generate a list of tif files
post_fire_paths = glob(os.path.join(landsat_post_fire_path,
"*band*.tif"))
# Sort the data to ensure bands are in the correct order
post_fire_paths.sort()
post_fire_paths
['cold-springs-fire/landsat_collect/LC080340322016072301T1-SC20180214145802/crop/LC08_L1TP_034032_20160723_20180131_01_T1_sr_band1_crop.tif',
'cold-springs-fire/landsat_collect/LC080340322016072301T1-SC20180214145802/crop/LC08_L1TP_034032_20160723_20180131_01_T1_sr_band2_crop.tif',
'cold-springs-fire/landsat_collect/LC080340322016072301T1-SC20180214145802/crop/LC08_L1TP_034032_20160723_20180131_01_T1_sr_band3_crop.tif',
'cold-springs-fire/landsat_collect/LC080340322016072301T1-SC20180214145802/crop/LC08_L1TP_034032_20160723_20180131_01_T1_sr_band4_crop.tif',
'cold-springs-fire/landsat_collect/LC080340322016072301T1-SC20180214145802/crop/LC08_L1TP_034032_20160723_20180131_01_T1_sr_band5_crop.tif',
'cold-springs-fire/landsat_collect/LC080340322016072301T1-SC20180214145802/crop/LC08_L1TP_034032_20160723_20180131_01_T1_sr_band6_crop.tif',
'cold-springs-fire/landsat_collect/LC080340322016072301T1-SC20180214145802/crop/LC08_L1TP_034032_20160723_20180131_01_T1_sr_band7_crop.tif']
Next, open a single band from your Landsat scene. Below you use the .squeeze()
method to ensure that output xarray object only has 2 dimensions and not three.
# Open a single band without squeeze - notice the first dimension is 1
band_1 = rxr.open_rasterio(post_fire_paths[0], masked=True)
band_1.shape
(1, 177, 246)
# Open a single band using squeeze notice there are only 2 dimensions here
# when you use squeeze
band_1 = rxr.open_rasterio(post_fire_paths[0], masked=True).squeeze()
band_1.shape
(177, 246)
# Plot the data
f, ax=plt.subplots()
band_1.plot.imshow(ax=ax,
cmap="Greys_r")
ax.set_axis_off()
ax.set_title("Plot of Band 1")
plt.show()
Below is a function called open_clean_bands
that opens a single tif file and returns an xarray object. In the following lessons you will build this function out to process and clean your Landsat data.
def open_clean_bands(band_path):
"""A function that opens a Landsat band as an (rio)xarray object
Parameters
----------
band_path : list
A list of paths to the tif files that you wish to combine.
Returns
-------
An single xarray object with the Landsat band data.
"""
return rxr.open_rasterio(band_path, masked=True).squeeze()
The code below takes each band that you opened, and stacks it into a new single output array. NOTE: this approach is only efficient if you wish to process ALL of the bands in your data. Given the size of Landsat data, you likely will want to remove bands that you don’t need and if your study area is smaller than the entire image, you may also want to clip your data. You will learn how to clip and subset your data in the next lesson.
# Open all bands in a loop
all_bands = []
for i, aband in enumerate(post_fire_paths):
all_bands.append(open_clean_bands(aband))
# Assign a band number to the new xarray object
all_bands[i]["band"]=i+1
# OPTIONAL: Turn list of bands into a single xarray object
landsat_post_fire_xr = xr.concat(all_bands, dim="band")
landsat_post_fire_xr
<xarray.DataArray (band: 7, y: 177, x: 246)> array([[[ 446., 476., 487., ..., 162., 220., 260.], [ 393., 457., 488., ..., 200., 235., 296.], [ 364., 393., 388., ..., 246., 298., 347.], ..., [ 249., 283., 363., ..., 272., 268., 284.], [ 541., 474., 364., ..., 260., 269., 285.], [ 219., 177., 250., ..., 271., 271., 286.]], [[ 515., 547., 572., ..., 181., 233., 261.], [ 440., 519., 571., ..., 211., 251., 322.], [ 411., 460., 449., ..., 264., 326., 387.], ..., [ 387., 326., 427., ..., 288., 278., 301.], [ 554., 654., 433., ..., 276., 276., 293.], [ 291., 174., 291., ..., 292., 290., 304.]], [[ 782., 772., 843., ..., 335., 390., 411.], [ 684., 771., 836., ..., 363., 412., 511.], [ 656., 725., 706., ..., 425., 518., 599.], ..., ... ..., [1900., 1917., 2076., ..., 1722., 1891., 1890.], [1779., 1893., 1983., ..., 1645., 1847., 2090.], [1553., 1440., 1587., ..., 1562., 1689., 1964.]], [[2864., 2974., 3108., ..., 983., 1195., 1271.], [2527., 2827., 3008., ..., 1132., 1293., 1546.], [2141., 2427., 2433., ..., 1324., 1652., 1922.], ..., [1662., 1757., 1922., ..., 1463., 1472., 1519.], [1786., 1532., 1554., ..., 1374., 1423., 1450.], [1071., 943., 975., ..., 1524., 1461., 1518.]], [[1920., 1979., 2098., ..., 537., 660., 687.], [1505., 1863., 1975., ..., 651., 747., 924.], [1240., 1407., 1391., ..., 769., 1018., 1189.], ..., [1216., 1190., 1398., ..., 877., 890., 928.], [1517., 1184., 1078., ..., 846., 810., 820.], [ 660., 593., 623., ..., 984., 909., 880.]]]) Coordinates: * band (band) int64 1 2 3 4 5 6 7 * y (y) float64 4.428e+06 4.428e+06 ... 4.423e+06 4.423e+06 * x (x) float64 4.557e+05 4.557e+05 4.557e+05 ... 4.63e+05 4.63e+05 spatial_ref int64 0 Attributes: STATISTICS_MAXIMUM: 3483 STATISTICS_MEAN: 297.16466859584 STATISTICS_MINIMUM: -57 STATISTICS_STDDEV: 119.61507774931 scale_factor: 1.0 add_offset: 0.0 grid_mapping: spatial_ref
- band: 7
- y: 177
- x: 246
- 446.0 476.0 487.0 499.0 494.0 507.0 ... 565.0 787.0 984.0 909.0 880.0
array([[[ 446., 476., 487., ..., 162., 220., 260.], [ 393., 457., 488., ..., 200., 235., 296.], [ 364., 393., 388., ..., 246., 298., 347.], ..., [ 249., 283., 363., ..., 272., 268., 284.], [ 541., 474., 364., ..., 260., 269., 285.], [ 219., 177., 250., ..., 271., 271., 286.]], [[ 515., 547., 572., ..., 181., 233., 261.], [ 440., 519., 571., ..., 211., 251., 322.], [ 411., 460., 449., ..., 264., 326., 387.], ..., [ 387., 326., 427., ..., 288., 278., 301.], [ 554., 654., 433., ..., 276., 276., 293.], [ 291., 174., 291., ..., 292., 290., 304.]], [[ 782., 772., 843., ..., 335., 390., 411.], [ 684., 771., 836., ..., 363., 412., 511.], [ 656., 725., 706., ..., 425., 518., 599.], ..., ... ..., [1900., 1917., 2076., ..., 1722., 1891., 1890.], [1779., 1893., 1983., ..., 1645., 1847., 2090.], [1553., 1440., 1587., ..., 1562., 1689., 1964.]], [[2864., 2974., 3108., ..., 983., 1195., 1271.], [2527., 2827., 3008., ..., 1132., 1293., 1546.], [2141., 2427., 2433., ..., 1324., 1652., 1922.], ..., [1662., 1757., 1922., ..., 1463., 1472., 1519.], [1786., 1532., 1554., ..., 1374., 1423., 1450.], [1071., 943., 975., ..., 1524., 1461., 1518.]], [[1920., 1979., 2098., ..., 537., 660., 687.], [1505., 1863., 1975., ..., 651., 747., 924.], [1240., 1407., 1391., ..., 769., 1018., 1189.], ..., [1216., 1190., 1398., ..., 877., 890., 928.], [1517., 1184., 1078., ..., 846., 810., 820.], [ 660., 593., 623., ..., 984., 909., 880.]]])
- band(band)int641 2 3 4 5 6 7
array([1, 2, 3, 4, 5, 6, 7])
- y(y)float644.428e+06 4.428e+06 ... 4.423e+06
array([4428450., 4428420., 4428390., 4428360., 4428330., 4428300., 4428270., 4428240., 4428210., 4428180., 4428150., 4428120., 4428090., 4428060., 4428030., 4428000., 4427970., 4427940., 4427910., 4427880., 4427850., 4427820., 4427790., 4427760., 4427730., 4427700., 4427670., 4427640., 4427610., 4427580., 4427550., 4427520., 4427490., 4427460., 4427430., 4427400., 4427370., 4427340., 4427310., 4427280., 4427250., 4427220., 4427190., 4427160., 4427130., 4427100., 4427070., 4427040., 4427010., 4426980., 4426950., 4426920., 4426890., 4426860., 4426830., 4426800., 4426770., 4426740., 4426710., 4426680., 4426650., 4426620., 4426590., 4426560., 4426530., 4426500., 4426470., 4426440., 4426410., 4426380., 4426350., 4426320., 4426290., 4426260., 4426230., 4426200., 4426170., 4426140., 4426110., 4426080., 4426050., 4426020., 4425990., 4425960., 4425930., 4425900., 4425870., 4425840., 4425810., 4425780., 4425750., 4425720., 4425690., 4425660., 4425630., 4425600., 4425570., 4425540., 4425510., 4425480., 4425450., 4425420., 4425390., 4425360., 4425330., 4425300., 4425270., 4425240., 4425210., 4425180., 4425150., 4425120., 4425090., 4425060., 4425030., 4425000., 4424970., 4424940., 4424910., 4424880., 4424850., 4424820., 4424790., 4424760., 4424730., 4424700., 4424670., 4424640., 4424610., 4424580., 4424550., 4424520., 4424490., 4424460., 4424430., 4424400., 4424370., 4424340., 4424310., 4424280., 4424250., 4424220., 4424190., 4424160., 4424130., 4424100., 4424070., 4424040., 4424010., 4423980., 4423950., 4423920., 4423890., 4423860., 4423830., 4423800., 4423770., 4423740., 4423710., 4423680., 4423650., 4423620., 4423590., 4423560., 4423530., 4423500., 4423470., 4423440., 4423410., 4423380., 4423350., 4423320., 4423290., 4423260., 4423230., 4423200., 4423170.])
- x(x)float644.557e+05 4.557e+05 ... 4.63e+05
array([455670., 455700., 455730., ..., 462960., 462990., 463020.])
- spatial_ref()int640
- crs_wkt :
- PROJCS["WGS 84 / UTM zone 13N",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32613"]]
- semi_major_axis :
- 6378137.0
- semi_minor_axis :
- 6356752.314245179
- inverse_flattening :
- 298.257223563
- reference_ellipsoid_name :
- WGS 84
- longitude_of_prime_meridian :
- 0.0
- prime_meridian_name :
- Greenwich
- geographic_crs_name :
- WGS 84
- horizontal_datum_name :
- World Geodetic System 1984
- projected_crs_name :
- WGS 84 / UTM zone 13N
- grid_mapping_name :
- transverse_mercator
- latitude_of_projection_origin :
- 0.0
- longitude_of_central_meridian :
- -105.0
- false_easting :
- 500000.0
- false_northing :
- 0.0
- scale_factor_at_central_meridian :
- 0.9996
- spatial_ref :
- PROJCS["WGS 84 / UTM zone 13N",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32613"]]
- GeoTransform :
- 455655.0 30.0 0.0 4428465.0 0.0 -30.0
array(0)
- STATISTICS_MAXIMUM :
- 3483
- STATISTICS_MEAN :
- 297.16466859584
- STATISTICS_MINIMUM :
- -57
- STATISTICS_STDDEV :
- 119.61507774931
- scale_factor :
- 1.0
- add_offset :
- 0.0
- grid_mapping :
- spatial_ref
landsat_post_fire_xr.plot.imshow(col="band",
col_wrap=3,
cmap="Greys_r")
plt.show()
Plot RGB image
Just like you did with NAIP data, you can plot 3 band color composite images for Landsat using the earthpy ep.plot_rgb()
function. Refer to the landsat bands in the table at the top of this page to figure out the red, green and blue bands. Or read the ESRI landsat 8 band combinations post.
ep.plot_rgb()
requires:
the numpy array containing the bands that you wish to plot. You can access this by using
xarray_name.values
IMPORTANT: this array should be in xarray band order (bands first).The numeric location of bands that you wish to plot in the array.
ep.plot_rgb(landsat_post_fire_xr.values,
rgb=[3, 2, 1],
title="RGB Composite Image\n Post Fire Landsat Data")
plt.show()
Notice that the image above looks dark. You can stretch the image as you did with the NAIP data, too. Below you use the stretch argument built into the earthpy plot_rgb()
function. The str_clip
argument allows you to specify how much of the tails of the data that you want to clip off. The larger the number, the most the data will be stretched or brightened.
ep.plot_rgb(landsat_post_fire_xr.values,
rgb=[3, 2, 1],
title="Landsat RGB Image\n Linear Stretch Applied",
stretch=True,
str_clip=1)
plt.show()
# Adjust the amount of linear stretch to futher brighten the image
ep.plot_rgb(landsat_post_fire_xr.values,
rgb=[3, 2, 1],
title="Landsat RGB Image\n Linear Stretch Applied",
stretch=True,
str_clip=4)
plt.show()
Raster Pixel Histograms
You can create a histogram to view the distribution of pixel values in the rgb bands plotted above.
# Plot all band histograms using earthpy
band_titles = ["Band 1",
"Blue",
"Green",
"Red",
"NIR",
"Band 6",
"Band7"]
ep.hist(landsat_post_fire_xr.values,
title=band_titles)
plt.show()
Plot CIR
Now you’ve created a red, green blue color composite image. Remember red green and blue are colors that your eye can see.
Next, create a color infrared image (CIR) using landsat bands: 4,3,2.
ep.plot_rgb(landsat_post_fire_xr.values,
rgb=[4, 3, 2],
title="CIR Landsat Image Pre-Cold Springs Fire",
figsize=(10, 10))
plt.show()
Data Tip: Landsat 8 Pre Collections Data
If you are working with Landsat data downloaded pre USGS collections, your data may be formatted and named slightly differently than the example shown on this page. Below is an explanation of the legacy Landsat 8 naming convention.
File: LC08_L1TP_034032_20160707_20170221_01_T1_sr_band1_crop.tif
Sensor | Sensor | Satellite | WRS path | WRS row | ||||
---|---|---|---|---|---|---|---|---|
L | C | 8 | 034 | 032 | 2016 | 0707 | 20170221 | 01 |
Landsat | OLI & TIRS | Landsat 8 | path = 034 | row = 032 | year = 2016 | Month = 7, day = 7 | Processing Date: 2017-02-21 (feb 21 2017) | Archive (second version): 01 |
- L: Landsat
- X: Sensor
- C = OLI & TIRS O = OLI only T = IRS only E = ETM+ T = TM M = MSS
- S Satelite
- PPP
- RRR
- YYYY = Year
- DDD = Julian DAY of the year
- GSI - Ground station ID
- VV = Archive Version
Julian day
We won’t spend a lot of time on Julian days. The julian day used to be used in Landsat pre collections file naming. However recently they have switched to a normal year-month-date format
See this link that provide tables to help you convert julian days to actual date.
Leave a Comment