This vignette outlines a basic use scenario for smapr. We will acquire and process NASA (Soil Moisture Active-Passive) SMAP data, and generate some simple visualizations.
Multiple SMAP data products are provided by the NSIDC, and these products vary in the amount of processing. Currently, smapr primarily supports level 3 and level 4 data products, which represent global daily composite and global three hourly modeled data products, respectively. NSIDC provides documentation for all SMAP data products on their website, and we provide a summary of data products supported by smapr below.
Dataset id | Description | Resolution |
---|---|---|
SPL2SMAP_S | SMAP/Sentinel-1 Radiometer/Radar Soil Moisture | 3 km |
SPL3FTA | Radar Northern Hemisphere Daily Freeze/Thaw State | 3 km |
SPL3SMA | Radar Global Daily Soil Moisture | 3 km |
SPL3SMP | Radiometer Global Soil Moisture | 36 km |
SPL3SMAP | Radar/Radiometer Global Soil Moisture | 9 km |
SPL4SMAU | Surface/Rootzone Soil Moisture Analysis Update | 9 km |
SPL4SMGP | Surface/Rootzone Soil Moisture Geophysical Data | 9 km |
SPL4SMLM | Surface/Rootzone Soil Moisture Land Model Constants | 9 km |
SPL4CMDL | Carbon Net Ecosystem Exchange | 9 km |
This vignette uses the level 4 SPL4SMAU (Surface/Rootzone Soil Moisture Analysis Update) data product.
NASA requires a username and password from their Earthdata portal to access SMAP data. You can get these credentials here: https://earthdata.nasa.gov/
Once you have your credentials, you can use the set_smap_credentials
function to set them for use by the smapr package:
This function saves your credentials for later use unless you use the argument save = FALSE
.
To find out which SMAP data are available, we’ll use the find_smap
function, which takes a data set ID, date(s) to search, and a dataset version. The SPL4SMAU data product is on version 3 (see https://nsidc.org/data/SPL4SMAU).
This returns a data frame, where every row is one data file that is available on NASA’s servers.
str(available_data)
#> 'data.frame': 8 obs. of 3 variables:
#> $ name: chr "SMAP_L4_SM_aup_20180601T030000_Vv4030_001" "SMAP_L4_SM_aup_20180601T060000_Vv4030_001" "SMAP_L4_SM_aup_20180601T090000_Vv4030_001" "SMAP_L4_SM_aup_20180601T120000_Vv4030_001" ...
#> $ date: Date, format: "2018-06-01" "2018-06-01" ...
#> $ dir : chr "SPL4SMAU.004/2018.06.01/" "SPL4SMAU.004/2018.06.01/" "SPL4SMAU.004/2018.06.01/" "SPL4SMAU.004/2018.06.01/" ...
To download the data, we can use download_smap
. Note that this may take a while, depending on the number of files being downloaded, and the speed of your internet connection. Because we’re downloading multiple files, we will use the verbose = FALSE
argument to avoid printing excessive output to the console.
Each file corresponds to different times as indicated by the file names:
Each file that we downloaded is an HDF5 file with multiple datasets bundled together. To list all of the data in a file we can use list_smap
. By default, if we give list_smap
a data frame of local files, it will return a list of data frames. Because all of these data files are of the same data product, using list_smap
on one file (e.g., the first) will tell us what’s available in all of the files:
list_smap(local_files[1, ])
#> $SMAP_L4_SM_aup_20180601T030000_Vv4030_001
#> group name otype dclass dim
#> 0 / Analysis_Data H5I_GROUP
#> 1 / EASE2_global_projection H5I_DATASET STRING 1
#> 2 / Forecast_Data H5I_GROUP
#> 3 / Metadata H5I_GROUP
#> 4 / Observations_Data H5I_GROUP
#> 5 / cell_column H5I_DATASET INTEGER 3856 x 1624
#> 6 / cell_lat H5I_DATASET FLOAT 3856 x 1624
#> 7 / cell_lon H5I_DATASET FLOAT 3856 x 1624
#> 8 / cell_row H5I_DATASET INTEGER 3856 x 1624
#> 9 / time H5I_DATASET FLOAT 1
#> 10 / x H5I_DATASET FLOAT 3856
#> 11 / y H5I_DATASET FLOAT 1624
To dig deeper, we can use the all
argument to list_smap
:
list_smap(local_files[1, ], all = TRUE)
#> $SMAP_L4_SM_aup_20180601T030000_Vv4030_001
#> group name
#> 0 / Analysis_Data
#> 1 /Analysis_Data sm_profile_analysis
#> 2 /Analysis_Data sm_profile_analysis_ensstd
#> 3 /Analysis_Data sm_rootzone_analysis
#> 4 /Analysis_Data sm_rootzone_analysis_ensstd
#> 5 /Analysis_Data sm_surface_analysis
#> 6 /Analysis_Data sm_surface_analysis_ensstd
#> 7 /Analysis_Data soil_temp_layer1_analysis
#> 8 /Analysis_Data soil_temp_layer1_analysis_ensstd
#> 9 /Analysis_Data surface_temp_analysis
#> 10 /Analysis_Data surface_temp_analysis_ensstd
#> 11 / EASE2_global_projection
#> 12 / Forecast_Data
#> 13 /Forecast_Data sm_profile_forecast
#> 14 /Forecast_Data sm_rootzone_forecast
#> 15 /Forecast_Data sm_surface_forecast
#> 16 /Forecast_Data soil_temp_layer1_forecast
#> 17 /Forecast_Data surface_temp_forecast
#> 18 /Forecast_Data tb_h_forecast
#> 19 /Forecast_Data tb_h_forecast_ensstd
#> 20 /Forecast_Data tb_v_forecast
#> 21 /Forecast_Data tb_v_forecast_ensstd
#> 22 / Metadata
#> 23 /Metadata AcquisitionInformation
#> 24 /Metadata/AcquisitionInformation platform
#> 25 /Metadata/AcquisitionInformation platformDocument
#> 26 /Metadata/AcquisitionInformation radar
#> 27 /Metadata/AcquisitionInformation radarDocument
#> 28 /Metadata/AcquisitionInformation radiometer
#> 29 /Metadata/AcquisitionInformation radiometerDocument
#> 30 /Metadata CRID
#> 31 /Metadata/CRID AUP
#> 32 /Metadata/CRID Root
#> 33 /Metadata Config
#> 34 /Metadata DataQuality
#> 35 /Metadata/DataQuality TBH
#> 36 /Metadata/DataQuality/TBH CompletenessOmission
#> 37 /Metadata/DataQuality/TBH DomainConsistency
#> 38 /Metadata/DataQuality TBV
#> 39 /Metadata/DataQuality/TBV CompletenessOmission
#> 40 /Metadata/DataQuality/TBV DomainConsistency
#> 41 /Metadata DatasetIdentification
#> 42 /Metadata Extent
#> 43 /Metadata GridSpatialRepresentation
#> 44 /Metadata/GridSpatialRepresentation Latitude
#> 45 /Metadata/GridSpatialRepresentation Longitude
#> 46 /Metadata ProcessStep
#> 47 /Metadata SeriesIdentification
#> 48 /Metadata Source
#> 49 /Metadata/Source L1C_TB
#> 50 / Observations_Data
#> 51 /Observations_Data tb_h_obs
#> 52 /Observations_Data tb_h_obs_assim
#> 53 /Observations_Data tb_h_obs_errstd
#> 54 /Observations_Data tb_h_obs_time_sec
#> 55 /Observations_Data tb_h_orbit_flag
#> 56 /Observations_Data tb_h_resolution_flag
#> 57 /Observations_Data tb_v_obs
#> 58 /Observations_Data tb_v_obs_assim
#> 59 /Observations_Data tb_v_obs_errstd
#> 60 /Observations_Data tb_v_obs_time_sec
#> 61 /Observations_Data tb_v_orbit_flag
#> 62 /Observations_Data tb_v_resolution_flag
#> 63 / cell_column
#> 64 / cell_lat
#> 65 / cell_lon
#> 66 / cell_row
#> 67 / time
#> 68 / x
#> 69 / y
#> otype dclass dim
#> 0 H5I_GROUP
#> 1 H5I_DATASET FLOAT 3856 x 1624
#> 2 H5I_DATASET FLOAT 3856 x 1624
#> 3 H5I_DATASET FLOAT 3856 x 1624
#> 4 H5I_DATASET FLOAT 3856 x 1624
#> 5 H5I_DATASET FLOAT 3856 x 1624
#> 6 H5I_DATASET FLOAT 3856 x 1624
#> 7 H5I_DATASET FLOAT 3856 x 1624
#> 8 H5I_DATASET FLOAT 3856 x 1624
#> 9 H5I_DATASET FLOAT 3856 x 1624
#> 10 H5I_DATASET FLOAT 3856 x 1624
#> 11 H5I_DATASET STRING 1
#> 12 H5I_GROUP
#> 13 H5I_DATASET FLOAT 3856 x 1624
#> 14 H5I_DATASET FLOAT 3856 x 1624
#> 15 H5I_DATASET FLOAT 3856 x 1624
#> 16 H5I_DATASET FLOAT 3856 x 1624
#> 17 H5I_DATASET FLOAT 3856 x 1624
#> 18 H5I_DATASET FLOAT 3856 x 1624
#> 19 H5I_DATASET FLOAT 3856 x 1624
#> 20 H5I_DATASET FLOAT 3856 x 1624
#> 21 H5I_DATASET FLOAT 3856 x 1624
#> 22 H5I_GROUP
#> 23 H5I_GROUP
#> 24 H5I_GROUP
#> 25 H5I_GROUP
#> 26 H5I_GROUP
#> 27 H5I_GROUP
#> 28 H5I_GROUP
#> 29 H5I_GROUP
#> 30 H5I_GROUP
#> 31 H5I_GROUP
#> 32 H5I_GROUP
#> 33 H5I_GROUP
#> 34 H5I_GROUP
#> 35 H5I_GROUP
#> 36 H5I_GROUP
#> 37 H5I_GROUP
#> 38 H5I_GROUP
#> 39 H5I_GROUP
#> 40 H5I_GROUP
#> 41 H5I_GROUP
#> 42 H5I_GROUP
#> 43 H5I_GROUP
#> 44 H5I_GROUP
#> 45 H5I_GROUP
#> 46 H5I_GROUP
#> 47 H5I_GROUP
#> 48 H5I_GROUP
#> 49 H5I_GROUP
#> 50 H5I_GROUP
#> 51 H5I_DATASET FLOAT 3856 x 1624
#> 52 H5I_DATASET FLOAT 3856 x 1624
#> 53 H5I_DATASET FLOAT 3856 x 1624
#> 54 H5I_DATASET FLOAT 3856 x 1624
#> 55 H5I_DATASET INTEGER 3856 x 1624
#> 56 H5I_DATASET INTEGER 3856 x 1624
#> 57 H5I_DATASET FLOAT 3856 x 1624
#> 58 H5I_DATASET FLOAT 3856 x 1624
#> 59 H5I_DATASET FLOAT 3856 x 1624
#> 60 H5I_DATASET FLOAT 3856 x 1624
#> 61 H5I_DATASET INTEGER 3856 x 1624
#> 62 H5I_DATASET INTEGER 3856 x 1624
#> 63 H5I_DATASET INTEGER 3856 x 1624
#> 64 H5I_DATASET FLOAT 3856 x 1624
#> 65 H5I_DATASET FLOAT 3856 x 1624
#> 66 H5I_DATASET INTEGER 3856 x 1624
#> 67 H5I_DATASET FLOAT 1
#> 68 H5I_DATASET FLOAT 3856
#> 69 H5I_DATASET FLOAT 1624
Looking at this output, we can conclude that the file contains multiple arrays (notice the dim
column). These arrays correspond to things like estimated root zone soil moisture (/Analysis_Data/sm_rootzone_analysis
), estimated surface soil moisture (/Analysis_Data/sm_surface_analysis
), and estimated surface temperature (/Analysis_Data/surface_temp_analysis
). See https://nsidc.org/data/smap/spl4sm/data-fields#sm_surface_analysis for more detailed information on what these datasets represent and how they were generated.
The datasets that we are interested in are spatial grids. The smapr
package can extract these data into raster
objects with the extract_smap
function, which takes a dataset name as an argument. These names are paths that can be generated from the output of list_smap
. For example, if we want to get rootzone soil moisture, we can see a dataset with name sm_rootzone_analysis
in group /Analysis_Data
, so that the path to the dataset is /Analysis_Data/sm_rootzone_analysis
:
sm_raster <- extract_smap(local_files, '/Analysis_Data/sm_rootzone_analysis')
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
This will extract all of the data in the data frame local_files
, generating a RasterBrick with one layer per file:
sm_raster
#> class : RasterBrick
#> dimensions : 1624, 3856, 6262144, 8 (nrow, ncol, ncell, nlayers)
#> resolution : 9008.055, 9008.055 (x, y)
#> extent : -17367530, 17367530, -7314541, 7314541 (xmin, xmax, ymin, ymax)
#> coord. ref. : +proj=cea +lon_0=0 +lat_ts=30 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
#> data source : /home/max/.cache/smap/tmp.tif
#> names : SMAP_L4_S//Vv4030_001, SMAP_L4_S//Vv4030_001, SMAP_L4_S//Vv4030_001, SMAP_L4_S//Vv4030_001, SMAP_L4_S//Vv4030_001, SMAP_L4_S//Vv4030_001, SMAP_L4_S//Vv4030_001, SMAP_L4_S//Vv4030_001
#> min values : 0.01099210, 0.01104227, 0.01111004, 0.01108601, 0.01082948, 0.01094369, 0.01094055, 0.01095674
#> max values : 0.7999770, 0.7999318, 0.8000000, 0.7999870, 0.8000000, 0.8000000, 0.7999970, 0.7999533
We can visualize each layer in our RasterBrick:
If you want to crop the data to a study region, you can use the raster::crop
function. Let’s illustrate by focusing on the state of Colorado, which is approximately rectangular, so that we can define an extent
object using latitude and longitude values that roughly correspond to the state boundaries. The raster::extent()
function can be used with any Spatial*
or Raster*
object.
co_extent <- extent(c(-109, -102, 37, 41))
co_extent <- as(co_extent, "SpatialPolygons")
sp::proj4string(co_extent) <- "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"
co_extent
#> class : SpatialPolygons
#> features : 1
#> extent : -109, -102, 37, 41 (xmin, xmax, ymin, ymax)
#> coord. ref. : +proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0
Now that we have a SpatialPolygons object, we can use this to crop the soil moisture raster to our study region. First, we need to ensure that the projections are the same:
Then, we could crop the soil moisture data to our polygon, which will make the extents match.
We might also have a polygon to use as a mask. For example, to mask out our Colorado polygon, we could do the following (operating on the first layer of the soil moisture raster for simplicity):
Notice that masking does not crop the raster, it simply sets all values outside of the polygon to NA. In some cases, an inverse mask is useful, where values inside the polygon are set to NA:
We may want to average soil moisture values across layers of a raster brick. This can be done with the raster::calc()
function:
Our SPL4SMAU data have estimated surface and rootzone soil moisture layers. If we want to compare these values, we can load the surface soil moisture data, compute the mean value over layers as we did for the rootzone soil moisture raster, and generate a scatterplot.
surface_raster <- extract_smap(local_files,
name = '/Analysis_Data/sm_surface_analysis')
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
#> Warning in H5Aread(A): Reading attribute data of type 'VLEN' not yet
#> implemented. Values replaced by NA's.
# compute mean
mean_surface_sm <- calc(surface_raster, fun = mean)
# compare values
plot(values(mean_sm), values(mean_surface_sm), col = 'dodgerblue', cex = .1,
xlab = 'Rootzone soil moisture', ylab = 'Surface soil moisture', bty = 'n')
abline(0, 1, lty = 2)