TerrAdapt:Cascadia Documentation
TerrAdapt:Cascadia
  • QUICK START GUIDE
  • HOW TO USE TERRADAPT:CASCADIA
    • Public License and Citation
    • Appropriate Uses and Key Limitations
    • TerrAdapt:Cascadia Map Portal
    • TerrAdapt:Cascadia Dashboard
  • DATA INPUTS
    • Data Inputs Overview
    • Remote Sensing Data
    • Climate
    • Energy and Transportation Infrastructure
    • Topography, Hydrology, & Soils
  • METHODS AND VALIDATION
    • Methods Overview
    • Landcover
      • Taxonomy
      • Training Data
      • Covariates
      • Model Development
      • Model Validation
    • Forest Structure
      • Training Data
      • Covariates
      • Model Development
      • Model Validation
    • Rangeland Fractional Cover
      • Training Data
      • Covariates
      • Model Development
      • Model Validation
    • Change Detection and Ecological Disturbance Modeling
      • Taxonomy
      • Covariates
      • Training Data
      • Model Development
    • Human Footprint
    • Habitat
      • Species Distribution Modeling
      • Ecosystem-based Models
      • Core Habitat
      • Habitat Centrality
    • Connectivity
      • Mapping Connectivity Networks
      • Corridors
      • Corridor Centrality
      • Mapping Barriers
Powered by GitBook
On this page

Was this helpful?

  1. METHODS AND VALIDATION
  2. Landcover

Training Data

PreviousTaxonomyNextCovariates

Last updated 1 year ago

Was this helpful?

For each class in our landcover taxonomy, we created between 500 and 5000 observations divided among the years 2008, 2011, 2013, 2016, 2019, and 2021 based on the consensus among the following landcover-related datasets:

Dataset
Citation
Data availability

v1

2015-present

v1.3

1985-2021

More observations were produced for more common landcover classes compared to rare classes. Not all datasets were available for every year across the full extent of the region. Also, the landcover datasets in the table above only partially conformed to our landcover taxonomy, and some landcover classes were widely included in these datasets while others were only available in a small subset. Given these limitations, there were a variable number of datasets that were available to assess model agreement depending on the class, location, and year. However, in all cases, at least 2 datasets agreed on the class of the observation in our training dataset. The total number of training data samples was 78,500.

For each observation of a consensus landcover class, our Google Earth Engine (GEE) based workflow extracted the value of all matched to the year corresponding to the sampling year of the consensus landcover observation. The result was a table (a GEE FeatureCollection) with attributes for the consensus landcover class and all covariates.

All training data locations were assigned a 'fold' (value from 1 to 10) that was used to reduce autocorrelation during model validation. Folds were assigned based on a 10 km x 10 km grid of randomized folds across the region. During model training, one fold of data was always withheld, and because of the spatial assignment of folds, the withheld data was spatially separated from all other folds and therefore expected to be a more independent assessment compared to validation based on folds that were spatially intermixed.

covariates
Dynamic World
Brown et al. 2022
LCMAP
Xian et al. 2022
US National Landcover Database
Homer et al. 2020
Rangeland Analysis Platform
Jones et al. 2018
JRC Global Surface Water
Pekel et al. 2016
USDA CroplandCROS
Boryan et al. 2011
Canada Annual Crop Inventory
Fizette et al. 2013
North American Land Change Monitoring System
ESA WorldCover