Skip to page content
USDA Forest Service
  
Treesearch

Research & Development Treesearch

 
Treesearch Home
About Treesearch
Contact Us
Research & Development
Forest Products Lab
International Institute of Tropical Forestry
Northern
Pacific Northwest
Pacific Southwest
Rocky Mountain
Southern Research Station
Help
 

Science.gov - We Participate


USA.gov  Government Made Easy


Global Forest Information Service

US Forest Service
P.O. Box 96090
Washington, D.C.
20090-6090

(202) 205-8333

You are here: Home / Search / Publication Information
Bookmark and Share

Publication Information

View PDF (601 KB)

Title: Evaluating effectiveness of down-sampling for stratified designs and unbalanced prevalence in Random Forest models of tree species distributions in Nevada

Author: Freeman, Elizabeth A.; Moisen, Gretchen G.; Frescino, Tracy S.;

Date: 2012

Source: Ecological Modeling. 233: 1-10.

Publication Series: Scientific Journal (JRNL)

Description: Random Forests is frequently used to model species distributions over large geographic areas. Complications arise when data used to train the models have been collected in stratified designs that involve different sampling intensity per stratum. The modeling process is further complicated if some of the target species are relatively rare on the landscape leading to an unbalanced number of presences and absences in the training data. We explored means to accommodate unequal sampling intensity across strata as well as the unbalanced species prevalence in Random Forest models for tree and shrub species distributions in the state of Nevada. For the unequal sampling intensity issue, we tested three modeling strategies: fitting models using all the data, down-sampling the intensified stratum; and building separate models for each stratum. We explored unbalanced species prevalence by investigating the effects of down-sampling the more prevalent response (presence or absence), and by optimizing the cutoff thresholds for declaring a species present. When modeling species presence with stratified data that was collected with different sampling intensities per stratum, we found that neither down-sampling the intensified stratum, nor fitting individual strata models, improved model performance. We also found that balancing the number of presences and absences in a training data set by down-sampling did not improve predictive models of species distributions, and did not eliminate the need to optimize thresholds. We then apply our final choice of model to the full raster layers for Nevada to produce statewide species distribution maps.

Keywords: random forests, species distributions, down sampling, species prevalence

Publication Notes:

  • We recommend that you also print this page and attach it to the printout of the article, to retain the full citation information.
  • This article was written and prepared by U.S. Government employees on official time, and is therefore in the public domain.

XML: View XML

Citation:


Freeman, Elizabeth A.; Moisen, Gretchen G.; Frescino, Tracy S. 2012. Evaluating effectiveness of down-sampling for stratified designs and unbalanced prevalence in Random Forest models of tree species distributions in Nevada. Ecological Modeling. 233: 1-10.

 


 [ Get Acrobat ]  Get the latest version of the Adobe Acrobat reader or Acrobat Reader for Windows with Search and Accessibility

USDA logo which links to the department's national site. Forest Service logo which links to the agency's national site.