Skip to page content
USDA Forest Service

Research & Development Treesearch

Treesearch Home
About Treesearch
Contact Us
Research & Development
Forest Products Lab
International Institute of Tropical Forestry
Pacific Northwest
Pacific Southwest
Rocky Mountain
Southern Research Station
Help - We Participate  Government Made Easy

Global Forest Information Service

US Forest Service
P.O. Box 96090
Washington, D.C.

(202) 205-8333

You are here: Home / Search / Publication Information
Bookmark and Share

Publication Information

View PDF (297 KB)

Title: Sample sizes and model comparison metrics for species distribution models

Author: Hanberry, B.B.; He, H.S.; Dey, D.C.;

Date: 2012

Source: Ecological Modelling. 227: 29-33.

Publication Series: Scientific Journal (JRNL)

Description: Species distribution models use small samples to produce continuous distribution maps. The question of how small a sample can be to produce an accurate model generally has been answered based on comparisons to maximum sample sizes of 200 observations or fewer. In addition, model comparisons often are made with the kappa statistic, which has become controversial. Therefore, we used sample sizes ranging from 30 to 2500 individuals to model 16 tree species or species groups in Minnesota's Laurentian Mixed Forest. We compared all smaller sample sizes to models for 2500 records and then 1000 records using Cohen’s kappa, Pearson’s r, Cronbach's alpha, and two intraclass correlation coefficients. We then began confirmation of our findings by repeating the process using a smaller extent in a different area, a portion of Missouri's Central Hardwoods. Although there are disadvantages to using the kappa statistic and intraclass correlation coefficients, due to conversion to categories or computation limitations respectively, the model comparison metrics produced similar results. Comparison values depend on the maximum sample size, and at sample sizes roughly around 10-20% of the maximum sample size, values will begin to decrease more rapidly. Models may not be very accurate below a sample size of 200, for our study areas, extents, and grains. Nonetheless, models based on small sample sizes still may provide information for rare species. We recommend using the full sample available for modeling, after using a partial sample for accuracy assessment. Future research is needed to confirm our findings for different areas, extents, grains, and species.

Keywords: Correlation, Forest inventory and analysis, Intraclass correlation coefficient, Kappa

Publication Notes:

  • We recommend that you also print this page and attach it to the printout of the article, to retain the full citation information.
  • This article was written and prepared by U.S. Government employees on official time, and is therefore in the public domain.
  • This publication may be available in hard copy. Check the Northern Research Station web site to request a printed copy of this publication.
  • Our on-line publications are scanned and captured using Adobe Acrobat. During the capture process some typographical errors may occur. Please contact Sharon Hobrla, if you notice any errors which make this publication unusable.



Hanberry, B.B.; He, H.S.; Dey, D.C. 2012. Sample sizes and model comparison metrics for species distribution models. Ecological Modelling. 227: 29-33.


 [ Get Acrobat ]  Get the latest version of the Adobe Acrobat reader or Acrobat Reader for Windows with Search and Accessibility

USDA logo which links to the department's national site. Forest Service logo which links to the agency's national site.