Skip to page content
USDA Forest Service
  
Treesearch

Research & Development Treesearch

 
Treesearch Home
About Treesearch
Contact Us
Research & Development
Forest Products Lab
International Institute of Tropical Forestry
Northern
Pacific Northwest
Pacific Southwest
Rocky Mountain
Southern Research Station
Help
 

Science.gov - We Participate


USA.gov  Government Made Easy


Global Forest Information Service

US Forest Service
P.O. Box 96090
Washington, D.C.
20090-6090

(202) 205-8333

You are here: Home / Search / Publication Information
Bookmark and Share

Publication Information

View PDF (0 bytes)

Title: An empirical evaluation of two-stage species tree inference strategies using a multilocus dataset from North American pines

Author: DeGiorgio, Michael; Syring, John; Eckert, Andrew J.; Liston, Aaron; Cronn, Richard; Neale, David B.; Rosenberg, Noah A.;

Date: 2014

Source: BMC Evolutionary Biology

Publication Series: Scientific Journal (JRNL)

Description: Background: As it becomes increasingly possible to obtain DNA sequences of orthologous genes from diverse sets of taxa, species trees are frequently being inferred from multilocus data. However, the behavior of many methods for performing this inference has remained largely unexplored. Some methods have been proven to be consistent given certain evolutionary models, whereas others rely on criteria that, although appropriate for many parameter values, have peculiar zones of the parameter space in which they fail to converge on the correct estimate as data sets increase in size. Results: Here, using North American pines, we empirically evaluate the behavior of 24 strategies for species tree inference using three alternative outgroups (72 strategies total). The data consist of 120 individuals sampled in eight ingroup species from subsection Strobus and three outgroup species from subsection Gerardianae, spanning ∼47 kilobases of sequence at 121 loci. Each “strategy” for inferring species trees consists of three features: a species tree construction method, a gene tree inference method, and a choice of outgroup. We use multivariate analysis techniques such as principal components analysis and hierarchical clustering to identify tree characteristics that are robustly observed across strategies, as well as to identify groups of strategies that produce trees with similar features. We find that strategies that construct species trees using only topological information cluster together and that strategies that use additional non-topological information (e.g., branch lengths) also cluster together. Strategies that utilize more than one individual within a species to infer gene trees tend to produce estimates of species trees that contain clades present in trees estimated by other strategies. Strategies that use the minimize-deep-coalescences criterion to construct species trees tend to produce species tree estimates that contain clades that are not present in trees estimated by the Concatenation, RTC, SMRT, STAR, and STEAC methods, and that in general are more balanced than those inferred by these other strategies. Conclusions: When constructing a species tree from a multilocus set of sequences, our observations provide a basis for interpreting differences in species tree estimates obtained via different approaches that have a two-stage structure in common, one step for gene tree estimation and a second step for species tree estimation. The methods explored here employ a number of distinct features of the data, and our analysis suggests that recovery of the same results from multiple methods that tend to differ in their patterns of inference can be a valuable tool for obtaining reliable estimates.

Publication Notes:

  • We recommend that you also print this page and attach it to the printout of the article, to retain the full citation information.
  • This article was written and prepared by U.S. Government employees on official time, and is therefore in the public domain.

XML: View XML

Citation:


DeGiorgio, Michael; Syring, John; Eckert, Andrew J.; Liston, Aaron; Cronn, Richard; Neale, David B.; Rosenberg, Noah A. 2014. An empirical evaluation of two-stage species tree inference strategies using a multilocus dataset from North American pines. BMC Evolutionary Biology. 14: 67 p

 


 [ Get Acrobat ]  Get the latest version of the Adobe Acrobat reader or Acrobat Reader for Windows with Search and Accessibility

USDA logo which links to the department's national site. Forest Service logo which links to the agency's national site.