Skip to page content
USDA Forest Service

Research & Development Treesearch

Treesearch Home
About Treesearch
Contact Us
Research & Development
Forest Products Lab
International Institute of Tropical Forestry
Pacific Northwest
Pacific Southwest
Rocky Mountain
Southern Research Station
Help - We Participate  Government Made Easy

Global Forest Information Service

US Forest Service
P.O. Box 96090
Washington, D.C.

(202) 205-8333

You are here: Home / Search / Publication Information
Bookmark and Share

Publication Information

View PDF (2.5 MB)

Title: Meeting the challenges of non-referenced genome assembly from short-read sequence data

Author: Parks, M.; Liston, A.; Cronn, R.;

Date: 2010

Source: Acta Horticulturae. 859: 323-332

Publication Series: Scientific Journal (JRNL)

Description: Massively parallel sequencing technologies (MPST) offer unprecedented opportunities for novel sequencing projects. MPST, while offering tremendous sequencing capacity, are typically most effective in resequencing projects (as opposed to the sequencing of novel genomes) due to the fact that sequence is returned in relatively short reads. Nonetheless, there is great interest in applying MPST to genome sequencing in non-model organisms. We have developed a bioinformatics pipeline to assemble short-read sequence data into nearly complete chloroplast genomes using a combination of de novo and reference-guided assembly, while decreasing reliance on a reference genome. Initially, short-read sequences are assembled into larger contigs using de novo assembly. De novo contigs are then aligned to the corresponding reference genome of the most closely related taxon available and merged to form a consensus sequence. The consensus sequence and reference are in turn 'merged' such that aligned de novo sequence remains unaffected while missing sequence is filled in using the reference sequence. This chimeric reference is then utilized in reference-guided assembly to align the original short-data, resulting in a draft plastome. Using two established Pinus reference plastomes, our method has been effective in the assembly of 33 chloroplast genomes within the genus Pinus, and results with four species representing other genera of Pinaceae suggest the method will be of general use in land plants, particularly once limitations of PCR-based chloroplast enrichment are overcome.

Keywords: next-generation sequencing, massively parallel sequencing, Pinus, Illumina

Publication Notes:

  • We recommend that you also print this page and attach it to the printout of the article, to retain the full citation information.
  • This article was written and prepared by U.S. Government employees on official time, and is therefore in the public domain.



Parks, M.; Liston, A.; Cronn, R. 2010. Meeting the challenges of non-referenced genome assembly from short-read sequence data. Acta Horticulturae. 859: 323-332.


 [ Get Acrobat ]  Get the latest version of the Adobe Acrobat reader or Acrobat Reader for Windows with Search and Accessibility

USDA logo which links to the department's national site. Forest Service logo which links to the agency's national site.