New Strategies for De Novo Sequencing of Daunting Genomes

Award Information
Agency: Department of Health and Human Services
Branch: N/A
Contract: 1R43HG006022-01
Agency Tracking Number: HG006022
Amount: $136,540.00
Phase: Phase I
Program: SBIR
Awards Year: 2010
Solicitation Year: 2010
Solicitation Topic Code: NHGRI
Solicitation Number: PHS2010-2
Small Business Information
DUNS: 019710669
HUBZone Owned: N
Woman Owned: N
Socially and Economically Disadvantaged: N
Principal Investigator
 () -
Business Contact
Phone: (608) 831-9011
Research Institution
DESCRIPTION (provided by applicant): Next-generation sequencing (NGS) platforms are fundamentally altering genetic and genomic research by providing massive amounts of data in a low-cost, high-throughput format. The main drawback of existing technologies is the short sequence read lengths they produce. As a result, de novo assembly of daunting genomes is still impossible and resequencing and assembly of human genomes is a significant challenge when analyzing complex genomic regions. New tools that bridge the gap between massively parallel short read sequencing technologies (35-500 bases) and the need for large scaffolds to assemble a genome (100,000 bases) are clearly needed. The SBIR Phase I grant proposal New Strategies for De Novo Sequencing of Daunting Genomes proposes to develop a new front end to NGS. The technology to construct paired-end clone-free libraries from large randomly sheared DNA fragments (50-300 kb) has not been developed. A high efficiency universal protocol for making clone-free libraries will generate long physical scaffolds from the paired-ends of 50, 100 and 300 Kb inserts, enabling the accurate assembly of complex genomes, much like fosmid and BAC end sequences in conventional clone based strategies. A new virtual BAC library construction technology will replace the conventional clone based method. A clone-free 300 Kb insert library will be constructed and individual members will be completely sequenced using the new tools developed for the first time in this proposal. The production of numerous contiguous 300 Kb regions of sequence from a chromosome will dramatically simplify the accurate assembly of complex genomic regions as well as complex genomes, much like the sequencing of entire BACs clone in conventional strategies. The development of these tools could reduce computational cost of genome assembly by 2-3 orders of magnitude, produce more complete and accurate genomes, enable the de novo sequencing of daunting genomes, and make personal genome resequencing and metagenomics tractable. 1 PUBLIC HEALTH RELEVANCE: The practical result of this work will be the accurate assembly of complex regions of the human genome associated with disease, as well the ability to assemble entire genomes using random sequencing strategies. DNA sequencing of individual human genomes can unlock the genetic basis of complex diseases and as such is important to our medical well being. Metagenomic analysis of hundreds of unique organisms that cannot be cultivated can unlock new metabolic pathways for small molecule drugs and other industry applications. True de novo sequencing of novel genomes of complex organisms can shed light on comparative genomics, the evolutionary history of life, and better understanding of all life styles on earth, which forms a web that humans need for survival.

* Information listed above is at the time of submission. *

Agency Micro-sites

SBA logo
Department of Agriculture logo
Department of Commerce logo
Department of Defense logo
Department of Education logo
Department of Energy logo
Department of Health and Human Services logo
Department of Homeland Security logo
Department of Transportation logo
Environmental Protection Agency logo
National Aeronautics and Space Administration logo
National Science Foundation logo
US Flag An Official Website of the United States Government