USA flag logo/image

An Official Website of the United States Government

BioHDF - Open Binary File Standards for Bioinformatics

Award Information

Agency:
Department of Health and Human Services
Branch:
N/A
Award ID:
75670
Program Year/Program:
2005 / STTR
Agency Tracking Number:
HG003792
Solicitation Year:
N/A
Solicitation Topic Code:
N/A
Solicitation Number:
N/A
Small Business Information
GEOSPIZA, INC.
BOX 344, 2442 NW MARKET ST SEATTLE, WA 98107-
View profile »
Woman-Owned: No
Minority-Owned: No
HUBZone-Owned: No
 
Phase 1
Fiscal Year: 2005
Title: BioHDF - Open Binary File Standards for Bioinformatics
Agency: HHS
Contract: 1R41HG003792-01
Award Amount: $142,775.00
 

Abstract:

DESCRIPTION (provided by applicant): Geospiza Inc. and the National Center for Supercomputing Applications (NCSA) are creating a standards based software framework around NCSA's Heirarchical Data Format (HDF5). The envisioned framework will integrate algorithms important in DNA and protein sequence analysis to create scalable high throughput software systems which will be accessed using new graphical user interfaces (GUIs) to provide researchers with new views of their data to finish sequencing projects in large-scale genome sequencing, microbial genome sequencing, viral epidemiology, polymorphism detection, phylogenetic analysis, multi-locus sequence typing, confirmatory sequencing, and EST analysis. In our vision, algorithms will be either integrated into the system to directly read and write from HDF5 project files, or they will communicate with project files via filter programs that produce standardized XML formatted data. Through this model, a scalable solution will support different applications of DNA sequencing, fulfilling the many needs and requirements expressed by the medical research community now and into the future. As the first step in this process we will, define requirements for editing and versioning data in DNA sequencing, research and propose data models for the computational phases of DNA sequencing and annotating DNA sequence data using existing standards, create a prototype application for DNA sequencing based SNP discovery, and engage the bioinformatics community for BioHDF adoption. In the past ten years the cost of sequencing DNA has dropped over 1000 fold and the amount of raw sequence data, entering our national repositories is doubling every 12 months. DNA sequencing is fundamental to biological research activities such as genomics, systems biology, and clinical medicine. Proposals are being sought to decrease sequencing costs by two orders of magnitude through technology refinements with an ultimate vision of developing technology to sequence human genome equivalents for $1000 each. The amount of data that will be produced through these endeavors is unimaginable. However, the $1,000 genome will not advance medical research unless we integrate all phases of the DNA sequencing process and treat the creation, management, finishing, analysis, and sharing of the data as common goals.

Principal Investigator:

Todd M. Smith
2066334403
TODD@GEOSPIZA.COM

Business Contact:

Todd Smith
2066334403
TODD@GEOSPIZA.COM
Small Business Information at Submission:

Geospiza, Inc.
Geospiza, Inc. Box 344, 2442 Nw Market St Seattle, WA 98107

EIN/Tax ID: 911894564
DUNS: N/A
Number of Employees: N/A
Woman-Owned: No
Minority-Owned: No
HUBZone-Owned: No
Research Institution Information:
UNIVERSITY OF ILLINOIS
UNIVERSITY OF ILLINOIS
OFFICE OF SPONSORED PROGRAMS & RESEARCH ADMIN
CHAMPAIGN, IL 61820
RI Type: Nonprofit college or university