You are here

Commercial Software Using High throughput Computational Techniques to Improve Genome Analysis

Award Information
Agency: Department of Health and Human Services
Branch: National Institutes of Health
Contract: 4R44HG009474-02
Agency Tracking Number: R44HG009474
Amount: $224,417.00
Phase: Phase I
Program: SBIR
Solicitation Topic Code: 172
Solicitation Number: PA15-269
Timeline
Solicitation Year: 2015
Award Year: 2017
Award Start Date (Proposal Award Date): 2017-02-13
Award End Date (Contract End Date): 2017-07-31
Small Business Information
3151 VILLAGE CIR S
Ann Arbor, MI 48108-2243
United States
DUNS: 080055927
HUBZone Owned: No
Woman Owned: No
Socially and Economically Disadvantaged: No
Principal Investigator
 MARK KIEL
 (734) 223-2519
 kiel@genomenon.com
Business Contact
 MARK KIEL
Phone: (734) 223-2519
Email: kiel@genomenon.com
Research Institution
N/A
Abstract

Recent advances in DNA sequencing technology have not been matched by improved analytic techniques to
quickly and accurately interpret patient genome data to inform diagnosis prognosis and therapy making
decisions in the clinic and to identify candidate biomarkers of disease in research laboratories Development of
automated techniques to facilitate interpretation of this data will benefit patient care and improve public health
by promoting widespread use of cost efficient sequencing clinically and by making it feasible to sequence a
broader range of patients including those with complex disease or to identify patients who have an elevated
risk of developing future disease Our long term goal is to commoditize sequence interpretation using high
throughput computational techniques in the same way that next generation DNA sequencing technology has
commoditized genome data production The present project will result in commercial software that automates
genome sequence interpretation Specifically we will develop software that automatically collects and
organizes a comprehensive set of genetic information by systematically reading millions of scientific articles
and scanning dozens of genetic variant databases software that uses this information to prioritize patient
data into clinical categories based on the likelihood of disease and software that automatically identifies
candidate biomarkers of disease from multi sample cohort data To do this we will use a variety of innovative
data processing techniques First we will systematically mutate the reference genome in silico to produce a
comprehensive database of every possible mutation at every position of every gene and use this data to query
every word of every article ever published or any publicly available database to identify disease gene variant
associations We will compare the results from this automated process to results obtained using more
expensive and time consuming manual methods and hypothesize that we can achieve concordance and
identify more variants and fold more references for each These results will be organized into clinically
meaningful categories and presented in an interactive graphical interface that displays the evidence for each of
these associations We will then use this information to drive prioritization of patient data based on similarities
to known disease causing variants and the strength of evidence for their pathogenicity in order to increase
analytic sensitivity and specificity thereby improving speed and reliability of sequencing in the clinic Our
automated results will then be compared to conventional methods of data annotation and filtration for andgt
patient samples from diseases Finally we will use the same prioritization strategy to comprehensively
compare variant data between all patients within a disease cohort to automatically identify the variants most
likely to lead to disease and compare our automated results to conventional methods for andgt samples from
diseases The growth in the $ B genome sequencing market is driven by improvements in informatics
techniques and automated solutions such as proposed here have significant commercial potential The successful completion of the proposed project will contribute to the public health mission of
the NIH by promoting more widespread adoption of genome sequencing by making the
interpretation of this data more accurate and cost effective in clinical and research laboratories
The community of users that can benefit from this research include geneticists oncologists
pathologists researchers and patients

* Information listed above is at the time of submission. *

US Flag An Official Website of the United States Government