Scalable tools for the analysis of chemical compounds using graph-based querying

Award Information
Agency:
Department of Health and Human Services
Branch
n/a
Amount:
$223,300.00
Award Year:
2007
Program:
SBIR
Phase:
Phase I
Contract:
1R43GM081328-01
Award Id:
85725
Agency Tracking Number:
GM081328
Solicitation Year:
n/a
Solicitation Topic Code:
n/a
Solicitation Number:
n/a
Small Business Information
ACELOT, INC., 5385 Hollister Avenue, #111, SANTA BARBARA, CA, 93111
Hubzone Owned:
N
Minority Owned:
N
Woman Owned:
N
Duns:
784692001
Principal Investigator:
WILLIAMLINDSTROM
() -
Business Contact:
AMBUJSINGH
() -
singhambujk@gmail.com
Research Institute:
n/a
Abstract
DESCRIPTION (provided by applicant): The generation, manipulation, storage and retrieval of chemical structures and subsequent calculation of various properties, often related to their biological activity, have become extremely important for drug discovery . The resulting field of Cheminformatics has blossomed in recent years and has been a hotbed for the application of data mining and database principles to collections of chemical compounds. The wide adoption of these techniques has led to im- proved method s for representation of chemical structures, similarity-based retrieval of chemical compounds, diversity analysis, and substructure mining. The representation of chemical compounds as graphs captures the essential aspects of chemical structures in a natura l way that can be communicated easily. Recent techniques for graph querying and mining have demonstrated great promise for scalability as well as an improved quality of results over traditional representation techniques such as fingerprints. These techniqu es include novel ways of graph matching, the organization of graphs in a hierarchical index structure, and the mining of a set of graphs to find statistically over-represented motifs. The proposed research will develop computational tools based on these id eas and investigate the feasibility of the techniques on diverse and large data sets. Graph-based techniques for similar compound retrieval, diversity analysis, and substructure mining will be compared to competing techniques based on other representations of chemical structures. Finally, a system that integrates chemical compound databases with biological databases will be developed. The resulting analysis methods are expected to make a significant impact on the complex, time-consuming, and expensive proce ss of drug discovery. Graph-based representation of chemical compounds results in a more accurate realization of the chemical space. The use of recent techniques in graph querying and mining will enable data analysis that can scale to millions of compounds . The developed system will also integrate information on chemical compounds with biological activity and protein interaction networks, thus enabling more efficient drug discovery.

* information listed above is at the time of submission.

Agency Micro-sites


SBA logo

Department of Agriculture logo

Department of Commerce logo

Department of Defense logo

Department of Education logo

Department of Energy logo

Department of Health and Human Services logo

Department of Homeland Security logo

Department of Transportation logo

Enviromental Protection Agency logo

National Aeronautics and Space Administration logo

National Science Foundation logo
US Flag An Official Website of the United States Government