USA flag logo/image

An Official Website of the United States Government

Flexible NLP system for MEDLINE information extraction

Award Information

Agency:
Department of Health and Human Services
Branch:
N/A
Award ID:
66515
Program Year/Program:
2003 / SBIR
Agency Tracking Number:
GM067276
Solicitation Year:
N/A
Solicitation Topic Code:
N/A
Solicitation Number:
N/A
Small Business Information
ARIADNE GENOMICS, INC.
ARIADNE GENOMICS, INC. 9430 Key West Avenue ROCKVILLE, MD 20850 3308
View profile »
Woman-Owned: No
Minority-Owned: No
HUBZone-Owned: No
 
Phase 1
Fiscal Year: 2003
Title: Flexible NLP system for MEDLINE information extraction
Agency: HHS
Contract: 1R43GM067276-01A1
Award Amount: $100,000.00
 

Abstract:

DESCRIPTION (provided by applicant): This Small Business Innovation and Research Phase I project focuses on the development of the fully automatic system for extraction of the protein function information from MEDLINE abstracts and conversion it into a form of a conceptual graph. All existent protein function databases depend on human experts who cannot keep up with the exponential growth of protein function information freely available in MEDLINE. There is an urgent need for an automatic system capable of extracting protein function information from literature. The system we proposed will be based on advanced natural language processing (NLP) technologies, and uses it as a fast and reliable way to extract information about protein function from human readable sources. To this end, we have developed and tested MedScan - a prototype of such system that parses scientific abstracts and converts protein function information into a form of a conceptual graph. It consists of a preprocessor module selecting candidate sentences from MEDLINE, an NLP module utilizing proprietary linguistic model to parse the selected sentences, and an information extraction module utilizing developed ontology to extract and validate protein function information. The results of MedScan evaluation indicate that it is a feasible candidate for a proposed task. In Phase II, the software system will be developed to assist the researchers to quickly access, search and navigate through the MEDLINE content, and to visualize and analyze the large volumes of protein function data. We will also extend our approach to other areas including pharmacogenomics and extraction of clinically relevant information.

Principal Investigator:

Nikolai D. Daraselia
2404536296
NIKOLAI@ARIADNEGENOMICS.COM

Business Contact:

Ilya Mazo
2404536296
MAZOILYA@ARIADNEGENOMICS.COM
Small Business Information at Submission:

ARIADNE GENOMICS, INC.
ARIADNE GENOMICS, INC. 9700 GREAT SENECA HWY ROCKVILLE, MD 20850

EIN/Tax ID: 753033369
DUNS: N/A
Number of Employees: N/A
Woman-Owned: No
Minority-Owned: No
HUBZone-Owned: No