Intelligent Record Linkage Techniques Based on Information Retrieval, Natural Language Processing, and Machine Learning

Award Information
Agency:
Department of Defense
Amount:
$100,000.00
Program:
STTR
Contract:
F49620-01-C-0055
Solitcitation Year:
N/A
Solicitation Number:
N/A
Branch:
Air Force
Award Year:
2001
Phase:
Phase I
Agency Tracking Number:
F013-0049
Solicitation Topic Code:
N/A
Small Business Information
SCIENTIFIC SYSTEMS COMPANY, INC.
500 West Cummings Park, Suite 3000, Woburn, MA, 01801
Hubzone Owned:
N
Woman Owned:
N
Socially and Economically Disadvantaged:
N
Duns:
859244204
Principal Investigator
 Sai-Ming Li
 Research Engineer
 (781) 933-5355
 eliot@ssci.com
Business Contact
 Raman Mehra
Title: President
Phone: (781) 933-5355
Email: rkm@ssci.com
Research Institution
 UNIV. OF MARYLAND, BC
 Elaine Young
 Office of Sponsored Programs,, UMBC, 1000 Hilltop
Baltimore, MD, 21250
 (410) 455-1336
 Nonprofit college or university
Abstract
The sheer magnitude of information available online via the Internet has overwhelmed the ability of existing search tools to produce useful query responses. Current web-search techniques typically fail to correlate relevant documents that areidentified in different ways, such as synonyms and acronyms. The challenge is to find an approach that can obtain highly accurate matches even when those documents do not share any obvious attributes with the query, and with minimal informationrequirement from the user. The objective of this STTR project is to develop an information management system to rapidly and accurately linking records of related information from web-based information sources. In Phase I we plan to identify,implement, andevaluate hybrid approaches for cross-record linkage, using a combination of machine learning, information retrieval, and natural language processing methodologies. This will involve the integrationof pre-processed outputs of multiple approaches for record linkage into a significantly higher-quality result. In particular, we will investigate the use of selected statistical, Artificial Intelligence and Neural Networks techniques forimproving the record linkage performance of information management systems. University of Maryland (Baltimore County) will be the research institute partner for this effort, under the direction of Professor Charles Nicholas, aninternationally recognized expert in information retrieval and knowledge management. Commercial applications of the proposed technology include all private sector companies, federal and state agencies who need to acquire and manage large amountof information in the form of text documents in order to stay competitive or efficient. It will appeal to knowledge-intensive businesses, small/medium companies, individual consultants, universities and federal research institutesas acost-effective alternative to traditional database or web search and match engines.

* information listed above is at the time of submission.

Agency Micro-sites

US Flag An Official Website of the United States Government