Intelligent Record Linkage Techniques Based on Information Retrieval, Natural Language Processing, and Machine Learning

Award Information
Agency: Department of Defense
Branch: Air Force
Contract: F49620-01-C-0055
Agency Tracking Number: F013-0049
Amount: $100,000.00
Phase: Phase I
Program: STTR
Awards Year: 2001
Solicitation Year: N/A
Solicitation Topic Code: N/A
Solicitation Number: N/A
Small Business Information
500 West Cummings Park, Suite 3000, Woburn, MA, 01801
DUNS: 859244204
HUBZone Owned: N
Woman Owned: N
Socially and Economically Disadvantaged: N
Principal Investigator
 Sai-Ming Li
 Research Engineer
 (781) 933-5355
Business Contact
 Raman Mehra
Title: President
Phone: (781) 933-5355
Research Institution
 Elaine Young
 Office of Sponsored Programs,, UMBC, 1000 Hilltop
Baltimore, MD, 21250
 (410) 455-1336
 Nonprofit college or university
The sheer magnitude of information available online via the Internet has overwhelmed the ability of existing search tools to produce useful query responses. Current web-search techniques typically fail to correlate relevant documents that areidentified in different ways, such as synonyms and acronyms. The challenge is to find an approach that can obtain highly accurate matches even when those documents do not share any obvious attributes with the query, and with minimal informationrequirement from the user. The objective of this STTR project is to develop an information management system to rapidly and accurately linking records of related information from web-based information sources. In Phase I we plan to identify,implement, andevaluate hybrid approaches for cross-record linkage, using a combination of machine learning, information retrieval, and natural language processing methodologies. This will involve the integrationof pre-processed outputs of multiple approaches for record linkage into a significantly higher-quality result. In particular, we will investigate the use of selected statistical, Artificial Intelligence and Neural Networks techniques forimproving the record linkage performance of information management systems. University of Maryland (Baltimore County) will be the research institute partner for this effort, under the direction of Professor Charles Nicholas, aninternationally recognized expert in information retrieval and knowledge management. Commercial applications of the proposed technology include all private sector companies, federal and state agencies who need to acquire and manage large amountof information in the form of text documents in order to stay competitive or efficient. It will appeal to knowledge-intensive businesses, small/medium companies, individual consultants, universities and federal research institutesas acost-effective alternative to traditional database or web search and match engines.

* information listed above is at the time of submission.

Agency Micro-sites

SBA logo
Department of Agriculture logo
Department of Commerce logo
Department of Defense logo
Department of Education logo
Department of Energy logo
Department of Health and Human Services logo
Department of Homeland Security logo
Department of Transportation logo
Environmental Protection Agency logo
National Aeronautics and Space Administration logo
National Science Foundation logo
US Flag An Official Website of the United States Government