Intelligent Record Linkage Techniques Based on Information Retrieval, Natural Language Processing, and Machine Learning

Award Information
Agency:
Department of Defense
Branch
Air Force
Amount:
$100,000.00
Award Year:
2001
Program:
STTR
Phase:
Phase I
Contract:
F49620-01-C-0055
Award Id:
52605
Agency Tracking Number:
F013-0049
Solicitation Year:
n/a
Solicitation Topic Code:
n/a
Solicitation Number:
n/a
Small Business Information
500 West Cummings Park, Suite 3000, Woburn, MA, 01801
Hubzone Owned:
N
Minority Owned:
N
Woman Owned:
N
Duns:
859244204
Principal Investigator:
Sai-MingLi
Research Engineer
(781) 933-5355
eliot@ssci.com
Business Contact:
RamanMehra
President
(781) 933-5355
rkm@ssci.com
Research Institute:
UNIV. OF MARYLAND, BC
Elaine Young
Office of Sponsored Programs,, UMBC, 1000 Hilltop
Baltimore, MD, 21250
(410) 455-1336
Nonprofit college or university
Abstract
The sheer magnitude of information available online via the Internet has overwhelmed the ability of existing search tools to produce useful query responses. Current web-search techniques typically fail to correlate relevant documents that areidentified in different ways, such as synonyms and acronyms. The challenge is to find an approach that can obtain highly accurate matches even when those documents do not share any obvious attributes with the query, and with minimal informationrequirement from the user. The objective of this STTR project is to develop an information management system to rapidly and accurately linking records of related information from web-based information sources. In Phase I we plan to identify,implement, andevaluate hybrid approaches for cross-record linkage, using a combination of machine learning, information retrieval, and natural language processing methodologies. This will involve the integrationof pre-processed outputs of multiple approaches for record linkage into a significantly higher-quality result. In particular, we will investigate the use of selected statistical, Artificial Intelligence and Neural Networks techniques forimproving the record linkage performance of information management systems. University of Maryland (Baltimore County) will be the research institute partner for this effort, under the direction of Professor Charles Nicholas, aninternationally recognized expert in information retrieval and knowledge management. Commercial applications of the proposed technology include all private sector companies, federal and state agencies who need to acquire and manage large amountof information in the form of text documents in order to stay competitive or efficient. It will appeal to knowledge-intensive businesses, small/medium companies, individual consultants, universities and federal research institutesas acost-effective alternative to traditional database or web search and match engines.

* information listed above is at the time of submission.

Agency Micro-sites


SBA logo

Department of Agriculture logo

Department of Commerce logo

Department of Defense logo

Department of Education logo

Department of Energy logo

Department of Health and Human Services logo

Department of Homeland Security logo

Department of Transportation logo

Enviromental Protection Agency logo

National Aeronautics and Space Administration logo

National Science Foundation logo
US Flag An Official Website of the United States Government