High-Precision Agile Active Learning for Domain-Customizable Information Extraction (HALCYON)

Award Information
Agency: Department of Defense
Branch: Air Force
Contract: FA8750-07-C-0148
Agency Tracking Number: F071-091-2540
Amount: $99,639.00
Phase: Phase I
Program: SBIR
Awards Year: 2007
Solicitation Year: 2007
Solicitation Topic Code: AF071-091
Solicitation Number: 2007.1
Small Business Information
LANGUAGE COMPUTER CORP.
1701 North Collins Blvd., Suite 2000, Richardson, TX, 75080
DUNS: 127802234
HUBZone Owned: N
Woman Owned: N
Socially and Economically Disadvantaged: N
Principal Investigator
 Paul Aarseth
 Principal Investigator
 (972) 231-0052
 paul.aarseth@languagecomputer.com
Business Contact
 Yolanda Guzman
Title: VP-Financial & Legal
Phone: (972) 231-0052
Email: yolanda@languagecomputer.com
Research Institution
N/A
Abstract
The dynamic operational environments in which Air Force users operate today require information extraction systems that can be rapidly – and easily – customized to new and challenging domains. In order to address operational demands for textual information, Language Computer Corporation (LCC) has developed a customizable information extraction system, known as CiceroCustom, which enables military and intelligence personnel to extract information from sources of unstructured textual information (including OSINT and HUMINT) quickly and efficiently. In this Phase I SBIR effort, called High-Precision Agile Active Learning for Domain-Customizable Information Extraction (HALCYON), LCC will extend the customizable information extraction capacity provided by CiceroCustom with a new framework which can be used to enhance the quality and accuracy of domain customizations performed by users. We plan to build an enhanced prototype which incorporates (1) an agile customization framework which leverages a novel paradigm for active learning, (2) a context-driven mechanism for customizing extractors to specific domains that allows for the incorporation of diverse forms of user input, (3) a novel method for integrating domain-specific knowledge into an information extraction system, and (4) a robust textual reasoning capability which leverages a state-of-the-art textual entailment system in order to reason about domain knowledge for extraction.

* information listed above is at the time of submission.

Agency Micro-sites

SBA logo
Department of Agriculture logo
Department of Commerce logo
Department of Defense logo
Department of Education logo
Department of Energy logo
Department of Health and Human Services logo
Department of Homeland Security logo
Department of Transportation logo
Environmental Protection Agency logo
National Aeronautics and Space Administration logo
National Science Foundation logo
US Flag An Official Website of the United States Government