High Precision Event Extraction Using Predicate Arguments (HIPEPA)

Award Information
Agency: Department of Defense
Branch: Air Force
Contract: FA8750-05-C-0139
Agency Tracking Number: F051-090-2036
Amount: $99,968.00
Phase: Phase I
Program: SBIR
Awards Year: 2005
Solicitation Year: 2005
Solicitation Topic Code: AF05-090
Solicitation Number: 2005.1
Small Business Information
1701 North Collins Blvd., Suite 2000, Richardson, TX, 75080
DUNS: 098242246
HUBZone Owned: N
Woman Owned: N
Socially and Economically Disadvantaged: N
Principal Investigator
 Sanda Harabagiu
 Chief Technical Officer
 (972) 231-0052
Business Contact
 Yolanda Guzman
Title: Senior In-House Counsel / Financial
Phone: (972) 231-0052
Email: yolanda@languagecomputer.com
Research Institution
Our goal is to facilitate visualization of event information using a novel event extraction paradigm that achieves accuracy close to that of human analysts and allows both a simplified customization to new domains and extraction of complex events expressed as an event extraction framework. The event extraction framework replaces the pattern-based paradigm with predictate-argument structures that allow extraction of events in any domain. Mappings to new domain of interest can be learned in this new paradigm by making use of maximum entropy models. Furthermore, this novel paradigm that we propose takes advantage of several novel open-domain features including (1) an open domain semantic parser used to extract syntactic and semantic information (e.g. predicate-argument relations) from source documents and (2) discourse processing techniques such as coreference resolution of events produced by event normalization and event fusion. This paradigm allows temporal and spatial normalization of events such that space or time expressions are recognized and normalized when they are not explicit references e.g. "last summer", "four years" (duration), every month (set), and "a year after the earthquake (event anchored expressions). Similarly, in the case of special expressions, we consider implicit references e.g. "second house", areas of the country such as "The South", sets "every river", and event anchored expressions, e.g. "twenty miles north of Baghdad". As technical leaders in the field of Natural Language Processing and its application to unstructured text understanding for the military and intelligence communities, Language Computer Corporation (LCC) is well suited to provide this capability. Our proposed work on this SBIR, called High Precision Event Extraction Using Predicate Arguments (HIPEPA) will provide a framework for event extraction that is accurate, domain relevant, and easily customized to the dynamic information needs of the intelligence analyst. LCC will build on our experience and existing capability to provide a prototype pattern-free event extraction framework that normalizes spatial and temporal information for visualization and supports the detection and fusion of event data.

* Information listed above is at the time of submission. *

Agency Micro-sites

SBA logo
Department of Agriculture logo
Department of Commerce logo
Department of Defense logo
Department of Education logo
Department of Energy logo
Department of Health and Human Services logo
Department of Homeland Security logo
Department of Transportation logo
Environmental Protection Agency logo
National Aeronautics and Space Administration logo
National Science Foundation logo
US Flag An Official Website of the United States Government