High Precision Event Extraction Using Predicate Arguments (HIPEPA)

Award Information
Agency:
Department of Defense
Branch
Air Force
Amount:
$99,968.00
Award Year:
2005
Program:
SBIR
Phase:
Phase I
Contract:
FA8750-05-C-0139
Award Id:
72903
Agency Tracking Number:
F051-090-2036
Solicitation Year:
n/a
Solicitation Topic Code:
n/a
Solicitation Number:
n/a
Small Business Information
1701 North Collins Blvd., Suite 2000, Richardson, TX, 75080
Hubzone Owned:
N
Minority Owned:
N
Woman Owned:
N
Duns:
098242246
Principal Investigator:
Sanda Harabagiu
Chief Technical Officer
(972) 231-0052
sanda@languagecomputer.com
Business Contact:
Yolanda Guzman
Senior In-House Counsel / Financial
(972) 231-0052
yolanda@languagecomputer.com
Research Institution:
n/a
Abstract
Our goal is to facilitate visualization of event information using a novel event extraction paradigm that achieves accuracy close to that of human analysts and allows both a simplified customization to new domains and extraction of complex events expressed as an event extraction framework. The event extraction framework replaces the pattern-based paradigm with predictate-argument structures that allow extraction of events in any domain. Mappings to new domain of interest can be learned in this new paradigm by making use of maximum entropy models. Furthermore, this novel paradigm that we propose takes advantage of several novel open-domain features including (1) an open domain semantic parser used to extract syntactic and semantic information (e.g. predicate-argument relations) from source documents and (2) discourse processing techniques such as coreference resolution of events produced by event normalization and event fusion. This paradigm allows temporal and spatial normalization of events such that space or time expressions are recognized and normalized when they are not explicit references e.g. "last summer", "four years" (duration), every month (set), and "a year after the earthquake (event anchored expressions). Similarly, in the case of special expressions, we consider implicit references e.g. "second house", areas of the country such as "The South", sets "every river", and event anchored expressions, e.g. "twenty miles north of Baghdad". As technical leaders in the field of Natural Language Processing and its application to unstructured text understanding for the military and intelligence communities, Language Computer Corporation (LCC) is well suited to provide this capability. Our proposed work on this SBIR, called High Precision Event Extraction Using Predicate Arguments (HIPEPA) will provide a framework for event extraction that is accurate, domain relevant, and easily customized to the dynamic information needs of the intelligence analyst. LCC will build on our experience and existing capability to provide a prototype pattern-free event extraction framework that normalizes spatial and temporal information for visualization and supports the detection and fusion of event data.

* information listed above is at the time of submission.

Agency Micro-sites


SBA logo

Department of Agriculture logo

Department of Commerce logo

Department of Defense logo

Department of Education logo

Department of Energy logo

Department of Health and Human Services logo

Department of Homeland Security logo

Department of Transportation logo

Enviromental Protection Agency logo

National Aeronautics and Space Administration logo

National Science Foundation logo
US Flag An Official Website of the United States Government