You are here
Consolidating Entity Information from Heterogeneous Text Sources for Multi-INT Fusion
Title: Sr. Reserach Scientist
Phone: (716) 565-0401
Email: cornell@janyainc.com
Title: CEO
Phone: (716) 565-0401
Email: rohini@janyainc.com
In this project we propose to develop an end-to-end high performance system for cross document entity consolidation. The problems tackled are person name disambiguation, personal alias detection as well as location name disambiguation. Performance on alias detection is proposed to be enhanced by using the results of entity consolidation in a second pass. Flexibility to configure and use additional information from an external database is also addressed, and this can further improve the performance of the system. Other key objectives are updating a world model database with consolidated entity information as well as support for making the technology stand alone to work seamlessly with other compatible IE engines. BENEFIT: The technology developed in this effort will result in a state of the art end-to-end system for cross document entity consolidation that includes name disambiguation as well as alias detection. This also involves support for mapping new information to the correct entity in a world model database as well as using the technology with other compatible IE engines. Business intelligence systems frequently make use of large knowledge bases containing information on companies, products, people and projects. The ability to automatically correlate these knowledge bases with dynamically extracted entity profiles (EPs) from unstructured text in order to perform change detection and automatic update would be of tremendous value. This effort can potentially assure to combine in one place, the following - (i) features of the consolidated entity, (ii) its attributes, (iii) relations to or from another entity and (iv) events in which the entity is involved The consolidation especially provides for abundance of information about an entity due to accumulation of information from several documents.
* Information listed above is at the time of submission. *