USA flag logo/image

An Official Website of the United States Government

Disambiguation of Entity Association Statements

Award Information

Agency:
Department of Defense
Branch:
N/A
Award ID:
Program Year/Program:
2011 / SBIR
Agency Tracking Number:
N102-176-1050
Solicitation Year:
2010
Solicitation Topic Code:
N102-176
Solicitation Number:
2010.2
Small Business Information
SEMANDEX NETWORKS, Inc
5 Independence Way Suite 309 Princeton, NJ -
View profile »
Woman-Owned: No
Minority-Owned: No
HUBZone-Owned: No
 
Phase 1
Fiscal Year: 2011
Title: Disambiguation of Entity Association Statements
Agency: DOD
Contract: N00014-10-M-0437
Award Amount: $69,997.00
 

Abstract:

Existing techniques for level 1 fusion of association statements in large RDF data stores have proven inadequate to the challenge of entity disambiguation from multiple intelligence sources. The resulting entity and association uncertainty causes problems of missed associations, redundant RDF statements that limit scalability, and significant limitations on higher level reasoning algorithms. Numerous researchers have established the need for higher level context in addressing this problem, and while necessary, context alone is not sufficient. We propose to leverage an existing software-based characteristic matcher operating in a context-aware framework by using a simulated annealing algorithm to support level 1 fusion of a large RDF data store. We will develop a capability whose goal is to generate a single connected graph from large RDF that contains no redundant entities and no missed connections. Specifically, the algorithm we propose will address (i) entity uncertainty, (ii) entity information from different knowledge bases that results in a contradiction, (iii) creation of statements regarding an entity in a knowledge base or common feature space that do not contradict existing statements on that entity, and (iv) allows for the deletion of an entity or entity statements without breaking other associations that may refer to that entity. The result will be an algorithm that integrates and connects new RDF-expressed sources while preserving the original data and semantics, allowing the expression of a large data corpus as a single connected graph. The Feasibility Criterion for Phase I will be to measure and show clear progress in RDF statement disambiguation against a data store containing tens of thousands of statements.

Principal Investigator:

Dave Ihrie
Vice President, Governmen
(301) 233-4780
dihrie@semandex.net

Business Contact:

Adriana Reininger
Business&Contracts Mana
(609) 454-0657
asr@semandex.net
Small Business Information at Submission:

SEMANDEX NETWORKS, Inc
5 Independence Way Suite 309 Princeton, NJ -

EIN/Tax ID: 223732123
DUNS: N/A
Number of Employees:
Woman-Owned: No
Minority-Owned: No
HUBZone-Owned: No