You are here

Bootstrapping Background Knowledge to Arbitrate Data Integrity Issues Within Large Volumes of Data

Award Information
Agency: Department of Defense
Branch: Army
Contract: W15P7T-11-C-H257
Agency Tracking Number: A111-026-0857
Amount: $99,999.00
Phase: Phase I
Program: SBIR
Solicitation Topic Code: A11-026
Solicitation Number: 2011.1
Timeline
Solicitation Year: 2011
Award Year: 2011
Award Start Date (Proposal Award Date): 2011-04-28
Award End Date (Contract End Date): N/A
Small Business Information
951 Mariner's Island Blvd., STE 360
San Mateo, CA -
United States
DUNS: 608176715
HUBZone Owned: No
Woman Owned: No
Socially and Economically Disadvantaged: No
Principal Investigator
 Terrance Goan
 Principal Investigator
 (206) 545-1478
 goan@stottlerhenke.com
Business Contact
 Carolyn Maxwell
Title: Contracts Manager
Phone: (650) 931-2700
Email: maxwell@stottlerhenke.com
Research Institution
 Stub
Abstract

As intelligence and sensor data acquisition technologies improve and expand, the difficulties of maintaining data integrity across vast amounts of data continue to plague researchers. Generally considered to be a problem of computational scalability, we also recognize that a much greater challenge lies in developing and maintaining background knowledge that can be used to move beyond traditional data integrity checks, in an effort to identify and resolve more complex inconsistencies. With our proposed system, called Arbiter, we seek to exploit the hidden opportunity posed by very large data sources in three ways: (1) constructing pseudo-genomes for each entity instance to rapidly identify likely matches, leveraging lightweight ontology alignment heuristics to efficiently identify high-confidence alignment opportunities; (2) leveraging data redundancy to autonomously learn the background knowledge necessary to facilitate the detection of complex relational inconsistencies; and (3) validating entity instance matches with a wide range of heuristics in combination with the acquired background knowledge to resolve higher levels of uncertainty. Phase I prototyping will draw on existing software components, allowing rapid progress.

* Information listed above is at the time of submission. *

US Flag An Official Website of the United States Government