You are here

Building Semantic Knowledge of Large Data Sets through Collaborative Visual Approaches


TECHNOLOGY AREAS: Information Systems

OBJECTIVE: Provide a technology application that builds semantic knowledge through metadata tagging capabilities and collaborative visual approaches, and includes multi-modal data feeds including text and visual data.

DESCRIPTION:  Today’s information systems generate massive amounts of data and the areas of Defense and Homeland Security have the difficult task of quickly understanding and determining the crucial information from complex data systems [2,8].  Visual data mining and collaborative visual approaches to information can help handle the influx of information [6].  There is a corpus of data that, if sufficient data mining capabilities existed, could aid intelligence analysts in understanding information in near-real time [3].  A system that can extract information automatically from various multi-modal data sources and the efficient customization of a system to a new domain while defining a set of features and extraction rules are challenging tasks [1,7].  Many of the current folksomies do not control for spelling variations, synonyms, or clarification of homonyms.  Controlled vocabularies such as a thesauri, classification schemes, and tools such as spell check or “text of 9 keys (T9)” could vastly reduce errors within reported data, improve search quality as well as enhance information discovery within a large database system [4].  Even if we could mine information via the tagged data, it is useless to users if it cannot be analyzed and employed in an operational setting [5].  These controlled tags should be easily accessible and users should be able to select and change appropriate tags at any time.  An application with various user defined displays and layouts would be most useful in building semantic knowledge [2].

Challenges for this topic include 1) identifying relevant multi-modal data sets that incorporate various forms of text and visual data, 2) determine an effective tagging technique that can be used to increase accuracy of information systems, 3) develop a method to run the application with supporting tags in a pre-existing system, 4) demonstrate the ability to search and/or navigate the tagged database, 5) show the application’s capability for visual analysis with various display options that can be utilized by distributed collaborative teams, 6) demonstrate the usefulness of this application in the broad realms of military, government, and commercial settings.

PHASE I: Develop a research plan that establishes the proof of concept for the application that will enable data tagging capabilities within a pre-existing large data set including text and visual data.  Describe how the increased tagging abilities will support advanced searching and integrated visual analysis of various display options.  Estimate the technical feasibility and value of the system and identify the essential technology issues that must be overcome to achieve success.  Prepare a comprehensive research and development proposal for Phase II that includes critical plans for testing and evaluation of the system and its components.

PHASE II: Based on the preliminary plans of Phase I, produce a prototype application that is capable of enabling the tagging of various forms of data within an existing data set.  The prototype should lead to a demonstration of the capability.  Test the prototype in a large multi-sensor database including text and visual data to demonstrate the technical feasibility and merit of the product.  Demonstrate a capability of enhanced data search capabilities that return relevant and related information for analysis that can be displayed in a variety of display options.  Propose a verification and validation process.

PHASE III: Produce an application capable of deployment in an operational setting that can be utilized by distributed collaborative teams.  Test the system in an operational setting as a component of a larger pre-existing multi-sensor database.  The application should provide metrics for performance assessment.  The work should focus on the ability to transition the tagging and searching system into the realm of military applications, other Federal Agencies, and/or private sector markets.

US Flag An Official Website of the United States Government