You are here

Discovering Valued Information in a Cloud Environment


TECHNOLOGY AREAS: Information Systems

OBJECTIVE: Design and implement a populated architecture that can quickly and efficiently discover actionable tactical information contained or derived from large parallel data stores.  A user friendly interface should be able to translate a condition of interest into a parallelized map reduce job.  The system should also be able to recognize redundancies in data held within large parallel data stores to prevent sparse evidence from being viewed as a well supported inference.  The mature system should provide a common data currency for exchange across a large parallel enterprise through use of data dictionaries, shared ontologies and maps to data sources.  A request for unusual events must be understand in the context of many types of data and many different types of mission information requirements.  All data and services should be discoverable by a tactical user equipped with only a PDA. 

DESCRIPTION: In combating terrorism the need exists to monitor at risk individuals and groups.  The data sources to achieve this goal can consist of military sensors and sources as well as open source literature.  Key data types that may contain valued information include unstructured text, audio files, images, high resolution imagery, wide area airborne imagery and biometric data  Currently, there does not exist a way to run specific searches in response to a tactical information need against large distributed data stores.  Sensor data can be 1) stored in a large data archive for retrieval and extraction, 2) kept at an aggregation node (gateway), or 3) remain close to smart sensor and triggers provided for data distribution. The goal of the topic is to mature a set of map reduce tools that can address key warfighter questions quickly by fusing interim results obtained by running code in parallel across numerous multi-INT data stores.  Map reduce jobs must be configured and activated by an easy to use tactical PDA based user interface that understands the language commonly used to express priority and specific information requirements.  After a single knowledge product is produced from the combination of many work tasks applied across many data stores, that product must be delivered to the warfighter.  The matured system must be able to produce answers in real time.  For this topic, offerors may work with the following distributed data sources; warfighter observations containing time/location stamps and unstructured text, images with time/location stamps and unstructured text comments; and biometric reports containing time/location stamps.  Phase 1 performers may work with synthetic data.  The Hadoop framework should be utilized.   The offeror should work towards a capability to allow a warfighter to ask for activity reports relevant to a location over a specified time and receive a summary derived from the content of distributed data stores.  The specific challenges of this topic include:  1) Maturing a set of related map reduce jobs that can act on distributed stored images/imagery, unstructured text and biometrics data to find data that relates to a priority or specific information requirement  2) Development of a level one fusion engine based on FrameNet that can combine the output of a number of map reduce jobs run against a distributed data store to produce a single knowledge product 3) Development of a user GUI that allows the warfighter to input a time/space bounded information requirements and be returned successfully mined information 4)  development of semantic search map reduce jobs and a PDA triggered workflow manager. 

Research in the areas of mathematics, statistics, computational data analysis and visualization, computational sciences and computer science are of interest.  In addition to the application of research methods and approaches, it is important to evaluate the impact of these efforts areas with regards to the way they change how data is collected, analyzed and assessed to meet a prescribed time for operational necessity and efficiency.  It is of value to use open standards to reduce costs [4].

The OSD is interested in innovative R&D that involves technical risk.  Proposed work should have technical and scientific merit.  Creative solutions are encouraged. 

PHASE I: Complete a plan and detailed approach for populating an architecture that can address warfighter questions by simultaneously processing multi-INT distributed data stores.  Identify the critical technology issues that must be overcome to achieve success.  Technical work should focus on the reduction of key risk areas.  For a constrained set of warfighter questions and distributed data stores, demonstrate that phase 1 risk reduction work has shown that a full implementation of the approach is technically tractable.    Prepare a revised research plan for Phase 2 that addresses critical issues. 

PHASE II: Produce a prototype system that is capable of distributed processing in a cloud environment.  The prototype system should assemble information by automated means, provide performance metrics and offer visualization appropriate to user’s device.  Produce a prototype distributed processing service that can produce accurate answers to warfighter questions by simultaneously processing large distributed varied data sources.  The prototype should enable a demonstration of the capability to be conducted using relevant data sources, some of which may be classified.  The prototype should be capable of operating in a real time mode.  Identify appropriate test performance dependent variables and make trade-off studies. Address bias in data processing due to redundant sampling. The prototype should be relevant to both DoD and commercial use cases. 

PHASE III: Produce a system capable of deployment in an operational setting.  The work should focus on a specific user environment intended for product transition.  Test the system in an operational setting in a stand-alone mode or as a component of shared processing environment.  The work should work towards a transition to program of record, military organization or commercial product.  The system should adhere to open standards and use registered COI vocabulary and ontologies where feasible.

US Flag An Official Website of the United States Government