You are here

STTR Phase I:Real-Time Smart Data Lake

Award Information
Agency: National Science Foundation
Branch: N/A
Contract: 2135007
Agency Tracking Number: 2135007
Amount: $256,000.00
Phase: Phase I
Program: STTR
Solicitation Topic Code: CH
Solicitation Number: NSF 21-563
Solicitation Year: 2021
Award Year: 2022
Award Start Date (Proposal Award Date): 2022-02-15
Award End Date (Contract End Date): 2022-08-31
Small Business Information
2700 POST OAK BLVD FL 21 STE 152
United States
DUNS: 117939243
HUBZone Owned: No
Woman Owned: No
Socially and Economically Disadvantaged: No
Principal Investigator
 Donpaul Stephens
 (646) 872-2124
Business Contact
 Donpaul Stephens
Phone: (646) 872-2124
Research Institution
 University of Wisconsin-Madison
Madison, WI 53715
United States

 Nonprofit College or University

The broader impact of this Small Business Technology Transfer (STTR) Phase I project will be to fundamentally accelerate the pace of data science. New solutions that alleviate or bypass the bottlenecks inherent in existing data analytics technologies are required to unlock the value contained in “big data” and bring greater benefits to research, business, and society at large.The potential benefits include improved quality control for manufactured goods, reduced fraud in financial transactions, and enhanced customization of consumer services.Many common software applications, such as search engines, online shopping, e-commerce, medical applications, and social networks, are backed by data analytical processing services. This project will enable massive data sets to be pre-processed directly by a shared storage service, then allow this capability to be efficiently utilized by client analytic applications. This project will accelerate transformation of analytical data to useful insights while reducing network congestion, simplifying complex analytics systems, and lowering information technology costs.This Small Business Technology Transfer (STTR) Phase I project examines the challenge of how to dramatically accelerate data-intensive computing problems by enabling large data objects to be processed directly in the storage layer then efficiently utilized by client applications. The technology will be built on a software defined data lake used for big data applications. Key technology will be added to enable JSON, one of the most widely used data interchange formats, to be processed in a distributed manner in-place within this storage solution with the objective of enabling client analytic applications to retrieve not the complete object but only the desired subset of content they require. The storage solution will be augmented to transform the data into a format that can be directly consumed by the clients. This will substantially increase efficiency within the storage itself and between analytics clients and the storage solution – while acceelrating data processing by bypassing bottlenecks. The result will be a smart data lake that can reduce network traffic, improve data freshness, and enable real-time operation – while accelerating big data analytics by an order of magnitude or more.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

* Information listed above is at the time of submission. *

US Flag An Official Website of the United States Government