You are here

GoBig: A Unified Interface to Big Data Systems

Award Information
Agency: Department of Energy
Branch: N/A
Contract: DE-SC0013252
Agency Tracking Number: 215779
Amount: $149,999.00
Phase: Phase I
Program: SBIR
Solicitation Topic Code: 01c
Solicitation Number: DE-FOA-0001164
Timeline
Solicitation Year: 2015
Award Year: 2015
Award Start Date (Proposal Award Date): 2015-02-17
Award End Date (Contract End Date): 2015-11-16
Small Business Information
28 Corporate Drive
Clifton Park, NY 12065-8688
United States
DUNS: 010926207
HUBZone Owned: No
Woman Owned: No
Socially and Economically Disadvantaged: No
Principal Investigator
 Jeffrey Baumes
 Dr.
 (518) 371-3971
 jeff.baumes@kitware.com
Business Contact
 Vicki Rafferty
Title: Dr.
Phone: (518) 371-3971
Email: contracts@kitware.com
Research Institution
N/A
Abstract

Problem statement
A researcher dealing with big data today is met with a maze of languages,
programming environments, data storage and query systems, and compute engines.
Pursuing a new path in this space may take years and millions of dollars of investment,
only to discover that a new and more applicable big data paradigm has emerged. Costs
include learning programming languages, storage systems, and computing paradigms,
as well as significant hardware and administrative costs of setting up and maintaining
the needed environments for data storage, transfer, and computation.
How this problem is being addressed
GoBig unifies and simplifies big data tools in two important areas: unified user interface
to big data software and hardware stacks, and streamlined deployment and modularity
to various types of cloud and HPC systems. Data is managed through the extensible
Girder data framework, an open-source project started at Kitware which provides a
unified interface to many distributed storage systems along with access control and
extensible plugins. Romanesco manages analyses and workflows that span
programming language boundaries. The results are then persisted in Girder to be
made available for further analysis or visualization. Instead of managing and supporting
multiple user endpoints to various big data toolchains, user management and
authorization for multiple systems may be managed by GoBigs account credentials.
What is to be done in Phase I
To demonstrate the feasibility of the GoBig system in Phase I, we will show system
modularity by extending computation support in GoBig to Hadoop, HPC clusters
running MPI, a queueing system, and a distributed data system. We will also add Julia,
Java, and Scala to the analytic programming languages supported in GoBig, and
demonstrate the applicability of GoBig to a computational science domain. Our Phase I
work will also demonstrate ease of deployment including provisioning of arbitrary
systems and easy installation on cloud services such as OpenStack and Amazon Web
Services (AWS). This will all be performed utilizing Kitwares proven practices for agile,
durable, and sustainable software.
Commercial applications and other benefits
Because GoBig is open-source and extensible, the community that will grow around the
aforementioned tools will foster agility and innovation while reducing maintenance cost
over time. The development model used for open-source projects has also been
proven to scale to thousands of developers while maintaining a high standard for
quality. We will encourage the participation of developers who can add abstractions for
more data storage and processing systems. GoBigs flexibility and ease of use will
ultimately impact a broad range of data analysts who require a low barrier of entry to
distributed compute services, including government, academia, and the business
community.
Key words
Analytics, Big Data, Software, Open Source
Summary for members of Congress
As the needs for big data storage and processing have escalated dramatically in recent
years, a powerful but unwieldy set of disparate tools have appeared that are difficult to
utilize. Our proposed platform, GoBig, addresses this by exposing multiple big data
storage and computation platforms from a convenient, unified interface.

* Information listed above is at the time of submission. *

US Flag An Official Website of the United States Government