You are here

A New Secondary Access Mechanism for Indexing Very Large Data Sets

Award Information
Agency: Department of Energy
Branch: N/A
Contract: DE-FG02-97ER82428
Agency Tracking Number: 37242
Amount: $74,000.00
Phase: Phase I
Program: SBIR
Solicitation Topic Code: N/A
Solicitation Number: N/A
Solicitation Year: N/A
Award Year: 1997
Award Start Date (Proposal Award Date): N/A
Award End Date (Contract End Date): N/A
Small Business Information
3131 S. Dixie Drive Suite 200
Dayton, OH 45439
United States
HUBZone Owned: No
Woman Owned: No
Socially and Economically Disadvantaged: No
Principal Investigator
 Dr. Shashikala Toke Das
 Research Scientist
 (937) 643-0797
Business Contact
 Mr. Venu Pasupuleti
Title: CEO
Phone: () -
Research Institution


A New Secondary Access Mechanism for Indexing Very Large Data Sets--MegaSoft Technologies Inc., 3131 S. Dixie Drive Suite 200, Dayton, OH 45439-2223;
Dr. Shashikala Toke Das, Principal Investigator
Mr. Venu Pasupuleti, Business Official
DOE Grant No. DE-FG02-97ER82428
Amount: $74,000

Nuclear physics experiments conducted at the Relativistic Heavy Ion Collider and the Thomas Jefferson National Accelerator Facility generate tens to hundreds of terabytes of raw data per experiment per year. Storage, retrieval, and processing of these data from immensely complex physical events is a challenging problem to computer scientists. In order to address the efficient retrieval of such large data sets, this project uses an innovative technique based on a secondary access structure called a property map, which concisely stores the properties of data as relational or object-oriented databases. Phase I objective is to demonstrate the feasibility and efficiency of this approach and the advantages over alternative traditional multi-attribute indexing techniques. The project will determine attributes of datasets and various kinds of queries provided by scientists and perform tradeoff studies between this approach and other related techniques. A cost model will be developed and simulation studies will be performed. In Phase II actual queries
provided by scientists will be modeled, and a robust implementation of the algorithm necessary to store actual data in a commercial, off-the-shelf database management system will be completed.

Commercial Applications and Other Benefits as described by the awardee: The property map technique developed in this project for summarizing and retrieving large datasets facilitates sharing of data among users, dramatically reduces the amount of space as compared to that required by traditional multi-attribute indexing techniques, is useful for efficient query processing, and is applicable in any commercial sector where data warehousing and data mining are employed. Based on utilization statistics, the technique also facilitates monitoring of performance of user queries that allows tuning based on feedback about the organization of data.

* Information listed above is at the time of submission. *

US Flag An Official Website of the United States Government