You are here
Indexing large scientific data
Title: Chief Technology Officer
Phone: (404) 217-0457
Email: glenn@maplarge.com
Title: President
Phone: (404) 217-0457
Email: lynwood@maplarge.com
Hadoop style systems have done an excellent job of providing scalable long term disk bound data storage and enjoy wide acceptance in both Government and the private sector. However, Hadoop implementations suffer from performance limitations with respect to whole set aggregates and real time interactivity that we believe can be solved by optimizing for local memory operations. The key performance driver is memory locality. A well written Hadoop process might sometimes achieve optimal memory throughput on an individual node, but the overall system does not generally result in optimal memory locality and thus frequently fails performance requirements. We propose to create a multi node data architecture that automatically optimizes for memory locality using a compressed column oriented architecture compatible with both CPU and GPU processing. The result will be a real time streaming architecture capable of indexing and querying large volumes of heterogeneous scientific data stored on clusters of cloud computers.
* Information listed above is at the time of submission. *