Extreme-Speed Eigensolver Suite
Department of Energy
Agency Tracking Number:
Solicitation Topic Code:
Small Business Information
13680 NW 14TH STREET, SUITE 5, Sunrise, FL, 33323-2845
Socially and Economically Disadvantaged:
AbstractOne of the most important linear algebra problems is that of finding the eigenvalues and eigenvectors of large-scale matrices. Methods for the solution of these problems, usually called eigensolvers, are fundamental to many applications in engineering and scienceranging from the most challenging computational chemistry problems in molecular design, to stability searches in structural design, to other modeling- and simulationintense disciplines that are at the core of DOEs scientific priorities, such as weather prediction, nuclear energy, nuclear weapon certification, and wind turbine design, among many others. Scientists working in these areas are always striving for increased computational performance. Modern hybrid supercomputing architectures (made of arbitrary combinations of CPUs, GPUs, FPGAs, and possibly other types of cores) could provide the computational performance needed by large-scale eigensolvers, as has been done for linear equation solvers [Gonz09, Gonz10]. However, increasing performance in eigensolvers through hybrid computing architectures is, in general, harder to achieve than doing it for linear equation solvers. The reason: most eigenvalues software and numerical libraries scale inefficiently when targeted to a large number of processors due to (1) communication becoming a primary burden, and (2) this burden being aggravated when transitioning into heterogeneous high-speed architectures [HPCw10]. No industrial-quality scalable eigensolver numerical library exists today that can efficiently exploit heterogeneous concurrency of specialized cores (e.g., any combination of GPUs, FPGAs, Cell Processors, etc) to deliver breakthrough speedups for eigenvalue-centric computations. Accelogic proposes to deploy the worlds fastest eigensolver numerical library, equipped to exploit hybrid HPC systems made of arbitrary combinations of CPUs, GPUs, FPGAs, and possibly other types of cores integrated in the same supercomputing network. We propose two major innovations to attack the above-mentioned hurdles by: (1) implementing communication-avoiding eigensolver algorithms; and (2) introducing innovative approaches to guarantee scaling based on talent-aware heterogeneous concurrency. Talent-aware heterogeneous concurrency exploits the abilities that different computing cores have to execute particular numerical tasks. For instance, a GPU is extremely fast for streamlined vector processing, but an FPGA has higher memory bandwidth when operating in its internal memory regime. Talent-aware heterogeneous concurrency considers these cores talents (i.e., capabilities) to optimally balance computing jobs, aiming at a nontrivial non-homogeneous partition of the computing jobs those results in overall reduction in time-to-solution. Accelogic has raised private sector matching funds tied to this SBIR project for an additional $500,000 such an investment is a testament to the strong commercialization potential of the proposed technology. Commercial Applications and Other Benefits: This technology will have potentially large impact in industries such as structural design, chemistry, bioengineering, and computational physics, as well as in DOEs research programs in fusion energy, climate/weather modeling, nanoscale science, genomics, and study of nuclear matter, among many others.
* information listed above is at the time of submission.