HPCDB 2011 : 1st Workshop on High Performance Computing Meets Databases
Call For Papers
The high-performance computing (HPC) community is facing significant data challenges: the performance of simulations on evolving leadership class computing architectures are increasingly dominated by the costs of data access, movement, transformation and analysis on the HPC system. These problems are expected to only get worse as we move towards exascale computing. The database community has developed a collection of approaches that have allowed them to effectively meet similar data challenges for the commercial sector by emphasizing a rigorous data model, a simple but expressive query algebra, cost-based optimization, declarative query languages, and logical and physical data independence. This workshop is focused on bringing together the HPC and database communities in an effort to facilitate discussions that will lead to both a greater awareness of each other and eventually solutions to the myriad of data problems facing high performance computing.
Among others, workshop participants will discuss the following questions:
What are the appropriate data models for HPC data? (arrays, structured and unstructured meshes, graphs, trees, relations?)
Which critical systems and subsystems found in HPC environments would benefit from features typically associated with databases? (e.g., filesystems equipped with indexing and query optimization; monitoring implemented as stream queries)
Can declarative query languages make HPC systems and datasets accessible to a new class of data-oriented scientists?
The hallmark of the database community is to “push the computation to the data,” insulating users and applications from details of data representation, scale, system architecture, and evaluation method, while affording runtime optimization opportunities unavailable to compile-time techniques. In what other HPC contexts might this general approach be applied?
Which science domains or specific HPC applications are particularly well-suited to this approach?
In this workshop, we invite 4 page position papers that define or clarify the data challenges facing HPC, explore the design space between the two communities, or describe work in progress bridging these communities. Applications and platforms of interest to the HPC community will drive the focus of the workshop. We seek new approaches to the problems in these areas as opposed to providing yet another forum for “big data” or “how fast can I write data to disk” discussions. In particular, we request that each position paper include a section considering how the work could be deployed in the context of leadership-class computing platforms OR address a specific application of interest in HPC (e.g., an application involving simulation, visualization, or other typical HPC area.)
Topics of Interest
Data models and query algebras for HPC (arrays, meshes, graphs, images)
Query languages for parallel processing
HPC “data challenges”
Distributed data structures
Extending file-based systems with database features
DBMS on massively multicore platforms
Very-low footprint and main-memory DBMS architectures
In situ analysis via streaming and continuous queries
Column-stores for computational science
Databases and high-performance visualization
Simulations and linear algebra as queries
Batch query processing and multiple-query optimization
We invite position papers of no more than 4 pages, following ACM conference formatting guidelines. All submissions in PDF format. A collection of the best papers may be invited to a special issue of a journal to be determined.
Submit .doc or .pdf via email to email@example.com
Workshop information is also available on the Supercomputing 2011 website
Full papers: August 22
Bill Howe (University of Washington), Terence Critchlow (Pacific Northwest National Labs), Magda Balazinska (University of Washington), Kirsten Kleese-Van Dam (Pacific Northwest National Lab), Julio Lopez (Carnegie Mellon University), Jeff Gardner (University of Washington)