MODA 2020 : First International Workshop on Monitoring and Operational Data Analytics (MODA)
Call For Papers
1st ISC HPC International Workshop on “Monitoring and Operational Data Analytics” (MODA)
June 25, 2020
1. Workshop scope
The race to Exascale poses significant challenges for the collection and analysis of the vast amount of data that future HPC systems will produce, in terms of the increasing complexity of the machines, the scalability and intrusiveness of the adopted monitoring solution, and the interpretability and effective inference driven by the acquired data. The main scope of the 1st ISC-HPC International Workshop on Monitoring and Operational Data Analytics (MODA) is to provide insight into current trends in MODA, to identify potential gaps, and to offer an outlook into the future of the involved fields high performance-computing, databases, machine learning, and possible solutions for upcoming Exascale systems. Contributions matching the scope of the workshop will be related to:
Currently envisioned solutions and practices for monitoring systems at data centers and HPC sites. Significant focus will be placed on operational data collection mechanisms respectively i) covering different system levels, from building infrastructure sensor data to CPU-core performance metrics, and ii) targeting different end-users, from system administrators to application developers and computational scientists.
Effective strategies for analyzing and interpreting the collected operational data. Such strategies should particularly include (but are not limited to) different visualization approaches and machine learning-based techniques, potentially inferring knowledge of the system behavior and allowing for the realization of a proactive control loop.
This workshop is not targeting new solutions proposed in the context of application performance modeling and/or application performance analysis tools. Novel contributions in the area of compiler analysis, debugging, programming models and/or sustainability of scientific software are also considered out of the scope of the workshop.
MODA is becoming common practice at various international HPC sites. However, each site adopts a different, insular approach, rarely adopted in production environments and mostly limited to the visualization of the system and building infrastructure metrics for health check purposes. In this regard, we observe a gap between the collection of operational data and its meaningful and effective analysis and exploitation, which prevents the closing of the feedback loop between the monitored HPC system, its operation, and its end-users. Under these premises, the goals of the workshop can then be summarized in the following way:
Gather and share knowledge and establish a common ground within the international community with respect to best practices in monitoring and operational data analytics.
Discuss future strategies and alternatives for MODA, potentially improving existing solutions and envisioning a common baseline approach in HPC sites and data centers.
Establish a debate on the usefulness and applicability of AI techniques on collected operational data for optimizing the operation of production systems (e.g. for practices such as predictive maintenance, runtime optimization, optimal resource allocation and scheduling).
1.1 Topics of interest
The contributions submitted to MODA will ideally address:
State-of-the-practice method, tools, techniques in monitoring at various HPC sites
Solutions for monitoring and analysis of operational data that work very well on large- to extreme-scale systems with a large number of users
Solutions that have proven limitations in terms of efficiency of operational data collection in real-time or in terms of the quality of the collected data
Opportunities and challenges of using machine learning methods for efficient monitoring and analysis of operational data
Integration of monitoring and analysis practices into production system software (energy and resource management) and runtime systems (scheduling and resource allocation)
Discuss explicit gaps between operational data collection, processing, effective analysis, highly useful exploitation, and propose new approaches to closing these gaps for the benefit of improving HPC center planning, operations, and research
Other monitoring and operational data analysis challenges and approaches (data storage, visualization, integration into system software, adoption)
1.2 Submission and publication
We will solicit original contributions in the form of original papers (6-12 pages) which will be peer-reviewed by the program committee members. All accepted papers will be presented during the workshop. We aim at a minimum of 4 and a maximum of 8 accepted papers, for the 8 x 30 minutes slots in the tentative workshop program (see Section 4).
We will publish the workshop papers together with the ISC 2020 proceedings, including an abstract of the keynote and invited talks, and a short white paper of the panel session.
High-quality contributions may be considered for a full-length submission to a special journal issue in collaboration with ParCo, CPE, or other journals.
Papers should be submitted through the online system at https://moda20.sciencesconf.org, by navigating to the top right side of the page and creating an account by clicking on the downward arrow near the “Login” box.
Deadline: March 1, 2020 (AoE)
Notification: April 6, 2020.
2.1 Workshop organizing committee
1. Florina Ciorba – University of Basel, Switzerland
2. Nicolas Lachiche - University of Strasbourg, France
3. Aurélien Cavelan - University of Basel, Switzerland
4. Daniele Tafani - Leibniz Supercomputing Centre, Germany
5. Utz-Uwe Haus - Cray/HPE EMEA Research Lab, Switzerland
2.2 Technical program committee
1. Andrea Bartolini - University of Bologna, Italy
2. Valeria Bartsch - Fraunhofer ITWM Kaiserslautern, Germany
3. Norm Bourassa - NERSC LBNL, USA
4. Jim Brandt - Sandia National Labs, USA
5. Rubén Cabezón - sciCORE, University of Basel, Switzerland
6. Carlo Cavazzoni - CINECA, Italy
7. Todd Gamblin - LLNL, USA
8. Victor Holanda - CSCS, Switzerland
9. Thomas Ilsche - Technische Universität Dresden, Germany
10. Jacques-Charles Lafoucriere - CEA, France
11. Erwin Laure - KTH, Sweden
12. Fiilippo Mantovani - BSC, Spain
13. Diana Moise - Cray/HPE, Switzerland
14. Ariel Oleksia - Poznan Supercomputing Center, Poland
15. Melissa Romanus - NERSC LBNL, USA
16. Karthee Sivalingam - Cray/HPE, UK
17. Heiko Schuldt - University of Basel, Switzerland
18. Martin Schulz - TU Munich / Leibniz Supercomputing Centre, Garching, Germany
19. Keiji Yamamoto - RIKEN, Japan