posted by organizer: ndebard || 1110 views || tracked by 3 users: [display]

FTXS 2015 : The 5th Fault Tolerance for HPC at eXtreme Scale (FTXS) Workshop


When Jun 15, 2015 - Jun 15, 2015
Where Portland, OR
Submission Deadline Feb 9, 2015
Notification Due Mar 9, 2015
Categories    resilience   fault tolerance   HPC   supercomputing

Call For Papers

5th Workshop on Fault-Tolerance for HPC at eXtreme Scale (FTXS 2015)

In conjunction with
The 24th International ACM Symposium on
High Performance Distributed Computing (HPDC 2015)
Portland, Oregon, USA on June 15 – 19, 2015

Authors are invited to submit original papers on the research and practice of
fault-tolerance in extreme scale (HPC) computing. Resilience and
fault-tolerance remain a major concern for supercomputing and advances in this
area are needed to allow applications to compute accurate (or within error
tolerance) answers in a timely and efficient manner in the presence of
degradations or failures of platform components (both hardware and software).

Topics include, but are not limited to:
* Failure data analysis and field studies
* Power, performance, resilience (PPR) assessments / tradeoffs
* Novel fault-tolerance techniques and implementations
* Emerging hardware and software technology for resilience
* Silent data corruption (SDC) detection / correction techniques
* Advances in reliability monitoring, analysis, and control of highly
complex systems
* Failure prediction, error preemption, and recovery techniques
* Fault-tolerant programming models
* Models for software and hardware reliability
* Metrics and standards for measuring, improving, and enforcing effective
* Scalable Byzantine fault-tolerance and security from single-fault and
fail-silent violations
* Atmospheric evaluations relevant to HPC systems (terrestrial neutrons,
temperature, voltage, etc.)
* Near-threshold-voltage implications and evaluations for reliability
* Benchmarks and experimental environments including fault injection
* Frameworks and APIs for fault-tolerance and fault management

See and for more information.

AMD will sponsor the FTXS 2015 best paper award! The award will be chosen by
the PC and awarded at the workshop.

Submissions are solicited in the following categories:
* Regular papers presenting innovative ideas improving the state of the
* Experience papers discussing the issues seen on existing extreme-scale
systems, including some form of analysis and evaluation.
* Extended abstracts proposing disruptive ideas in the field, including
some form of preliminary results.

Submissions shall be sent electronically, must conform to ACM conference
proceedings style and should not exceed eight (8) pages including all text,
appendices, and figures. Position papers should not exceed six (6) pages.

Submission of papers: February 9th, 2015
Author notification: March 9th, 2015
Camera-ready papers: April 2015
Workshop: June 15th, 2015

Nathan DeBardeleben – Los Alamos National Laboratory
Franck Cappello – Argonne National Laboratory and UIUC
Robert Clay – Sandia National Laboratories

Leonardo Bautista Gomez – Argonne National Laboratory
Aurélien Bouteiller – University of Tennessee Knoxville
Greg Bronevetsky - Lawrence Livermore National Laboratory
John Daly - Department of Defense
Christian Engelmann – Oak Ridge National Laboratory
Kurt Ferreira – Sandia National Laboratories
Ana Gainaru – University of Illinois at Urbana-Champaign
Qiang Guan – Los Alamos National Laboratory
Saurabh Gupta – Oak Ridge National Laboratory
Saurabh Hukerikar – Information Sciences Institute/USC
Hideyuki Jitsumoto – Tokyo Institute of Technology
Zhiling Lan – Illinois Institute of Technology
Scot Levy – University of New Mexico
Naoya Maruyama – RIKEN AICS
Bogdan Nicolae – IBM Research – Ireland
Thomas Ropars - EPFL
Yves Robert - ENS Lyon
Anthony Skjellum - Auburn University
Vilas Sridharan – AMD, Inc.
Devesh Tiwari – Oak Ridge National Laboratory
Abhinav Vishnu - Pacific Northwest National Laboratory

Related Resources

FTS 2016   IEEE 2nd International Workshop on Fault Tolerant Systems
HPC 2017   High Performance Computing Symposium
OPODIS 2016   International Conference on Principles of Distributed Systems
ISC HPC 2017   ISC High Performance 2017
EuroMPI/USA 2017   The 24th European MPI Users' Group Meeting
LADC 2016   Latin-American Symposium on Dependable Computing
EDCC 2016   European Dependable Computing Conference
HiPINEB 2017   The 3rd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era
SRDS 2016   35th Symposium on Reliable Distributed Systems
EDCC 2017   13th European Dependable Computing Conference