posted by organizer: liza183 || 1838 views || tracked by 3 users: [display]

BTSD 2022 : The 4th International Workshop on Big Data Tools, Methods, and Use Cases for Innovative Scientific Discovery (BTSD) 2022


When Dec 17, 2022 - Dec 17, 2022
Where Virtual
Abstract Registration Due Oct 1, 2022
Submission Deadline Oct 23, 2022
Notification Due Nov 1, 2022
Final Version Due Nov 20, 2022
Categories    machine learning   physics   material science   big data

Call For Papers

Call for Papers
Program Chairs
Sangkeun (Matt) Lee

Jong Youl Choi

Anika Tabassum

Organizers’ Background
Sangkeun (Matt) Lee received his Ph.D. degree in computer science and engineering from Seoul National University in 2012. He is currently an R&D Associate in Computer Science and Mathematics Division at Oak Ridge National Laboratory. He has been studying big data, data science, and machine learning and applied state-of-the-art data analysis technologies in many application domains. He has developed many data analytics software, and one of his developed software, ORiGAMI has won the 2016 DOE R&D 100 Award. He has been contributing to many of leading computer science conferences and journals such as ACM WWW, ACM RecSys, and Expert Systems with Applications. For the last few years, he has collaborated with scientists across various domains including material science, nuclear science, and mechanical engineering, and published papers in scientific journals such as Journal of Nuclear Materials, Acta Materialia, The Electricity Journal, Advanced Theory, and Simulations.

Jong Youl (Jong) Choi is a researcher working in the Discrete Algorithms Group, Computer Science and Mathematics Division, Oak Ridge National Laboratory (ORNL), Oak Ridge, Tennessee, USA. He earned his Ph.D. degree in Computer Science at Indiana University Bloomington in 2012 and his MS degree in Computer Science from New York University in 2004. His areas of research interest span data mining and machine learning algorithms, high-performance data-intensive computing, and parallel and distributed systems. More specifically, he is focusing on researching and developing data-centric machine learning algorithms for large-scale data management, in situ/in-transit data processing, and data management for code coupling. Jong Choi actively serves on conference committees and journal reviews such as ParaMo, CCPE, and CLUS.

Anika Tabassum is currently working as a Postdoctoral researcher at Oak Ridge National Laboratory, where she is contributing toward Deep Learning for multi-scale and multimodal battery analytics and plasma simulation for fusion energy. Her research focuses on developing deep learning models for robust scientific computing, specifically, she works on knowledge-guided ML and scientific ML. She received her Ph.D. from the Department of Computer Science at Virginia Tech where she worked on bringing knowledge-guided ML to address multiple challenges in power system failures and clean energy. Her Ph.D. research work was funded by an NSF Urban Computing fellowship. Apart from her primary research focus, she also worked on designing the COVID-19 forecasting model for the CDC challenge. She has published in multiple venues such as ACM SigKDD, AAAI, CIKM, IEEE BigData, IAAI, and journals like ACM TIST and Elsevier. She completed her bachelor's degree in Computer Science and Engineering from the Bangladesh University of Engineering and Technology.

Introduction to Workshop
Advances in big data technology, artificial intelligence, and machine learning have created so many success stories in a wide range of areas, especially in industry. These success stories have been motivating scientists, who study physics, chemistry, materials, medicine, and many more, to explore a new pathway of utilizing big data tools for their scientific activities.

However, there are barriers to overcome. Most existing big data tools, systems, and methodologies have been developed without considering scientific purposes or scientists’ specific requirements. They are not originally developed for scientists who have no or little knowledge of programming or computer science. On the other hand, for computer scientists, understanding the domain problem is often very challenging due to the lack of enough background knowledge.

We expect that big data technologies can play a great role in contributing to scientific innovation in many ways. There are already a lot of ongoing scientific projects around the world that aim to discover novel hypotheses, analyze big multidimensional data which couldn’t be handled manually, and reduce the time required by complex calculations via machine. This workshop intends to bring domain scientists and computer scientists together while exploring and extending opportunities in the development of big data tools, systems, and methodologies for scientific discovery, to share success stories and lessons learned, and discuss challenges if overcome would enable successful collaboration across different domains, especially domain scientists and computer/data scientists.

In this workshop, we discuss the following questions:

What makes big data tools for scientists different from the existing tools?

What specific needs and challenges do domain scientists face when they try to adopt big data tools?

How can computer scientists and domain scientists communicate to define a feasible problem together?

What are the barriers of using big data for scientific discovery and how do these barriers differ in different science domains?

Workshop History
The international workshop on Big Data Tools, Methods, and Use Cases for Innovative Scientific Discovery (BTSD) was first held in December 2019 in conjunction with IEEE Big Data 2019 conference, organized by Matt Lee and Travis Johnston. Total of 12 papers were accepted. It was a great start to build a strong scientific collaboration community. The second BTSD workshop was held in December 2020 as a virtual workshop in conjunction with IEEE Big Data 2020. Total of 11 papers were accepted and presented. The third BTSD workshop was held in December 2021 as a virtual workshop in conjunction with IEEE Big Data 2021. Total of 9 papers were accepted and presented. It was a great communication and opportunity to learn from experiences across many scientific domains.

Research Topics Included in the Workshop
Big data tools, systems, and methods related to, but not limited to:

Scientific data processing

Artificial intelligence/Deep neural networks/Machine learning

Text mining/Graph mining

Database/Query processing/Query Optimization

Parallel computation/High Performance Computing

Visualization/User Interface/HCI


High Performance Computing …

that facilitate innovation and discovery in a scientific domain, such as:



Material science

Mechanical engineering

Nuclear engineering

Biomedical science …

Use cases, success stories, lessens learned in scientific discovery using big data tools, systems, and methods

Program Committee Members
Youngjae Kim, Sogang University, South Korea

Feng Bao, Florida State University, USA

Supriya Chinthavali, Oak Ridge National Laboratory, USA

Guimu Guo, Rowan University, USA

Ramakrishnan Kannan, Oak Ridge National Laboratory, USA

Seungha Shin, University of Tennessee, USA

Pei Zhang, Oak Ridge National Laboratory, USA

Ivy Peng, Lawrence Livermore National Laboratory, USA

Ralph Kube, Princeton Plasma Physics Laboratory, USA

Ohyung Kwon, Korea Institute of Industrial Technology, South Korea

Paper Submission
Please submit a short paper (minimum 4 page, up to 6 page IEEE 2-column format) or full paper (minimum 8 page, up to 10 page IEEE 2-column format) through the online submission system.

Papers should be formatted to IEEE Computer Society Proceedings Manuscript Formatting Guidelines (see link to "formatting instructions" below).

Formatting Instructions

8.5" x 11" (DOC, PDF)

LaTex Formatting Macros

Important Dates
Abstract Submission: Oct 1, 2022:

Due date for full workshop papers submission: Oct 23 (Final Extension), 2022

Nov 1, 2022: Notification of paper acceptance to authors

Nov 20, 2022: Camera-ready of accepted papers

Presentation Preparation
To be announced

To be announced

Workshop Primary Contact
Sangkeun (Matt) Lee, Discrete Algorithms Group, Computer Science and Mathematics Division, Oak Ridge National Laboratory, TN, USA. Tel: +1 865 574 8858 Email:

Related Resources

ICBICC 2024   2024 International Conference on Big Data, IoT, and Cloud Computing (ICBICC 2024)
ECAI 2024   27th European Conference on Artificial Intelligence
BDCAT 2024   IEEE/ACM Int’l Conf. on Big Data Computing, Applications, and Technologies
AIM@EPIA 2024   Artificial Intelligence in Medicine
IEEE BigData 2024   2024 IEEE International Conference on Big Data
ICMLA 2024   23rd International Conference on Machine Learning and Applications
DSIT 2024   7th International Conference on Data Science and Information Technology
ICoSR 2024   2024 3rd International Conference on Service Robotics
ICDM 2024   IEEE International Conference on Data Mining
ICIBA 2024   4th IEEE International Conference on Information Technology, Big Data and Artificial Intelligence