Accelerating Fusion Science with the Data Fusion Labeler (dFL): A Framework for Rapid and Reproducible Labeling of Experimental Data

POSTER

Abstract

The proliferation of machine learning (ML) applications in fusion energy science has created a critical need for tools that can efficiently generate large, high-quality labeled datasets. To address this, we created the Data Fusion Labeler (dFL), an application for the rapid exploration and labeling of multimodal, 1-D timeseries data. Deployed on the Saga system at General Atomics in collaboration with Hewlett Packard Enterprise, dFL is accessible to those in the broader fusion community who have been approved for DIII-D access and abide by the DIII-D data usage agreement. A key feature of the tool is its interoperability with TokSearch, a new data portability system letting users retrieve signals from multiple fusion devices, such as DIII-D. As a demonstration of its capability, dFL was used to generate a labeled dataset of magnetic and plasma signals from DIII-D to create classifiers to differentiate between quiescent H-mode (QH), broadband turbulent QH (BBQH), and wide pedestal QH (WPQH) plasma regimes. The platform's ability to display data in multiple formats, including timeseries and spectrograms, was crucial for accurate feature identification. dFL accelerated a previous labeling process by a factor of five. The resulting dataset successfully trained a classifier to explore the underlying physics of these plasma regimes. The dFL also promotes reproducible science by providing data purveyance via HP's common metadata framework, ensuring robust provenance for curated datasets and ML models.

*Work supported by the U.S. Department of Energy, Office of Science, Office of Fusion Energy Sciences, using the DIII-D National Fusion Facility, a DOE Office of Science user facility, under Award No. DE-FC02-04ER54698, along with Office of Fusion Energy Sciences Award No. DE-SC0024426.

Presenters

  • Mathew Waller

    • Sophelio

Authors

  • Mathew Waller

    • Sophelio
  • Craig Michoski

    • SapientAI LLC
  • Zeyu Li

    • General Atomics
  • Brian Sammuli

    • General Atomics
  • Raffi M Nazikian

    • General Atomics
  • David Orozco

    • General Atomics
  • Martin Foltin

    • Hewlett Packard Enterprise
  • Tapan Nakkina

    • Sophelio