Reviewing the Evolution of Research Trends in Low Temperature Plasma Science and Technology Over the Last Five Decades via Topic Modeling
ORAL
Abstract
We apply Non-Negative Matrix Factorization (NMF) for topic modeling on an extensive archive of Gaseous Electronics Conference (GEC) abstracts spanning from 1970 to 2022. The data collection and filtering process involved multiple steps, such as converting PDFs of all abstracts into text files and using a high-quality weak supervision model to prepare a structured dataset. Data cleaning included preprocessing steps like tokenization, contraction expansion, stop-word removal, and lemmatization to standardize the text data. We then merged n-grams and extracted features using term frequency-inverse document frequency. This investigation utilizes a corpus of more than 20,000 unique abstracts. To determine the optimal number of topics, we conducted a stability analysis. The terms associated with each topic have been meticulously selected using a relevance metric as well as domain knowledge, ensuring that the identified terms are not only frequent but also highly pertinent to their respective topics. By examining the evolution of topics throughout the period covered by the abstracts, we aim to gain a deeper understanding of the chronological progression of research areas related to low-temperature plasma science and technology. Our goal is to identify interdependent topic trends and the overall knowledge structure in this field, and to link these trends with real-world developments. The dataset and insights are valuable for knowledge discovery and could potentially aid future research in this field.
–
Presenters
-
Bhaskar Chaudhury
Group in Computational Science and HPC, DA-IICT, India., Group in Computational Science and HPC, DA-IICT, Gandhinagar, India
Authors
-
Bhaskar Chaudhury
Group in Computational Science and HPC, DA-IICT, India., Group in Computational Science and HPC, DA-IICT, Gandhinagar, India
-
Divya Patel
Group in Computational Science and HPC, DA-IICT, Gandhinagar, India
-
Kunj Patel
Group in Computational Science and HPC, DA-IICT, Gandhinagar, India
-
Agam Shah
School of Computational Science & Engineering, College of Computing, Georgia Institute of Technology, Atlanta, USA