Resolving Multi-Level Discrepancies in Information Extraction: Foundation Models for Autonomous Interdisciplinary Literature Analysis in Physics

ORAL

Abstract

Recent advances in large language models (LLMs) and foundation models have unlocked new possibilities for automating literature analysis in physics, yet significant challenges remain in harmonizing information extracted at both the document level and across multiple sources. Discrepancies arise due to non-standard technical terminology, ambiguity in human language, and the limitations of traditional extraction workflows, which struggle with context windows, multi-hop reasoning, and catastrophic forgetting when processing extensive, interdisciplinary domain-specific context.

This talk introduces a hybrid approach leveraging semantic vector search, knowledge graphs, and LLM-based decision engines to resolve inconsistencies, quantify confidence levels and improve entity normalization, drawing on para-consistent logic and joint extraction strategies. We demonstrate how this methodology enables autonomous, high-fidelity interdisciplinary literature studies, facilitating the identification and synthesis of critical information for advanced physics research. Our results highlight the importance of robust multi-level discrepancy resolution for accelerating discovery and fostering cross-domain innovation in physics.

Presenters

  • Jian Yang

    • Westlake Corp.

Authors

  • Jian Yang

    • Westlake Corp.
  • Michael Dessauer

    • Westlake Corp.
  • Constantyn Chalitsios

    • Westlake Corp.