Meaningful Assessment in Times of Artificial Intelligence
ORAL · Invited
Abstract
In an era when generative AI can produce correct answers for almost all end-of-chapter problems in first-year physics, meaningful assessment must pivot from closed-format final solutions to the paths that students take to reach them. This talk argues for making reasoning, modeling choices, and quality of representation the primary targets of both formative and summative assessment. It will show how task design that elicits assumptions, multiple representations, estimates and units, error analyses, comparisons of alternative methods, and brief written justifications makes thinking visible - and how AI can be used productively as a coach in low-stakes settings while remaining constrained in high-stakes contexts. To keep workload manageable at scale without compromising trust, the talk presents human-in-the-loop grading workflows: confidence filters that route only reliable AI decisions forward and defer uncertain cases, plus clustering of similar solution paths so one carefully graded exemplar propagates to its peers. Psychometric checks provide evidence for reliability and fairness. The result is an assessment ecosystem that values sense-making over answer-matching, supports timely feedback, and remains auditable in the age of AI.
–
Publication:Gerd Kortemeyer, Julian Nöhl, and Daria Onishchuk, Grading assistance for a handwritten thermodynamics exam using artificial intelligence: An exploratory study, Phys. Rev. Phys. Educ. Res. 20, 20144‐1 — 20144‐24 (2024) Gerd Kortemeyer and Julian Nöhl, Assessing confidence in AI-assisted grading of physics exams through psychometrics: An exploratory study, Phys. Rev. Phys. Educ. Res. 21, 010136‐1 — 010136‐24 (2025) Jan Cvengros and Gerd Kortemeyer, Assisting the Grading of a Handwritten General Chemistry Exam with Artificial Intelligence, arXiv:2509.10591 (2025) Gerd Kortemeyer, Alexander Caspar, and Daria Horica, Artificial-Intelligence Grading Assistance for Handwritten Components of a Calculus Exam, arXiv:2510.05162 (2025)