
Cracking The Exam with Artificial Intelligence: A Word-Based Analysis of LGS Science Questions
Chapter from the book:
Büyük,
U.
(ed.)
2025.
Technology Integration in Education: Artificial Intelligence and Digitalization Perspectives.
Synopsis
This study aims to reveal the conceptual structure of LGS (High School Entrance Exam) Science questions administered between 2018 and 2024 by analyzing them using word-based natural language processing (NLP) techniques. The dataset, consisting of 120 questions, was processed using the Zemberek-NLP library; stop words were removed and words were categorized by part of speech. Subsequently, techniques such as word frequency analysis, TF-IDF (Term Frequency–Inverse Document Frequency), concept clustering, and word cloud visualization were applied.
The analysis results indicate that the LGS questions predominantly reflect 8th-grade learning outcomes, with a systematic emphasis on the “Living Things and Life” unit. Genetics-related concepts such as “DNA,” “seed,” and “pea” stood out both in frequency and TF-IDF scores. The “Matter and Industry,” “Pressure,” and “Physical Events” units were also consistently represented, with words like “identical,” “liquid,” and “experiment” frequently appearing. In contrast, units such as “Earth and the Universe” appeared only sparsely across the years.
The findings suggest that LGS places a strong focus on specific concept clusters, prioritizes scientific process skills, and contains a noticeable imbalance in content distribution. This study provides a data-driven resource for teachers and content developers in exam preparation, and serves as an example of how AI-supported text analysis can be utilized in educational contexts.