The advanced generative artificial intelligence (AI) techniques, such as large language models and large multimodal models, are transforming many aspects of educational assessment. The integration of AI into education has the potential to revolutionize not only test development and evaluation but also the way students can learn. Over the past years, some successful adoptions of machine learning in this area are using natural language processing for automated scoring, or applying collaborative filtering to predict student responses. The rapid advances of large foundation models (e.g., ChatGPT, GPT-4, Llama, Gemini) demonstrate the potential of intelligent assessment with data-driven AI systems. These models could potentially benefit test construct identification, automatic item generation, multimodal item design, automated scoring, and assessment administration. Meanwhile, new research challenges arise in the intersection of AI and educational assessments. For instance, the explainability and accountability of current large foundations models are still inadequate to convince the stakeholders in the educational ecosystem, which limits the adoption of AI techniques in large-scale assessments. Also, it is still unclear whether the large foundation models are capable of assisting complex assessment tasks that involve creative thinking or high-order reasoning. Tackling these research challenges would require collaborative efforts from researchers and practitioners in both AI and educational assessment.
This one-day workshop provides a forum for researchers from AI and educational assessment to review and discuss the recent advances of applying large foundation models for educational assessment. The workshop includes keynote talks and peer-reviewed papers (oral and poster). Original high-quality contributions are solicited on the following topics:
Tsinghua University
University of Maryland
Stanford University
Duolingo
Cambium Assessment
The University of British Columbia
Location: MTG 19&20
Keynote Talk 1: The Practices and Lessons Learnt from AI-assisted Teaching in Tsinghua
University, Hongning Wang
To promote deep integration of artificial intelligence (AI) technology with education, Tsinghua
University launched a university-wide initiative named ‘AI-Enabled Teaching Pilot Program' in the fall
2023. The program actively invests on the research and development of AI teaching assistants based on
the latest generative AI technologies, tailored to specific needs of individual courses. To date, the
developed AI teaching assistants have been introduced into more than 150 courses’ daily teaching
practices, and received very positive feedback from both students and instructors. This initiative
provides a valuable experimental platform for AI and educational researchers to closely observe and
analyze the profound impact of GenAI technology on education and teaching. It also lays the groundwork
and provides experience for the subsequent larger-scale development of both the AI-assisted educational
systems and the courses themselves. In this presentation, we will showcase the primary functions of the
intelligent teaching assistant systems and share our observations and reflections on the potential
impacts of AI technology on the field of education.
Keynote Talk 2: AI-Enhanced Practices in Educational Assessment, Hong Jiao
Assessment of student learning outcomes are expected to be valid, reliable, accurate, and efficient. Technology has played a significant role in shaping and enhancing educational assessment practices. As the landscape of artificial intelligence (AI) evolves rapidly, AI technology is currently revolutionizing the test development process, particularly in terms of assessment design, item development, implementation, and psychometric analysis. This presentation will first demonstrate the transformative impact of AI technology on educational assessment by presenting successful use cases of AI in test development such as automated scoring, cheating detection, and process data analysis. Next, the presentation will explore additional possibilities and opportunities that AI can bring to enhance educational assessment practices. This includes the use of generative AI for item generation and item parameter prediction modeling. Finally, the presentation will address the challenges in the applications of AI in educational assessment. It will highlight the potential bias and fairness issues, as well as the ethical considerations for the responsible use of AI in large-scale test development. This presentation will emphasize the importance of thoughtful implementation and continuous evaluation to ensure the validity, reliability, and fairness of AI-powered assessment systems.
Keynote Talk 3: TBA
Keynote Talk 4: TBA
Submission URL: Please submit your work via Openreview.
Format: All submissions must be in PDF format and anonymized. Submissions are limited to nine content pages, including all figures and tables; unlimited additional pages containing references and supplementary materials are allowed. Reviewers may choose to read the supplementary materials but will not be required to.
Style file: You must format your submission using the NeurIPS 2024 LaTeX style file. The maximum file size for submissions is 50MB. Submissions that violate the NeurIPS style (e.g., by decreasing margins or font sizes) or page limits may be rejected without further review.
Double-blind reviewing: The reviewing process will be double blind at the level of reviewers (i.e., reviewers cannot see author identities). Authors are responsible for anonymizing their submissions. In particular, they should not include author names, author affiliations, or acknowledgements in their submissions and they should avoid providing any other identifying information (even in the supplementary material).
Important Dates:
Note: The workshop is non-archival, so you can have a concurrent submission of your work to both the workshop and other venues.
University of Virginia
CFA Institute
Apple Inc.
University of Iowa
University of Virginia
University of Virginia