The concept of Assessment Capacity offers a deep insight into the foundational conditions required before it is even possible to engage meaningfully with the "big topics" of testing, such as fairness or validity. This insight emerges from very real experiences—particularly in the Global South—where key ideas and practices have not yet sufficiently propagated to establish the baseline capacity needed to critically examine and operationalize these concepts.
The rise of artificial intelligence (AI) interacts directly with the broader capacity of societies to govern AI use and to identify appropriate and responsible applications. This capacity is also reflected in a society's ability to understand what counts as responsible, ethical, or minimally acceptable practice, which in turn depends on local moral philosophies, regulatory traditions, and institutional histories.
Culturally and infrastructurally, the development and deployment of AI—especially large language models (LLMs)—are largely concentrated in the United States and a small number of other regions (e.g., data-center concentration and computational infrastructure). The development of LLMs is anchored in this specific milieu, and the technical and epistemic capacities required to shape and evaluate these systems remain similarly concentrated within relatively closed networks of experts.
In the context of educational assessment, AI applications are already diverse and expanding. These include, among others, item generation, feedback generation, early validity checks and expert reviews, test security, and the generation of synthetic data.
This raises a central question: What new capacities are required for assessment systems as AI becomes integrated into testing technologies? More specifically, which general AI-related capacities are most relevant—and most urgent—for assessment practice?