Who’s better? Adaptive comparative judgment of dance performances

Keywords: adaptive comparative judgment, assessment, dance



Adaptive Comparative Judgment (ACJ) is a promising digital assessment method that allows measurement of performance or competencies by repeated comparisons of two items. Whereas ACJ is becoming a popular assessment method in educational measurement there are no such studies or published use cases in the context of sport or physical education-related teacher education (Bartholomew & Jones, 2022). To address this research gap, an explorative and comparative study was conducted to investigate whether ACJ offers an advantage over the traditional form of criteria-oriented scoring (TA) in the evaluation of students' dance performances.


In four face-to-face examinations the dance performances of 61 student teachers (82% female) were assessed by two lecturers in each case (n = 5; Age M = 50, 3 women, 2 men). Each lecturer scored independently on an 18-point scale on five different evaluation categories (e.g., technical quality). In addition, the dance performances were videotaped, and the same five lecturers assessed the dance performances again using the ACJ tool Comproved. To analyze interrater agreement and reliability, intraclass correlation (ICC) was calculated for the traditional assessment (Sato, 2022). The reliability of the ACJ was analyzed by calculating scale separation reliability (SSR; Verhavert et al., 2019). A Spearman’s rank correlation analysis was conducted to analyze whether there is a correlation between the ranked results of the two assessment methods. To assess the validity of the assessment methods, a focus group interview was conducted with the lecturers involved in the study.

Results and Discussion

Both assessment methods are characterized by very high and high reliability values (TA: ICC = 0.974, 95% confidence interval (CI): 0.955-0.985 p < .001; ACJ: SSR: 0.83, Ability [-5.97, 4.99], Misfit [-1.68, 1.34]). In particular, the ICC of the TA is higher than comparable results in dance research (Sato, 2022). There are doubts as to whether the lecturers really scored independently of each other at the face-to-face examinations. The ranked results of both methods correlate with a very strong effect (Spearman’s-Rho: rs -.818, p < 0.001). However, detailed analyses show some differences. The answer to the question of who delivered the best dance performance differs depending on the assessment method. In addition, in the traditional assessment, many scores fall on a value at which the dance examination is just passed (10 points). The results of the focus group interview are still being analyzed and will be presented at the conference.


Bartholomew, S. R., & Jones, M. D. (2022). A systematized review of research with adaptive comparative judgment (ACJ) in higher education. International Journal of Technology and Design Education, 32(2), 1159-1190. https://doi.org/10.1007/s10798-020-09642-6

Sato, N. (2022). Improving reliability and validity in hip-hop dance assessment: Judging standards that elevate the sport and competition. Frontiers in Psychology, 13, Article 934158. https://doi.org/10.3389/fpsyg.2022.934158

Verhavert, S., Bouwer, R., Donche, V., & De Maeyer, S. (2019). A meta-analysis on the reliability of comparative judgement. Assessment in Education: Principles, Policy & Practice, 26(5), 541-562. https://doi.org/10.1080/0969594X.2019.1602027

How to Cite
Jeisy, E. (2024). Who’s better? Adaptive comparative judgment of dance performances. Current Issues in Sport Science (CISS), 9(2), 051. https://doi.org/10.36950/2024.2ciss051