2466 - MIRROR, MIRROR, ON THE WALL: WHO DELIVERS FEEDBACK BEST OF ALL?

Session: D01S038 - Artificial Intelligence at Work 3
AUTHORS:
Gardner Aimee (University of Colorado School of Medicine ~ Aurora ~ United States of America) , Michalsen Kara (University of Colorado School of Medicine ~ Aurora ~ United States of America) , Carlson Clint (University of Colorado School of Medicine ~ Aurora ~ United States of America) , Czaja Angela (University of Colorado School of Medicine ~ Aurora ~ United States of America) , Kiger Michelle (University of Colorado School of Medicine ~ Aurora ~ United States of America) , Bilyeu Catherine (University of Manitoba ~ Thompson ~ Canada) , Lockspeiser Tai (University of Colorado School of Medicine ~ Aurora ~ United States of America)
Abstract text:
Introduction


We investigate the accuracy of feedback delivery scores provided by artificial intelligence (AI) powered avatars in comparison to feedback delivery ratings created by supervisors and peer observers.


Methods
Supervisors attended a two-hour professional development workshop focused on enhancing feedback delivery skills. After a 20-minute lecture, participants took turns in groups of four providing feedback to the avatar while the remaining participants observed. After each scenario, all participants and the avatar independently rated the feedback delivery skills using the DOCS-FBS feedback delivery scale (1-3 scale; 1=not done;3=successfully done). Scenario complexity was calculated with a 9-item 1-4 scale (4=most complex). Paired-samples t-tests with post-hoc Tukey tests were used to examine differences between groups.


Results


Twenty-four supervisors with an average of 10(± 5) years in practice participated across six groups. A total of 120 feedback evaluations were collected across 24 unique avatar interactions. Scenario complexity ratings ranged from 1.89 to 3.11, with an average scenario complexity rating of 2.50 (±0.46). Across each of the four scenarios, no differences emerged between self and avatar feedback delivery ratings. For the two scenarios with above average complexity ratings, however, peer ratings were higher than both self ratings (scenario 3: 2.91±0.08 vs 2.47±0.42,p<0.05); scenario 4: 2.97±0.03 vs 2.80±0.16,p<0.01) and avatar ratings (scenario 4: 2.74±0.22 vs 2.97±0.03,p<0.05). No differences emerged between self and peer ratings or avatar and peer ratings for either of the below-average complexity scenarios.


Discussion


Feedback delivery ratings produced by self, peers, and the avatar platform itself were similar the majority (75%) of the time. With higher complexity, however, we found peers provided higher feedback delivery ratings than self or avatars. Our study supports the value of using AI-powered avatars to simulate the feedback delivery process and offers supervisors an opportunity to rehearse and refine their feedback skills in a safe, convenient, repeatable setting.