This paper presents a study where a 3D motion-capture animated ‘virtual speaker’ is compared to a video of a real speaker with regards to how it facilitates children's speech comprehension of narratives in background multitalker babble noise. As secondary measures, children self-assess the listening- and attentional effort demanded by the task, and associates words describing positive or negative social traits to the speaker. The results show that the virtual speaker, despite being associated with more negative social traits, facilitates speech comprehension in babble noise compared to a voice-only presentation but that the effect requires some adaptation. We also found the virtual speaker to be at least as facilitating as the video. We interpret these results to suggest that audiovisual integration supports speech comprehension independently of children's social perception of the speaker, and discuss virtual speakers’ potential in research and pedagogical applications.