Ting-Hao ‘Kenneth’ Huang, associate professor in the College of IST, will present, “What Roles Will Humans Play in the Future of Data Annotation,” to the Department of Computer Science (CS) and the Center for Language and Speech Processing (CLSP) at Johns Hopkins University. The talk will take place on April 11 as part of the university’s CS/CLSP Spring 2025 Seminar Series.
Large language models (LLMs) have rapidly and impressively taken over many tasks once handled by human annotators in constructing text datasets. But does this mean we no longer need humans in the annotation loop? What roles should humans play in future data annotation pipelines?
“In this talk, I will present two recent studies that explore the evolving role of humans in the landscape of text data annotation,” Huang said. “First, we ask whether a well-designed and carefully executed traditional crowdsourcing pipeline can still outperform LLMs in labeling quality. Our study offers an in-depth and holistic comparison between human and LLM annotation performance.
“Second, we turn to a future where LLMs increasingly replace manual annotation labor. In this scenario, the human role shifts toward instructing the models–often through prompting. But how effective are humans at prompting LLMs for annotation tasks, especially when working without access to gold-standard labels? We investigate this growing practice, which we call ‘prompting in the dark,’ and assess its implications for the quality and reliability of LLM-generated annotations.”