Menu Close

Summer Research Discussion Event — Series 2 🗓


Title: Assessing Demographic Bias in Named Entity Recognition
Authors: Shubhanshu Mishra, Sijun He, Luca Belli
Named Entity Recognition (NER) is often the first step towards automated Knowledge Base (KB) generation from raw text. In this work, we assess the bias in various NER systems for English across different demographic groups with synthetically generated corpora. Our analysis reveals that models perform better at identifying names from specific demographic groups across two datasets. We also identify that debiased embeddings do not help in resolving this issue. Finally, we observe that character-based contextualized word representation models such as ELMo results in the least bias across demographics. Our work can shed light on potential biases in automated KB generation due to systematic exclusion of named entities belonging to certain demographics. Paper link: (published version will be available soon)

Speaker bio:
is a Machine Learning Researcher at the Content Understanding Research team at Twitter, Inc. He did his Ph.D. at the iSchool, University of Illinois at Urbana-Champaign where he was advised by Dr. Jana Diesner and Dr. Vetle I. Torvik. His thesis was titled Information Extraction from Digital Social Trace Data with Applications to Social Media and Scholarly Communication Data. His research is focused on improving information extraction tasks as well as analyzing the extracted information for social patterns. He finished his Integrated Bachelor’s and Master’s degree in Mathematics and Computing from the Indian Institute of Technology, Kharagpur in 2012. He was a fellow of Kishor Vaigyanik Protsahan Yojana (KVPY), a scholarship program funded by the Department of Science and Technology of the Government of India, from 2007 to 2012. More information about his work can be found at:

Scheduled Past Events