Gender bias in neural natural language processing

DOI10.1007/978-3-030-62077-6_14zbMATH Open1465.68260OpenAlexW3128232076WikidataQ110953800 ScholiaQ110953800MaRDI QIDQ2038005FDOQ2038005

Authors: Kaiji Lu, Piotr Mardziel, Fangjing Wu, Preetam Amancharla, Anupam Datta

Publication date: 8 July 2021

Abstract: We examine whether neural natural language processing (NLP) systems reflect historical biases in training data. We define a general benchmark to quantify gender bias in a variety of neural NLP tasks. Our empirical evaluation with state-of-the-art neural coreference resolution and textbook RNN-based language models trained on benchmark datasets finds significant gender bias in how models view occupations. We then mitigate bias with CDA: a generic methodology for corpus augmentation via causal interventions that breaks associations between gendered and gender-neutral words. We empirically show that CDA effectively decreases gender bias while preserving accuracy. We also explore the space of mitigation strategies with CDA, a prior approach to word embedding debiasing (WED), and their compositions. We show that CDA outperforms WED, drastically so when word embeddings are trained. For pre-trained embeddings, the two methods can be effectively composed. We also find that as training proceeds on the original data set with gradient descent the gender bias grows as the loss reduces, indicating that the optimization encourages bias; CDA mitigates this behavior.

Full work available at URL: https://arxiv.org/abs/1807.11714

Recommendations

zbMATH Keywords

fairness deep learning machine learning natural language processing

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05) Artificial neural networks and deep learning (68T07) Natural language processing (68T50)

Cited In (3)

This page was built for publication: Gender bias in neural natural language processing

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2038005)