Studying names, pronouns and gender with large language models
gender phd nlp by Vagrant GautamAn introduction to how names and pronouns are related to gender, and how my research investigates that with large language models.
Pronouns and names
Vagrant told Vagrant's friend that Vagrant's landlord made Vagrant's landlord's famous onion soup, and that Vagrant's landlord's famous onion soup tasted great.
That was probably painful to read, right? That's why we have names and pronouns!
Vagrant told Vlad that xyr landlord made his famous onion soup, and that it tasted great.
We use names to refer to a specific individuals directly in conversation, and we use pronouns as shorthand to stand in for a noun or a noun phrase, including names. Especially when talking about people, this is useful so we don't have to repeat someone's job or their relation to us over and over again.
We even have conventions around when to use different types of referring expressions while referring to the same person. Consider the two examples below, which are identical except for a single highlighted word:
Tomorrow I'm going to catch up with a friend I haven't spoken to in a while, Xanda. She works at Harvey Mudd and I've always been a fan of her cool hair colours.
Tomorrow I'm going to catch up with a friend I haven't spoken to in a while, Xanda. She works at Harvey Mudd and I've always been a fan of Xanda's cool hair colours.
The first clearly sounds better, doesn't it? That's because I already switched to pronoun mode for Xanda when I told you she works at Harvey Mudd and so you expect that to continue, even if you didn't know that consciously. When I instead switch back to using Xanda's name in the second example, it's unexpected and can even be a bit confusing.
Pronouns, names and gender
If you have no linguistics background, you probably didn't consciously know that you had these intuitions about when to use a name and when to use a pronoun. Indeed, both names and pronouns do a lot of linguistic heavy lifting without us paying much attention to them. Consider the following sentence:
Rory ran into Jess and he told her about his work.
Who told whom about his work? You might have inferred that the 'he' user is probably a man, that 'Rory' is more likely to be a man's name than 'Jess' (often a nickname for Jessica) is, and therefore that the most likely scenario is that Rory told Jess about his work. All those inferences and connections happened in the timespan of you reading that sentence.
We make these inferences about social gender because names are often assigned or chosen in gender-associated ways, and because third person singular English pronouns[1] mark grammatical gender which is also often associated with a social gender. Our inferences about social gender via names and pronouns are actually very important to create and follow conversations about people in a way that disambiguates different individuals.
However, the heuristics we use to infer gender can sometimes trip us up, since neither names nor pronouns map one-to-one to gender. You will likely already know this if you know queer/trans/non-binary people.[2] More broadly, we're more likely to infer wrongly when we lack context about the people and the situation. For example, if you've watched Gilmore Girls like I just did, you might recognize the names in my previous example and realize that Rory (Lorelai Gilmore) is a girl and Jess (Jess Mariano) is a guy.
Pronouns, names, gender, and natural language processing
So, to summarize, names and pronouns help us keep track of people in conversations, without having to specify someone's relation to us over and over again, we have a lot of complex conventions in how we use them to create and follow discussions about individuals, and they frequently signal things about gender, which can sometimes trip us up.
As someone who works on natural language processing (NLP, or making computers do things with language), some questions that I'm interested in are how NLP systems deal with names, pronouns and gender. I find this interesting from a number of perspectives, including
- Fairness, i.e., wanting systems to treat people fairly and not cause harms, including misgendering and misnaming people
- Linguistics, i.e., how do systems create or understand coherent discourse involving the use of names, pronouns and other referring expressions (e.g., someone's occupation or relation to us)
- Meta-level research practices, i.e., when we want to measure things about "gender" or "race" and instead use pronouns and names as a stand-in for that, how much do we miss?
Three of my papers this year explore the answers to some of these questions. I hope to write a post about each of them for which this post will serve as a general introduction, but for now, here are the questions each paper tackles:
When do names say something about a person's gender, race, and other sociodemographic characteristics, and how should this inform our use of names to infer these characteristics?
Vagrant Gautam, Arjun Subramonian, Anne Lauscher, Os Keyes.
Workshop on Gender Bias in Natural Language Processing, 2024.
How well do large language models reuse pronouns for an individual when we're talking about two people who use different pronouns?
Vagrant Gautam, Eileen Bingert, Dawei Zhu, Anne Lauscher, Dietrich Klakow.
Transactions of ACL, 2024.
How reliably can large language models map different kinds of pronouns to the right people? How should we map pronouns to (social) gender?
Vagrant Gautam, Julius Steuer, Eileen Bingert, Ray Johns, Anne Lauscher, Dietrich Klakow.
Workshop on Computational Models of Reference, Anaphora and Coreference, 2024.
In English, it's just third person singular pronouns that mark grammatical gender. In German, nouns and adjectives carry this information too, and Italian even marks gender on verbs. ↩︎
I use xe/xem and they/them pronouns and I chose the name Vagrant, an English noun with no obvious gendered associations, to convey my not-a-girl-and-not-a-boy-ness. ↩︎
- Previous post: (My!) class on concepts in NLP
- Back to the archive