The researchers found some intriguing differences between how men and women respond to using ChatGPT. After using the chatbot for four weeks, female study participants were slightly less likely to socialize with people than their male counterparts who did the same. Meanwhile, participants who set ChatGPT’s voice mode to a gender that was not their own for their interactions reported significantly higher levels of loneliness and more emotional dependency on the chatbot at the end of the experiment. OpenAI currently has no plans to publish either study.
Chatbots powered by large language models are still a nascent technology, and it’s difficult to study how they affect us emotionally. A lot of existing research in the area—including some of the new work by OpenAI and MIT—relies upon self-reported data, which may not always be accurate or reliable. That said, this latest research does chime with what scientists so far have discovered about how emotionally compelling chatbot conversations can be. For example, in 2023 MIT Media Lab researchers found that chatbots tend to mirror the emotional sentiment of a user’s messages, suggesting a kind of feedback loop where the happier you act, the happier the AI seems, or on the flipside, if you act sadder, so does the AI.
OpenAI and the MIT Media Lab used a two-pronged method. First they collected and analyzed real-world data from close to 40 million interactions with ChatGPT. Then they asked the 4,076 users who’d had those interactions how they made them feel. Next, the Media Lab recruited almost 1,000 people to take part in a four-week trial. This was more in-depth, examining how participants interacted with ChatGPT for a minimum of five minutes each day. At the end of the experiment, participants completed a questionnaire to measure their perceptions of the chatbot, their subjective feelings of loneliness, their levels of social engagement, their emotional dependence on the bot, and their sense of whether their use of the bot was problematic. They found that participants who trusted and “bonded” with ChatGPT more were likelier than others to be lonely, and to rely on it more.
This work is an important first step toward greater insight into ChatGPT’s impact on us, which could help AI platforms enable safer and healthier interactions, says Jason Phang, an OpenAI policy researcher who worked on the project.
“A lot of what we’re doing here is preliminary, but we’re trying to start the conversation with the field about the kinds of things that we can start to measure, and to start thinking about what the long-term impact on users is,” he says.
Although the research is welcome, it’s still difficult to identify when a human is—and isn’t—engaging with technology on an emotional level, says Devlin. She says the study participants may have been experiencing emotions that weren’t recorded by the researchers.
“In terms of what the teams set out to measure, people might not necessarily have been using ChatGPT in an emotional way, but you can’t divorce being a human from your interactions [with technology],” she says. “We use these emotion classifiers that we have created to look for certain things—but what that actually means to someone’s life is really hard to extrapolate.”