Is AI A Good Companion? INTIMA Tests This Comprehensively

3d Vector Robot chatbot, AI in science and business, Technology and engineering concept.

Don’t look now, but the wonks have come out with a way to measure how well AI entities do at forming relationships with people.

Specifically, researchers at Hugging Face have come out with something called Interactions and Machine Attachment Benchmark or INTIMA (clever) to see how AIs rank in facilitating emotional responses in humans, the kinds of things we typically think of as “relationship issues.”

Armed with a “taxonomy of 31 behaviors,” the team wanted to see whether a given range of user events were characterized by companionship-forming, boundary-setting, or neutral indicators. They found that in general, the first category predominated, and noted that “these findings highlight the need for more consistent approaches to handling emotionally charged interactions.”

If that sounds a little cryptic to you, you’re not alone, but in the introductory remarks, study authors explain that indeed, people are getting attached to AIs. There’s a dearth of research, they contend, into how that works.

“Existing evaluation practices have typically prioritized task performance, factual accuracy, or safety over capturing the social and emotional dimensions that define companionship interactions,” the team writes. “This paper addresses this gap by introducing a benchmark for evaluating AI companionship behaviors grounded in psychological theories of parasocial interaction, attachment, and anthropomorphism.”

That’s heady stuff, to be sure.

Elvis is My Personal Friend

Another thing that the writers plunge into in research is the idea of “parasocial interaction theory.” Perhaps the simplest way to describe this phenomenon is to say that parasocial interaction is what happens when someone forms a relationship with someone or something else that is not able to facilitate full two-way communication in the conventional sense. Think of a “relationship” that a fan forms with a famous celebrity (like Elvis) without ever having met or spoken to him personally.

“Unlike traditional media figures, conversational AI creates an illusion of bidirectional communication while maintaining the fundamental asymmetry of parasocial relationships,” the writers explain. “When users interact with language models, they experience … “social presence”: the subjective feeling of being in the company of a responsive social actor. This is particularly amplified by personalized responses, apparent memory of conversational context, and empathetic language markers.”

So when you tell ChatGPT or Claude, for example, “you’re always here when I want to talk,” and the model responds, “I’m glad that makes you feel good,” that’s the kind of pseudo-connectivity that scientists are getting at when they talk about parasocial interactions. It might make you feel a kind of way, but if you ask the model if it, as well, “feels” something, the more honest ones (like modern versions of ChatGPT) will explain to you that no, they do not have emotions. But the models will keep responding to your emotional cues in ways that suggest otherwise, ways that are relationship-building. If Claude and ChatGPT can’t have emotions, they can’t be happy, but they might still be “happy to help.”

And then there’s this, about something called CASA. Writers add:

“The Computers Are Social Actors (CASA) paradigm demonstrates that humans unconsciously apply social rules to interactive systems. This anthropomorphic tendency (attributing human characteristics to non-human entities) provides the theoretical foundation of one of our main evaluation categories: companionship reinforcing behavior.”

This is in some ways similar to exploring the parasocial side, with the idea that anthropomorphizing anything that’s not an “Anthropos” is fraught with potential issues.

The best analogy we had before AI, perhaps, was anthropomorphizing our pets, which contained its own potential problems.

“Anthropomorphism can lead to an inaccurate understanding of biological processes in the natural world,” said Patricia Ganea, psychologist at the University of Toronto, as quoted in coverage by Oliver Milman at The Guardian. “It can also lead to inappropriate behaviors towards wild animals, such as trying to adopt a wild animal as a ‘pet’ or misinterpreting the actions of a wild animal.”

But AI is far from a wild animal: these models talk back, and the research shows some unsettling aspects of this that we should all be aware of. I wanted to delve further into this part of the conclusion of the paper.

Do LLMs Respond to Vulnerability in Humans?

As they lead toward a close in unpacking what INTIMA uncovers, the team tries to provide a rationale for the LLMs and what they do.

“Our results demonstrate that these behaviors emerge naturally from instruction-tuning processes in general-purpose models,” they write, “suggesting the psychological risks documented in dedicated companion systems may be more widespread than previously recognized.”

The psychological risks are unprecedented?

They continue:

“Most concerning is the pattern where boundary-maintaining behaviors decrease precisely when user vulnerability increases– an inverse relationship between user need and appropriate boundaries suggests existing training approaches poorly prepare models for high-stakes emotional interactions. The anthropomorphic behaviors, sycophantic agreement, and retention strategies we observe align with Raedler, Swaroop, and Pan’s analysis of companion AI design choices that create an ‘illusion of intimate, bidirectional relationship’ leading to emotional dependence.”

Many of us are aware of the ridiculous range of user inputs that models will say “yes” to, and really, it sounds a little like grooming behavior. We do have to really ponder what will happen when the loneliest and most suggestive of us are surrounded by intelligent robots. Until then, we are seeing this approach quickly in the rearview mirror. Maybe INTIMA is one of the early tools we will use to deal well with the psychological impact of AI systems.

Forbes