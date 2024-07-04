Voice-to-speech software was designed to make visual media accessible to users with certain disabilities, and on TikTok, it has become a creative force in its own right. Since TikTok rolled out its text-to-speech feature, in 2020, it has developed a host of simulated voices to choose from — it now offers more than 50, including ones named “Hero,” “Story Teller” and “Bestie.” But the platform has come to be defined by one option. “Jessie,” a relentlessly pert woman’s voice with a slightly fuzzy robotic undertone, is the mindless voice of the mindless scroll.

Jessie seems to have been assigned a single emotion: enthusiasm. She sounds as if she is selling something. That’s made her an appealing choice for TikTok creators, who are selling themselves. The burden of representing oneself can be outsourced to Jessie, whose bright, retro robot voice lends videos a pleasantly ironic sheen.

Hollywood has constructed masculine bots, too — none more famous than HAL 9000, the computer voice in 2001: A Space Odyssey. Like his feminized peers, HAL radiates serenity and loyalty. But when he turns against Dave Bowman, the film’s central human character — “I’m sorry, Dave, I’m afraid I can’t do that” — his serenity evolves into a frightening competence. HAL, Dave realizes, is loyal to a higher authority. HAL’s masculine voice allows him to function as a rival and a mirror to Dave. He is allowed to become a real character.

Like HAL, Samantha of Her is a machine who becomes real. In a twist on the Pinocchio story, she starts the movie tidying a human’s email inbox and ends up ascending to a higher level of consciousness. She becomes something even more advanced than a real girl.

Johansson’s voice, as inspiration for bots both fictional and real, subverts the vocal trends that define our feminized helpmeets. It has a gritty edge that screams "I am alive". It sounds nothing like the processed virtual assistants we are accustomed to hearing speaking through our phones. But her performance as Samantha feels human not just because of her voice but because of what she has to say. She grows over the course of the film, acquiring sexual desires, advanced hobbies and AI friends. In borrowing Samantha’s affect, OpenAI made Sky seem as if she had a mind of her own. Like she was more advanced than she really was.

When I first saw Her, I thought only that Johansson had voiced a humanoid bot. But when I revisited the film recently, after watching OpenAI’s ChatGPT demo, the Samantha role struck me as infinitely more complex. Chatbots do not spontaneously generate human speaking voices. They don’t have throats or lips or tongues. Inside the technological world of Her, the Samantha bot would have itself been based on the voice of a human woman — perhaps a fictional actress who sounds much like Johansson.

It seemed that OpenAI had trained its chatbot on the voice of a nameless actress who sounds like a famous actress who voiced a movie chatbot implicitly trained on an unreal actress who sounds like a famous actress. When I run ChatGPT’s demo, I am hearing a simulation of a simulation of a simulation of a simulation of a simulation.

Tech companies advertise their virtual assistants in terms of the services they provide. They can read you the weather report and summon you a taxi; OpenAI promises that its more advanced chatbots will be able to laugh at your jokes and sense shifts in your moods. But they also exist to make us feel more comfortable about the technology itself.

Johansson’s voice functions like a luxe security blanket thrown over the alienating aspects of AI-assisted interactions. “He told me that he felt that by my voicing the system, I could bridge the gap between tech companies and creatives and help consumers to feel comfortable with the seismic shift concerning humans and AI,” Johansson said of Sam Altman, OpenAI’s founder. “He said he felt that my voice would be comforting to people.”

It is not that Johansson’s voice sounds inherently like a robot’s. It’s that developers and filmmakers have designed their robots’ voices to ease the discomfort inherent in robot-human interactions. OpenAI has said that it wanted to cast a chatbot voice that is “approachable” and “warm” and “inspires trust.” Artificial intelligence stands accused of devastating the creative industries, guzzling energy and even threatening human life. Understandably, OpenAI wants a voice that makes people feel at ease using its products. What does artificial intelligence sound like? It sounds like crisis management.