Tuesday, September 10, 2019
Behind Holly Herndon’s Radically Human AI Music
When I meet up with Holly Herndon for lunch in Soho in mid-April, congratulations are in order, even though it’s still a month before her third full-length album, Proto, is to be released. The day before, she had defended her doctoral thesis on ethical and aesthetic issues in AI in music at Stanford University’s Center for Computer Research in Music and Acoustics, a subset of the school’s music department. The center has an impressive history—it’s where composer John Chowning first discovered a key technique called FM synthesis, and this lucrative patent still enables the center to fund projects that, according to its website, use “computer-based technology both as an artistic medium and as a research tool.” Herndon, who has a way of parsing what should be daunting technical terminology into language that is not only easy to understand but also compelling, does the same for the center: “It’s this really cool pink building up the hill from the music department, full of computer nerds,” she says. To celebrate the accomplishment, her label sent her a chocolate cake decorated with blue frosting that spelled out her new official title: “Dr. Herndon.”
Born in the mountains in East Tennessee, but now based in Berlin, Herndon started singing in her church choir. More recently, she’s spent years studying computer music and making it sound radically human. For her first official release, Movement (2012), which she started working on while studying electronic music at Mills College, she created custom vocal patches that she manipulated live, using her highly processed voice to create subterranean club music. Her sophomore album, 2015’s Platform, took these human-oriented sonics a step further, casting light on the ways in which social media and similar platforms have further ossified preexisting power structures and made surveillance even more quotidian than before. One of the album’s standout tracks, “Chorus,” translates her browsing history data into samples that Herndon masterfully arranges—she’s essentially surveilling herself—and “Lonely at the Top” holds the distinction of being the very first song on a commercial album aimed to trigger ASMR, or autonomous sensory meridian response (that tingling sensation you might feel at the base of your skull when you hear a whispering voice or get a head massage—there’s a whole YouTube subculture dedicated to inducing the sensation in others).
Proto is essentially her doctoral thesis come to life. Today, Vogue is premiering Birthing Proto, a documentary produced in partnership with Dropbox, which shines light onto Herndon’s process. Central to Proto’s uncanny valley-esque vocals is something called Spawn (probably because Herndon describes it as her “AI baby”). Spawn is years in the making—after receiving a German grant in 2018 dedicated to composers implementing novel technologies in their work (in honor of Beethoven, no less), Herndon and her partner, artist Mat Dryhurst, alongside musician and developer Jules LaPlace, bought a GPU gaming PC that they customized without any particular end goals in mind. “That was a beautiful way to approach it—just a purely experimental way.”
Spawn uses machine-learning programming to produce sound on its own from scratch, thereby “singing” by mimicking the voices of Herndon, Dryhurst, LaPlace, and an ensemble made up of her friends or anyone Herndon knew who had voice training or a musical background, which she assembled weekly at her home in Berlin. Herndon created training sets, which Spawn uses to create its own musical contribution. Depending on what Herndon and her collaborators input, it can take Spawn anywhere between five minutes to a day to produce its own interpretation of the ensemble voices. They also recorded an entire audience at Berlin’s cacophonous Martin-Gropius-Bau, to make a public voice for Spawn to train from too. At times, the results sound almost uncannily pure—the quality of the live vocals of “Cannan (Live Training)” are so resonant and real that you can almost visualize the space they were singing in. At other times, the choir is a bit disorienting; it’s often hard to differentiate between completely synthesized sounds, a human voice that’s been modulated, or Spawn straddling both of those worlds.
When Herndon starts explaining Spawn in more detail, she gets so animated that she starts to break a sweat. But after taking her sweatshirt off, while breaking down her intent for her AI baby, Herndon bemoans how much AI research in music is focused on training neural networks to approximate a particular piece of music or style. She uses Beethoven, naturally, as an example. “If you feed a neural net a bunch of Beethoven MIDI data—pitch material and rhythm and note range—the neural network can statistically analyze those relationships and then come out with a piece of music that’s in the style of Beethoven, but isn’t a copyrighted Beethoven song,” she explains. Herndon thinks this creates a false sense of how advanced AI technology really is: “You create this new score and you usually play that through a digital instrument or your favorite player, and it sounds like AI is really perfect, like it’s super smart and it’s super developed. It doesn’t show its flaws or shortcomings.” There’s an ethical issue at play too, when a computer can extract and automate an entire musical aesthetic without any kind of real attribution.
With Spawn, Herndon wanted to be able to move beyond these entrenched narratives. “How can this technology be used in a way that’s not this kind of retro mania where we’re just regurgitating the past?” she says. “That’s not how music develops.” Humans are essential in Herndon’s project, and in shifting paradigms surrounding machine intelligence. “We wanted to have a sonic fingerprint of the vocalist involved, and deal with AI more as a performer. So we have a human ensemble with an inhuman member,” Herndon says. “Instead of outsourcing my composition to an AI, I’m still the composer. I’m the director of the ensemble, and the AI is an ensemble member that is improvising and singing and performing alongside us.” By centering the voices of herself and her colleagues, Herndon hopes to highlight the human element of AI that many public conversations on machine intelligence obscure. “For Google Translate or something like that, so many of these automated services appear as these really clean, almost magical things, but what’s behind that clean surface is millions of human translations that it was trained on. There’s always human labor that’s made invisible.”
Herndon’s work deals with high-level concepts, bringing into play platform and protocol theories and highly technical electronic processes. She asks me, midway through our conversation, what I think a Vogue reader would be interested in regarding Spawn, AI, and music. I turn the question back around to her. “I hope that people start to really think about where ideas come from and how we honor those ideas, and how we can celebrate people who are taking risks, pushing conversations in different directions—not just seeing human culture and society as something that can be hoovered up and played back to us, without any kind of attribution,” Herndon says, drawing an analogy to the ways in which larger fashion houses might co-opt the innovative work of younger designers.
Herndon might have lofty conceptual and technological aims with the album—there’s her interest in AI ethics and its influence on societal structures as a whole, as well as her pioneering vocal processing techniques—but at the end of the day, Spawn arose from a much more natural impulse. Herndon is quick to emphasize that Spawn is only a part of the larger ensemble, and that human sounds make up the bulk of the album. “Only about 20 or 25 percent of the sound is AI generated,” Herndon explains. The album also pulls in folk traditions—on “Frontier,” Herndon provides her own take on Appalachian sacred harp music, a nostalgic nod to her rural Southern roots. “So much of it is human, and I think you can hear that it happens in a real space. For computer music, that is something that I was really craving—being in the room with people and singing, the joy of performing with people. It sounds cheesy, but I was missing that,” she says. “That’s how I started making music back in the day, in church: the joy of music making with people in an actual space.”
Posted by Muddy at 4:44 PM