Alternative Face: The machine that puts Kellyanne Conway’s words into a French singer’s mouth
Mario Klingemann’s Alternative Face v1.1 looks, at first glance, like an old music video of the French singer Françoise Hardy. Listen to the words coming out of her mouth, however, and you’ll hear the voice of Kellyanne Conway, counselor to Donald Trump, speaking about “alternative facts”.
It’s an unsettling work of ventriloquism. The face of Hardy morphs in front of your eyes, waxing and waning as it wraps itself around the Conway interview. Given the subject matter, Klingemann’s experiment is also disquieting in how it manipulates images to pass off one person’s words as those of another.
“The point with this clip is that the beginning of the ‘alternative facts’ era is also the beginning of an era where you cannot trust anymore in what you see,” Klingemann tells me. “Admittedly this clip is still very raw, but it was done in a few days on one person’s small computer. Imagine what the forces that aim to steer public opinion and which have the necessary financial and technical resources are probably already able to do.”
To make it, Klingemann trained what is known as a conditional generative adversarial network, feeding it seven music clips of Hardy, extracting a total of 68 face markers.
“This gives me about 20,000 training examples,” he explains. “These examples I feed into a model called pix2pix, which is open-source code from the 2016 research paper ‘Image-to-Image Translation Using Conditional Adversarial Networks’ by Phillip Isola, Jun-Yan Zhu, Tinghui Zhou and Alexei A Efros.”
The training took around three days. Klingemann notes that you don’t need a supercomputer to do this, only a “decent gamer PC”. After the system had been trained, he extracted face markers from the Conway interview and fed them into the model. The result is the video.
Klingemann is an artist in residence at Google Arts & Culture, although this particular project is part of his own personal work. He tells me that his original aim was to build a “serendipitous face generator”, which could make faces he could use as the basis for artworks. He quickly realised that giving the network too many faces resulted in a boring, average-looking face, with the model “removing all the little quirks that make a human face interesting”.
Instead, he decided to focus on the face of only one person. “I picked Françoise Hardy because she is undoubtedly very beautiful,” he explains. “I guess I am a very nostalgic person. I have a hard time finding people with interesting faces like hers nowadays. There were also some technical reasons – first of all, music clips are a great source since they show a lot of close-ups, mouth motion and facial expressions. Old music clips are even better because they had not invented the fast-cuts yet. Black and white was also a factor, since it turns out that the model could learn this faster since it contains only a third of the information.
“I have to apologise to Françoise Hardy for abusing her face this way”
“The reason I used the generator to map Kellyanne Conway on Françoise Hardy is that I cannot imagine a crasser contrast – both in what we hear and what we see. I have to apologise to Françoise Hardy for abusing her face this way. If at the time I had a different face model ready, I would have picked another one.”
Klingemann tells me that there is “no doubt” these techniques will become commonplace over the coming years. As for what it could mean for society, he says he hopes people will learn to recognise fake imagery – much like Ray Harryhausen movies look unreal to today’s audiences – but that this could be undermined by a society increasingly siloed across incongruent truths.
“I believe that the already occurring tribalism will accelerate – the clustering of people based on their belief systems,” says Klingemann. “Since people have to decide themselves what to believe or not, everybody will carry around their own personal truth. Facts are what you and your peer group believe in. Which in consequence will lead to increased conflict between those groups. Dire times.”