This is Nick Bilton My name is Nick Bilton I'm a special correspondent for Vanity Fair and his beat you could say is trying to predict the future of technology to look into the future and to this kind of crystal ball and try to predict what the next 51015 years would look like for the media industry if you have a good batting record like digits as you call some big you know phones in our pockets that would be like super computers the social media would drive news not newspapers and so on and things like that so you know it's it's been pretty good I reached out to you because I came across this article that you wrote an article that sent shivers down my spine and I'm not one to typically be given shivers by article so I guess how did you stumble into all of this and where did where does this start for you so I was sitting around with some friends in my living room and front of my mention oh did you see this thing that Adobe put out recently. Live in a time when more people than ever before believed that they could change the world and that conversation led Nick to a video a video online of the adobe Macs 2016 call. Their tons and tons of people in the audience is amazing and off in front of the news it looks like the stage of a Apple product launch but sort of beach themed why beach I have absolutely no idea that's a little t.m.i. But hey you know there are 2 hosts that are sitting in these like the lifeguard chairs sorry me. Jordan Peele Jordan Peele is in Key and Peele Jordan Peele Yes And then the other host is this woman Kim Chambers who is a marathon swimmer and an Adobe employee. And then please welcome just. Say you know everyone. Laughing You guys have been making where to start from law and I were photo editing and he says Adobe is no. For Photoshop we're known for editing photos and doing magical things visual well. We're on to the next the thing today let's do something to human speech the screen on a Mac. Computer Wow I have obtained this piece of all do you know where there's Mike Ok talking to peel about his feeling after getting. Keegan Michael key had been nominated for an Emmy and he and Jordan Peele were talking about it there's a pretty interesting joke here so let's just hear it. I jumped up. And kissed my dogs and my wife in that order. Not a bad joke so that's. Ok so I suppose you want to. Go to his wife so in other words what if Keegan Michael Key was feeling like that was a little bit rough on my wife that was a little bit mean you know maybe he wanted to go and rewrite history and say that he kissed his wife before the dogs actually want his wife to go before the dogs. Ok so what do we do it usually save clicks a button the program automatically generates a transcript of the audio and projects it up on the screen behind him you know just text of what King Michael Kay said Ok I mean zooming a little bit and then copy paste she just highlights the word wife and pastes it over in front of dogs Ok let's listen to clicks play and I kissed my wife and my dogs. 'd So he was able to move the at the audio by moving the text around in the text Yes exactly Ok well that's kind of cool kind of impressive. But then here's more here's the more. We have to type something that's not here so wait what I'm just hands of thing I've heard that actually that I'm not there. My whole activity here is our Jordan. So start to recover the truth let's do it he goes back into that little word box so let's remove the word of my or your secret's out and also just type the word Jordan I typed it out j o r d And just to be clear Michael he did not say Jordan anywhere in this clip and here we go and I kissed Jordan and my dogs I. Just typed in a word that the guy never said and it made the guy say the word he never said it as if he actually said it exactly Well you know which. Jumps out of the lifeguard chair sort of stomping around the stage or what you want you would be human Oh yeah I have a magic and the last magic I'm going to show you guys as we can actually type small phrases so that say Ok I'll remove the deletes the words my dogs any type spree. That's. The bag and I just ordered 3 times. The. Wage that you are saying that Keegan Michael Key never said ever said Jordan never said 3 he never said times never ever said any of those words and somehow just from the typing in the bit the guy is now saying them and we're hearing them in his voice that's what just happened that is exactly what the demo claims it's essentially Photoshop for audio Nick Bilton again you can take as little as 20 minutes of someone's voice and type the words and it creates in that voice. That sentence with just 20 minutes of the guy talking yes but how how in heaven do you do this. Ok And so you were here to do what exactly do you do here sure I'm the product manager for audio this is during. I flew it to Seattle and tracked him down to ask him exactly that question so essentially what it does is it doesn't analysis of the speech and it creates models and it basically and explain to me that this program which they call Vocal by the way what it does is it takes 20 minutes or actually 40 if you have the best results of you talking and it figures out all the phonetics of your speech all of the sounds you make find each little block of sound and speech that is in the recordings chop them all up and then when you go and type things in it will recombine those into that new word but what if it encounters a sound that I never made. The theory is in 40 minutes of speech which is the amount they recommend you feed in you're going to probably say just about every sound really English language so if really so like phonetically I go I run through the gamut and in 40 minutes yes. Like what would you what are you hoping people will would use a product like foco for. So for the video production tools and for what I dish it is used for a lot as dialogue editing the whole idea during said is to help people that work in movies and t.v. a Lot of our customers record great audio and set the actors in the dialogues and everything. And when they come back if sometimes there's a mistake or they make a change like the actor on set said Shoo but what he was pointing out was obviously a book and right now there's they do what's called a.d.r. They'll bring the actor in record someone's in the trying drop that into the video but you're not using the same microphones are not in the same location the actor might be sick that day so his voice sounds different in things you a lot of times you can really hear that stand out in productions if they don't get it just right but with Coco you just delete the words shoot type in boot and there it is using the same source medium the same characteristics and have it just sound seamless and natural and so. It's going to be a sort of the hope is that it will make the lives of professional post-production editors easier the world over that's our hope right now yeah. But that's not exactly well I mean it's what Nick Bilton thought when he saw this video it could be Donald Trump's voice or Vladimir Putin. I saw that and I thought wow imagine a of audio clips start getting shared around the Internet. As fake news of a fake conversation between you know Putin poll man afford about trying to get Trump into the White House or something like that and right now I was like whoa this is this is scary stuff. But we're just getting started in the words of John Raymond Arnold played by Samuel l. Jackson in the movie dressing part in his own voice but things are about to get a lot. Crazier. So forget voices for a 2nd because now 123-451-2345 it's Face Time All right we are at the g. L. And center at the University of Washington in Seattle so I left Adobe and went across town to talk to the head of the Grail lab. Yeah. Dr. Elizabeth a professor in the computer says he. Gets his book going to save you from a little. Ok just to back up for a 2nd when Nick 1st saw the vocoded demonstration he started to wonder Ok How could this be used down the road in my original thesis was oh well maybe what will happen is that you will be able to create 3 d. Actors just like you didn't Star Wars then join it with the vocoder stuff to create a fake Hillary Clinton and you know Donald Trump having a conversation or making out whatever it is you want to do and that led him to investigate the type of work that era does did using these terms like facial reenactment and facial manipulation are those the Are those the right words and then what the hell did these words yeah so. I mean it's all it's a way of animating basis and it started from the movie concept is these remotely controlled bodies to think like the aptly named movie Avatar. Or so you're going to look for the back side of intelligent life in the toys story. And to make the characters come alive you need is the expressions of the actors playing them this is a movie a space means that you will bring that person to the studio then you cover their face with these sticky sensory marker things and then they will spend hours hours hours capturing that person's little dynamics like smile. No teeth surprised disturbance that's angry bloated. And from that they create a virtual character capable of emoting all those expressions and to make that character believable the animators sometimes have to model a bone structure and muscles and as you can imagine this can get very very expensive and so what people like era started to wonder was like can this be done on a budget so she and others in the field started feeding videos of faces into computers and trained those computers to break down the face into a series of points. 50 by 350 that is 62500 points on one human face in the months you know that's right you get dragged the points Ok so once you can track how my face moves through a video clip by these 250 by 250 points what can you then do with that information well I can apply the points on the face on a different model of a different person now this is this is where things get quite strange because instead of being able to map all of your facial movements onto a computer generated virtual character or person what era and others in this field of facial reenactment have. How to do is to map your facial movement on to a real person a pre recorded real person well what is that even maybe How's that work the best example of this is this piece of software that Nick showed us this software that I found from the university students called Face to Face present a novel real time facial reenactment method that works with any money the webcam there's a video demo of this and when you open it up this very monotone voice comes in saying center method only uses r.g.b. Data for both the source and target actor and you're like What the heck is this and screen pops up here we demonstrate our method in a live set up on the right you've got this heavy set man goatee spiked hair on the right a source actor is captured with a standard webcam arching his eyebrows he's pursing his lips he's opening his mouth widely sort of like like if you're making funny faces for a 2 year old kind of thing yeah and then this input drives the animation of the face in the video shown on the monitor to the left on the left you've got this dealt computer screen displaying a c.n.n. Clip of George Bush this is a real clip of Bush back from 2013 and his face is there looking right at the camera occupies most of that screen significant difference to create and what you start to notice is when the man with the goatee smiles George Bush in the c.n.n. Clip also smiles when the man raises his eyebrows George Bush raises his eyebrows and you realize this man is controlling George Bush's face so this is a guy in the president controlling a past George Bush a real George Bush from an old video clip Yeah Ok I pulled up a video for you here Ok and a little while back when we were just learning about this we happen to have a friend who writes for The New Yorker in the studio so that is George Bush's face . What oh God. God that's terrifying. His Ok so yeah I cannot stop watching George Bush's face oh they're doing it Putin now holy God so I just have a guy just sort of going. For you and then that's what Putin is doing yeah. Oh now it's Trump you know I mean those videos online my mouth agape again this is this this is a form of puppetry where your face is the is the puppet ear and the only thing is is that George w. Bush is the puppet so I sit in front of a camera I smile and the business is taken care of it that's real time this isn't like you have to render some software on your computer literally you download a clip or you take a clip from cable news and you turn on your webcam and however long it takes you to do it you're done the same as a shooting a video on your phone what is this for so what are the applications of this. I want to be able to help develop telepresence this is era again so I look telepresence yet so for example so my mom leaves in Israel. And I'm here and. What is the cool if I could have some it's kind of crazy but right but if I could have some going to hologram of her sitting on my couch here and we can have a conversation going one step further one of your colleagues a guy by the name of Steve cites I'm a professor at the University of Washington and I also work part time a Google He told me that they see this technology as like a building block that could one day be used to essentially virtually bring someone back from the dead I just think this technology combined with a virtual reality and other innovations could help me you know just be there in the room with Albert Einstein or Carl Sagan you know that sort of the motivation that's what they want to do that's the motivation to ghosts for them yes and when I was talking to some folks who work in commercials they're developing their own version of this and the idea is that they're going to make a 1000000 or a $1000000000.00 off of this because say you bring I don't know. Jennifer Aniston in. In to film some make up commercial and in the make up commercial in English she says so come and buy this product this is the best sort of whatever product around right now you've got China which is a booming market you maybe want to market things to China and you'd really like to be able to use Jennifer Aniston problem is Jennifer Aniston doesn't speak Mandarin so either you use the same audio clip and you have someone come in and speak Mandarin over her and the lips don't line up or you have to hire Mandarin speaking actor to come in and do the part of Jennifer Aniston with this technology all you have to do is record Jennifer Aniston once you can hire a Mandarin speaker and the Mandarin speakers voice will be coming out of Jennifer Aniston's mouth as if she had said it and in front of the camera her lips would be moving as if she were a perfect Mandarin speaker exactly exactly wow I think that is actually a that's a that's amazing yeah. I'm amazed and completely frightened by what you're telling me and that's the whole point of what Nick was writing about that they gave me shivers that some day if you join the video manipulation with the vocal voice manipulation you mean you're you're the the ultimate puppets here you can create. Anyone talking about anything that you want in their own voice and having any kind of emotion around it and you have it right there for everyone to see in video and all you need to do is take that and put it on Twitter or Facebook and if it's shocking enough i minutes later it's everywhere. The. Like the timing of you guys making this thing and then this explosion of fake news. How do you guys think about about how it could be used for various purposes. It's a good question. Again you're coming from and if you're going every technology is developed and there is danger of. Technology it can create fake videos and so on why they want to call it fake it is that like to create media from audio right but there are fake video Yeah but the way that I think about it is that like scientists are doing their job in sign like inventing the technology and science of and then we all need to like think about the next steps obviously I mean because of the white on that. And the answer is not there maybe it's an education maybe every video should come up with some code now that this is this is like a fanfic media are authentic attacks and you don't believe anything else I mean yeah but like it is maybe was the timing more than anything but I saw this video and it really felt like oh my god like America can't handle this right now like we're in a moment where we're truth seems to be sort of an open disc what is true is has become an open discussion and this seems to be adding fuel on the fire of sort of. Competing narratives in a way that I I find troubling and I'm just curious that you don't. I think that. I think that people if people know that knowledge exists then they will be more skeptical my guess I don't know but if people know that thinking is exists and they know that take that exists fake videos exist fake photos exist then everyone is more skeptical of what they read and see but like is a man in North Carolina I think it's from North Carolina believed from a fake print article that Hillary Clinton was running a sex ring out of a pizza parlor in d.c. Which is like insane this man believed and shut up with a gun and if people are at a moment where they are willing to believe stories as ludicrous as that like I don't expect them to wonder if this video is real or not. So what are you asking I ask you Well I'm asking do you are you afraid of the power of this and if not why. I'm just giving my own I don't know it just there I'm answering your questions bad I mean technologist I'm a computer scientist so. Not really because another and I know that because I know that this technology is reversible I mean well there is not. Too much. Have you seen these videos otherwise I can. Ok yeah. We're feeling worried and more than that surprised that the folks making these technologies weren't we decided to check if we were totally off base and get in touch with one of the guys who's on the front lines can you describe what was going to you when you were watching Bush's face I can tell you exactly what I was thinking I was thinking how are we going to develop a friends a technique to detect this. This is honey for Reed I am a professor of computer science at Dartmouth College he's sort of like a Sherlock Holmes digital misdeeds which means that spends a lot of time sitting around looking at pictures and videos trying to understand where has this come from has it been manipulated and should we trust that he's done work for all sorts of organizations the a.p. The Times who want to know if say a picture is fake or not they often ask me you have to when the just happened yesterday images came out of North Korea and every time images come out of these regimes where there's a history of photo manipulation there are real concerns about this so I was asked to determine if they've been manipulated in some way and if so how had they been manipulated and how did how the heck would you do that every time you manipulate data you're going to leave something behind let's say you do some funny business to a photo you might create some noticeable distortion in the picture itself but you also might distort the data and we're in the business of basically finding those distortions in the data for example imagine he gets sent a photo it's probably a j peg which now is 99 percent of the image formats that we see out there is what is called the compression scheme just a fancy way to say that when a photo is taken and stored as a j peg the camera you know just to save space throws a little bit of the data away so for example if I went out to the Dartmouth green right now and took a picture of the graph. The camera isn't going