Learn how to step-by-step create your first realistic deepfake video in a few minutes.
There comes a point in time when every IT security person needs or wants to create their first deepfake video. They not only want to create their first deepfake video but make it fairly believable, and if they are lucky, scare themselves, their friends, co-workers, and bosses. I get it. It is fun.
If you follow these instructions, it will take you longer to create the free accounts you need (a minute or two) than it does to create your first realistic-looking deepfake video.
There are literally hundreds of deepfake audio-, image-, and video-making sites and services and more appear each day. Each of the existing ones gets easier and more feature-rich every day. You can use any of these sites to create your first deepfake video.
But here’s the site I would start with: https://www.hedra.com/. It has a great, easy-to-use interface.
You will need to create and confirm your new free Hedra user account.
Then choose Create.
You will see a screen similar to this:
Upload (or copy and paste) the text of what you want the deepfake audio/video to be saying.
You can upload a voice, pick a voice from among the dozens of free existing ones, or clone (i.e., copy) your voice or someone else’s. The first time around I chose the premade voice of a British woman. You can type in a “prompt” to guide how you want the audio and video to be, but this is optional.
In the example below, I uploaded some text promoting my recent book, and uploaded a picture of actress Gillian Anderson.
Then I clicked on Generate video.
Note: Hedra, and most of the legitimate deepfake sites, don’t want you using copyrighted pictures or audio. You really must make sure you have the permission of the person you are using. I am using Gillian Anderson here purely as an example to represent a picture of anyone. You can use a real person or an AI-generated picture or person. Hedra and other sites will try their best to detect copyrighted images and celebrities and block you from using them.
That is it. Just wait a few minutes and your deepfake video is finished and is ready for download.
The first one I ever created was much better than I expected. It was not the actress’s real voice, but it was close enough that I bet most people would not notice, because she often plays British roles.
There are a ton of other deepfake video-making sites, but Hedra is among the easiest.
I wasn’t finished playing around, so I decided to do a “face swap”, replacing Gillian Anderson’s face with mine since I could not legally…without her permission, use her real image.
I went to another site called Akool (www.akool.com) (see below) and created another free account.
Then I uploaded the same Gillian Anderson photo and a photo of me (see below).
And this is what I came out with.
I am not a great-looking long-haired hippy.
I was amazed at how well Akool blended my face with the rest of Gillian Anderson’s head and body. It was really realistic-looking on the first attempt. So, at this point, I had this modified hybrid picture/video of me over Gillian Anderson’s body speaking in the voice of a British woman.
If I wanted to, I could simply pop over to Hedra again, clone my voice, and have this hybrid picture and video say anything using my voice, but I wanted to try a different voice-cloning service.
I went over to Speechify (https://myvoice.speechify.com).
The voice-cloning service (shown below) allows you to import anyone’s audio along with text you want that voice to say.
Most of the voice-cloning services like a minute or more of audio to clone a voice, but some services only need as little as 6 seconds of video to fake a voice with a reasonable amount of realism. But longer audio training clips are better for sure.
And just like that I had me purportedly saying something I had never said before
This is pretty scary. If I can get a short copy of your voice saying anything I can make you say anything. I can see a scammer calling you to ask if you want to buy an extended auto warranty and when you decline, they ask you a few more short questions designed to capture more of your voice. And you don’t realize that your voice essence has now been stolen forevermore.
As a final step, I uploaded my hybrid picture and newly fake cloned voice of myself to Hedra again and created a new video of hybrid-Roger saying how much he liked his own book (see image below).
So, I ended up with this scary-looking version of myself saying something nice about my book.
Here are my key takeaways:
- It will take you longer to set up your free accounts than it will to create realistic deepfake videos
- The first tries at deepfake videos are good enough to fool most people
- The tools are good enough that anyone can do it in a few minutes
- If you have got more than a few minutes your deepfake videos are probably going to be very good
- All an attacker needs to clone you or anyone else is your picture and an audio sample of your voice. That is it.
The World Has Changed
It is clear to me that the point in time where ANYONE can EASILY create realistic deepfake videos is here. And, yes, bad people are absolutely going to use these tools to fool unsuspecting users.
Just as you taught yourself, your family, your friends and your co-workers not to trust all emails, calls, social media reach outs and SMS messages, now you must do the same for video, image and audio scenarios. You CANNOT trust any digital medium transmitter to be who they really say they are. You can’t trust any image or video to be real. It is just a new reality. We didn’t use to have to worry about such a thing, and now we do. Life has changed. Adjust accordingly.
How To Detect Deepfakes
There are lots of services that claim to detect deepfake videos and audio. My co-worker James McQuiggan has tried a bunch and said none are close to perfect. All have too many false-negatives and false-positives. They can help, but you can’t really rely on them. Expect AI-deepfake detectors and AI-deepfakes to always play a losing cat-and-mouse game like antivirus scanners and malware.
Perry Carpenter, KnowBe4’s Chief Human Risk Management Strategist and author of his recent best-selling AI book, FAIK: A Practical Guide to Living in a World of Deepfakes, Disinformation, and AI-Generated Deceptions, puts it this way to my team at KnowBe4 (and I’m summarizing):
Nearly every tool and service we use is going to be AI-enabled and assist us in some way. Our social media channels are going to help us create AI-assisted better versions of ourselves, with better text, audio, pictures and video. Every audio, picture, and video tool is using or going to use AI to make better output which we all will happily use. They already are. To ask a deception-detecting tool if something is AI-generated or not doesn’t make sense in a world where many, many legitimate things that we are all going to use are AI-assisted or AI-generated.
Note: You should buy Perry’s FAIK book.
Is that audio, picture, or video AI-generated? Yes! There you have it. I have already told you how any AI-detection tool will respond to nearly all future generated audio, video, and image.
The primary question you have to ask yourself is if what you are being told/shown is trying to be maliciously deceptive with an agenda in some way?
Whether the content is real or AI is not as important as if it is trying to maliciously deceive you. Focus on the content…not whether an image looks a little fake or has blurred fingers.
So, lessen your “Is that AI?” radar and strengthen your “Is that BS?” radar.
AI-generated or not, if the message is unexpected and asking you to do something you have never done before (at least for that sender) you should probably confirm it using some other method before performing the requested action or reacting too emotionally. See the graphical representation of those points below.
If I had only one minute to teach everyone how to best detect malicious scamming messages now and in the future, this is it: If the contact is unexpected and asking you to do something you have never done before (at least for that requestor), STOP and THINK before you react. It won’t work for every scam, but it works for the bulk of them.
Train yourself that way. Train your family that way. Train your employees that way. How well you teach this and how well your employees learn and practice this skill will likely determine if your organization is or isn’t successfully hacked in a given time period.
Closing
The era of easy deepfakes is here…has been here…and is just going to get easier and more common. But we humans are a resilient bunch. We are not just going to sit there and get scammed over and over again without reacting.
All our cyber defense tools will be AI-enabled and be able to better protect us against AI-enabled (and real) scams. We just need to treat all audio, images and video like we do emails and text messages, today. Focus on the content of the message, because if I’m trying to scam you, the message or content will be malicious in some way, and that does not change just because it looks like me or hybrid me. I still have to ask you to send me your password, send that money somewhere, or do something that is harmful to your own interests.
If you want more help on helping your co-workers spot deception, get Perry Carpenter’s book.