Malaysia

Is Anwar-Zahid audio a deepfake? Only digital forensics can tell, say experts

They say creating a bogus recording is an extremely tedious task, but not an impossible one

Updated 5 years ago · Published on 08 Apr 2021 5:00PM

Is Anwar-Zahid audio a deepfake? Only digital forensics can tell, say experts
A deepfake audio is a falsified audio by means of deep machine learning that is able to clone a person’s voice and make it indistinguishable from the original speaker. – Google Wavenet pic, April 8, 2021

by Amar Shah Mohsen

KUALA LUMPUR – Within hours of a purported conversation between PKR president Datuk Seri Anwar Ibrahim and Umno president Datuk Seri Ahmad Zahid Hamidi going viral yesterday, both leaders have since issued separate denials and pledged action on the matter.

While the authenticity of the audio recording has yet to be verified, this incident has raised the question on whether it is possible that their alleged exchange had been “deepfaked”, and how easy it is to accomplish this.

For those unfamiliar, a deepfake audio is a falsified audio by means of deep machine learning that is able to clone a person’s voice and make it indistinguishable from the original speaker. In simpler terms, it is like Photoshop for voices.

A simple Google search will let you know that there are multiple applications and software readily available that will allow you to deepfake.

While earlier speech synthesis – known as concatenative text-to-speech (TTS) – sounds less natural as it is done via stitching short fragments of a huge database of speech, newer and convoluted technology using parametric TTS can yield more natural-sounding speech as contents and characteristics of the speech can be controlled.

Applications like WaveNet – a deep generative model of raw audio waveforms backed by Google subsidiary DeepMind – has the capability to generate synthetic utterances, and even create fillers and made-up word-like sounds.

The software Descript allows users to create a realistic TTS model of themselves by uploading just a few minutes of audio and typing in words they want to add into a recording. It is meant to save podcasters from having to re-record an audio when a mistake is made in their initial recording.

Such is the advancement of technology that researchers from Washington University were able to create a synthetic Barack Obama using neural networks to model the former United States president’s mouth shape to match any audio to him.

Below is an example of how Canadian company Lyrebird came up with a software that can mimic anyone’s voice, and faked a conversation between Obama, Donald Trump and Hillary Clinton. 

Difficult, but not impossible

Two cybertechnology experts told The Vibes that while it requires a lot of resources and time to deepfake an audio – hours of voice recording data of an individual are needed to create a high quality deepfake – it is not an impossible task.

SysArmy Sdn Bhd chief technical officer Alan Yau said depending on how accurate and similar a person wants an audio to mimic an individual, it could take weeks to produce a deepfake.

In an era of advanced technology, it is very hard to tell the difference between an actual audio and a fake one, he added.

“Even if you ask a professional whether an audio recording is of a certain person, they can never give you a yes-or-no answer. They can only say it is likely or unlikely. No one can be 100% sure.

“What usually happens in trying to authenticate a recording is that you may need multiple views to make a conclusion. This is why, even in courts, they don’t depend on the input of only one forensics expert. You need to get at least the opinions of two or three others.”

Digital forensics specialist Fong Choong Fook says each voice has a unique pattern, so one can usually distinguish if a recording is a deepfake. – File pic, April 8, 2021
Digital forensics specialist Fong Choong Fook says each voice has a unique pattern, so one can usually distinguish if a recording is a deepfake. – File pic, April 8, 2021

Digital forensics specialist Fong Choong Fook said producing a deepfake audio is a tedious process, which requires hours of machine learning of a person’s conversations and speeches to understand the pattern and style in which one speaks.

To prove whether an audio is authentic, he said samples of the person’s voice will be recorded and digitalised for voice analysis, and then compared with the “deepfake” recording.

“Everyone’s voice has a unique pattern, so we can usually distinguish if a recording is a deepfake.

“This is also the case when a court is trying to ascertain if an audio recording is genuine; a sample of the person’s voice will be recorded and digitalised.”

In yesterday’s case, Fong said as a report has been lodged, the next step will be to hire an independent forensics expert to conduct digital forensics on the leaked audio, which has since gone viral.

In the alleged audio recording of the two party leaders, among others, “Zahid” supposedly said to “Anwar” that his winding-up speech at the Umno general assembly on March 28 was only “tactical”, in an apparent reference to his remarks that there will be “no Anwar, no DAP and no Bersatu”.

“Anwar” replied saying there is still work to be done with the Umno Supreme Council – probably referring to convincing it to work with PKR.

“Anwar” said it will be good if one of Umno’s ministers resigned from the Perikatan Nasional cabinet, but added that none has the courage to do so.

“Zahid” replied by calling Umno ministers “cowards”. – The Vibes, April 8, 2021

Related News

Opinion / 2mth

A civilizational moment for Malaysia: From Al-Attas to Osman Bakar

Opinion / 3mth

The 'Age of Apps': Making daily life easier, not harder

Opinion / 4mth

Government Procurement Bill 2025: Evidence of government’s firm stance against corruption

Malaysia / 7mth

 PM’s claim of compliance on Sabah’s 40% share conflicts with court ruling, says Roger Chin

Malaysia / 7mth

Nation on right track towards economic objectives, say economists

Events / 7mth

Global leaders, thinkers, and advocates from across the Global South in KL for three-day conference

Spotlight

Business

Tycoon Vincent Tan trims BCorp stake further in RM115m share sale

Malaysia

UMNO’s solo gamble in Johor: A show of strength or risky miscalculation?

By The Vibes Says

Malaysia

Nik Aziz’s grandson allegedly slapped by senator: Father ready to take case to court

Malaysia

Lorry driver jailed a day, fined for making obscene gestures, dangerous driving (video)

Malaysia

PKR leader defends MyKhas access suspension for PJ, Subang MPs, cites ‘political choices’

Opinion

Social media set to dominate Johor polls as election kingmaker

Malaysia

Man charged in Butterworth parang attack case that left victim fearing permanent disability

Malaysia

Teen mothers must return to school, says Fadhlina as education remains priority

Malaysia

Penang water tariffs to increase from July 1 after year-long deferment

You may be interested

Malaysia

Headless teen tragedy: VW driver charged with dangerous driving causing death

Malaysia

Govt prioritises effective administration over early election talk - Anwar

Malaysia

Care home worker jailed 36 years and caned for sexual offences against five boys

Malaysia

Nik Aziz’s grandson allegedly slapped by senator: Father ready to take case to court

Malaysia

Malaysians may soon be able to get a 10-year passport for RM350

Malaysia

Teenager who drove recklessly, causing death remanded for further investigation

Malaysia

Police confirm mystery of Jaslinda's disappearance has no criminal element

Malaysia

Southeast Asia’s booming scam industry eyes Malaysia