HomeWinBuzzer NewsKlick Labs Unveils AI Detection for Audio Deepfakes

Klick Labs Unveils AI Detection for Audio Deepfakes

Klick Health has found a way to analyze voices in a manner that’s so granular, it can tell whether it's a person or an artificial intelligence-powered machine.

-

Klick Labs, the research division of Klick Health based in Toronto, has introduced a novel method to distinguish between human and AI-generated audio clips. This development comes amid a surge in deepfake content, which includes AI-produced video, audio, and images that mimic real individuals.

The proliferation of deepfakes has been accelerated by the advent of advanced AI chatbots, high-quality voice generators and replicators like those offered by Elevenlabs and Truecaller. High-profile figures such as Taylor Swift, President Joe Biden, and the Pope have all been targeted by these sophisticated forgeries. Europol has projected that by 2026, up to 90% of online content could be synthetically generated, a sentiment echoed by the Canadian Security Intelligence Service, which has labeled the situation a significant threat.

Recent voice cloning scams have underscored the urgency of developing reliable deepfake detection methods. In response, Meta has introduced mandatory labels for AI-generated content, and the Federal Communications Commission has ruled that deepfake voices in robocalls are illegal. Public policy and AI experts are particularly concerned about the potential increase in deepfake usage in the lead-up to the U.S. presidential election.

Technological Inspiration and Methodology

Yan Fossat, Senior Vice President of Digital Health Research and Development at Klick Labs, drew inspiration from science fiction to tackle this issue. Referencing films like “Terminator” and “Blade Runner,” Fossat and his team envisioned a tool akin to the Voight-Kampff machine, which measures physiological responses to determine authenticity. The results of this approach were published in the open-access journal JMIR Biomedical Engineering detailing Klick Labs' findings.

In their Toronto lab, Fossat and his team began experimenting with voice analysis. They gathered audio samples from 49 individuals with diverse accents and backgrounds and generated synthetic clips using a deepfake generator. These clips were then scrutinized for vocal biomarkers—distinctive features in voices that reveal information about the speaker's health or physiology.

Klick Labs has identified 12,000 vocal biomarkers, but their current detection method relies on five specific markers: speech length, variation, micropauses, macropauses, and the proportion of time spent speaking versus pausing. Micropauses are brief pauses under half a second, while macropauses are longer. These pauses occur naturally in human speech as people breathe or search for words.

Challenges and Future Prospects

Despite achieving an 80% success rate in identifying , Fossat acknowledges the challenge of keeping pace with rapidly evolving AI technology. For instance, OpenAI‘s recent advancements in generating vocal deepfakes that simulate micro-breaths have made detection more complex. However, Fossat remains optimistic, noting that thousands of other biomarkers, such as heart rate, could be leveraged for future detection methods.

Klick Labs' research extends beyond deepfake detection. They are conducting 16 other studies on vocal biomarkers and diseases, including a study published in Mayo Clinic Proceedings: Digital Health, which demonstrated an AI model capable of detecting Type 2 diabetes with high accuracy using just 10 seconds of voice data. This research will continue in collaboration with Humber River Hospital in Toronto, potentially leading to phone-based diagnostic tools.

Markus Kasanmascheff
Markus Kasanmascheff
Markus is the founder of WinBuzzer and has been playing with Windows and technology for more than 25 years. He is holding a Master´s degree in International Economics and previously worked as Lead Windows Expert for Softonic.com.

Recent News