It was an average day in January when Jennifer DeStefano decided to answer a call from an unknown number. On the other end of the line, DeStefano's 15-year-old daughter was sobbing and begging for help.
"Mom, these bad men have me, help me, help me," her daughter pleaded.
As DeStefano began to panic, a new voice came onto the line: her daughter's supposed kidnapper, demanding a $1 million ransom.
Related: With identity fraud spiking, AU10TIX is working to change the way the industry works
DeStefano was frantically receiving ransom instructions from the kidnapper when a friend was able to get a call through to DeStefano's husband. Her daughter was safe and in bed.
The girl on the other end of the phone, the girl who sobbed and cried in the voice and with the unique inflection of DeStefano's daughter was never real. It was a synthetic scam, generated by artificial intelligence.
As so-called generative AI becomes more powerful and more accessible, such instances of deepfake fraud are on the rise. In the course of DeStefano's testimony before the Senate Judiciary Committee in June, she related a second story involving a similarly fake, yet convincing, call placed to her own mother.
An April survey conducted by ID verification software maker Regula found that 37% of global businesses have already been hit with synthetic voice fraud last year; 25% of banks reported more than 100 synthetic fraud incidents in 2022 alone.
In the midst of this environment that is designed to deceive our eyes and our ears, Pindrop has designed and implemented deepfake fraud protection software. The company works with a number of corporations, notably banks and big retailers, leading the charge in voice authentication and security. The threats Pindrop has encountered in these endeavors have begun to grow recently as deepfake fraud, boosted by generative AI, gets better and more prolific.
"We strongly believe with the explosion of AI, it's the threat of deepfakes that breaks all trust," Vijay Balasubramaniyan, the CEO and co-founder of Pindrop, told TheStreet in an interview.
Banks, he said, are struggling to determine if a call is coming from a real customer, or if it is the result of an AI-generated scheme, another element in an environment where people are hard-pressed to determine if something seen online is real: deepfaked photos, videos and audio of politicians, world leaders and celebrities have, likewise, been on the rise.
Microsoft's VALL-E, released in January, can synthesize a person's voice given a three-second audio clip.
And to the average ear, such synthetic iterations are nearly indistinguishable from the real thing.
Related: Artificial Intelligence is a sustainability nightmare - but it doesn't have to be
How the software works
The software, powered by machine learning algorithms, presents a two-pronged defense against potential fraudsters. The first determines if a given voice is a human or a machine and the second ensures it is the right human.
In terms of arming that first prong, Pindrop, employing data scientists who serve additionally as linguistics experts, has focused on the evolution of human linguistics. Humans have a particular way of speaking that AI — lacking such biological components as vocal cords and a mouth — struggle to replicate.
Words containing the letters "S" and "F," for example, are often indistinguishable to machines from noise, Balasubramaniyan said. Machines, unable to determine the difference between some sounds and letters, often mess up in such areas.
Further, the temporal nature of human speech is one that machines struggle to replicate.
"When you say 'hello Paul,' my mouth is open when I say hello, and my mouth shuts down when it says Paul," Balasubramaniyan said. "The speed with which I can do that has physical limitations. These machines don't care about any of that. They're just caring about making sure your ear thinks it's a human on the other end."
These attacks, while often indistinguishable to a human, are detectable by Pindrop's network, which has been trained to scan for such anomalies. And with thousands of voice samples available in every second of a given recording — a 16,000-hertz channel provides 16,000 samples of someone's voice every second — Pindrop is able to determine the likelihood that a call is synthetic.
Pindrop has an additional adversarial system that is designed to find ways to beat its vocal authenticator, making the flagship defensive software "generations ahead of any known attack," according to the company.
When Microsoft's VALL-E first came out, Pindrop's system, without additional training, was 99% accurate in detecting the synthetic fraud coming from the new software.
Related: The ethics of artificial intelligence: A path toward responsible AI
Pindrop in use
The software, Balasubramaniyan said, acts as a sort of traffic light. An agent at a call center will see a real-time analysis of a call, designed to quickly determine if the voice on the other end is real, accurate or synthetic.
And though Pindrop has the kind of software that could help those who find themselves in situations similar to DeStefano's terrible deepfake incident, the company is not yet at a stage to release such a product to an individual consumer.
Amit Gupta, Pindrop's vice president of product management, told TheStreet that to solve such a use case Pindrop's software would need to be on everybody's phone, analyzing every call, something many consumers might not grant access to. The company, Gupta said, would need to establish some sort of partnership with phone makers and carriers in order to provide such a service.
Gupta said that Pindrop is not currently exploring such a partnership, though he did add that Pindrop can and does apply its software to social media video verification for interested customers.
"We certainly take pride in the fact that we are making the world a little bit better every day," Gupta said. "Not probably a lot, but when we find fraudsters, when we protect even enterprises against fraud, it is the end user's account that is protected."
Action Alerts PLUS offers expert portfolio guidance to help you make informed investing decisions. Sign up now.