Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Fortune
Fortune
Allie Garfinkle

Exclusive: Cartesia, voice AI startup, raises $64 million Series A

Brandon Yang, Karan Goel, Albert Gu, and Arjun Desai (Credit: Cartesia)

Since 1897, Karan Goel’s family has been running the same business. 

"My great-great-grandfather started a manufacturing business back in the day," said Goel. "He was a physics teacher, an educator, and decided he wanted to build lab equipment."

About a century later, Goel was growing up across the street from his family’s factory, which produces beakers and microscopes. His parents and grandparents all worked in the business, and they were "all pretty entrepreneurial."

"I think the best—and certainly biggest—times used to be shipping days, when all the stuff used to get put into trucks and shipped, because they were exporting to Europe and the U.S.," said Goel. "That was actually pretty fun, because you had all these boxes, then you put them all in trucks, very neatly stacked up."

From an early age, Goel was drawn to the optimization and precision that drive global goals—then as a fascination, now as a mission. Today, Goel is the CEO of Cartesia, a startup developing real-time generative AI models focused on voice AI. Cartesia emerged from Stanford's AI Lab in 2023, cofounded by Goel and fellow researchers Arjun Desai, Brandon Yang, and Albert Gu. Also on the founding team—Chris Ré, a Stanford professor, a prominent figure in AI, and a MacArthur "Genius."

"I feel like a lot of my decisions are driven by the people that I work with, not necessarily by the work they’re doing," said Goel. "I figure, if you’re around really smart, interesting people, you generally end up doing pretty amazing things. That was how I ended up working with [Chris Ré] for my PhD. That’s how I met my current cofounders, because all of them actually were his PhD students. We became friends and started working together. And when we graduated, we started this company."

Over the last two years, as the AI boom reached new highs and AI technology progressed, Cartesia has also grown. Now, Cartesia has raised a $64 million Series A led by Kleiner Perkins, Fortune has exclusively learned. Index Ventures, Lightspeed, A*, Factory, and Greycroft also participated in the round, along with others like Dell Technologies Capital and Samsung Ventures. The startup has now raised $91 million in total and counts Quora, Cresta, and Rasa among its customers. Cartesia’s latest audio model, Sonic, now is being used by more than 10,000 customers. 

Sonic has cut latency—the delay between a system’s input and output—from 90 to 45 milliseconds, Goel told Fortune. Cartesia’s AI models, he said, are designed to tackle core inefficiencies in modern AI, reducing computational costs while enabling the model to handle data on a large scale. Translation: The goal is for customers to be able to use AI for as much as possible, even and especially as they grow. Goel sees a wide range of use cases, from customer support to digital avatars, and is likely to only see more, as 2025 is expected to be a turning-point year for voice AI.

"Moving forward, voice is going to be such an important medium of communication," said Goel. "That's how you communicate with businesses. That's how you will communicate with computers. That's how you communicate with robots eventually."

Voices are nuanced. What makes a voice sound human? Do you want it to? How do you ensure the technology isn’t misused? Voice AI has been meeting technological breakthroughs recently, and is both art and science. Cartesia’s name is drawn from René Descartes, the 17th-century mathematician and philosopher known for the principle "I think, therefore I am" and his polymathic contributions to analytic geometry. 

"Cartesian coordinates are very core to mathematics," said Goel. "A lot of our work is very deeply mathematical."

At the same time, there’s an art to avoiding the uncanny valley—when an avatar or CGI creation looks or sounds human, but misses the mark in small (but massively unsettling) ways.

"You can have a video, but if the voice doesn't sound authentic and natural, the whole thing feels robotic," said Goel. "A single word can communicate a lot of meaning."

AI may seem worlds away from Goel’s great-great-grandfather’s factory, but they're actually built on similar foundations—precision, innovation, and the people who bring them to life. Whether stacking beaker-filled boxes in a truck or fine-tuning a voice model, tiny details—and the right colleagues—make a world of difference.

See you tomorrow,

Allie Garfinkle
X:
@agarfinks
Email: alexandra.garfinkle@fortune.com
Submit a deal for the Term Sheet newsletter here.

Nina Ajemian curated the deals section of today’s newsletter. Subscribe here.

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.