- Voice AI startups raised over $398 million in VC funding in 2024, according to PitchBook data.
- The technology is expanding into enterprise uses such as customer service and assistants.
- BI spoke to investors about untapped opportunities in the nascent voice AI space.
Voice is rapidly becoming the new AI battleground.
From noisy virtual assistants to speech synthesis tools, technology has taken off in the past year.
While AI voice technology is not new, the tools have rapidly become more sophisticated, driving adoption from call centers to recruitment agencies.
Its use cases are wide ranging from real-time audio transcriptions to generating synthetic voices from text requests.
Investors looking for the next opportunity in the highly competitive AI market have thrown their checkbooks behind startups. According to PitchBook data, startups developing voice artificial intelligence technology raised over $398 million in VC funding in 2024.
London-based PolyAI, which has developed voice assistants for call centers, secured $50 million in a funding round from Hedosophia. London and New York-based ElevenLabs, which has developed a voice cloning technology, raised $100 million in January 2024 — and is said to be raising another $200 million, Business Insider first reported.
“Recent breakthroughs in real-time speech-to-speech processing have unlocked new use cases, including virtual assistants, customer support and voice-based productivity,” said Sivesh Sukumar, an investor at VC firm Balderton. “Companies like ElevenLabs and OpenAI are at the forefront of this space, with ElevenLabs releasing a real-time API that is expected to drive further adoption.”
Voice AI is a relatively nascent space, so it doesn’t yet have an incumbent — but it’s driving investor excitement about the untapped opportunities in the sector, Sukumar added.
An expanding ecosystem
Startups are quickly identifying how to adapt voice technology to a multitude of enterprise and consumer needs. And with agentic AI a hot topic for CEOs, its overlap with voice technology could have high potential.
PlayAI, a startup developing an AI platform for text-to-speech models and AI voice agents, raised $21 million in seed funding in November.
“We’ve seen a massive increase in interest in building voice agents that a human can talk to just like another human,” said PlayAI co-founder Hammad Syed. “Voice AI is crossing the mainstream and will be a key interface in how people interact with technology. Investors definitely recognize this opportunity,” he added.
VCs scouring the ecosystem to make their next big bet are now looking at voice AI as a technology stack, said Steve Jang, founder and managing partner at Kindred Ventures, which also backed PlayAI. The firm’s investment thesis is to support “multi-layered startups with multiple use cases across consumer, enterprise and creative.”
“First, there are specialized and foundational models. Second, there are services and infrastructure tools, which provide access and integration with these models. And then, perhaps most importantly, there is the large vertical application space,” he said. for BI.
The sector is also attractive to investors because voice is an easy category to make money. “You can price it by the bottom line, so it’s very easy to monetize,” said Jonathan Userovici, general partner at VC firm Headline. “That’s why you have so much revenue traction — it’s very easy to get a return on investment, especially if you’re replacing a human doing that job.”
Consumer appetite for voice AI has also grown. With more users preferring to receive information through audio formats such as podcasts, Sukumar highlighted the growing consumer demand for voice control and audio interfaces. He built PersuAIsion, a voice AI platform that allows users to practice real-world conversations — from job interviews to first dates — because he saw the space for voice to meet such consumer needs.
“If OpenAI can capture the consumer voice agent, they will be what Siri was meant to be,” he said. “I think there’s going to be a lot more interoperability with personal devices and there’s just going to be a better e-commerce consumer experience on that front.”
Frontier labs are catching up
Despite its growing popularity, voice artificial intelligence doesn’t seem to have a settled trickster yet. Part of the reason may be that frontier labs have largely stayed away from space, perhaps out of concern that a misuse of sound-generating capabilities could result in a potential backlash, according to Air’s State of AI 2024 report Street Capital.
“Despite collecting large amounts of audio and video data, border labs have been slow to release text-to-speech products,” said Nathan Benaich, founder and general partner of Air Street Capital. He pointed to OpenAI’s advanced voice mode, which was repeatedly pushed back, and Google’s NotebookLM, which “is relatively blocked.”
Artificial intelligence experts had sounded the alarm about the possible rise of “deepfakes” in a year marked by global elections – but it did not end that way.
“In all likelihood, the labs were keen to avoid panicking about the deep rigging that often accompanies major elections. I think it’s inevitable that they’ll play more in this space, just because the potential commercial opportunity is so great.” big,” Benaich said.
Big Tech may be slowly bucking the trend. Amazon’s plans to increase its voice assistant offerings through Alexa were delayed until 2025, and Apple recently enhanced its Siri feature by adding ChatGPT capabilities.
However, Benaich noted that it will not be an easy task for any company to take the crown. “Replacing companies like ElevenLabs, which already enjoy widespread adoption and have been optimizing their tools for enterprise users for years now, could prove challenging,” he said.