ElevenLabs CEO says voice is becoming the default interface for agentic AI

Voice is quickly moving from a feature to a primary control layer for AI, and that shift matters for any business building customer-facing automation. ElevenLabs CEO Mati Staniszewski says the combination of high-fidelity speech generation with LLM reasoning is pushing interactions away from screens and toward always-available conversation.

Voice-first interfaces are expanding beyond phones and screens

At Web Summit in Doha (qatar.websummit.com), Staniszewski framed voice as the next major interface: less tapping, more talking. For marketers and e-commerce operators, this is a distribution change: “where” intent is captured moves from typed search boxes and app UIs to ambient, spoken commands in headphones, cars, and wearables.

Seth Pierrepont, a general partner at Iconiq Capital, added that keyboards are starting to feel outdated for many workflows, even if screens remain critical for gaming and entertainment. The implication is that product discovery, support, and post-purchase service may increasingly be mediated by voice agents rather than forms, menus, or live chat.

Agentic voice systems will rely more on memory, context, and hybrid compute

Staniszewski also pointed to more “agentic” behavior: systems that require less explicit prompting because they accumulate context over time and can take multi-step actions. In practice, that means persistent memory and tighter integrations (calendar, CRM, order management) so users can speak naturally without restating constraints every session.

On the infrastructure side, ElevenLabs is working toward a hybrid model that blends cloud and on-device processing. For businesses, on-device capabilities can reduce latency and keep some data local, while cloud components handle heavier generation and orchestration.

ElevenLabs is already partnering with Meta to bring voice into products like Instagram and Horizon Worlds, and Staniszewski signaled openness to future work on Ray-Ban smart glasses.

The tradeoff is privacy. As voice becomes “always on,” companies face higher scrutiny around surveillance and retention of voice data, especially after allegations that voice assistants have captured more than users intended, as seen in prior claims involving Google (TechCrunch also notes a recent settlement).

For B2B teams, the immediate takeaway is operational: start designing voice flows, guardrails, and measurement now, because voice will increasingly be where customer intent is expressed first.

ElevenLabs CEO says voice is becoming the default interface for agentic AI

Key Takeaways

Voice-first interfaces are expanding beyond phones and screens

Agentic voice systems will rely more on memory, context, and hybrid compute

Stay Informed

Related Topics