Automation

ElevenLabs CEO says voice is becoming the default interface for agentic AI

ElevenLabs’ CEO argues voice-first, context-aware assistants will reshape how users control software across phones, wearables, and cars.

ElevenLabs CEO says voice is becoming the default interface for agentic AI
Feb 6, 2026
2 min read
By Marketing Team

Key Takeaways

  • ElevenLabs says voice is becoming a primary interface as speech models pair with LLM reasoning.
  • Agentic voice assistants will rely on persistent memory and integrations to reduce explicit prompting.
  • ElevenLabs is pursuing hybrid cloud plus on-device processing to support wearables and lower latency.
  • Privacy and data retention risks rise as voice interfaces become more persistent in everyday hardware.

Voice is quickly moving from a feature to a primary control layer for AI, and that shift matters for any business building customer-facing automation. ElevenLabs CEO Mati Staniszewski says the combination of high-fidelity speech generation with LLM reasoning is pushing interactions away from screens and toward always-available conversation.

Voice-first interfaces are expanding beyond phones and screens

At Web Summit in Doha (qatar.websummit.com), Staniszewski framed voice as the next major interface: less tapping, more talking. For marketers and e-commerce operators, this is a distribution change: “where” intent is captured moves from typed search boxes and app UIs to ambient, spoken commands in headphones, cars, and wearables.

Seth Pierrepont, a general partner at Iconiq Capital, added that keyboards are starting to feel outdated for many workflows, even if screens remain critical for gaming and entertainment. The implication is that product discovery, support, and post-purchase service may increasingly be mediated by voice agents rather than forms, menus, or live chat.

Agentic voice systems will rely more on memory, context, and hybrid compute

Staniszewski also pointed to more “agentic” behavior: systems that require less explicit prompting because they accumulate context over time and can take multi-step actions. In practice, that means persistent memory and tighter integrations (calendar, CRM, order management) so users can speak naturally without restating constraints every session.

On the infrastructure side, ElevenLabs is working toward a hybrid model that blends cloud and on-device processing. For businesses, on-device capabilities can reduce latency and keep some data local, while cloud components handle heavier generation and orchestration.

ElevenLabs is already partnering with Meta to bring voice into products like Instagram and Horizon Worlds, and Staniszewski signaled openness to future work on Ray-Ban smart glasses.

The tradeoff is privacy. As voice becomes “always on,” companies face higher scrutiny around surveillance and retention of voice data, especially after allegations that voice assistants have captured more than users intended, as seen in prior claims involving Google (TechCrunch also notes a recent settlement).

For B2B teams, the immediate takeaway is operational: start designing voice flows, guardrails, and measurement now, because voice will increasingly be where customer intent is expressed first.

Stay Informed

Weekly AI marketing insights

Join 5,000+ marketers. Unsubscribe anytime.

Related Topics

ElevenLabsvoice AIagentic AIwearablesMeta