Google and ElevenLabs Lead New Speech‑to‑Text Benchmark as Google Quantum‑Proofs HTTPS

Logo: Google
Artificial Analysis unveiled version 2.0 of its AA‑WER speech‑to‑text benchmark on March 1, 2026, with The‑Decoder reporting ElevenLabs’ Scribe v2 topping the list at 2.3% word‑error rate, followed by Google’s Gemini 3 Pro at 2.9%.
Key Facts
- •Key company: Google
- •Also mentioned: ElevenLabs, Artificial Analysis, Cloudflare
ElevenLabs’ Scribe v2 topped Artificial Analysis’ AA‑WER v2.0 benchmark with a 2.3 % word‑error rate, edging out Google’s Gemini 3 Pro at 2.9 %, according to a report by The‑Decoder on March 1, 2026. The ranking placed Mistral’s Voxtral Small a close third at 3.0 %, while Google’s Gemini 3 Flash followed at 3.1 % and ElevenLabs’ earlier Scribe v1 slipped to 3.2 %. Notably, Google’s strong showing came without any dedicated transcription training; the results derive from Gemini’s broader multimodal architecture, The‑Decoder noted. OpenAI’s Whisper Large v3, the most widely used open‑source model, landed mid‑pack at 4.2 %, with Alibaba’s Qwen3 ASR Flash (5.9 %), Amazon’s Nova 2 Omni (6.0 %) and Rev AI (6.1 %) trailing behind. The benchmark also featured a separate AA‑AgentTalk test focused on voice‑assistant commands, where Scribe v2 recorded a 1.6 % error rate and Gemini 3 Pro posted 1.7 %, keeping them well ahead of AssemblyAI’s Universal‑3 Pro at 2.3 %.
The results underscore a shifting competitive landscape in speech‑to‑text technology, where specialized startups can rival the output of tech giants. ElevenLabs, a year‑old firm that recently launched a stand‑alone voice‑generation app (TechCrunch), has leveraged its proprietary voice‑cloning pipeline to produce a model that not only excels in text‑to‑speech but now dominates transcription as well. The company’s rapid progress follows its earlier announcement of multilingual text‑to‑speech capabilities (VentureBeat), suggesting a broader strategy to become a one‑stop shop for voice AI. Google, meanwhile, continues to double‑down on its Gemini line, extending the same model family across both multimodal generation and speech transcription without bespoke fine‑tuning, a tactic that The‑Decoder highlighted as evidence of the model’s versatility.
While speech‑to‑text advances, Google is also addressing a looming security challenge: quantum‑resistant HTTPS. In a February 27, 2026 Ars Technica article, Google disclosed a plan to embed quantum‑proof cryptographic data into Chrome’s TLS handshake without inflating the certificate size beyond practical limits. Classical X.509 certificates occupy roughly 64 bytes, but quantum‑resistant equivalents can swell to about 2.5 kilobytes—40 times larger. To avoid crippling handshake latency, Google and its partner Cloudflare are employing Merkle‑Tree structures that compress the 2.5 kB payload into a 64‑byte space, effectively “squeezing” the data while preserving verification integrity. Bas Westerbaan, principal research engineer at Cloudflare, warned that larger certificates would “slow the handshake and leave people behind,” emphasizing the need for a seamless transition that does not degrade user experience or break middle‑box devices.
The convergence of these developments signals a broader industry push toward both functional excellence and future‑proof security. ElevenLabs’ benchmark win demonstrates that niche players can outpace incumbents in core AI metrics, while Google’s quantum‑proofing effort reflects the necessity of pre‑emptively hardening internet infrastructure against emerging threats. Both moves are being closely watched by enterprise customers, who increasingly demand high‑accuracy transcription for compliance and analytics, and by security‑focused organizations that must safeguard data against the eventual rise of quantum computing. As the AA‑WER benchmark continues to evolve, and as Chrome rolls out Merkle‑Tree‑based certificates, the AI and security ecosystems will likely see intensified competition, with performance and resilience becoming the twin pillars of market leadership.
This article was created using AI technology and reviewed by the SectorHQ editorial team for accuracy and quality.