Alibaba unveils Qwen 3.5 series with Flash, 27B, 35B‑A3B and 122B‑A10B models
Photo by Compare Fibre on Unsplash
Alibaba’s Qwen team launched the Qwen 3.5 series in February 2026, unveiling four models—Flash, 27B, 35B‑A3B and 122B‑A10B—with native multimodal capabilities, 256K‑plus context windows and MoE efficiency that lets the 35B model run with only 3 B active parameters, according to a recent report.
Quick Summary
- •Alibaba’s Qwen team launched the Qwen 3.5 series in February 2026, unveiling four models—Flash, 27B, 35B‑A3B and 122B‑A10B—with native multimodal capabilities, 256K‑plus context windows and MoE efficiency that lets the 35B model run with only 3 B active parameters, according to a recent report.
- •Key company: Alibaba
- •Also mentioned: Qwen
Alibaba’s Qwen 3.5‑Flash is already being positioned as the go‑to hosted API for long‑horizon agents. The model ships with a default 1 million‑token context window and a built‑in tool‑calling stack, features that the Qwen team says are essential for “agentic AI” workloads such as codebase indexing or multimodal memory retrieval (Qwen 3.5 “Medium” series report). By offloading the heavy lifting to Alibaba Cloud, Flash promises sub‑second latency even when processing massive prompts, a claim that differentiates it from the dense‑only offerings of rivals like OpenAI’s GPT‑4o.
The 27‑billion‑parameter dense variant, Qwen 3.5‑27B, is marketed for on‑premise deployment. Unlike earlier Qwen releases that relied on separate vision adapters, the 27B model is trained from scratch with early‑fusion multimodal tokens, allowing it to ingest text, images and video in a single pass (Qwen 3.5 Model Series guide). Its 256K‑plus context window—extendable to over a million tokens—means developers can run long‑form generation or document‑level reasoning without chunking, a capability that the guide highlights as a “new generation of AI designed specifically for the agentic AI era.”
The most technically striking entry is Qwen 3.5‑35B‑A3B, a sparsely‑gated mixture‑of‑experts (MoE) architecture that activates roughly 3 billion parameters per token despite a total size of 35 billion. According to the “Medium” series report, the “A3B” suffix explicitly denotes the active‑parameter count, and the team asserts that this efficiency lets the model outperform its predecessor, the 235 billion‑parameter Qwen‑3‑235B‑A22B, on key evaluation benchmarks. The MoE routing overhead is reported to be modest enough for production use, and early adopters are being asked to compare its inference cost and VRAM footprint against dense 30‑40 billion‑parameter rivals.
At the top of the stack sits Qwen 3.5‑122B‑A10B, a 122‑billion‑parameter model that retains the same 256K context window while scaling up reasoning capacity. The guide describes it as the “Long‑Context Giant,” intended for tasks that demand deep chain‑of‑thought or extensive multimodal synthesis. Although Alibaba has not released performance numbers, the model’s architecture mirrors the early‑fusion multimodal foundation of the rest of the series, suggesting it can handle video‑plus‑text prompts without the latency penalties typical of post‑hoc vision modules.
Collectively, the Qwen 3.5 lineup signals Alibaba’s bet that efficiency and context length will outweigh raw parameter counts in the next wave of AI applications. By offering a hosted, tool‑enabled Flash service, a locally runnable 27B dense model, a cost‑effective 35B MoE, and a heavyweight 122B reasoning engine, the Qwen team covers the full spectrum of enterprise needs—from rapid prototyping to large‑scale, long‑context inference. The series’ native multimodal training, 256K‑plus context windows, and MoE‑driven parameter efficiency are all documented in the February 2026 Qwen 3.5 guide, positioning Alibaba as a serious contender in the increasingly competitive landscape of agentic and multimodal AI.
Sources
No primary source found (coverage-based)
- Dev.to Machine Learning Tag
- Reddit - r/LocalLLaMA New
This article was created using AI technology and reviewed by the SectorHQ editorial team for accuracy and quality.