Mistral Medium 3.5 + Vibe: 128B dense open-weights model and async cloud coding agents at $1.5/$7.5 per million tokens
Mistral AI has introduced Mistral Medium 3.5 — a dense 128-billion-parameter model with 256k context, 77.6% on SWE-Bench Verified, and open weights under a modified MIT license. Alongside the model comes Vibe, an async cloud platform for coding agents launched from the CLI or Le Chat, plus a Le Chat Work mode preview for enterprise workflows. The model is priced at $1.5 input / $7.5 output per million tokens.
On April 29, 2026, Mistral AI announced Mistral Medium 3.5 alongside the new Vibe platform for asynchronous cloud coding agents and a preview of Le Chat Work mode. The announcement positions itself as a direct response to Cursor, GitHub Copilot, and Anthropic Claude for Creative Work — a full-stack AI development tool for the enterprise.
What is Mistral Medium 3.5?
This is a dense 128-billion-parameter model with a 256k token context, described as a “first flagship merged model” that combines instruction-following, reasoning, and coding in a single architecture. Concrete metrics:
- 77.6% on SWE-Bench Verified (fixing real GitHub bugs)
- 91.4 on τ³-Telecom (multi-tool calling in the telecom domain)
- Configurable reasoning effort per individual request
- Variable image size handling through a custom-trained vision encoder
- Reliable multi-tool calling and structured output
Deployment: The model is self-hostable on just 4 GPUs, which is significant for enterprises wanting on-premises setups. Weights are published under a modified MIT license, and the API is priced at $1.5 per million input, $7.5 per million output tokens.
Vibe: cloud coding agents
Vibe remote agents are asynchronous cloud sessions running in parallel without consuming local resources. They are launched directly from the CLI or within Le Chat, execute long tasks in the background, and feature sandbox isolation for edits and installs. A particularly useful capability: session “teleportation” allows moving a session from the local CLI to the cloud while preserving its history.
Vibe integrates with GitHub, Linear, Jira, Sentry, Slack, and Teams — positioning it as a complete development tool, not just a coding assistant.
Le Chat Work mode
The third piece of the puzzle is Le Chat Work mode (preview), an agent-driven mode for the enterprise:
- Cross-tool workflows across email, messaging, and calendar
- Research synthesis from the web, internal documentation, and connected tools
- Inbox triage with automatic draft replies and issue creation
- Persistent sessions for multi-turn problem-solving
- Transparent action logging with approval gates for sensitive operations
Why does this matter?
With this announcement, Mistral has made three simultaneous moves: a new generation flagship model (Medium 3.5), a new agentic platform (Vibe), and a new enterprise client application (Le Chat Work mode). This places it in direct competition with GitHub Copilot Workspace, Cursor, and Anthropic Claude for Creative Work on the coding front, and with OpenAI Managed Agents on the enterprise workflow front.
Open weights combined with aggressive API pricing ($1.5/$7.5) signal that Mistral is targeting EU sovereignty and the independence of enterprise buyers who want to avoid lock-in to Microsoft/AWS ecosystems.
Frequently Asked Questions
- What is new in Mistral Medium 3.5?
- A dense 128B model with 256k context, described as the 'first flagship merged model' combining instruction-following, reasoning, and coding. Achieves 77.6% on SWE-Bench Verified and 91.4 on the τ³-Telecom benchmark. Self-hostable on just 4 GPUs.
- What is Vibe?
- A cloud platform for async coding agents that run in parallel without using local resources. Launched from the CLI or Le Chat, they execute long tasks in the background with sandbox isolation for edits and installs. Integrated with GitHub, Linear, Jira, Sentry, Slack, and Teams. Sessions can be 'teleported' from local CLI to the cloud while preserving history.
- What are the prices?
- API: $1.5 per million input tokens, $7.5 per million output tokens. Model weights are open under a modified MIT license for self-hosting. Le Chat Work mode is available in preview.
This article was generated using artificial intelligence from primary sources.
Related news
GitHub Copilot in Visual Studio gets debugger agent and cloud agent sessions from the IDE
ArXiv Odysseys: CMU's realistic web agents benchmark reveals SOTA frontier models achieve 44.5% success and 1.15% Trajectory Efficiency on long-horizon tasks
AWS Bedrock AgentCore: Serverless MCP Proxy with IAM, OAuth 2.0 JWT, and CloudWatch Observability for Enterprise Governance