Typhoon Logo
TYPHOON
Typhoon 2.5

Typhoon 2.5

General
Qwen 3
30B
4B

Lightweight, high-performance models with improved Thai alignment and controllable thinking mode for efficient deployment.

Released
October 20, 2025
Context
up to 128K tokens
Input
Text
Output
Text
Typhoon 2.5
About this Model

Our newest release, Typhoon 2.5, introduces a new generation of open-source models built for agentic AI, natural Thai fluency, and superb efficiency. Available in 4B and 30B (A3B) variants, it delivers human-like responses, faster throughput, and record-low inference cost — over 3,000 tokens/sec on a single H100. Designed for real-world use, Typhoon 2.5 is open, fluent, and ready to act.

Key Features
State-of-the-Art Fluency
Responses are more coherent, and indistinguishable from natural human dialogue — especially in Thai, where we prioritize linguistic authenticity over just content accuracy.
Enhanced Function Calling
Dramatically improved accuracy and reliability. Now seamlessly integrates into real-world workflows — ideal for automation platforms like n8n, Langchain, or custom orchestration pipelines.
High-Throughput & Cost-Effective
At 64 concurrent requests, a single H100 can process over 3,000 tokens per second — enabling inference costs as low as $0.10 per million tokens.
Two Optimized variants
Pick the perfect balance of speed and scale: 4B for ultra-efficient inference on edge devices, or 30B (A3B) for production-grade power.
Proprietary Performance. Open Source Benefit.
Matches or exceeds GPT-4.1 mini and Claude Sonnet 4 in benchmarks — all while offering full control, transparency, and cost efficiency.
Release History
Version 1
October 20, 2025
Initial release