Typhoon 2.5

General

Qwen 3

30B

Typhoon 2.5 brings agentic intelligence, ultra efficiency, and natural Thai fluency to real-world workflows.

Back to Models

Released

October 20, 2025

Context

up to 128K tokens

Input

Text

Output

Text

About this Model

Our newest release, Typhoon 2.5, introduces a new generation of open-source models built for agentic AI, natural Thai fluency, and superb efficiency. Available in 4B and 30B (A3B) variants, it delivers human-like responses, faster throughput, and record-low inference cost — over 3,000 tokens/sec on a single H100. Designed for real-world use, Typhoon 2.5 is open, fluent, and ready to act.

Key Features

State-of-the-Art Fluency

Responses are more coherent, and indistinguishable from natural human dialogue — especially in Thai, where we prioritize linguistic authenticity over just content accuracy.

Enhanced Function Calling

Dramatically improved accuracy and reliability. Now seamlessly integrates into real-world workflows — ideal for automation platforms like n8n, Langchain, or custom orchestration pipelines.

High-Throughput & Cost-Effective

At 64 concurrent requests, a single H100 can process over 3,000 tokens per second — enabling inference costs as low as $0.10 per million tokens.

Two Optimized variants

Pick the perfect balance of speed and scale: 4B for ultra-efficient inference on edge devices, or 30B (A3B) for production-grade power.

Proprietary Performance. Open Source Benefit.

Matches or exceeds GPT-4.1 mini and Claude Sonnet 4 in benchmarks — all while offering full control, transparency, and cost efficiency.

Release History

Version 1

October 20, 2025

Initial release

Availability

Web Playground