We’re excited to introduce Typhoon Translate—a lightweight, open-source model specialized for Thai-English translation—now available on Hugging Face and Ollama.
Why Typhoon Translate
Translation is one of the most common and valuable uses of language models—whether you're reading an English article, drafting an email to international colleagues, or translating business documents. Our research confirms this trend: people rely on AI-powered translation in both everyday life and professional settings.
But while translation tools are everywhere, they’re not always the right fit—especially when it comes to quality, privacy, and local relevance.
-
Many translations still sound awkward or miss key context—requiring tedious human cleanup.
-
Cloud-based models often raise concerns when dealing with confidential or sensitive information.
At Typhoon, we saw a clear gap: Thai users need a translation model that’s accurate, secure, and runs locally—especially in a world where Thai is still underrepresented online.
So we built Typhoon Translate.
It’s a lightweight, high-performance Thai-to-English and English-to-Thai translation model designed to run on your own device—giving you full control over your data while delivering natural, human-like translations that rival the best.
The Challenge That Inspired Us
We've all seen funny or confusing translations—like strange Thai on restaurant signs or puzzling instruction manuals. But bad translations are more than just amusing; they can cause real problems.
Thai is classified as a medium-low-resource language. Only about 0.5% of online content is in Thai, while English makes up nearly 50%—a hundred times more. Unlocking this vast amount of English content requires efficient and accurate translation models.
Now, we have LLMs like GPT and Claude. These tools are much better at translating because they understand the small details and context of what's being said. While using LLMs for translation has become more common, running translations locally on personal devices is still rare due to quality limitations. There are also important use cases for local translation models, especially when dealing with private documents or sensitive information that shouldn't leave an organization's environment.
Small language models are improving every day, but their translation quality has not advanced significantly. Translation remains a challenge due to the long-tail nature of linguistic data.
Introducing Typhoon Translate: Bridging the Gap with Local LLM
Typhoon Translate is designed to solve translation challenges for Thai users by offering clear, natural-sounding translations where you can run on your laptop.
Typhoon Translate provides high-quality English to Thai / Thai to English translations that feel natural which rival large proprietary models.
Our mission is simple: to help Thai people access information and opportunities by removing language barriers.
Key Highlight
🚀 Lightweight: 4B parameters—runs on a regular laptop, no powerful hardware needed.
🔒 Private & Secure: Translates directly on your device; your data stays with you.
🎯 Natural Translation: Delivers fluid, human-like translations on par with leading proprietary models.
📊 Proven Performance: Outperforms GPT-4o and Gemini 2.5 Flash on the AI-as-a-judge benchmark for Thai-English tasks.
🔧 Open Weight: Available now on Hugging Face and Ollama. Easy to try, fine-tune, or integrate into your own workflow.
Methodology
Translation isn’t a new problem—datasets like SCB-MT and OPUS have supported Thai-English translation for years. However, most were created before the rise of large language models (LLMs), at a time when scale mattered more than fidelity. In the LLM era, the focus has shifted decisively toward quality over quantity.
To meet this new standard, we curated a high-quality Thai-English dataset using a modern, multi-step pipeline:
-
Data Sourcing: We collected diverse, publicly available English and Thai texts across multiple domains.
-
LLM-Augmented Translation: Multiple LLMs—including Gemma-3 27B, Typhoon, QwQ, and others—were used to generate initial translations. We use two-stage Labeling:
-
Stage 1 emphasized throughput, generating large volumes of synthetic data.
-
Stage 2 applied human-in-the-loop validation, randomly sampling for fluency, accuracy, and tone to ensure quality control.
- Data Mixture & Selection: We filtered and blended sources based on quality signals, then trained multiple checkpoints to benchmark translation performance.
The best-performing checkpoint, based on both automatic metrics and human judgment, is what we release as Typhoon Translate.

Domain proportion of our training dataset
Evaluation Framework
To assess Typhoon Translate, we adopted a modern evaluation method using GPT-4o-mini as an “AI judge,” following the AlpacaEval 2.0 framework. This approach compares translations generated by Typhoon Translate against those from other systems—including GPT-4o-mini itself—across both English-to-Thai and Thai-to-English directions.
Instead of traditional metrics like BLEU—which rely on exact word matches—we used this AI-based approach because translation quality involves meaning, tone, and cultural nuance, not just matching words. GPT-4o-mini judged which translation was better based on accuracy, fluency, and context.
The score reflects how often Typhoon Translate and another system was preferred over GPT-4o-mini's own translations.
Evaluation Dataset
We compiled a balanced and diverse test set covering both directions, sourcing content from high-quality, publicly available datasets:
Thai-to-English Evaluation data (128 samples)
Balanced Thai text samples sourced from:
English-to-Thai Evaluation data (177 samples)
Balanced English samples from:
We benchmarked our model against Google Translate and state-of-the-art language models including GPT-4, Gemini, and Claude.
Evaluation Results: Typhoon Translate Outperforms Leading Models in Both Directions
Recap: We evaluated translation quality using the “LLM as a Judge” method, where GPT-4o-mini compares translations and chooses the better one based on accuracy, fluency, and context. The win rate (%) reflects how often each system’s translation was preferred over GPT-4o-mini’s own output.
English-to-Thai (EN→TH)

Typhoon Translate leads the pack with a 63.8% win rate—outperforming all other evaluated systems, including:
- Gemini 2.5 Flash Preview (61.6%)
- GPT-4.1-2025 (59.3%)
- Claude 3.7 (55.4%)
- GPT-4o-2024 (54.8%)
- Google Translate (44.1%)
- GPT-4.1-mini (41.2%)
Thai-to-English (TH→EN)

Typhoon Translate again takes the lead with a 67.2% win rate, significantly ahead of top-tier models:
- GPT-4o-2024 (62.5%)
- Gemini 2.5 Flash Preview (61.7%)
- GPT-4.1-2025 (60.2%)
- GPT-4.1-mini (55.5%)
- Google Translate (44.1%)
- Claude 3.7 (39.1%)
Translation Demos
English-to-Thai Translation Demos
Below are sample translations from Typhoon Translate for English-to-Thai tasks. Each example is shown alongside GPT-4o’s output, highlighting real-world use cases where translation quality and nuance matter.
1. Fiction
Goal: Preserve the elegance and tone of narrative writing to enhance the reader's experience.
Source: Fictional text generated via ChatGPT.
Input:
Result:

GPT-4o Output:
Typhoon Translate Output:
2. Professional Reports
Goal: Accurately translate professional and business content while preserving technical terminology.
Source: SCBX-AI-Outlook-2025_ENG_Final.pdf
Result:

3. Technical Reports
Goal: Deliver precise translations of technical content with correct use of domain-specific terms.
Source: Typhoon 2: A Family of Open Text and Multimodal Thai Large Language Models
Result:

Additionally, the Thai-English translation demo is provided in the appendix.
Conclusion
These results confirm that smaller, specialized models can outperform larger general-purpose systems when trained with the right data and approach. Typhoon Translate delivers enterprise-grade translation—without the need for cloud APIs or massive compute.
It proves that local, lightweight AI can rival and even surpass the best, especially for Thai-English translation tasks where nuance matters.
Limitations & Future Work
Of course, translation is ultimately about quality and true communication. For important work, we still recommend having a human in the loop for quality assurance (as is the case with this very article). We also welcome feedback to help us improve the model further—feel free to reach out and share your thoughts via the Typhoon Discord.
it's important to note that Typhoon Translate was trained with a context window of 8192 tokens. For optimal performance and to ensure the highest translation quality, we recommend that input texts do not exceed this length. While this capacity is suitable for a wide range of documents and common use cases, translating texts significantly longer than 8192 tokens in a single pass might not yield the best results. We are continuously exploring avenues for improvement, including translation styles and context in future iterations of the model.
Try Typhoon Translate Today
Get started with Typhoon Translate by running it locally on your machine. The model is available on:
New to Ollama? Check out our quick-start tutorial here.
What’s in the Hugging Face Collection?
We provide Typhoon Translate in multiple formats to support different platforms and use cases:
- Transformers
- Standard PyTorch format, ideal for training, fine-tuning, or using with Hugging Face pipelines
- GGUF
- Quantized versions for efficient CPU/GPU performance
- Works with tools like llama.cpp, llamafile, and text-generation-webui
- Great for lightweight, local deployment across platforms
- MLX
- Optimized for Apple Silicon (M1/M2/M3)
- Runs natively via Apple’s MLX runtime
Appendix
English-to-Thai Translation Demos (Full Text)
Example 2: Professional Reports
Input:
GPT4o:
Typhoon Translate:
Example 3: Technical Reports
Input:
GPT4o:
Typhoon Translate:
Thai-to-English Translation Demos
Example 1: Fiction
The result from Typhoon Translate and GPT-4o are similar in this example.

Example 2: Professional Reports

Example 3: Technical Reports
