Today, our team at Typhoon is proud to share a milestone we’ve been working toward for more than a year: Typhoon Isan, a language technology suite featuring the first production-ready ASR model supporting the Isan dialect, developed with systematic transcription and spelling standards, alongside a collection of open research and linguistic datasets.
This launch represents more than just a new model. It reflects the kind of AI we want to build—technology that understands real people, real voices, and the rich linguistic identity of Thailand—while reinforcing our belief that open research is essential to strengthening Thailand’s NLP and AI ecosystem.
Why We Started Building Typhoon Isan
When we looked at the landscape of speech technology, we saw a clear gap. Most ASR systems struggle with dialects because there simply isn’t enough well-organized, high-quality data. Even our previously launched Typhoon ASR was trained mainly on central Thai.
We believe AI must be truly built for everyone in Thailand. So we started by piloting with Isan as the first dialect—it’s considered the most widely spoken dialect in the country with over 20 million people speaking Isan daily.
The mission to build AI truly for Thai guided how we approached this project: not just to build a model, but also to build NLP resources for the dialect. As an R&D team comprising linguists, researchers, engineers, and business/community practitioners who come together to develop language technology for Thai, we’re excited to contribute to this milestone.
Challenges
This year-long project was, of course, not an easy one. We faced two core foundational challenges throughout the project:
-
Isan is primarily a spoken language with no clear written standard.
There is no widely accepted writing, spelling, or transcription standard, which makes ASR development difficult. Our linguistics team worked closely with local speakers, teachers, and experts to define consistent transcription and spelling conventions before model training could begin.
-
Isan is a (very) low-resource language.
We often say Thai—central Thai included—is already a low-resource language. Working with a dialect makes the gap even clearer: high-quality datasets are extremely scarce. To address this, the Typhoon linguistics team directly collected spoken data from local speakers across the region, then carefully curated and annotated it according to the established standards.
What We Release
Under Typhoon Isan, we’re releasing both the linguistic foundations and the technical tools needed for dialect AI: open datasets, spelling and transcription standards, and two ASR models designed to understand Isan speech.
1. Typhoon Isan Speech Corpus Suite
These are the core resources we created and used to develop both Typhoon Isan ASR and Typhoon Isan TTS, and we’re releasing them openly so researchers, educators, and developers can build on them:
1.1 Isan Speech Corpus


This dataset contains audio recordings of Isan (Northeastern Thai) speech, paired with rich transcriptions and demographic metadata. It is designed to support Automatic Speech Recognition (ASR), dialect study, and text normalization tasks for the Isan language.
The dataset features spontaneous responses to specific questions, covering two various domains (General and, Finance, etc.), recorded by speakers from different provinces in Northeastern Thailand.
1.2 Isan Spelling Standard

A Thai-script–based orthographic system designed to represent Isan words consistently and systematically. It provides clear rules for writing Isan in a way that supports linguistic research, dataset creation, and AI model training.
1.3 Isan Speech Transcription Convention

A comprehensive guideline for transcribing spoken Isan in a consistent, machine-readable form. It defines rules for segmenting speech, marking tones, representing pronunciation, and handling variations across regions—ensuring high-quality annotations for AI and NLP training.
1.4 Isan Phonetic Dictionary

A curated phonetic lexicon that maps words to their Isan pronunciations. It provides consistent phonetic representations to support speech recognition and text-to-speech development.
Access the dictionary dataset here.
Access the phonetic transcription guideline here.
1.5 Isan Dialect Classification

An analytical report on how Isan accents vary across Northeastern provinces, using linguistic features to identify groups, patterns, and similarities among regional speech varieties.
1.6 Technical Report
A technical report detailing our development of the Isan speech dataset and the transcription protocols.
2. Typhoon Isan ASR
Typhoon Isan ASR is an open-source Automatic Speech Recognition model capable of accurately transcribing spoken Isan into text following the transcription standards we developed and published. It includes two model variants to suit different types of applications:
2.1 Typhoon Isan ASR Real-time

Typhoon Isan ASR Real-time is an open-source speech recognition model designed to support the Isan dialect alongside central Thai. It operates with high speed, accuracy, and low latency, even on standard hardware—making it ideal for real-time applications such as online meetings or intelligent assistant systems.
This model is fine-tuned from Typhoon ASR Real-time that uses NVIDIA NeMo fastConformer-transducer-large architecture and is optimized to address the difficulties mainstream ASR systems face when handling Thai regional dialects.
2.2 Typhoon Isan ASR Whisper

Typhoon Isan ASR Whisper is an open-source ASR model for Thai speech, fine-tuned to support the Isan dialect. It is adapted from Whisper Medium (Biodatlab), built on OpenAI’s Whisper architecture. This enables the model to handle Thai speech—including code-switching with English and other languages—with high accuracy.
This variant aims to overcome the common limitations of general-purpose ASR systems, which often fail to recognize regional dialects accurately while still remaining compatible with standard Whisper pipelines.
Comparison: Typhoon Isan ASR Real-Time vs. Typhoon Isan ASR Whisper
Choose Typhoon Isan ASR Real-time for live transcription, or Typhoon Isan ASR Whisper for high-accuracy transcription of recorded audio.
| Feature | Typhoon Isan ASR Real-time | Typhoon Isan ASR Whisper |
|---|---|---|
| Key Strengths | Low-latency real-time ASR; Runs fast on CPU or small GPUs; suitable for edge devices | High accuracy, multilingual support |
| Mode | Streaming, immediate transcription | Batch processing after audio upload |
| Model Size | ~115M parameters (lightweight) | ~800M parameters (larger, accuracy-focused) |
| Architecture | NVIDIA NeMo fastConformer-transducer-large | OpenAI Whisper Medium (fine-tuned by Biodatlab) |
| On-prem Deployment | Fully supported | Fully supported |
| Cost & Accessibility | Very low compute cost | Higher compute needs but still cheaper than commercial ASR |
Model Evaluation
To assess Typhoon Isan ASR’s performance on natural Isan speech, we evaluated it against a range of strong ASR baselines—Gemini, academic dialect models, and Whisper-based systems—using our internal test set of 500 utterances drawn from speakers across 10 provinces in Northeastern Thailand—Khon Kaen, Roi Et, Udon Thani, Ubon Ratchathani, Chaiyaphum, Maha Sarakham, Kalasin, Nong Bua Lamphu, Sakon Nakhon, and Yasothon—capturing a broad range of regional accents.
Character Error Rate (CER)
Character Error Rate (CER) reflects accuracy at the character level and is widely used for evaluating Thai and low-resource languages. A lower CER indicates higher transcription accuracy.
CER Results

| Model | CER | Revised Notes |
|---|---|---|
| scb10x/typhoon-isan-asr-whisper | 0.0885 | Whisper Medium fine-tuned on the Typhoon Isan dataset. |
| Gemini-2.5-pro | 0.1020 | Large-scale commercial ASR model used as a general benchmark. |
| scb10x/typhoon-isan-asr-realtime | 0.1065 | FastConformer-based real-time model optimized for low latency on standard hardware. |
| scb10x/whisper-medium-dialect-exp2-ep5 | 0.1772 | Whisper Medium fine-tuned on existing dialect datasets from NECTEC and SLSCU. |
| SLSCU_korat_model | 0.7008 | Academic model from Chulalongkorn University primarily trained on Korat (Nakhon Ratchasima) speech. |
Typhoon Isan ASR performs competitively with Gemini—while remaining fully open-source.
Our Isan ASR achieves accuracy on par (for the real-time one) and outperform (for the Whisper) with a large commercial ASR system like Gemini, demonstrating that open-source, domain-specific models can match proprietary solutions on dialectal speech.
The Whisper-medium-dialect baseline helps us understand how much our new data and standards improve dialect ASR.
By evaluating a model trained on existing dialect datasets from NECTEC and SLSCU, we observe a clear performance gap: Typhoon Isan achieves significantly lower CER. This shows that the new datasets, standardized transcription rules, and linguistic processes we developed directly lead to accuracy improvements.
Video Demo
This video shows example results of Typhoon Isan ASR as well as Typhoon Isan TTS.
This video shows an intelligent voice agent whose pipelines connected with Typhoon Isan ASR, Typhoon Isan TTS and Typhoon LLM (Typhoon 2.5). This is to show what currently is possible now.
Quick Access to Typhoon Isan Resources
-
Typhoon Isan ASR Real-time
-
Typhoon Isan ASR Whisper
Isan Speech Corpus
Isan Linguistic Resources
Looking Forward
Typhoon Isan marks an important step in our larger vision: AI that understands every dialect, every accent, and every identity in Thailand.
When AI can process local languages, it unlocks benefits far beyond technology—cultural preservation, accessibility, economic inclusion, and empowerment for millions of people who have long been underrepresented in the digital world.
Benefits to Users
-
Easier access to ASR technology
Supports both central Thai and Isan without relying on foreign services.
-
Lower development and deployment cost
Small and efficient models with open licenses allow organizations to run ASR even on everyday hardware.
-
Improved communication in the Isan region
Enables schools, local businesses, and public sector organizations to adopt voice technology more effectively.
-
Greater language equity in AI
Helps rural communities access AI in their own language.
Example Use Cases
- Intelligent assistants or call centers that support Isan
- Tools for journalists or researchers to transcribe interviews with local communities
- Smart City or public-service voice interfaces
- Government agencies or organizations needing Thai + Isan transcription
- Multimedia workflows such as auto-subtitling or podcast transcription
For us, this is only the beginning. If you’re a developer, researcher, or simply someone who cares about language, we invite you to explore the dataset, try the models, and build with us.
We plan to continue engaging with communities, expanding datasets, and building models that reflect the linguistic diversity of Thailand. This is why we are hosting an event today “TYPHOON: Hed Hai AI Jai Isan” to create a public discussion around local AI and bring together diverse stakeholders: businesses, technology providers, AI researchers/academics, linguists, and general users. The discussions will be recorded and released in a future blog post.



