Typhoon Isan: Open-Source ASR and a Language Technology Suite for Thailand’s Largest Dialect

Today, our team at Typhoon is proud to share a milestone we’ve been working toward for more than a year: Typhoon Isan, a language technology suite featuring the first production-ready ASR model supporting the Isan dialect, developed with systematic transcription and spelling standards, alongside a collection of open research and linguistic datasets.

This launch represents more than just a new model. It reflects the kind of AI we want to build—technology that understands real people, real voices, and the rich linguistic identity of Thailand—while reinforcing our belief that open research is essential to strengthening Thailand’s NLP and AI ecosystem.

Why We Started Building Typhoon Isan

When we looked at the landscape of speech technology, we saw a clear gap. Most ASR systems struggle with dialects because there simply isn’t enough well-organized, high-quality data. Even our previously launched Typhoon ASR was trained mainly on central Thai.

We believe AI must be truly built for everyone in Thailand. So we started by piloting with Isan as the first dialect—it’s considered the most widely spoken dialect in the country with over 20 million people speaking Isan daily.

The mission to build AI truly for Thai guided how we approached this project: not just to build a model, but also to build NLP resources for the dialect. As an R&D team comprising linguists, researchers, engineers, and business/community practitioners who come together to develop language technology for Thai, we’re excited to contribute to this milestone.

Challenges

This year-long project was, of course, not an easy one. We faced two core foundational challenges throughout the project:

Isan is primarily a spoken language with no clear written standard.

There is no widely accepted writing, spelling, or transcription standard, which makes ASR development difficult. Our linguistics team worked closely with local speakers, teachers, and experts to define consistent transcription and spelling conventions before model training could begin.
Isan is a (very) low-resource language.

We often say Thai—central Thai included—is already a low-resource language. Working with a dialect makes the gap even clearer: high-quality datasets are extremely scarce. To address this, the Typhoon linguistics team directly collected spoken data from local speakers across the region, then carefully curated and annotated it according to the established standards.

What We Release

Under Typhoon Isan, we’re releasing both the linguistic foundations and the technical tools needed for dialect AI: open datasets, spelling and transcription standards, and two ASR models designed to understand Isan speech.

1. Typhoon Isan Speech Corpus Suite

These are the core resources we created and used to develop both Typhoon Isan ASR and Typhoon Isan TTS, and we’re releasing them openly so researchers, educators, and developers can build on them:

1.1 Isan Speech Corpus

This dataset contains audio recordings of Isan (Northeastern Thai) speech, paired with rich transcriptions and demographic metadata. It is designed to support Automatic Speech Recognition (ASR), dialect study, and text normalization tasks for the Isan language.

The dataset features spontaneous responses to specific questions, covering two various domains (General and, Finance, etc.), recorded by speakers from different provinces in Northeastern Thailand.

Access here.

1.2 Isan Spelling Standard

A Thai-script–based orthographic system designed to represent Isan words consistently and systematically. It provides clear rules for writing Isan in a way that supports linguistic research, dataset creation, and AI model training.

Access here.

1.3 Isan Speech Transcription Convention

A comprehensive guideline for transcribing spoken Isan in a consistent, machine-readable form. It defines rules for segmenting speech, marking tones, representing pronunciation, and handling variations across regions—ensuring high-quality annotations for AI and NLP training.

Access here.

1.4 Isan Phonetic Dictionary

A curated phonetic lexicon that maps words to their Isan pronunciations. It provides consistent phonetic representations to support speech recognition and text-to-speech development.

Access the dictionary dataset here.

Access the phonetic transcription guideline here.

1.5 Isan Dialect Classification

An analytical report on how Isan accents vary across Northeastern provinces, using linguistic features to identify groups, patterns, and similarities among regional speech varieties.

Access here.

1.6 Technical Report

A technical report detailing our development of the Isan speech dataset and the transcription protocols.

Access here.

2. Typhoon Isan ASR

Typhoon Isan ASR is an open-source Automatic Speech Recognition model capable of accurately transcribing spoken Isan into text following the transcription standards we developed and published. It includes two model variants to suit different types of applications:

2.1 Typhoon Isan ASR Real-time

Typhoon Isan ASR Real-time is an open-source speech recognition model designed to support the Isan dialect alongside central Thai. It operates with high speed, accuracy, and low latency, even on standard hardware—making it ideal for real-time applications such as online meetings or intelligent assistant systems.

This model is fine-tuned from Typhoon ASR Real-time that uses NVIDIA NeMo fastConformer-transducer-large architecture and is optimized to address the difficulties mainstream ASR systems face when handling Thai regional dialects.

2.2 Typhoon Isan ASR Whisper

Typhoon Isan ASR Whisper is an open-source ASR model for Thai speech, fine-tuned to support the Isan dialect. It is adapted from Whisper Medium (Biodatlab), built on OpenAI’s Whisper architecture. This enables the model to handle Thai speech—including code-switching with English and other languages—with high accuracy.

This variant aims to overcome the common limitations of general-purpose ASR systems, which often fail to recognize regional dialects accurately while still remaining compatible with standard Whisper pipelines.

Comparison: Typhoon Isan ASR Real-Time vs. Typhoon Isan ASR Whisper

Choose Typhoon Isan ASR Real-time for live transcription, or Typhoon Isan ASR Whisper for high-accuracy transcription of recorded audio.

Feature	Typhoon Isan ASR Real-time	Typhoon Isan ASR Whisper
Key Strengths	Low-latency real-time ASR; Runs fast on CPU or small GPUs; suitable for edge devices	High accuracy, multilingual support
Mode	Streaming, immediate transcription	Batch processing after audio upload
Model Size	~115M parameters (lightweight)	~800M parameters (larger, accuracy-focused)
Architecture	NVIDIA NeMo fastConformer-transducer-large	OpenAI Whisper Medium (fine-tuned by Biodatlab)
On-prem Deployment	Fully supported	Fully supported
Cost & Accessibility	Very low compute cost	Higher compute needs but still cheaper than commercial ASR

Model Evaluation

To assess Typhoon Isan ASR’s performance on natural Isan speech, we evaluated it against a range of strong ASR baselines—Gemini, academic dialect models, and Whisper-based systems—using our internal test set of 500 utterances drawn from speakers across 10 provinces in Northeastern Thailand—Khon Kaen, Roi Et, Udon Thani, Ubon Ratchathani, Chaiyaphum, Maha Sarakham, Kalasin, Nong Bua Lamphu, Sakon Nakhon, and Yasothon—capturing a broad range of regional accents.

Character Error Rate (CER)

Character Error Rate (CER) reflects accuracy at the character level and is widely used for evaluating Thai and low-resource languages. A lower CER indicates higher transcription accuracy.

CER Results

Model	CER	Revised Notes
scb10x/typhoon-isan-asr-whisper	0.0885	Whisper Medium fine-tuned on the Typhoon Isan dataset.
Gemini-2.5-pro	0.1020	Large-scale commercial ASR model used as a general benchmark.
scb10x/typhoon-isan-asr-realtime	0.1065	FastConformer-based real-time model optimized for low latency on standard hardware.
scb10x/whisper-medium-dialect-exp2-ep5	0.1772	Whisper Medium fine-tuned on existing dialect datasets from NECTEC and SLSCU.
SLSCU_korat_model	0.7008	Academic model from Chulalongkorn University primarily trained on Korat (Nakhon Ratchasima) speech.

Typhoon Isan ASR performs competitively with Gemini—while remaining fully open-source.

Our Isan ASR achieves accuracy on par (for the real-time one) and outperform (for the Whisper) with a large commercial ASR system like Gemini, demonstrating that open-source, domain-specific models can match proprietary solutions on dialectal speech.

The Whisper-medium-dialect baseline helps us understand how much our new data and standards improve dialect ASR.

By evaluating a model trained on existing dialect datasets from NECTEC and SLSCU, we observe a clear performance gap: Typhoon Isan achieves significantly lower CER. This shows that the new datasets, standardized transcription rules, and linguistic processes we developed directly lead to accuracy improvements.

Video Demo

This video shows example results of Typhoon Isan ASR as well as Typhoon Isan TTS.

This video shows an intelligent voice agent whose pipelines connected with Typhoon Isan ASR, Typhoon Isan TTS and Typhoon LLM (Typhoon 2.5). This is to show what currently is possible now.

Quick Access to Typhoon Isan Resources

Typhoon Isan ASR Real-time
- Hugging Face
- Web Playground
Typhoon Isan ASR Whisper
- Hugging Face

Isan Speech Corpus

Isan Linguistic Resources

Github Repository

Looking Forward

Typhoon Isan marks an important step in our larger vision: AI that understands every dialect, every accent, and every identity in Thailand.

When AI can process local languages, it unlocks benefits far beyond technology—cultural preservation, accessibility, economic inclusion, and empowerment for millions of people who have long been underrepresented in the digital world.

Benefits to Users

Easier access to ASR technology

Supports both central Thai and Isan without relying on foreign services.
Lower development and deployment cost

Small and efficient models with open licenses allow organizations to run ASR even on everyday hardware.
Improved communication in the Isan region

Enables schools, local businesses, and public sector organizations to adopt voice technology more effectively.
Greater language equity in AI

Helps rural communities access AI in their own language.

Example Use Cases

Intelligent assistants or call centers that support Isan
Tools for journalists or researchers to transcribe interviews with local communities
Smart City or public-service voice interfaces
Government agencies or organizations needing Thai + Isan transcription
Multimedia workflows such as auto-subtitling or podcast transcription

For us, this is only the beginning. If you’re a developer, researcher, or simply someone who cares about language, we invite you to explore the dataset, try the models, and build with us.

We plan to continue engaging with communities, expanding datasets, and building models that reflect the linguistic diversity of Thailand. This is why we are hosting an event today “TYPHOON: Hed Hai AI Jai Isan” to create a public discussion around local AI and bring together diverse stakeholders: businesses, technology providers, AI researchers/academics, linguists, and general users. The discussions will be recorded and released in a future blog post.

Typhoon Isan: Open-Source ASR and a Language Technology Suite for Thailand’s Largest Dialect

Table of Contents

Why We Started Building Typhoon Isan

Challenges

What We Release

1. Typhoon Isan Speech Corpus Suite

1.1 Isan Speech Corpus

1.2 Isan Spelling Standard

1.3 Isan Speech Transcription Convention

1.4 Isan Phonetic Dictionary

1.5 Isan Dialect Classification

1.6 Technical Report

2. Typhoon Isan ASR

2.1 Typhoon Isan ASR Real-time

2.2 Typhoon Isan ASR Whisper

Model Evaluation

Character Error Rate (CER)

Video Demo

Quick Access to Typhoon Isan Resources

Looking Forward

Benefits to Users

Example Use Cases

Introducing ThaiOCRBench: A New Benchmark for Vision–Language Understanding in Thai Documents

Introducing Typhoon OCR 1.5: A Smaller, More Robust, and Faster Vision-Language OCR for Real-World Thai and English Documents