Typhoon Logo
TYPHOON
2025 Wrap Up: A Big Year of Typhoon

2025 Wrap Up: A Big Year of Typhoon

Typhoon

A year of models, papers, platforms, and community momentum across Thailand and Southeast Asia

Oravee (Orn) Smithiphol

Oravee (Orn) Smithiphol

December 24, 2025

2025 Wrap Up: A Big Year of Typhoon

Looking Back on a Big Year at Typhoon

2025 was a big year for Typhoon.

From shipping new generations of models to publishing research at top-tier conferences, from growing an open-source community to deploying Typhoon across real-world infrastructure, this year marked a turning point for the project. What started as an ambitious effort to build Thai–optimized AI has grown into a broader ecosystem of models, tools, research, and collaborators.

Most importantly, this year belongs to the people behind it. To our collaborators, users, researchers, partners, and community members: thank you for building, testing, questioning, and pushing Typhoon forward with us. None of this would have happened without your trust and support.

2025 Highlights at a Glance

2025 was a year of rapid building, deeper research, and expanding real-world adoption for Typhoon. Here’s a snapshot of what the year looked like:

  • Model releases: 14 total

    12 new models across text, reasoning, audio, vision, OCR, and translation, plus 2 major updates

  • Research output with conference presence: 14 total

    Peer-reviewed publications across top-tier conferences including ICLR, ACL, EMNLP, Interspeech, and IJCNLP–AACL

  • Community & ecosystem events: 7 hosted or co-hosted

    Events hosted or co-hosted, alongside 20+ speaking engagements across academia, industry, and developer communities

  • Knowledge sharing & community content: 30+ bilingual posts
    A fully bilingual (TH / EN) blog publishing 20+ in-depth articles beyond release announcements—covering best practices, tutorials, thought leadership, and real-world community use cases

  • Website growth:

    60,000 new website visitors discovering Typhoon models, research, and resources

  • Community growth:

    Discord community more than tripled, growing from ~400 to 1,300 members

  • Free API usage:

    14M API calls serving approximately 320M tokens

  • Model downloads:

    Total Hugging Face downloads increased from 330K to 6M. This makes Typhoon the most downloaded open-source Thai-optimized model suite.


What We Shipped — by Category

Model Releases in 2025

In 2025, we shipped 12 new model releases and 2 major updates, spanning reasoning, language, speech, OCR, and regional dialect support.

  1. Typhoon 2

    A major generation upgrade to the Typhoon family, delivering stronger overall performance, improved Thai language understanding, and better efficiency for real-world deployment. Learn more.

  2. Typhoon 2 Audio (Research Preview)

    An early research preview exploring audio-language modeling capabilities, laying the groundwork for future speech understanding and multimodal audio applications.Learn more.

  3. Typhoon 2 Vision (Research Preview)

    A research preview of Typhoon’s vision-language direction, focusing on multimodal understanding for images and documents in Thai and regional contexts. Learn more.

  4. Typhoon T1 (Research Preview)

    Our first open reasoning model, and the first of its kind released from Southeast Asia. Learn more.

  5. Typhoon 2 R1

    Advanced 70B reasoning model combining Typhoon 2 and DeepSeek R1 with enhanced math and coding performance. Learn more.

  6. Typhoon 2.1 Gemma

    Lightweight, high-performance text models with a controllable thinking mode, built on Gemma.
    Learn more.

  7. Typhoon OCR 1.0 (May) & 1.5 (November)

    OCR-focused models optimized for Thai and English documents, with significant accuracy and robustness improvements in v1.5.
    Learn more.

  8. Typhoon Translate 1.0 (June) & 1.5 (November)

    Dedicated Thai–English translation models, upgraded in v1.5 for better fluency and consistency.
    Learn more.

  9. Typhoon ASR Real-time

    A lightweight, low-latency streaming speech recognition model designed for real-world applications.
    Learn more.

  10. Typhoon 2.5

    Our latest generation text model, with improved Thai fluency, stronger agentic capabilities, and better efficiency.
    Learn more.

  11. Typhoon Isan ASR Real-time

    A lightweight streaming ASR model trained specifically on the Isan dialect.
    Learn more.

  12. Typhoon Isan ASR (Whisper-based)

    A Whisper-based speech recognition model fine-tuned for Isan dialect speech.
    Learn more.

Have you tried them all? :)


Open-Source Datasets, Benchmarks, Source Codes, and Other Resources

Beyond models, 2025 was a big year for open resources. We released datasets, benchmarks, tools, and developer assets to help the community build, evaluate, and deploy AI systems more effectively.

  1. Typhoon Application Week

    A collection of 7 open-source web applications showcasing real-world AI apps built with Typhoon, complete with GitHub repositories for free download and exploration.

    Available at: apps.opentyphoon.ai

  2. Thai Social Values Dataset

    This dataset contains survey questions and responses designed to explore social attitudes and values among people in Thailand in 2025. It includes a comprehensive set of carefully crafted questions and collected responses aimed at facilitating research on social perspectives, values, cultural attitudes, as well as crowdsourcing algorithm research. This work was in collaboration with Stanford University.
    Access here.

  3. SeaCrowd-VL

    A visual dataset focused on Southeast Asia, aimed at improving multimodal understanding in regional contexts.
    Learn more.

  4. ThaiOCRBench

    The first robust, Thai-specific OCR benchmark, created to support more reliable evaluation of OCR and document understanding models.
    Learn more.]

  5. n8n Template

    A ready-to-use workflow template to help developers integrate Typhoon models into agentic and automation pipelines.
    Template link], Blog post

  6. Typhoon Isan Speech Corpus

    An open speech dataset and linguistic research for the Isan dialect.
    Full details.


Platforms and Infrastructure Partners

Making strong models is only part of the work. In 2025, we invested heavily in infrastructure, partnerships, and distribution channels to make Typhoon easier to access, deploy, and scale—from local experimentation to production environments.

  1. Typhoon Platform & Documentation Refresh

    A fully refreshed Typhoon website, Playground, bilingual blog, and bilingual developer documentation, making it easier for both Thai and global developers to get started and build with Typhoon.

  2. Typhoon API Pro via Together AI

    Offered a production-grade Typhoon API through Together AI from March through the end of 2025, enabling reliable access for higher-throughput and production use cases.

  3. Availability on Ollama and OpenRouter

    Typhoon models made easily accessible through popular developer platforms, enabling quick local experimentation and flexible routing across providers.

  4. NVIDIA NIM × Typhoon

    Typhoon models integrated into NVIDIA NIM, supporting production-grade deployment and enterprise-ready inference workflows.

  5. Availability on Float16

    Typhoon models available for use with GPU infrastructure provider through Float16, supporting developers who want greater control.

  6. AWS GAIA Program

    Joined AWS’s GAIA program, with plans to launch Typhoon’s next-generation production environment on AWS in Q1 2026.


Co-Released Artifacts from Institutional Collaborations

We also worked closely with research institutions, industry partners, and ecosystem leaders to co-release research artifacts and applied resources.

  1. SIData+ at Siriraj Hospital

    A medical reasoning model developed through close collaboration, scheduled for public release in early 2026.

  2. Stanford University

    TalkArena is an open research platform designed for evaluating and comparing large audio models
    Access here.

  3. VISTEC

    Multiple research outputs, including:

    • Thai dialect research (ACL Workshop)
    • Multilingual mathematical reasoning (ACL)
    • Safety model robustness (ACL Workshop)
  4. AI Singapore × Typhoon

    SEALION Audio - A collaborative release expanding audio capabilities for Southeast Asian languages.
    Learn more.

  5. Agoda’s AI Developer Report

    Typhoon contributed insights to Agoda’s regional AI developer report.
    Download the report.

  6. Gemmaverse

    Typhoon 2.1 Gemma model featured in Google DeepMind’s Gemmaverse ecosystem.
    Read the story.


Papers & Conferences

Research has always been at the core of Typhoon. In 2025, our work was published across leading AI conferences, covering language, reasoning, speech, multimodal evaluation, safety, and Southeast Asian–focused datasets.

This is a list of all 14 accepted papers this year including both papers led by the Typhoon team (first author) and collaborative work with our academic and industry partners.

  • Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging — An Open Recipe

    ICLR 2025

    Presents a practical, open recipe for rapidly converting language-specific LLMs into reasoning-capable models using model merging techniques.

    [Paper]

  • Typhoon T1: An Open Thai Reasoning Model

    ICLR 2025

    Introduces Typhoon T1, an open reasoning model designed specifically for Thai, addressing both linguistic and cultural reasoning challenges.

    [Paper]

  • Enhancing Low-Resource Language and Instruction-Following Capabilities of Audio Language Models

    Interspeech 2025

    Explores methods to improve audio language models for low-resource languages, with a focus on instruction-following and real-world usability.

    [Paper]

  • SkillAggregation: Reference-Free LLM-Dependent Aggregation

    ACL 2025

    Proposes a reference-free evaluation framework that leverages LLMs to aggregate and assess complex skills without ground-truth labels.

    [Paper]

  • Mind the Gap! Static and Interactive Evaluations of Large Audio Models

    ACL 2025

    Examines the discrepancy between static benchmarks and interactive evaluations for large audio models, highlighting gaps in current evaluation practices.

    [Paper]

  • Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

    ACL 2025

    Introduces SEA-VL, a large-scale multicultural vision-language dataset, and analyzes trade-offs between data collection strategies in Southeast Asia.

    [Paper]

  • Towards Better Understanding of Program-of-Thought Reasoning in Cross-Lingual and Multilingual Environments

    ACL 2025

    Investigates how program-of-thought reasoning generalizes across languages, uncovering challenges in cross-lingual and multilingual settings.

    [Paper]

  • Shortcut Learning in Safety: The Impact of Keyword Bias in Safeguards

    ACL 2025

    Analyzes how keyword-based shortcuts can undermine safety mechanisms, revealing hidden vulnerabilities in safeguard systems.

    [Paper]

  • ThaiInstruct: An Instruction-Following Dataset for Culturally-Aware, Multitask, and Multi-Domain Evaluation in Thai

    EMNLP 2025

    Presents a large-scale instruction-following dataset designed to evaluate cultural awareness, task diversity, and domain coverage in Thai.

    [Paper]

  • Prior Prompt Engineering for Reinforcement Fine-Tuning

    EMNLP 2025

    Studies how prompt design prior to reinforcement fine-tuning impacts learning efficiency and downstream performance.

    [Paper]

  • Unlearning vs. Obfuscation: Are We Truly Removing Knowledge?

    EMNLP 2025

    Examines whether current unlearning methods genuinely remove knowledge or merely obscure it, with implications for model safety and compliance.

    [Paper]

  • FinCoT: Grounding Chain-of-Thought in Expert Financial Reasoning

    EMNLP 2025

    Introduces FinCoT, a framework for grounding chain-of-thought reasoning in expert-level financial knowledge.

    [Paper]

  • Talk Less, Call Right: Enhancing Role-Play LLM Agents with Automatic Prompt Optimization and Role Prompting

    EMNLP 2025

    Proposes techniques to improve role-play agents by optimizing prompts and role definitions for more effective tool use and decision-making.

    [Paper]

  • ThaiOCRBench: A Task-Diverse Benchmark for Vision-Language Understanding in Thai

    IJCNLP–AACL 2025

    Introduces ThaiOCRBench, a comprehensive benchmark covering diverse OCR and vision-language tasks tailored for Thai documents.

    [Paper]


Events & Speaking

Events We Hosted or Co-Hosted

In 2025, we were determined in engaging with the community—bringing researchers, developers, and practitioners together to learn, share, and build with Typhoon.

  1. Typhoon 2 Launch Event

    The official launch of Typhoon 2, featuring model deep dives and discussions on real-world use cases and research directions

  2. ML Research Meetup #1

    A research-focused meetup bringing together students, researchers, and practitioners to exchange ideas on modern ML and LLM research.

  3. Cursor Meetup Bangkok

    A meetup we supported Cursor users to mingle and explore AI-powered coding workflows, with practical demonstrations and community-driven discussions.

  4. SEA AI Developer Meetup with AI Singapore

    Co-hosted with AISG to connect AI developers across Southeast Asia, share perspectives on local/regional AI models as well as AI technology keynotes and introduce a regional AI hackathon.

  5. LLM Fine-Tuning and Deployment Bootcamp

    An intensive full-day bootcamp, in partnership with Float16, focused on fine-tuning, and deploying large language models in production environments.

  6. Typhoon Community Meetup

    A casual, community-first gathering for Typhoon users to share projects, give feedback, and connect with the core team.

  7. Typhoon เฮ็ดให้ AI ใจอีสาน

    A community event highlighting AI for the Isan dialect, combining technical talks with cultural context and local use cases.

Speaking Engagements

As interest in AI continued to grow, so did invitations for the Typhoon team to share our work. In 2025 alone, we were invited to nearly 30 speaking engagements, and ultimately spoke at 20+ events across a wide range of audiences.

Our talks spanned:

  • Academic and student audiences, including institutions such as KMITL, CMKL, KMUTT, and Mahidol University, where we shared research insights and practical guidance for building with LLMs.
  • Business and practitioner audiences, including programs like NIA ACC, Fortune Magazine’s AI Brainstorm, and Techsauce Global Summit — where we ran a two-hour, hands-on workshop demonstrating agentic AI workflows using n8n.
  • Industry-specific conferences, such as the Bangkok Digital Finance Conference
  • Developer and tech communities, including FOSSASIA Summit and SuperAI Engineer, where we engaged deeply with builders on open-source models, tooling, and real-world implementation challenges.

These conversations helped shape how Typhoon evolves, grounding our work in real needs and real feedback from the ecosystem.


Sharing Knowledge, Best Practices, and Community Stories

In 2025, we doubled down on one of our core commitments: making AI knowledge accessible, practical, and inclusive.

We launched a fully bilingual blog (Thai / English) to ensure that insights, best practices, and real-world experiences are easy to access for everyone—developers, researchers, and practitioners alike. Over the course of the year, we published 30+ blog posts in each language. Excluding model release announcements, we shared 20+ in-depth articles spanning best practices, tutorials, thought leadership, and community stories.

Knowledge & Best Practices

We published long-form, deeply researched articles aimed at helping practitioners build better AI systems in practice, not just in theory. Notable pieces include:

  • Mastering Agentic Workflows: 20 Principles to Build Smarter AI Systems

    A nearly 10,000-word deep dive into evaluation-driven development, context engineering, tool usage, and agentic workflow design. Explore

  • A Practical Guide to Agentic Self-Reflection and Other Methods to Improve LLM Inference on Complex Questions

    A hands-on guide to improving reasoning performance using self-reflection and related prompting techniques. Explore

  • The Current Landscape of Reasoning Model Development

    An overview of how reasoning models are evolving, including key approaches, trade-offs, and open challenges. Explore

Tutorials & How-To Guides

We also published practical tutorials to lower the barrier to entry for building with Typhoon, covering:

  • Local deployment using tools like Ollama and LM Studio
  • Agentic workflows using n8n
  • Integrations with emerging standards such as MCP

These guides were designed to help developers move quickly from experimentation to working systems.

Beyond hands-on guides, we shared perspectives on broader AI trends and strategic questions, including:

  • Why Local Language Models Matter

    An exploration of the advantages of local-language models for accuracy, culture, and real-world adoption. Explore

Community Stories & Real-World Use Cases

Finally, we highlighted how Typhoon is being used in practice through community and partner stories, including:

  • SiData+ at Siriraj Hospital — Admin chatbot use case
  • VISAI — Legal chatbot
  • Thailand Development Research Institute (TDRI) — Big Data Text Analytics
  • RISA — AI exam tutor
  • Typhoon Community Meetup — multiple real-world projects shared by the community

Together, these stories reflect what matters most to us: not just building models, but helping people use AI meaningfully in real contexts.


Wrapping Up 2025 — and Looking Ahead

As we look back on 2025, what stands out most isn’t just the number of models released or papers published—it’s the momentum. The momentum of a growing community, of research turning into usable systems, and of ideas moving from prototypes to production.

We head into 2026 with a strong foundation: deeper research, broader infrastructure access, and clearer signals from the people who use Typhoon every day. There’s still a lot to explore, improve, and build—but if 2025 is any indication, we’re just getting started.

Thank you for being part of the journey. We’re excited to keep building with you in the year ahead!