Publications

Explore our research publications and technical papers that advance Thai language AI development. From foundational models to cutting-edge applications, discover the scientific contributions driving innovation in Thai NLP.

Research Papers

Access our latest research publications covering Thai language models, multimodal systems, and evaluation frameworks.

Typhoon 2: A Family of Open Text and Multimodal Thai Large Language Model

December 2024Technical Report

This paper presents Typhoon 2, Thai-optimized models for text, vision, and audio. It outlines methods like continual pre-training and post-training to enhance Thai performance, with evaluation across tasks. The series includes models from 1 to 70 billion parameters, safety tools, and advances in document understanding and speech processing.

Typhoon: Thai Large Language Models

December 2023Technical Report

The Typhoon series introduces Thai LLMs optimized for low-resource challenges, using continual training and ThaiExam for evaluation. Fine-tuned for Thai tasks, Typhoon outperforms open-source models and rivals GPT-3.5 in Thai, with greater efficiency.

CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models

May 2024Research Paper @ NeurIPS RBFM Workshop 2024

CrossCheckGPT introduces a reference-free method for ranking hallucinations in multimodal foundation models, leveraging cross-system consistency as a measure of robustness. Applicable across domains and tasks, it uses explicit and implicit consistency metrics to assess hallucination levels. The method demonstrates high correlation with human judgments and supports new benchmarks, including the first audio-visual hallucination benchmark, AVHalluBench. In collaboration with University of Cambridge, Tsinghua University.

Enhancing Low-Resource Language and Instruction Following Capabilities of Audio Language Models

September 2024Research Paper

This paper evaluates audio language models in low-resource languages, using Thai as an example, revealing their limitations despite multilingual pretraining. It explores data mixtures to optimize models for both a target language and English, integrating audio comprehension and speech instruction-following into a unified framework. The proposed model, Typhoon-Audio, significantly outperforms open-source models and rivals state-of-the-art systems like Gemini-1.5-Pro in both English and Thai.

Typhoon T1: An Open Thai Reasoning Model

February 2025Research Paper @ ICLR SCI-FM Workshop 2025

An open-source Thai reasoning model development effort with comprehensive ablation study

Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging - An Open Recipe

February 2025Research Paper @ ICLR SCI-FM Workshop 2025

This paper explores data selection and model merging to enhance language-specific LLMs (e.g., Thai) with DeepSeek R1-level reasoning. Using only public datasets and a $120 budget, we achieve this without compromising performance on language tasks.

SkillAggregation: Reference-free LLM-Dependent Aggregation

October 2024Research Paper

This work introduces SkillAggregation, a novel method for combining judgments from multiple LLMs in NLP tasks without relying on reference labels. Extending the Crowdlayer approach from image classification, SkillAggregation leverages judge estimates during inference. Experiments show that SkillAggregation consistently outperforms existing aggregation methods, achieving state-of-the-art results across most tasks. In collaboration with University of Cambridge, Stanford University.

An Empirical Study of Multilingual Reasoning Distillation for Question Answering

November 2024Research Paper @ EMNLP 2024 (main)

This paper explores multilingual reasoning distillation in LLMs, proposing d-CoT-nR, a novel approach that incorporates incorrect rationales alongside positive ones to enhance learning. Experiments on multilingual high-school exams show that d-CoT-nR improves accuracy in unseen languages and step-by-step reasoning, outperforming existing methods focused primarily on English. In collaboration with VISTEC.

Efficient Overshadowed Entity Disambiguation by Mitigating Shortcut Learning

November 2024Research Paper @ EMNLP 2024 (main)

This work addresses the challenge of overshadowed entities in entity disambiguation (ED) by proposing a debiasing technique to prevent shortcut learning during training. Unlike knowledge-based methods, this approach avoids added computational overhead at inference. Experiments show state-of-the-art performance on ED datasets, offering a fast and effective solution for improving ED. In collaboration with VISTEC.

McCrolin: Multi-consistency Cross-lingual Training for Retrieval Question Answering

November 2024Research Paper @ EMNLP 2024 (Findings)

McCrolin is a multi-consistency cross-lingual training framework designed to enhance consistency, ranking stability, and robustness in cross-lingual QA systems. Using multi-task learning, McCrolin achieves state-of-the-art results on standard QA datasets and excels with varying input sizes. It demonstrates strong generalizability across different encoder architectures and sizes. In collaboration with VISTEC.