Explore our research publications and technical papers that advance Thai language AI development. From foundational models to cutting-edge applications, discover the scientific contributions driving innovation in Thai NLP.
Access our latest research publications covering Thai language models, multimodal systems, and evaluation frameworks.
This paper presents TYPHOON 2, Thai-optimized models for text, vision, and audio. It outlines methods like continual pre-training and post-training to enhance Thai performance, with evaluation across tasks. The series includes models from 1 to 70 billion parameters, safety tools, and advances in document understanding and speech processing.
The TYPHOON series introduces Thai LLMs optimized for low-resource challenges, using continual training and ThaiExam for evaluation. Fine-tuned for Thai tasks, TYPHOON outperforms open-source models and rivals GPT-3.5 in Thai, with greater efficiency.
CrossCheckGPT introduces a reference-free method for ranking hallucinations in multimodal foundation models, leveraging cross-system consistency as a measure of robustness. Applicable across domains and tasks, it uses explicit and implicit consistency metrics to assess hallucination levels. The method demonstrates high correlation with human judgments and supports new benchmarks, including the first audio-visual hallucination benchmark, AVHalluBench. In collaboration with University of Cambridge, Tsinghua University.
This paper evaluates audio language models in low-resource languages, using Thai as an example, revealing their limitations despite multilingual pretraining. It explores data mixtures to optimize models for both a target language and English, integrating audio comprehension and speech instruction-following into a unified framework. The proposed model, TYPHOON-Audio, significantly outperforms open-source models and rivals state-of-the-art systems like Gemini-1.5-Pro in both English and Thai.
An open-source Thai reasoning model development effort with comprehensive ablation study
This paper explores data selection and model merging to enhance language-specific LLMs (e.g., Thai) with DeepSeek R1-level reasoning. Using only public datasets and a $120 budget, we achieve this without compromising performance on language tasks.
This work introduces SkillAggregation, a novel method for combining judgments from multiple LLMs in NLP tasks without relying on reference labels. Extending the Crowdlayer approach from image classification, SkillAggregation leverages judge estimates during inference. Experiments show that SkillAggregation consistently outperforms existing aggregation methods, achieving state-of-the-art results across most tasks. In collaboration with University of Cambridge, Stanford University.
This paper explores multilingual reasoning distillation in LLMs, proposing d-CoT-nR, a novel approach that incorporates incorrect rationales alongside positive ones to enhance learning. Experiments on multilingual high-school exams show that d-CoT-nR improves accuracy in unseen languages and step-by-step reasoning, outperforming existing methods focused primarily on English. In collaboration with VISTEC.
This work addresses the challenge of overshadowed entities in entity disambiguation (ED) by proposing a debiasing technique to prevent shortcut learning during training. Unlike knowledge-based methods, this approach avoids added computational overhead at inference. Experiments show state-of-the-art performance on ED datasets, offering a fast and effective solution for improving ED. In collaboration with VISTEC.
McCrolin is a multi-consistency cross-lingual training framework designed to enhance consistency, ranking stability, and robustness in cross-lingual QA systems. Using multi-task learning, McCrolin achieves state-of-the-art results on standard QA datasets and excels with varying input sizes. It demonstrates strong generalizability across different encoder architectures and sizes. In collaboration with VISTEC.