This conference talk explores the process of enhancing medical language models through continued pre-training and reasoning optimization. Learn how researchers took a 72B parameter LLM and improved it with domain-specific Japanese medical knowledge, then applied preference optimization techniques to achieve more stable reasoning capabilities. Discover the methodology behind creating MedAI, a specialized medical language model for Japan, and consider how similar approaches could be implemented for other languages and countries. The presentation covers the work by Wataru Kawakami, Keita Suzuki, and Junichiro Iwasawa from Preferred Networks Inc. and The University of Tokyo, highlighting their Preferred-MedLLM-Qwen-72B model available on Hugging Face, which builds upon the Qwen2.5 base model supporting 29 languages.
Stabilizing Reasoning in Medical LLMs with Continued Pretraining and Reasoning Preference Optimization
Discover AI via YouTube
Overview
Syllabus
Stabilizing Reasoning in Medical LLM (MedAI Japan)
Taught by
Discover AI