This video explores the latest optimization and performance improvements for DeepSeek R1-based models called "Light R1" for 32B and 14B model sizes. Learn about the new open-source models that outperform the original DeepSeek R1 32B and R1 14B versions. Discover the technical advancements made by researchers from Qiyuan Tech and Renmin University in their paper "Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond." The 26-minute presentation details how these optimized models achieve superior reasoning capabilities and performance metrics. Access the improved models directly on Huggingface through the provided links for both the 32B and 14B versions. Perfect for AI researchers, developers, and enthusiasts interested in state-of-the-art language models and reasoning capabilities.
Overview
Syllabus
Improved DeepSeek R1-32B & R1-14B: NEW Light-R1
Taught by
Discover AI