Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

DeepSeekMath - Pushing the Limits of Mathematical Reasoning in Open Language Models

Yannic Kilcher via YouTube

Overview

Explore a detailed analysis of a research paper that introduces DeepSeekMath 7B, a groundbreaking language model specifically designed for mathematical reasoning. Learn how this 7B parameter model achieves remarkable performance on complex mathematical benchmarks through innovative techniques like Group Relative Policy Optimization (GRPO) and specialized data selection. Understand the model's architecture, which builds upon DeepSeek-Coder-Base-v1.5 and utilizes 120B math-related tokens from Common Crawl, alongside natural language and code data. Discover how this model approaches the performance levels of larger models like Gemini-Ultra and GPT-4 on the competition-level MATH benchmark, achieving a 51.7% score without external toolkits and reaching 60.9% with self-consistency over 64 samples. Delve into the technical details of GRPO, a variant of Proximal Policy Optimization (PPO), and how it optimizes both mathematical reasoning abilities and memory usage.

Syllabus

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models (Paper Explained)

Taught by

Yannic Kilcher

Reviews

Start your review of DeepSeekMath - Pushing the Limits of Mathematical Reasoning in Open Language Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.