Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Training a Reasoning Model Using DeepSeek with 7GB VRAM - A Fast Fine-tuning Guide

Machine Learning With Hamza via YouTube

Overview

Coursera Plus Annual Sale: All Certificates & Courses 25% Off!
Learn to fine-tune Large Language Models (LLMs) for reasoning tasks in this 27-minute tutorial video that demonstrates using the GRPO reinforcement learning algorithm with minimal GPU requirements. Explore the complete process from environment setup to testing results, including detailed explanations of GRPO methodology, data preparation, model configuration, and reward function implementation. Master local LLM fine-tuning using the Unsloth fast fine-tuning Python library, requiring only 7GB of VRAM. Follow along with practical demonstrations of training procedures, analyze training outcomes, and understand how to test the fine-tuned model effectively. Access comprehensive resources including GitHub repositories, Hugging Face documentation, and Unsloth notebooks to support the implementation process.

Syllabus

00:00 Intro
01:02 Explaining GRPO
08:03 Environment Setup guidelines
10:20 Data , Model & Reward functions
17:57 Training
21:24 Training results
23:47 Testing

Taught by

Machine Learning With Hamza

Reviews

Start your review of Training a Reasoning Model Using DeepSeek with 7GB VRAM - A Fast Fine-tuning Guide

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.