ABOUT THE COURSE:This course introduces the fundamental concepts underlying Large Language Models (LLMs). It starts with an introduction to the various problems in NLP, and discusses how to approach the problem of language modeling using deep learning. It describes the architectural intricacies of Transformers and the pre-training objectives of the differen Transformer-based models. It also discusses the recent advances in LLM research, including LLM alignment, prompting, parameter-efficient adaptation, hallucination, bias and ethical considerations. This course prepares a student to comprehend, critique and approach various research problems on LLMs.INTENDED AUDIENCE: UG and PG students in CSE, EE, ECE, IT, Maths, etc.PREREQUISITES: Mandatory: Machine Learning, Python Programming Optional: Deep LearningINDUSTRY SUPPORT: All those industries whose work involves machine learning, such as Google, Microsoft, Adobe, IBM, Accenture, Adobe, JP Morgan, Wipro, Flipkart, Amazon, etc
Overview
Syllabus
Week 1
- Introduction to NLP – Part 1 (NLP pipeline, applications of NLP)
- Introduction to NLP – Part 2 (distributional semantics)
- Introduction to Deep Learning (Perceptron, ANN, backpropagation, CNN)
- Word vectors (Word2Vec, GloVe, fastText)
- Introduction to Statistical Language Models (N-gram LM, Perplexity, Smoothing)
- Language Models with CNN and RNN
- Introduction to PyTorch
- Implementation of RNNs and LSTMs using PyTorch
- Sequence-to-sequence models, Beam search
- Attention and self-attention
- Introduction to Transformers
- Positional Embedding, various tokenization strategies
- Implementation of Transformers using PyTorch
- Transfer Learning: ELMo, BERT (Encoder-only Model)
- Transfer Learning: GPT (Decoder-only Model), T5 (Encoder-decoder model)
- Prompting (hard and soft) and Instruction fine-tuning (FLAN)
- Advanced prompting (Chain of Thoughts, Graph of Thoughts, Prompt Chaining, etc.)
- Introduction to HuggingFace Library
- Alignment with Human Feedback: RLHF, RLAIF
- Parameter-efficient adaptation (Prompt tuning, Prefix tuning, LoRA)
- Knowledge graphs (KGs)
a. Representation, completion
b. Tasks: Alignment and isomorphism
c. Distinction between graph neural networks and neural KG inference
- Open-book question answering: The case for retrieving from structured and unstructured sources;retrieval-augmented inference and generation
- Retrieval augmentation techniques
a. Key-value memory networks in QA for simple paths in KGs
b. Early HotPotQA solvers, pointer networks, reading comprehension
c. REALM, RAG, FiD, Unlimiformer
d. KGQA (e.g., EmbedKGQA, GrailQA)
- Overview of recently popular models such as GPT4, Llama 3, Claude 3,Mistral, and Gemini
- Ethical NLP – Bias and Toxicity
- Conclusion
Taught by
Prof. Tanmoy Chakraborty, Prof. Soumen Chakraborti