Any data analysis is incomplete without statistics. After getting the data, the statistical tools aims to extract the information hidden inside the data. Sampling theory and regression analysis are two important tools among others which play a fundamental role in extracting such information. The role of such classical topics of statistics are to be understood in the context of data science. Such topics have fundamental applicability in data science and are to be understood from computational aspects through software. The introductory tools of sampling theory and regression analysis are detailed in this course. How to use them with the popular free R statistical software R and what are the interpretations of the outcome is the objective of the course to be taught. INTENDED AUDIENCE :UG students of Science and Engineering. Students of humanities with basic mathematical and statistical background can also do it. Working professionals in analytics can also do it.PREREQUISITES : “Introduction to R Course” and “Essentials of Data Science With R Software – 1 - Probability and Statistical Inference” are preferred. Mathematics background up to class 12 is needed. Some minor statistics background is desirable.INDUSTRIES SUPPORT :All industries having R & D set up will use this course.
Week 1:Introduction to data science and Calculations with R SoftwareWeek 2:Basic Fundamentals of SamplingWeek 3:Simple Random SamplingWeek 4:Simple Random Sampling with R Week 5:Stratified Random SamplingWeek 6:Stratified Random Sampling with RWeek 7:Bootstrap Methodology with RWeek 8:Introduction to Linear Models and Regression and Simple linear regression Analysis Week 9:Simple Linear Regression Analysis with RWeek 10:Multiple Linear Regression AnalysisWeek 11:Multiple Linear Regression Analysis with RWeek 12:Variable Selection using LASSO Regression