Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

# Statistics and R

### Overview

This course teaches the R programming language in the context of statistical data and statistical analysis in the life sciences.

We will learn the basics of statistical inference in order to understand and compute p-values and confidence intervals, all while analyzing data with R code. We provide R programming examples in a way that will help make the connection between concepts and implementation. Problem sets requiring R programming will be used to test understanding and ability to implement basic data analyses. We will use visualization techniques to explore new data sets and determine the most appropriate approach. We will describe robust statistical techniques as alternatives when data do not fit assumptions required by the standard approaches. By using R scripts to analyze data, you will learn the basics of conducting reproducible research.

Given the diversity in educational background of our students we have divided the course materials into seven parts. You can take the entire series or individual courses that interest you. If you are a statistician you should consider skipping the first two or three courses, similarly, if you are biologists you should consider skipping some of the introductory biology lectures. Note that the statistics and programming aspects of the class ramp up in difficulty relatively quickly across the first three courses. We start with simple calculations and descriptive statistics. By the third course will be teaching advanced statistical concepts such as hierarchical models and by the fourth advanced software engineering skills, such as parallel computing and reproducible research concepts.

These courses make up two Professional Certificates and are self-paced:

Data Analysis for Life Sciences:

• PH525.1x: Statistics and R for the Life Sciences
• PH525.2x: Introduction to Linear Models and Matrix Algebra
• PH525.3x: Statistical Inference and Modeling for High-throughput Experiments
• PH525.4x: High-Dimensional Data Analysis

Genomics Data Analysis:

• PH525.5x: Introduction to Bioconductor
• PH525.6x: Case Studies in Functional Genomics

This class was supported in part by NIH grant R25GM114818.

### Taught by

Michael Love and Rafael Irizarry

## Reviews

3.5 rating, based on 20 Class Central reviews

4.1 rating at edX based on 59 ratings

Start your review of Statistics and R

• A wonderfully presented course which is a part of a larger series of 8 related courses, this course covered the basics of using R and a general overview of statistics. Course material is released every week, but all the quizzes were due about 4 montâ€¦
• Brandt Pence
(Note, I took this before the reorganization of the courses. I believe the material in the first two-three courses remains the same, so my comments should still be valid here.) This is the first course in the PH525 sequence offered by HarvardX onâ€¦
• Chris Falter
Pro: If you watch the videos, read the material, and do the exercises, you will emerge with a working understanding of statistics foundations (normal distribution, Student's t-distribution, Monte Carlo simulations, etc.) and R.

Con: The instructors were sometimes very sloppy in their explanations; they tended to use hard-to-grasp lingo in the videos and even in the exercises. Between the forums and the exercise explanations, however, I was able to *eventually* understand the exercises that were poorly worded initially.
• Anonymous
The instruction videos are very sloppy, and the text book and other resources are not very helpful as well. The exercises have questions on topics that have either not been discussed or very poorly described. Additionally, the language of the questiâ€¦
• Ayse N.
The way this course is taught feels pretty sloppy; it is easy to feel lost. They teach one way, and the answers they provide for some exercises is written in a completely different way they have never taught.

To be able to understand some things, you need to already know a bit about the topic.

Also the way they name variables is quite cringe-worthy, in some place they name a variable "X", another variable is "x"; since R is case sensitive, no need to worry, right?
• Max Pietsch
I have a background in computer science but none in statistics. I began to get lost in the part about t-tests. This was basic statistical information, so someone with that background would be good to go. To get through the first quarter of the course I had to do a lot of googling for how to work with R, which was fine and helped me learn R.
Im doing Masters in Analytics and this course by Rafael Irizarry is helping me so much in my studies.Amazing course.You get to learn all necessary tools for data analytics in this. The instructor is teaching everything slowly and gradually.Look no where else.First take this course for data science and then go for some other course.Highly recommended .I dont know why some people have given less rating to this course.
• Anonymous
I took intro to stats two years ago, now I'm facing econometrics so needed to learn R and brush up on stats. So far (into Week 3) it is a good course for that. There is a fair amount of puzzling things out for yourself, but that's probably a good thing, too. It is a tremendous value for the price ;-)
• Anonymous
Not suitable for the beginners.
Tutor gave the example which will not use in the exercise, so you will get lost easily.
• Robert Grutza
The instructor was very good. The material was presented in a logical manner. Nice sample R code with explanations. Etc....
• Anonymous
Course is not organised well, neither instructor doesn't explain material in depth. Videos are almost useless, many terms used in the them without explaining; for example: hypergeometric distribution
• Piotr Dziuba
• Jinwook