Advanced Bioconductor

Harvard University via edX

Go to class Write review

Details

Go to class

Provider

edX
Pricing

Free Online Course (Audit)
Languages

English
Certificate

$219.00 Certificate Available
Duration & workload

4 weeks, 2-4 hours a week
Sessions

On-Demand
Level

Advanced
Subtitles

English

Found in

Part of

Data Analysis for Genomics

Overview

In this course, we begin with approaches to visualization of genome-scale data, and provide tools to build interactive graphical interfaces to speed discovery and interpretation. Using knitr and rmarkdown as basic authoring tools, the concept of reproducible research is developed, and the concept of an executable document is presented. In this framework reports are linked tightly to the underlying data and code, enhancing reproducibility and extensibility of completed analyses. We study out-of-memory approaches to the analysis of very large data resources, using relational databases or HDF5 as "back ends" with familiar R interfaces. Multiomic data integration is illustrated using a curated version of The Cancer Genome Atlas. Finally, we explore cloud-resident resources developed for the Encyclopedia of DNA Elements (the ENCODE project). These address transcription factor binding, ATAC-seq, and RNA-seq with CRISPR interference.

Given the diversity in educational background of our students we have divided the series into seven parts. You can take the entire series or individual courses that interest you. If you are a statistician you should consider skipping the first two or three courses, similarly, if you are biologists you should consider skipping some of the introductory biology lectures. Note that the statistics and programming aspects of the class ramp up in difficulty relatively quickly across the first three courses. By the third course will be teaching advanced statistical concepts such as hierarchical models and by the fourth advanced software engineering skills, such as parallel computing and reproducible research concepts.

These courses make up two Professional Certificates and are self-paced:

Data Analysis for Life Sciences:

PH525.1x: Statistics and R for the Life Sciences
PH525.2x: Introduction to Linear Models and Matrix Algebra
PH525.3x: Statistical Inference and Modeling for High-throughput Experiments
PH525.4x: High-Dimensional Data Analysis

Genomics Data Analysis:

PH525.5x: Introduction to Bioconductor
PH525.6x: Case Studies in Functional Genomics
PH525.7x: Advanced Bioconductor

This class was supported in part by NIH grant R25GM114818.

HarvardX requires individuals who enroll in its courses on edX to abide by the terms of the edX honor code. HarvardX will take appropriate corrective action in response to violations of the edX honor code, which may include dismissal from the HarvardX course; revocation of any certificates received for the HarvardX course; or other remedies as circumstances warrant. No refunds will be issued in the case of corrective action for such violations. Enrollees who are taking HarvardX courses as part of another program will also be governed by the academic policies of those programs.

HarvardX pursues the science of learning. By registering as an online learner in an HX course, you will also participate in research about learning. Read our research statement to learn more.

Harvard University and HarvardX are committed to maintaining a safe and healthy educational and work environment in which no member of the community is excluded from participation in, denied the benefits of, or subjected to discrimination or harassment in our program. All members of the HarvardX community are expected to abide by Harvard policies on nondiscrimination, including sexual harassment, and the edX Terms of Service. If you have any questions or concerns, please contact [email protected] and/or report your experience through the edX contact form.

Taught by

Michael Love and Rafael Irizarry

Reviews

3.0 rating, based on 1 Class Central review

Start your review of Advanced Bioconductor

Brandt Pence

(Note I took these prior to their reorganization/combination. Back then they were 4 one-week courses, of which I took three, so I will review those modules below and assume that the material in the new course is similar). >>>>RNA-seq: This is the…

(Note I took these prior to their reorganization/combination. Back then they were 4 one-week courses, of which I took three, so I will review those modules below and assume that the material in the new course is similar).

>>>>RNA-seq:

This is the first case study course and 5th overall course in the PH525 sequence offered by HarvardX through EdX. I had high hopes for these case studies going in, but I left somewhat disappointed. I had hoped to learn how to work through an entire genomics pipeline, but instead most of the questions were similar to those in the Intro to Bioconductor course where much of the hard work was done for you.

RNA-seq is a powerful technique that has essentially replaced microarrays and may someday replace many RT-PCR studies for analysis of gene expression. The major problem with these courses is the massive datasets generated by these studies and the advanced techniques necessary to analyze them. Most personal computers cannot run an RNA-seq pipeline without running out of memory, so computer clusters are often used. This makes it next to impossible to let students go through the entire analysis pipeline, which is what I had hoped would happen in this class. Online resources like Galaxy help with this to an extent by letting individuals run their pipelines on remote servers, but for a programming class this type of analytical strategy isn't appropriate.

Nevertheless, I did learn some about how to use Bioconductor software to explore RNA-seq data, and that aligns well with the rest of the courses in this sequence. The value here is in giving students an overview of this field so that they are prepared for further study or to explore available datasets on their own. I had hoped for a bit more out of this sequence, but the relatively small time commitment and the fact that the courses were free means I can't complain too much.

Overall, three stars. I had hoped for a little more exposure to some of the RNA-seq analysis workflow, but the technical difficulty involved in using such large datasets makes that impossible for this type of course. This course only took a few hours of work to complete, so it is a minimal time investment that yields a decent introduction to RNA-seq analysis.

>>>>ChIP-seq:

This is the third case study and 7th overall course in the PH525 sequence. This was by far the least useful course in the sequence, and I'm not saying that due to the low grade I received. The whole course probably took 2-3 hours to complete, and there was not a single programming exercise in the entire class. The questions were not particularly straightforward, and many of them had multiple answers with only 2 attempts allowed. While Dr. Liu did cover the material reasonably well, the answers to some of the homework questions were either not apparent from the lectures or not covered at all, and the relatively small number of points available meant that missing a few questions reduced your grade considerably.

ChIP-seq (chromatin immunoprecipitation sequencing) is a powerful technique used to find transcription factor binding sites on genomic DNA by pulling down bound DNA fragments using antibodies against the transcription factor, then sequencing the resulting DNA fragments. While the class itself was a decent overview of the technique and some of the associated technologies, there was no aspect of it related to actually introducing students to performing analyses themselves, which until now had been the theme of the PH525 sequence.

Overall, two stars. This course could use a revamp to focus more on introducing data analysis techniques and making the homework questions a little clearer. There were some complaints in the forums about the homework difficulty in this course, which in my experience is unusual for EdX (but almost a default on Coursera).

>>>> DNA Methylation:

This is the fourth and final case study and 8th and final course overall in the PH525 sequence. This was the most useful case study course in the sequence of the three I've taken (I skipped the course on variant discovery due to its reliance on a Linux virtual machine, although I will probably go back and complete that course at some point). One of Dr. Irizarry's research areas involves analysis of DNA methylation through various techniques, and as a result the material in this course is somewhat better than in the other case studies.

DNA methylation is an epigenetic modification that generally reduces gene expression. Epigenetics is an extremely hot field at the moment, and the chance to learn a bit about how the data for these studies are generated was exciting. The course suffered a bit from the same things that plagued the other courses in this sequence in that much of the data analysis pipeline isn't really feasible for the average student (although much of this is based on the limited ability of normal laptops to handle the volume of data these studies generate). Therefore, much of the difficult parts were already completed for you as in previous classes, but this course was still a great introduction to the topic.

Overall, four stars. The best of the case studies I've taken so far. I would like to see longer courses that go more in depth on these data analysis techniques (perhaps including introducing command line analyses for fastq processing and the like), but I don't know if that will be possible without asking students to pay for cloud computing resources like Amazon AWS.

Go to class

Udemy, Coursera, 2U/edX Face Lawsuits Over Meta Pixel Use

Most common

Popular subjects

Popular courses

Advanced Bioconductor

Overview

Taught by

Tags

Reviews

Udemy, Coursera, 2U/edX Face Lawsuits Over Meta Pixel Use

Taught by

Tags

Introduction to Bioconductor

Case Studies in Functional Genomics

Statistics and R

Leaders of Learning

Introduction to Linear Models and Matrix Algebra

CS50's Introduction to Computer Science

Ivy League Online Courses

10 Best R Programming Courses

100 Top FREE edX Courses of All Time

Massive List of MOOC-based Microcredentials

Never Stop Learning.