Section 1: Data in R
Identify the components of RStudio; Identify the subjects and types of variables in R; Summarise and visualise univariate data, including histograms and box plots.
Section 2: Visualising relationships
Produce plots in ggplot2 in R to illustrate the relationship between pairs of variables; Understand which type of plot to use for different variables; Identify methods to deal with large datasets.
Section 3: Manipulating and joining data
Organise different data types, including strings, dates and times; Filter subjects in a data frame, select individual variables, group data by variables and calculate summary statistics; Join separate dataframes into a single dataframe; Learn how to implement these methods in mapReduce.
Section 4: Transforming data and dimension reduction
Transform data so that it is more appropriate for modelling; Use various methods to transform variables, including q-q plots and Box-Cox transformation, so that they are distributed normally Reduce the number of variables using PCA; Learn how to implement these techniques into modelling data with linear models.
Section 5: Summarising data
Estimate model parameters, both point and interval estimates; Differentiate between the statistical concepts or parameters and statistics; Use statistical summaries to infer population characteristics; Utilise strings; Learn about k-mers in genomics and their relationship to perfect hash functions as an example of text manipulation.
Section 6: Introduction to Java
Use complex data structures; Implement your own data structures to organise data; Explain the differences between classes and objects; Motivate object-orientation.
Section 7: Graphs
Encode directed and undirected graphs in different data structures, such as matrices and adjacency lists; Execute basic algorithms, such as depth-first search and breadth-first search.
Section 8: Probability
Determine the probability of events occurring when the probability distribution is discrete; How to approximate.
Section 9: Hashing
Apply hash functions on basic data structures in Java; Implement your own hash functions and execute, these as well as built-in ones; Differentiate good from bad hash functions based on the concept of collisions.
Section 10: Bringing it all together
Understand the context of big data in programming.