Give your career a competitive edge in data mining techniques, data analytics, data visualization, and statistical machine learning from the #1-ranked school for innovation in the U.S.
Learn to navigate large, complex datasets through interactive exploration.
With zettabytes of data being collected annually, governments, companies, and people have more access to data than ever before. With so much data, it can be hard to know where to start looking for important insights or trends to drive business decisions.
Data mining techniques provide the first level of abstraction to raw data by extracting patterns, making big data analytics tools increasingly critical for providing meaningful information to inform better business decisions, and applying statistical learning theory to find a predictive function based on data.
You’ll learn to apply mathematical theory and decision making techniques that are vital to big data analysis, classification, clustering, and association rule mining through real-world projects designed by faculty from Arizona State University.
By committing to online study for 6-9 months, you can earn the Big Data MasterTrack Certificate that will be a pathway to the online Master of Computer Science degree at Arizona State University.
Course 1: CSE 511 Data Processing at Scale - Database systems are used to provide convenient access to disk-resident data through efficient query processing, indexing structures, concurrency control, and recovery. This course delves into new frameworks for processing and generating large-scale datasets with parallel and distributed algorithms, covering the design, deployment and use of state-of-the-art data processing systems, which provide scalable access to data. Specific topics covered include: * Efficient query processing * Indexing structures * Distributed database design * Parallel query execution * Concurrency control in distributed parallel database systems * Data management in cloud computing environments * Data management in Map/Reduce-based * NoSQL database systems Learners completing this course will be able to: * Perform queries (e.g., SQL) and analytics tasks in state-of-the-art database systems * Apply leading-edge techniques to design/tune distributed and parallel database systems * Utilize existing NoSQL database systems as appropriate for specified cases * Perform database operations (e.g., selection, projection, join, and groupby) in state-of-the-art cluster computing systems such as Hadoop/Spark * Perform scalable data processing operations (e.g., selection, projection, join, and groupby) in cloud computing environments, including Amazon AWS [Read the full course brief>>](https://asuonline.asu.edu/docs/cse_511.pdf "ASU Big Data MasterTrack Certificate Data Processing at Scale Course Brief")
Course 2: CSE 572 Data Mining - Once called “knowledge discovery in databases,” advances in processing power and speed over the last decade have allowed users to move beyond manual, tedious, and time-consuming practices to quick, easy data analysis that harnesses the power of machine learning and high performance computing. This course will introduce you to the fundamentals of data mining and pattern recognition. You will gain a deeper understanding of data through hands-on experience in the topic areas of big data analysis, classification, clustering, and association rule mining. Advanced topics such as reinforcement learning, deep learning, transfer learning and Deep Mind for Google will also be covered. By the end of the course, you will be able to apply state of the art data mining technology to real world applications, analyze and compare competing techniques, and design optimal solutions for a given set of application driven constraints. Specific topics covered include: * Data Mining Fundamentals * Machine Learning * Data Collection * Deep Learning * Data Visualization * Reinforcement Learning * Data Mining Algorithms Learners completing this course will be able to: * Differentiate among major data mining techniques such as classification, cluster analysis, and association rule mining * Apply common data mining algorithms to discover relationships and patterns in large datasets * Implement more advanced learning algorithms such as deep learning and reinforcement learning * Utilize open source tools to build a data mining project to solve a specific problem [Read the full course brief>>](https://asuonline.asu.edu/docs/cse_572_course_brief_v2.pdf "ASU Big Data MasterTrack Certificate will Data Mining Course Brief")
Course 3: CSE 575 Statistical Machine Learning - The link between inference and computation is central to statistical machine learning, which combines the computational sciences with statistics. In addition to artificial intelligence, fields such as information management, finance, bioinformatics, and communications are significantly influenced by developments in statistical machine learning. This course investigates the data mining and statistical pattern recognition that support artificial intelligence. Main topics covered include supervised learning; unsupervised learning; and deep learning, including major components of machine learning and the data analytics that enable it. Specific topics covered include: * Probability distributions * Maximum likelihood estimation * Naive Bayes * Logistic regression * Support vector machines * Clustering * Principal component analysis * Neural networks * Convolutional neural networks Learners completing this course will be able to: * Distinguish between supervised learning and unsupervised learning * Apply common probability distributions in machine learning applications * Use cross validation to select parameters * Use maximum likelihood estimate (MLE) for parameter estimation * Implement fundamental learning algorithms such as logistic regression and k-means clustering * Implement more advanced learning algorithms such as support vector machines and convolutional neural networks * Design a deep network using an exemplar application to solve a specific problem * Apply key techniques employed in building deep learning architectures [Read the full course brief>>](https://asuonline.asu.edu/docs/cse_575.pdf "ASU Big Data MasterTrack Certificate Statistical Machine Learning")
Course 4: CSE 578 Data Visualization - Visual representations generated by statistical models help us to make sense of large, complex datasets through interactive exploration, thereby enabling big data to realize its potential for informing decisions. This course covers techniques and algorithms for creating effective visualizations based on principles from graphic design, visual art, perceptual psychology, and cognitive science to enhance the understanding of complex data. Specific topics covered include: * data transformations * exploratory querying * statistical graphics * time series analysis * exploratory spatial data analysis Learners completing this course will be able to: * Develop exploratory data analysis and visualization tools using Python and Jupyter notebooks * Apply design principles for a variety of statistical graphics and visualizations including scatterplots, line charts, histograms, and choropleth maps * Combine exploratory queries, graphics, and interaction to develop functional tools for exploratory data analysis and visualization [Read the full course brief>>](https://asuonline.asu.edu/docs/cse_578.pdf "ASU Big Data MasterTrack Certificate Data Visualization Course Brief")