This course introduces you to the core concepts, processes, and tools you need to know in order to get a foundational knowledge of data engineering. You will gain an understanding of the modern data ecosystem and the role Data Engineers, Data Scientists, and Data Analysts play in this ecosystem.
The Data Engineering Ecosystem includes several different components. It includes disparate data types, formats, and sources of data. Data Pipelines gather data from multiple sources, transform it into analytics-ready data, and make it available to data consumers for analytics and decision-making. Data repositories, such as relational and non-relational databases, data warehouses, data marts, data lakes, and big data stores process and store this data. Data Integration Platforms combine disparate data into a unified view for the data consumers. You will learn about each of these components in this course. You will also learn about Big Data and the use of some of the Big Data processing tools.
A typical Data Engineering lifecycle includes architecting data platforms, designing data stores, and gathering, importing, wrangling, querying, and analyzing data. It also includes performance monitoring and finetuning to ensure systems are performing at optimal levels. In this course, you will learn about the data engineering lifecycle. You will also learn about security, governance, and compliance.
Data Engineering is recognized as one of the fastest-growing fields today. The career opportunities available in the field and the different paths you can take to enter this field are discussed in the course.
The course also includes hands-on labs that guide you to create your IBM Cloud Lite account, provision a database instance, load data into the database instance, and perform some basic querying operations that help you understand your dataset.
What is Data Engineering?
In this module, you will learn about the different entities that come together to form a modern data ecosystem and the role Data Engineers, Data Scientists, Data Analysts, Business Analysts, and Business Intelligence Analysts play in this ecosystem. You will learn what data engineering is and the key tasks in a data engineering lifecycle. You will also gain an understanding of the responsibilities of a data engineer, the skillsets they need in order to be successful, and what a typical day in the life of a data engineer looks like. At the end of the module, you will be guided to create a Lite account on IBM Cloud.
The Data Engineering Ecosystem
In this module, you will learn about the data engineering ecosystem, the different types of data structures, file formats, sources of data, and the languages data professionals use in their day-to-day tasks. You will gain an understanding of several different types of data repositories such as relational and non-relational databases, data warehouses, data marts, and data lakes. You will learn about ETL and ELT processes, data pipelines, and data integration platforms. You will also gain an understanding of what big data is, and the tools used for processing and storing big data. During the course of this module, you will be guided to provision an instance of IBM Db2 using the Cloud Lite account you created in the previous module.
Data Engineering Lifecycle
In this module, we will walk you through the data engineering lifecycle. You will learn about the architecture of a data platform, factors for selecting and designing data stores, and the different facets of security as it applies to data platforms and data lifecycle management. You will also learn about the process, steps, and tools used for gathering, importing, wrangling, and querying data. You will gain an understanding of performance monitoring and the steps you can take to troubleshoot performance issues. We will also talk about governance regulations, why we need them, and how technology enables compliance to regulations. During the course of this module, you will be guided to load data from a CSV file into the IBM Db2 instance you created in the previous module. You will also be guided to explore your dataset using some basic SQL queries that will be provided to you.
Career Opportunities and Data Engineering in Action
In this module, you will learn about career opportunities in the field of Data Engineering and the different paths that you can take for getting skilled as a Data Engineer. At the end of the module, you will be presented with the final graded assignment which is divided into two parts. The first part of the final assignment includes a couple of quiz questions and the second part includes open-ended questions that will be reviewed and graded by a peer.