Health data are notable for how many types there are, how complex they are, and how serious it is to get them straight. These data are used for treatment of the patient from whom they derive, but also for other uses. Examples of such secondary use of health data include population health (e.g., who requires more attention), research (e.g., which drug is more effective in practice), quality (e.g., is the institution meeting benchmarks), and translational research (e.g., are new technologies being applied appropriately). By the end of this course, students will recognize the different types of health and healthcare data, will articulate a coherent and complete question, will interpret queries designed for secondary use of EHR data, and will interpret the results of those queries.
Introduction to Databases and Data Types
In this module, we will begin by introducing and defining databases, and placing the role of databases within the context of clinical informatics. We will continue by introducing the common health data types such as demographics, diagnosis, medications, procedures, and utilization data. We will finish this module by reviewing the emerging health data such as lab orders/results, vital signs, social data, and patient-generated data.
Data Sources and Data Challenges
In this module, we review the data specifications extracted from insurance claims and electronic health records. We will then discuss the common challenges in using health data, specifically issues with data quality, data interoperability, and data system architectures. Finally, we will describe the “Big Data” challenges of health data and explain some of the data problems that may hinder analytical efforts.
Formulating Data Questions
With this understanding of the data available, it’s time to see how to turn questions you and your colleagues will have into queries the database can understand. Besides getting rules of thumb for doing this translation, you will also be introduced to three online tools available to test some of these skills. You will also watch an interview with Sam Meiselman, course instructor and the data manager in charge of the Johns Hopkins Enterprise Data Warehouse, who has to use these skills on a daily basis.
Real World Applications of Data Science in Health Informatics
To send home the recurring message on the challenges and art of translating questions into queries, you will see interviews with two professionals: One who comes from the data management side of the equation, and one who comes from the domain. They will give you perspectives that are both similar (the need to understand the problem for which the data are being retrieved) and different (the multiplicity of data available vs the richness of the domain problem).