Overview
Explore common pitfalls in data science and statistics during this 55-minute conference talk from Analytics 2018. Learn how to avoid mistakes when preparing and engineering data using T-SQL or other database systems. Gain insights into soft skills, data gathering techniques, general statistics principles, and the proper handling of outliers. Discover the importance of data normalization, clustering, and correlation in your analyses. Understand the nuances of causation versus correlation and improve your visualization techniques. Master the art of prediction and problem-solving in data science while enhancing your overall approach to statistical analysis and database management.
Syllabus
Introduction
Who is a data scientist
Agenda
Soft Skills
Gathering Data
General Statistics
Factor
Test Differences
Causation and Correlation
Outliers
Removing outliers
Multivariate search for outliers
Things we inadvertently forget
Data normalization
Clustering
Normalization
Correlation
Visualization
Problem rules
Prediction
Recap
Questions
Taught by
PASS Data Community Summit