Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Data Formats for Data Science

EuroPython Conference via YouTube

Overview

Coursera Plus Annual Sale: All Certificates & Courses 25% Off!
Explore various data formats for data science beyond CSV and HDFS in this 43-minute EuroPython Conference talk. Delve into the strengths and limitations of different data storage solutions, including plain text, structured formats, and specialized scientific data formats like HDF5, ROOT, and NetCDF. Compare Pythonic implementations such as xarray, pyROOT, rootpy, h5py, PyTables, bcolz, and blaze for handling diverse data structures and sizes. Gain practical guidelines for choosing appropriate formats based on data characteristics and computational requirements. Discover emerging trends in columnar databases like MonetDB for high-speed in-memory analytics, equipping data scientists with a comprehensive understanding of data format options and their applications in scientific computing.

Syllabus

Intro
Data formats for data science
Textual data format
LowTake
CSV
Python
Textual Data
Binary Data
New HDF5 File
PyTables
Groups
DataChannelKing
Route
Root Files
Root Pi
Root Numpy
NoSQL DB
HDFS
HDFS III
Example
Python Code Example
Tools

Taught by

EuroPython Conference

Reviews

Start your review of Data Formats for Data Science

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.