Professional Documents
Culture Documents
Pandas
● Series
● DataFrame
● Index
PyTables
Exercises
Recap
Following the contents of the course
● Series
○ One dimensional array object with an index
● DataFrame
○ Two dimensional array object with index and columns
● Index
○ Used to label index and columns in Series and DataFrame
Pandas
Open the notebook and follow the examples
PyTables
PyTables
● Python interface to HDF5 files
○ Similar to h5py
● Focused on relational data (tables)
● NumPy deals with large datasets in-memory
● PyTables uses NumPy containers as in-memory buffers to push the I/O
bandwidth towards the platform limits
● It doesn’t support transactional operations so be careful if writing data in
parallel
● It provides different types of containers
PyTables
Group
● Allows compression
● Enlargeable
● Not enlargeable
● Can be used only with relatively small datasets (i.e. those that fit in memory)
● Not enlargeable
● h5py is an attempt to map the HDF5 feature set to NumPy as closely as possible
● PyTables is more focused on speed and dealing with really large datasets