You are on page 1of 7

Open Source Project Contribution

(zarr-python)

Ankit Agg rw l
(2022H1030051G)
a

what is zarr ?

• Storage of chunked ,Compressed,N-dimensional arrays.


• It is a python package
• We can read and write files to cloud storage system(e.g AWS3)

Issue
Expose chunk and/or slice information (byte offset and size) to the API and make it available to the user

Dependencies

• Dask- Dask is a flexible library for parallel computing in Python


• H5py- Store huge amounts of numerical data, and easily manipulate that data from NumPy
• Kerchunk- Kerchunk is a library that provides a unified way to represent a variety of
chunked, compressed data formats (e.g. NetCDF/HDF5, GRIB2, TIFF, …)
• Netcd4 - Many features not found in earlier versions of the library, such as hierarchical groups,
zlib compression, multiple unlimited dimensions, and new data types.
• Pytest - Makes it easy to write small tests, yet scales to support complex functional testing for
applications and libraries.
• xarray -Makes working with labelled multi-dimensional arrays simple, efficient, and fun
• zarr - Format for the storage of chunked, compressed, N-dimensional arrays.

Work done so far


• Installing git
• Cloning the repository and installing dependencies
• Learnt various libraries like zarr,Hdpy

What needs to be done

• Write the code to x the issue


• Send the pull request
fi

Thank you

You might also like