You are on page 1of 2

NumPy is a library for the Python programming language, adding support for large,

multi-dimensional arrays and matrices, along with a large collection of high-level


mathematical functions to operate on these arrays. 
Data manipulation is mostly done by 2 packages – Numpy and Pandas.

Most of the things in data analysis will be performed by the Pandas only. Numpy will
add few things in alongside.

 Numpy has limitation on things that it can do with the data. We can only
manipulate that data in Numpy which is in int or float types. Text values,
Boolean values cannot be manipulated via Numpy. Therefore, Pandas is being
used.

 Numpy used arrays. Arrays are like lists only. Arryas can be created by C, C++,
Java, any programming language, but lists are limited. But Lists is specific to
python.

 Arrays are much faster and memory efficient as compared to a list.

 Inside an Array, one can only store homogeneous data; i.e all the values inside
the array should have same data type. However, inside a list – you can have
collection of diff types of data.

Pandas has its own data structure. Normal python has some tools it stores and
manipulates data. Those things are lists, tupples, sets, dictionaries and strings. These
are data structures of Python. Their tasks is to store the data, access the data. For
similar operations – data storage, accessing the data , and manipulating it, Pandas
also has 2 tools.

Pandas Series - 1D Data – when you’re dealing with 1 column of any particular data
set
Pandas Data Frame – 2D data – when you’re dealing with 2 or more than 2 columns
of a database.

You might also like