Professional Documents
Culture Documents
series
Data=pd.series([list], index=[list]) the first parameter is like numpy
array while the second parameter is used to index like a dictionary where
the index is the key and the first parameter is value. If index is not passed
then the indexing works like normal 0,1,2,3,4,5… . series handles only
1D array.
Data.values printing this will give you the values present in the
first parameter. It is numpy array. Data itself is pandas object.
Data.index printing this will return either the second parameter or
if not present it will return value from 0 up-to the desired index-able
value. Its type is pandas object.
Printing ‘Data’ will return will return 2 rows where the right row is the
index and the left row the the value.
We can pass a dictionary in the pd.series to change the keys to index and
the values to values.
If we pass values in the second parameter I.e=[n1,n2,n3,…nx] then this Is
called explicit index in which if we try to index from [nd,nx] then both
will be accessed. But if we index with the traditional way I.e. [x,d] where
both are numbers then x will be included but not d. this is called implicit
indexing.
How to index
Data[x] returns the value that goes with the given index.
Data[x:y] returns the value from x to (y-1) if implicit index. Or returns
the value from x to (y) if explicit index.
Note: Data[x:y] if we pass number even if the index passed is number
then by default it access the implicit index. Where as the Data[x] by
default acess the explicit index. In order to avoid confusion between
explicit and implicit index when the index we passed is also number, then
do this: -
Data.iloc[x:y] or Data.iloc[x] to forcefully use implicit index.
Data.loc[x:y] or Data.loc[x] to forcefully use explicit index.
pd.DataFrame
The first way to utilize this object is sending multiple pd.series into a
dictionary like:
Super_data=Pd.DataFrame({data:pd.series, data1:pd.series, …})
printing will return columns composed of this data, data1 … and the
index in each of pd.series as a row and the intersection of this rows and
columns will be filled with the corresponding values present in the
pd.series.
You can transpose it like numpy. By passing the .T object.
Super_Data.value this returns a 2D array of values present in the cells.
Apply all methods of indexing present in the numpy notes for
matrix. Remember that the matrix is super_data.value. or we can pass
the .iloc object after our super data in order to utilize every indexing
present in the numpy notes indexing for matrix. So basically if we
pass .iloc after our dataframe we can use it as a matrix.
Super_data.columns returns all columns.
To add another column do this:-
If we want to add colum new_data then:
Super_data[new_data] = {pass the key and value}
Del super_data[new_data] will delet the column
In order to access sub-matrix of the super_data then:
Super_matrix[super_matrix[column] >, <, !=, =, value] then it will
operate according to the operation given.
pd.DataFrame([{key:value}, {key:value}]) it returns data composed of
rows and columns. The keys will be changed to columns and the values
will be changed to cells and index as a row. Since there is no index then
the index will be 0,1,2,3,4 …
Note: the index here represents the dictionary, plus mind that the
dictionaries are passed in a list.