You are on page 1of 14

PIVOT TABLES

INTRODUCTION
A pivot table is a statistical table that summarizes
a substantial table like a big dataset.
It is part of data processing.
This summary in pivot tables may include mean,
median, sum, or other statistical terms.
Pivot tables are originally associated with MS
Excel but we can create a pivot table in Pandas
using Python using the Pandas Dataframe
pivot_table() method.
WORKING OF PIVOT TABLES
• Import Libraries: First import Pandas library
which provides functionality for data
manipulation and analysis.
• Create DataFrame: Create a DataFrame
containing your data.
• Create a Pivot Table: Use the pivot_table()
method of the DataFrame to create a pivot table.
• View the Pivot Table: View the resulting pivot
table by printing it or displaying it.
SYNTAX
pivot = df.pivot_table(index=[’ COLUMN_NAME],
values=[’COLUMN_NAME],
aggfunc=’ NAME OF FUNCTION)
• index specifies which column to use as the
index in the pivot table.
• columns specifies which column to use to
create new columns in the pivot table.
• values specifies which column(s) to aggregate.
• aggfunc specifies the aggregation function to
use when multiple values need to be
combined.
import pandas as pd
data = {'Student': ['Akash', 'Aman', 'bharti', 'Umang',
'Satish'],
'Computers': [35, 35, 40, 40, 20],
'Maths': [50, 20, 30, 40, 10],
'Residence': ['HP', 'chd','Haryana', 'Punjab',
'HP'],
'Gender': ['M', 'M', 'F','F', 'M']
}
df = pd.DataFrame(data)
df
pivot = df.pivot_table(index=['Student'],
values=['Maths', 'Computers'],
aggfunc='sum')
print(pivot)
pivot = df.pivot_table(index=['Gender'],
values=['Maths'],
aggfunc='sum')
print(pivot)
ADVANTAGES
• Ease of Use: Python's Pandas library provides an intuitive interface
for creating pivot tables. With just a few lines of code, you can
generate complex pivot tables to summarize and analyze your data.

• Integration with Data Analysis Workflow: Pivot tables seamlessly


integrate into Python's data analysis workflow. You can incorporate
pivot tables into your scripts or Jupyter Notebooks alongside other
data manipulation and visualization tasks.

• Flexibility: Pandas pivot tables offer a high degree of flexibility. You


can easily customize pivot tables by specifying index, columns,
values, and aggregation functions, allowing for a wide range of
analysis options.
• Performance: Pandas is optimized for performance, making it
efficient for processing large datasets. Pivot tables can handle
millions of rows of data quickly and effectively, making them
suitable for big data analysis tasks.

• Interactive Data Exploration: With Pandas pivot tables, you can


interactively explore your data. You can dynamically adjust the
layout, apply filters, and drill down into specific subsets of data
to gain insights and identify patterns.

• Data Cleaning and Transformation: Pivot tables can assist in data


cleaning and transformation tasks. You can reshape and
reorganize your data as needed, pivoting, unpivoting, and
aggregating to prepare it for further analysis or visualization.
• Scalability: Python's ecosystem, including Pandas, is designed to handle
large datasets efficiently. Pivot tables can scale to accommodate datasets
of varying sizes, making them suitable for both small-scale and large-scale
data analysis projects.

• Integration with Visualization: Pandas pivot tables seamlessly integrate


with Python's visualization libraries such as Matplotlib and Seaborn. You
can easily create visual representations of your pivot table results,
enhancing the clarity and interpretability of your analysis.

• Reproducibility: Python's code-based approach to data analysis promotes


reproducibility. You can save your pivot table configurations as part of
your code, ensuring that your analysis is easily reproducible and
shareable with others.

You might also like