You are on page 1of 22

Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc.

| May, 2023 | Level Up Coding 6/23/23, 7:05 PM

Member-only story

Introducing PandasAI: The


Search Medium Write Sign up Sign In

Generative AI Python Library


Pandas AI is an additional Python library that enhances Pandas, the
widely-used data analysis and manipulation tool, by incorporating
generative artificial intelligence capabilities.

Gabe A, M.Sc. · Follow


Published in Level Up Coding · 9 min read · May 16

-- 13

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…Fintroducing-pandasai-the-generative-ai-python-library-568a971af014 Page 1 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

https://github.com/gventuri/pandas-ai

T oday, I want to share an exciting development in the world of data


analysis: PandasAI.

This revolutionary tool is designed to supercharge your data analysis tasks,


making them faster, more efficient, and downright enjoyable.

Section 1: Why PandasAI is the Future of Data Analysis

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…Fintroducing-pandasai-the-generative-ai-python-library-568a971af014 Page 2 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

When it comes to data analysis in Python, there’s one library that stands
head and shoulders above the rest: Pandas.

Pandas has been the go-to tool for manipulating and analyzing structured
data for over a decade. However, as datasets continue to grow larger and
more complex, there is a need for a tool that can handle these challenges
effortlessly. That’s where PandasAI comes in.

PandasAI takes the power of Pandas and combines it with the capabilities of
Artificial Intelligence to provide a seamless and intuitive data analysis
experience.

With its advanced algorithms and automated features, PandasAI can handle
massive datasets with ease, reducing the time and effort required to

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…Fintroducing-pandasai-the-generative-ai-python-library-568a971af014 Page 3 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

perform complex data manipulations. It can intelligently detect patterns,


outliers, and missing values, allowing you to make data-driven decisions
confidently.

Personal Tip: When working with PandasAI, take advantage of its


automated data cleaning features. By using functions like clean_data() and
impute_missing_values() , you can save a significant amount of time and
effort in preprocessing your data. It's always a good idea to explore the data
and understand its quality before diving into analysis. Trust me, this small
step can save you from headaches down the line!

Section 2: Getting Started with PandasAI

So, how can you get started with PandasAI?

The first step is to install the library, which is as simple as running the
following command in your Python environment:

pip install pandasai

Once you have PandasAI installed, you can import it into your Python script
or Jupyter Notebook using the following code:

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…Fintroducing-pandasai-the-generative-ai-python-library-568a971af014 Page 4 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

import pandasai as pdai

To give you a taste of what PandasAI can do, let’s say you have a dataset
with some missing values.

With traditional Pandas, you would need to spend time identifying and
handling these missing values manually. However, with PandasAI, you can
use the impute_missing_values() function to automatically fill in those gaps:

data = pd.read_csv('dataset.csv')
data_cleaned = pdai.impute_missing_values(data)

It’s as simple as that! PandasAI will intelligently analyze your data and fill in
the missing values using appropriate techniques, such as mean imputation
or regression.

This not only saves you time but also ensures that your analysis is based on
complete and reliable data.

Section 3: Exploring the Power of PandasAI


Now that you have a basic understanding of how to integrate PandasAI into
your data analysis workflow, let’s explore some of its powerful features and

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…Fintroducing-pandasai-the-generative-ai-python-library-568a971af014 Page 5 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

use cases.

1. Automated Feature Engineering


One of the most time-consuming aspects of data analysis is feature
engineering. Extracting meaningful information from raw data and
creating new features often requires extensive domain knowledge and
manual effort. However, PandasAI simplifies this process by automatically
generating new features based on the existing data.

data = pd.read_csv('dataset.csv')
data_features = pdai.generate_features(data)

PandasAI will analyze the patterns and relationships in your data and create
new features that capture important information. This saves you from the
tedious task of manually engineering features, allowing you to focus on the
insights and analysis.

2.Intelligent Data Visualization


Data visualization is a crucial part of any data analysis task, as it helps you
understand the patterns and trends hidden within the data. With PandasAI,
you can leverage its intelligent data visualization capabilities to create
insightful and informative visualizations effortlessly.

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…Fintroducing-pandasai-the-generative-ai-python-library-568a971af014 Page 6 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

data = pd.read_csv('dataset.csv')
pdai.plot_correlation_heatmap(data)

PandasAI provides a range of visualization functions that make it easy to


create stunning plots and charts. From correlation heatmaps to scatter
matrices, you can quickly gain valuable insights into your data by
visualizing it with just a few lines of code.

3. Streamlined Model Evaluation


When building machine learning models, evaluating their performance is a
critical step. PandasAI simplifies this process by providing a suite of
functions for model evaluation and comparison.

y_true = [0, 1, 1, 0, 1]
y_pred = [0, 1, 0, 0, 1]
pdai.plot_confusion_matrix(y_true, y_pred)

By using functions like plot_confusion_matrix() and plot_roc_curve() , you


can easily assess the performance of your models and make informed
decisions about their effectiveness.

Section 4: Frequently Asked Questions about PandasAI


Q: Is PandasAI compatible with existing Pandas code?

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…Fintroducing-pandasai-the-generative-ai-python-library-568a971af014 Page 7 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

Yes! PandasAI is built on top of Pandas, which means you can seamlessly
integrate it into your existing codebase. You can continue to use your favorite
Pandas functions while enjoying the additional capabilities provided by
PandasAI.

Q: How does PandasAI handle large datasets?

PandasAI is designed to handle large datasets efficiently. It leverages


advanced algorithms and optimizations to perform computations on large-
scale data with minimal memory usage. So, whether you’re working with
gigabytes or terabytes of data, PandasAI has got you covered.

Q: Can I contribute to the development of PandasAI?

Absolutely! PandasAI is an open-source project, and contributions from the


community are always welcome. Whether you want to suggest new features,
report bugs, or submit code improvements, you can actively participate in
shaping the future of PandasAI.

Q: Does PandasAI support GPU acceleration?

Currently, PandasAI doesn’t have native GPU acceleration. However, it takes


advantage of multi-core processing and parallel computing techniques to
speed up computations on modern CPUs.

Section 5: Real-Life Use Cases for PandasAI


https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…Fintroducing-pandasai-the-generative-ai-python-library-568a971af014 Page 8 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

Section 5: Real-Life Use Cases for PandasAI


As a seasoned data analyst, I’ve seen firsthand the transformative impact
that PandasAI can have on your data analysis workflow. It simplifies
complex tasks, reduces manual effort, and allows you to focus on the
insights and decisions that truly matter. Whether you’re a beginner or an
experienced data scientist, PandasAI has something to offer to enhance
your skills and productivity.

Remember, when working with PandasAI, always start by understanding


your data, leverage its automated cleaning and imputation functions, and
explore its powerful feature engineering and visualization capabilities. The
integration with Pandas ensures that you can seamlessly transition to
PandasAI without any major code changes.

So, don’t hesitate to give PandasAI a try! You’ll be amazed at how it can
revolutionize your data analysis processes and unlock new opportunities for
innovation and discovery. Embrace the power of PandasAI and let your data
analysis skills soar to new heights.

*Unlock the Full Potential of Data Analysis with PandasAI!

Section 6: Real-Life Use Cases for PandasAI


Now that you have a good understanding of the power and capabilities of
PandasAI, let’s dive into some real-life use cases where this tool can truly
shine.

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…Fintroducing-pandasai-the-generative-ai-python-library-568a971af014 Page 9 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

1. Financial Data Analysis


Financial data analysis often involves working with large and complex
datasets, such as stock market data or financial statements. PandasAI can
handle these datasets effortlessly, allowing you to perform in-depth
analysis, detect anomalies, and make data-driven investment decisions with
confidence. The automated feature engineering and visualization
capabilities of PandasAI can also help uncover hidden patterns and trends
in financial data, enabling you to gain a competitive edge.

import pandasai as pdai

# Load stock market data


stock_data = pdai.read_csv('stock_data.csv')
# Calculate rolling mean of stock prices
stock_data['Rolling Mean'] = pdai.rolling_mean(stock_data['Close'], window=30)
# Visualize stock prices and rolling mean
pdai.plot_line_chart(stock_data, x='Date', y=['Close', 'Rolling Mean'])

2. Customer Segmentation
Understanding your customers and their behavior is crucial for businesses
in various industries. With PandasAI, you can easily segment your customer
base based on various attributes and characteristics, such as demographics,
purchase history, or browsing behavior. By leveraging the automated
feature engineering capabilities of PandasAI, you can extract valuable
insights and create targeted marketing campaigns to improve customer
satisfaction and drive revenue growth.

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…introducing-pandasai-the-generative-ai-python-library-568a971af014 Page 10 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

import pandasai as pdai

# Load customer data


customer_data = pdai.read_csv('customer_data.csv')
# Perform customer segmentation based on purchase history and demographics
customer_segments = pdai.segment_customers(customer_data, features=['Purchase History
# Visualize customer segments
pdai.plot_pie_chart(customer_segments, labels='Segment', values='Count')

3. Healthcare Analytics
In the healthcare industry, analyzing vast amounts of patient data is
essential for making informed medical decisions and improving patient
outcomes. PandasAI can streamline the analysis process, allowing
healthcare professionals to extract valuable insights from electronic health
records, clinical trial data, or medical imaging data. The ability to handle
large datasets and automate certain data cleaning and feature engineering
tasks makes PandasAI a valuable tool in healthcare analytics.

import pandasai as pdai

# Load patient data


patient_data = pdai.read_csv('patient_data.csv')
# Perform analysis on patient data
average_heart_rate = pdai.mean(patient_data['Heart Rate'])
diabetes_patients = pdai.filter(patient_data, condition="Diabetes == 'Yes'")
# Visualize average heart rate
pdai.plot_bar_chart(x=['All Patients', 'Diabetes Patients'], y=[average_heart_rate, l

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…introducing-pandasai-the-generative-ai-python-library-568a971af014 Page 11 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

Section 6: Comparing Pandas and PandasAI: A Feature


Comparison
To help you understand the additional features and capabilities that
PandasAI brings to the table, let’s compare it with the popular Pandas
library in the following table:

As you can see, PandasAI offers several features that Pandas lacks, such as
automated data cleaning, feature engineering, and intelligent data
visualization.

These additional capabilities can significantly streamline your data analysis


tasks and empower you to derive deeper insights from your data.

While Pandas is an incredibly powerful and widely used library, PandasAI


takes data analysis to the next level by integrating Artificial Intelligence
algorithms and automation into the process. It provides a more efficient
and intuitive way to handle large datasets, automate repetitive tasks, and
unlock hidden patterns in your data.

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…introducing-pandasai-the-generative-ai-python-library-568a971af014 Page 12 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

Here are some additional code snippets that showcase how to


use PandasAI for various data analysis tasks:

1. Automated Data Cleaning


PandasAI offers automated data cleaning functions that can handle
common data quality issues, such as missing values and outliers. Here’s an
example of how to clean a dataset using PandasAI:

import pandasai as pdai

# Load the dataset


data = pdai.read_csv('data.csv')
# Clean the dataset by removing missing values
cleaned_data = pdai.clean_data(data)
# Save the cleaned dataset
cleaned_data.to_csv('cleaned_data.csv', index=False)

2. Automated Feature Engineering


PandasAI can automatically generate new features based on existing ones,
saving you time and effort. Here’s an example of how to perform automated
feature engineering with PandasAI:

import pandasai as pdai

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…introducing-pandasai-the-generative-ai-python-library-568a971af014 Page 13 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

# Load the dataset


data = pdai.read_csv('data.csv')
# Generate new features
transformed_data = pdai.generate_features(data)
# Save the transformed dataset
transformed_data.to_csv('transformed_data.csv', index=False)

3. Intelligent Data Visualization


PandasAI provides functions for creating insightful visualizations of your
data. Here’s an example of how to create a scatter plot with PandasAI:

import pandasai as pdai


import matplotlib.pyplot as plt

# Load the dataset


data = pdai.read_csv('data.csv')
# Plot a scatter plot of two variables
pdai.plot_scatter(data, x='Variable1', y='Variable2')
# Customize the plot
plt.title('Scatter Plot')
plt.xlabel('Variable 1')
plt.ylabel('Variable 2')
plt.show()

4. Model Evaluation and Comparison


PandasAI offers functions for evaluating and comparing machine learning
models. Here’s an example of how to evaluate a classification model using
PandasAI:

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…introducing-pandasai-the-generative-ai-python-library-568a971af014 Page 14 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

import pandasai as pdai


from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Load the dataset


data = pdai.read_csv('data.csv')
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(data.drop('target', axis=1), data
# Create a logistic regression model
model = LogisticRegression()
# Fit the model to the training data
model.fit(X_train, y_train)
# Evaluate the model
accuracy = pdai.accuracy_score(y_test, model.predict(X_test))
confusion_matrix = pdai.confusion_matrix(y_test, model.predict(X_test))
# Plot the confusion matrix
pdai.plot_confusion_matrix(confusion_matrix)
# Print the accuracy
print(f"Model Accuracy: {accuracy}")

These code snippets showcase just a few of the many features and
capabilities of PandasAI. Whether you’re cleaning data, engineering
features, visualizing insights, or evaluating models, PandasAI simplifies and
enhances your data analysis workflow.

In Conclusion: Unleash the Power of Data with PandasAI


PandasAI is a game-changer in the world of data analysis. With its advanced
AI capabilities and seamless integration with Pandas, it empowers data
analysts and scientists to tackle complex tasks more efficiently and
effectively. Whether you’re handling large datasets, automating feature
engineering, or visualizing data, PandasAI is your go-to tool.

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…introducing-pandasai-the-generative-ai-python-library-568a971af014 Page 15 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

So, what are you waiting for? Give PandasAI a try and see how it can
transform your data analysis workflows. As I’ve experienced firsthand, this
is what I would do — I believe it will take your data analysis skills to new
heights and unlock a whole world of possibilities. Happy analyzing!

*Keep Calm and Analyze On with PandasAI!

I hope this article has been helpful to you. Thank you for taking the time to
read it.

Free E-Book

Break Into Tech + Get Hired

If you enjoyed this article, you can help me share this knowledge with others by:
claps, comment, and be sure to + follow.

Who am I? I’m Gabe A, a seasoned data visualization architect and writer with
over a decade of experience. My goal is to provide you with easy-to-understand
guides and articles on various data science topics. With over 350+ articles
published across 25+ publications on Medium, I’m a trusted voice in the data
science industry.

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…introducing-pandasai-the-generative-ai-python-library-568a971af014 Page 16 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

Join Medium with my referral link — Gabe Araujo, M.Sc.


Read every story from Gabe Araujo, M.Sc. (and thousands of other
writers on Medium). Your membership fee directly…
medium.com

Free E-Book

Break Into Tech + Get Hired

Stay up to date. With the latest news and updates in the creative AI space —
follow the AI Genesis publication.

Data Science Programming Artificial Intelligence Machine Learning Python

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…introducing-pandasai-the-generative-ai-python-library-568a971af014 Page 17 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

Written by Gabe A, M.Sc. Follow

2.8K Followers · Writer for Level Up Coding

5X Top Writer | Free E-Book https://codeeliteintprep.gumroad.com/ | Support


my writing by Joining Medium: https://medium.com/@araujogabe1/membership

More from Gabe A, M.Sc. and Level Up Coding

Gabe A, M.Sc. in Level Up Coding Arslan Ahmad

How I Used Python to Make 12 Microservices Patterns I Wish I


Everyday Tasks Easier Knew Before the System Design…
Hey there! As a busy person with a lot on my Interview
Mastering the Art of Scalable and Resilient
plate, I’m always looking for ways to make … Systems with Essential Microservices Desig…
life easier. Patterns
· 8 min read · May 1 · 13 min read · May 16

-- 4 -- 13

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…introducing-pandasai-the-generative-ai-python-library-568a971af014 Page 18 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

Arslan Mirza Gabe A, M.Sc. in DataDrivenInvestor

10 Coding Habits That Make You a ChatGPT + Python+ Power BI


Bad Programmer You don’t need to be an expert to integrate
The Coder’s Playbook ChatGPT with Power BI. By following the…
steps outlined and leveraging the provided
· 7 min read · May 30 code…
· 9 min read · May 12

-- 4 --

See all from Gabe A, M.Sc. See all from Level Up Coding

Recommended from Medium

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…introducing-pandasai-the-generative-ai-python-library-568a971af014 Page 19 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

Leonie Monigatti in Towards Data Science Leonie Monigatti in Towards Data Science

10 Exciting Project Ideas Using Getting Started with LangChain: A


Large Language Models (LLMs) f… Beginner’s Guide to Building LLM…
Your Portfolio
Learn how to build apps and showcase your Powered
A LangChainApplications
tutorial to build anything with
skills with large language models (LLMs). G… large language models in Python
started today!
· 11 min read · May 15 · 12 min read · Apr 25

-- 8 -- 19

Lists

Predictive Modeling w/ Practical Guides to Machine


Python Learning
18 stories · 13 saves 10 stories · 27 saves

Coding & Development ChatGPT


11 stories · 4 saves 18 stories · 4 saves

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…introducing-pandasai-the-generative-ai-python-library-568a971af014 Page 20 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

Matt Chapman in Towards Data Science Love Sharma in Dev Genius

How I Stay Up to Date With the System Design Blueprint: The


Latest AI Trends as a Full-Time… Ultimate Guide
Data Scientist
No, I don’t just ask ChatGPT to tell me Developing a robust, scalable, and efficient
system can be daunting. However,…
understanding the key concepts and
· 8 min read · May 1 · 9 min readcan
components · Apr 20 the…
make

-- 25 -- 27

Wei-Meng Lee in Level Up Coding Kristen Walters in Adventures In AI

Training Your Own LLM using 5 Ways I’m Using AI to Make


privateGPT Money in 2023
Learn how to train your own language model These doubled my income last year
without exposing your private data to the…
provider
· 8 min read · May 19 · 9 min read · May 28

-- 9 -- 199

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…introducing-pandasai-the-generative-ai-python-library-568a971af014 Page 21 of 22
Introducing PandasAI: The Generative AI Python Library | by Gabe A, M.Sc. | May, 2023 | Level Up Coding 6/23/23, 7:05 PM

See more recommendations

Help Status Writers Blog Careers Privacy Terms About Text to speech Teams

https://12ft.io/api/proxy?ref=&q=https%3A%2F%2Flevelup.gitconnected…introducing-pandasai-the-generative-ai-python-library-568a971af014 Page 22 of 22

You might also like