You are on page 1of 14

Vertical Bar Graph + Adding Value Labels

import matplotlib.pyplot as plt


x=["Toyota", "Nissan", "Honda"]
y=[230, 245, 235]

def addlabels(x,y):
for i in range(len(x)):
plt.text(i,y[i],y[i], ha="center")

plt.bar(x,y, color=["r", "g", "b"], width=0.5)


plt.title("Speed Test (kmph)")
plt.xlabel("Car Brand")
plt.ylabel("Kilometres Per Hour")
addlabels(x, y)
plt.show()
Before you understand the structure of the code. Keep in mind that it’s case
sensitive. Meaning if you write a capital letter somewhere instead of a small letter,
the code might not run. So be careful of capitalization. Also, yes you need to be wary
of where to put quotation marks and which brackets to use. Using third brackets
instead of first brackets in some places might also break the code. Now let’s begin.
import matplotlib.pyplot as plt

This imports the model we’ll use to make the bar graph. Matplotlib is an extension of
python used to create visualisations like bar graphs, charts etc
x=["Toyota", "Nissan", "Honda"]
y=[230, 245, 235]

We’re doing this to define the values. We use third brackets if the variable has more
than 1 value. If any number is quantitative data (somthing you’re gonna do math
with). Then you don’t need to add quotation marks. But you need to add quotation
marks for any quantitative data (Any text would have quotation marks. “Banana”. You
can even add quotations to numbers that you won’t use to add or subtract with like
“2023”)
def addlabels(x,y):
for i in range(len(x)):
plt.text(i,y[i],y[i], ha="center")

You need to write this if you want to add labels. If you don’t want to add value labels,
then you don’t have to add this. ha=”center” just makes values align to the center.
plt.bar(x,y, color=["r", "g", "b"], width=0.5)

plt.bar tells the code that we’re going to make a bar graph. In the brackets x,y are
our variables that we want to put in the bar graph. Then we add a comma and write
the extra stuff such as color and width. Width changes the width of the bar.
plt.title("Speed Test (kmph)")
plt.xlabel("Car Brand")
plt.ylabel("Kilometres Per Hour")
addlabels(x, y)
plt.show()

plt.title puts a title on top of the bar graph.


plt.xlabel and plt.ylabel puts the text in the brackets onto the x and y axis of the
graph.
addlabels(x,y) puts the value labels on top of the bargraph. The value labels won’t
show up on the graph if you don’t write this line.
plt.show() is used to show the bar graph.
Line Graph + Add Value Labels
import matplotlib.pyplot as plt
x=["Toyota", "Nissan", "Honda"]
y=[230, 245, 235]

def addlabels(x,y):
for i in range(len(x)):
plt.text(i,y[i],y[i], ha="center")

plt.plot(x,y, linewidth=1.5)
plt.title("Speed Test (kmph)")
plt.xlabel("Car Brand")
plt.ylabel("Kilometres per hour")
addlabels(x, y)
plt.show()

The only difference in this code is that instead of plt.bar(x,y) we did plt.plot(x,y). We
didn’t add the part with the width because there is no bar width to change here.
Instead we used linewidth=1.5 to change the thickness of the line.
Horizontal Bar Graph
import matplotlib.pyplot as plt
x=["Toyota", "Nissan", "Honda"]
y=[230, 245, 235]

plt.barh(x,y, color=["r", "g", "b"], height=0.5)


plt.title("Speed Test (kmph)")
plt.xlabel("Kilometres Per Hour")
plt.ylabel("Car Brand")
plt.show()

Not much difference from the vertical bar graph. You have to switch the xlabel and
ylabel for this one tho. Or else your labels will be wrong.
plt.barh(x,y, color=["r", "g", "b"], height=0.5)

And instead of plt.bar we gotta write plt.barh. The h at the end of plt.barh is
necessary or else your bar graph won’t be horizontal. And instead of width=0.5 we
have to write height=0.5. Everything else is more or less the same as a vertical bar
graph.
Scatter Plot
import matplotlib.pyplot as plt
x=["Toyota", "Nissan", "Honda"]
y=[230, 245, 235]

plt.scatter(x,y, color=["r", "g", "b"])


plt.title("Speed Test (kmph)")
plt.xlabel("Car Brand")
plt.ylabel("Kilometres Per Hour")
plt.show()

You should be able to understand how things work now. Instead of plt.bar, plt.plot or
plt.barh, We used plt.scatter. This creates a scatter plot. Using height and width in
the brackets doesn’t work here. So keep that in mind.
Pandas - Dataframe
Firstly we’ll learn how to make a dataframe. Which is basically a table of data.
import pandas as pd
data = {
'Name': ['P', 'M', 'K'],
'height (cm)': [177, 180, 185]
}
df = pd.DataFrame(data)
print(df)

The output would look like this:

So now we learn what the code does. Firstly we have to import the pandas library
import pandas as pd

Then we have to state the data


data = {
'Name': ['P', 'M', 'K'],
'height (cm)': [177, 180, 185]
}

You don’t have to write it exactly in this format. You can write it like this too and it’ll
still work
data = {'Name': ['P', 'M', 'K'], 'height (cm)': [177, 180, 185]}

You don’t have to write it exactly in this format. You can write it like this too and it’ll
still work. Just make sure you type the variable name. I used the word data here but
you can even use x. Make sure whatever variable name you use here, you gotta call
that same variable further in the code. The variable contents need to be in second
brackets. And the contents of the variable should have a name that needs to be in a
quotation mark, then you gotta put a colon, then put your data list contents in third
brackets.
The df = pd.DataFrame(data) part just turns the data into the data frame if you ever
call the variable “df”. This is a crucial part of the code. And then finally to see the
data we type print(df). print(df) shows the full data. This is where df is being used
from your previous line of code.

If you have a huge list of data, maybe it wouldn’t be a good option to view the whole
thing. In that case we don’t use print(df), we can view the first few list items by using:

print(df.head(2))

Or we can print the last few Table rows by using


print(df.tail(2))

The number inside the bracket sets how many rows you want to see. I put 2 here, so
it only shows the first 2 rows when I use print(df.head(2)). And it only shows the last
2 rows when I use print(df.tail(2)). You can use any number you want really. And
leaving the brackets empty would show you the last 5 rows.
Matplotlib + pandas - creating a bar graph
import pandas as pd
import matplotlib.pyplot as plt

x = {'Name': ['P', 'M', 'K'], 'height (cm)': [177, 180, 185]}


df = pd.DataFrame(x)

plt.bar(df['Name'], df['height (cm)'], color='skyblue')


plt.xlabel('Name')
plt.ylabel('height (cm)')
plt.title('hehe')
plt.show()
Setting up the dataframe is the same as before

import pandas as pd
import matplotlib.pyplot as plt

x = {'Name': ['P', 'M', 'K'], 'height (cm)': [177, 180, 185]}


df = pd.DataFrame(x)

Then we gotta turn it into a bar graph, this is where things are a bit different. So here
goes.

plt.bar(df['Name'], df['height (cm)'], color='skyblue')

Use plt.bar(), in the brackets you call the variable df then state the first list thing,
which is “Name”. Then we call the second list thing, which is “height (cm)”. Make
sure the letters are exactly as you set in the data frame variable. It’s case sensitive
so you gotta make sure the capitalization is the same too. The rest is the same as
the usual matplotlib bar chart code.

plt.xlabel('Name')
plt.ylabel('height (cm)')
plt.title('hehe')
plt.show()

plt.xlabel(“”) puts a label on the left (x-axis). plt.ylabel(“”) puts a label on the bottom
(y-axis).
Pandas + importing a file from google colab
The following code lets you upload a file to google colab. So that you can use it later
in your code

from google.colab import files

uploaded = files.upload()

It should give you a box like this

Click on choose files and upload an excel sheet. It matters what filetype it is because
there’s different treatments for xlsx files and csv files. I uploaded a csv file here.

Then you gotta open a new code box and import pandas (if you haven’t yet). And
then write the following code to make pandas read the spreadsheet you just
uploaded.

import pandas as pd

df = pd.read_csv("data.csv")

The name is case sensitive. So make sure you type the name of your file correctly.
It’ll look like this

If your data has a lot of rows and if you use print(df) now it’ll show you the first 5 and
last 5 rows of your data. It’ll also show you the number of columns and rows.

You can use print(df.head()) or print(df.tail()) here to see the beginning and ending
rows of your data. Like this.
Imported CSV File + Pandas + Matplotlib
Now we’ll make a scatter plot from the csv file we uploaded. First upload the csv file
if you haven’t yet (refer to the previous section if you forgot how to upload a file.)

import pandas as pd
df = pd.read_csv("data.csv")

plt.scatter(df['Duration'], df['Pulse'], color='red')


plt.xlabel('Duration')
plt.ylabel('Pulse')
plt.title('Duration vs pulse')
plt.show()
Things are the same as before really. We set a variable df to make pandas read the
file everytime we call it.

import pandas as pd
df = pd.read_csv("data.csv")

Then we make the scatter plot. I viewed the data using print(df) and saw that there
were 4 headings. ‘Duration’, ‘Pulse’, ‘Maxpulse’, and ‘Calories’.

I decided to make a scatter plot using ‘Duration’, and ‘Pulse’. So I wrote


df['Duration'], and df['Pulse'] inside the plt.scatter bracket.

plt.scatter(df['Duration'], df['Pulse'], color='red')

The rest is the same as any usual matplotlib code

plt.xlabel('Duration')
plt.ylabel('Pulse')
plt.title('Duration vs pulse')
plt.show()
You gotta understand what data works with what chart. Replacing plt.scatter with
plt.plot will give you wildly messed up results.

You don’t want that now, do you?

You might also like