You are on page 1of 8

Week 3

1) DataFrame Creation: Create a Pandas DataFrame with the given data: `{'Name':
['Anna', 'Bob', 'Charlie'], 'Age': [25, 30, 35], 'Occupation': ['Engineer', 'Doctor',
'Teacher']}`.
2) Reading CSV: Given a CSV file named "data.csv", write a program to read and
display the first 5 rows.
3) Data Selection: Using the previously created DataFrame, select and print the 'Name'
and 'Age' columns.
4) GroupBy: Given a DataFrame with columns 'Name', 'Occupation', and 'Salary', group
the data by 'Occupation' and calculate the average salary for each profession.
5) Data Cleaning: Import a CSV file into a DataFrame. Check for missing values in all
columns and replace them with the mean of the respective column.
6) Basic Line Plot: Plot a simple line graph for the numbers `[2, 4, 6, 8, 10]` against `[10,
20, 30, 40, 50]`.
7) Bar Plot: Given a list of fruits `['apple', 'banana', 'cherry']` and their corresponding
counts `[5, 7, 3]`, plot a bar graph.
8) Histogram : Create a histogram for a given list of ages :
`[21,22,23,24,22,25,26,26,28,30,31,32,33,34,35,35,36,38,40]`
9) Scatter Plot: Given two lists of numbers, `[1, 3, 5, 7, 9]` and `[10, 9, 30, 25, 50]`, plot
a scatter plot and assign a title and labels to the x and y axes.
10) Subplots: Plot a sine and cosine curve on two subplots that share the same x-axis.

25
Program 1: DataFrame Creation: Create a Pandas DataFrame with the given data: `{'Name':
['Anna', 'Bob', 'Charlie'], 'Age': [25, 30, 35], 'Occupation': ['Engineer', 'Doctor', 'Teacher']}`.
Solution:
import pandas as pd
print("Kushal | 22103091")
d={
"Name": ["Anna", "Bob", "Charlie"],
"Age": [25, 30, 35],
"Occupation": ["Engineer", "Doctor", "Teacher"],
}
df = pd.DataFrame(d)
print(df)

Output:

Problem 2: Reading CSV: Given a CSV file named "data.csv", write a program to read and
display the first 5 rows.
Solution:
import pandas as pd

print("Kushal | 22103091")
df = pd.read_csv("test11.csv")
print(df.head())

Output:

26
Program 3: Data Selection: Using the previously created DataFrame, select and print the
'Name' and 'Age' columns.
Solution:
import pandas as pd

print("Kushal | 22103091")
d={
"Name": ["Anna", "Bob", "Charlie"],
"Age": [25, 30, 35],
"Occupation": ["Engineer", "Doctor", "Teacher"],
}
df = pd.DataFrame(d)
print(df[["Name", "Age"]])

Output:

Program 4: Given a DataFrame with columns 'Name', 'Occupation', and 'Salary', group the
data by 'Occupation' and calculate the average salary for each profession.
Solution:
print("Kushal | 22103091")
d2 = {
"Name": ["Anna", "Bob", "Charlie", "Daneil", "Mark", "Joe"],
"Salary": [100000, 80000, 35000, 50000, 120000, 200000],
"Occupation": ["Engineer", "Doctor", "Teacher", "Doctor", "Teacher", "Engineer"],
}
df = pd.DataFrame(d2)
avgsal = df.groupby("Occupation")["Salary"].mean()
print(avgsal)

Output:

27
Program 5: Import a CSV file into a DataFrame. Check for missing values in all columns
and replace them with the mean of the respective column.
Solution:
df = pd.read_csv("q5.csv")
column = df.mean()
df_filled = df.fillna(column)
print("Kushal | 22103091")
print(df_filled)

Output:

Program 6: Plot a simple line graph for the numbers `[2, 4, 6, 8, 10]` against `[10, 20, 30, 40,
50]`
Solution:
import matplotlib.pyplot as plt

x = [2, 4, 6, 8, 10]
y = [10, 20, 30, 40, 50]
plt.figure(figsize=(8, 6))
plt.plot(x, y, marker="o", linestyle="-")
plt.title("Simple Line Graph (Kushal | 22103091)")
plt.xlabel("X Axis")
plt.ylabel("Y Axis")
plt.grid(True)
plt.show()

Output:

28
Program 7: Given a list of fruits `['apple', 'banana', 'cherry']` and their corresponding counts
`[5, 7, 3]`, plot a bar graph

Solution:
import matplotlib.pyplot as plt

fruits = ["apple", "banana", "cherry"]


counts = [5, 7, 3]

plt.figure(figsize=(8, 6))
plt.bar(fruits, counts, color="skyblue")
plt.title("Fruit Counts (Kushal | 22103091)")
plt.xlabel("Fruits")
plt.ylabel("Counts")
plt.grid(axis="y", linestyle="--", alpha=0.7)
plt.show()

Output:

29
Program 8: Histogram : Create a histogram for a given list of ages :
`[21,22,23,24,22,25,26,26,28,30,31,32,33,34,35,35,36,38,40]`
Solution:
import matplotlib.pyplot as plt

ages = [21, 22, 23, 24, 22, 25, 26, 26, 28, 30, 31, 32, 33, 34, 35, 35, 36, 38, 40]

plt.figure(figsize=(8, 6))
plt.hist(ages, bins=10, color="skyblue", edgecolor="black", alpha=0.7)
plt.title("Age Distribution (Kushal | 22103091)")
plt.xlabel("Age")
plt.ylabel("Frequency")
plt.grid(axis="y", linestyle="--", alpha=0.7)
plt.show()

Output:

Program 9: Given two lists of numbers, `[1, 3, 5, 7, 9]` and `[10, 9, 30, 25, 50]`, plot a
scatter plot and assign a title and labels to the x and y axes.
Solution:
import matplotlib.pyplot as plt
x = [1, 3, 5, 7, 9]
y = [10, 9, 30, 25, 50]
plt.figure(figsize=(8, 6))
plt.scatter(x, y, color="skyblue", label="Data Points")
plt.title("Scatter Plot of Two Lists (Kushal | 22103091)")

30
plt.xlabel("X Axis")
plt.ylabel("Y Axis")
plt.legend()
plt.grid(True)
plt.show()

Output:

Program 10: Plot a sine and cosine curve on two subplots that share the same x-axis.
Solution:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 2 * np.pi, 100)
y_sin = np.sin(x)
y_cos = np.cos(x)
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 8), sharex=True)
ax1.plot(x, y_sin, color="blue", label="Sine")
ax1.set_title("Sine Curve")
ax1.set_ylabel("Amplitude")
ax2.plot(x, y_cos, color="red", label="Cosine")
ax2.set_title("Cosine Curve")
ax2.set_ylabel("Amplitude")
ax2.set_xlabel("X")
ax1.legend()
ax2.legend()

31
plt.tight_layout()
plt.show()

Output:

32

You might also like