You are on page 1of 33

1.

Create a panda’s series from a dictionary of values and a ndarray

import pandas as pd
import numpy as np

# Create a Series from a dictionary


data_dict = {'A': 10, 'B': 20, 'C': 30}
series_from_dict = pd.Series(data_dict)
print("Series from Dictionary:")
print(series_from_dict)

# Create a Series from an ndarray


data_ndarray = np.array([40, 50, 60])
series_from_ndarray = pd.Series(data_ndarray)
print("\nSeries from ndarray:")
print(series_from_ndarray)

Series from Dictionary:


A 10
B 20
C 30
dtype: int64

Series from ndarray:


0 40
1 50
2 60
dtype: int64

2. Print elements above the 75th percentile:


import pandas as pd

# Create a Series
data = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
series = pd.Series(data)

percentile_75 = series.quantile(0.75)
above_percentile_75 = series[series > percentile_75]

print("Elements above the 75th percentile:")


print(above_percentile_75)

output

Elements above the 75th percentile:


7 80
8 90
9 100
dtype: int64

3. Create a Data Frame quarterly sales where each row contains the item
category, item name, and expenditure. Group the rows by the category and
print the total expenditure per category.

import pandas as pd

data = {
'Category': ['A', 'B', 'A', 'B', 'C'],
'Item Name': ['Item1', 'Item2', 'Item3', 'Item4', 'Item5'],
'Expenditure': [100, 150, 200, 120, 80]
}

df = pd.DataFrame(data)

total_expenditure = df.groupby('Category')['Expenditure'].sum()
print("Total expenditure per category:")
print(total_expenditure)

output
Total expenditure per category:
Category
A 300
B 270
C 80
Name: Expenditure, dtype: int64

4. Create a data frame for examination result and display row labels,
column labels data types of each column and the dimensions
import pandas as pd

import pandas as pd

data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Math': [85, 70, 92],
'Science': [90, 88, 78]
}

df = pd.DataFrame(data)

print("DataFrame for examination results:")


print(df)

print("\nColumn data types:")


print(df.dtypes)

print("\nDimensions (rows, columns):")


print(df.shape)
output

DataFrame for examination results:


Name Math Science
0 Alice 85 90
1 Bob 70 88
2 Charlie 92 78

Column data types:


Name object
Math int64
Science int64
dtype: object

Dimensions (rows, columns):


(3, 3)

5. Filter out rows based on different criteria such as duplicate rows


import pandas as pd
import pandas as pd

data = {
'Name': ['Alice', 'Bob', 'Alice', 'Charlie', 'Bob'],
'Score': [85, 70, 85, 92, 70]
}

df = pd.DataFrame(data)

# Remove duplicate rows based on all columns


df_no_duplicates = df.drop_duplicates()

print("DataFrame without duplicate rows:")


print(df_no_duplicates)

output
DataFrame without duplicate rows:
Name Score
0 Alice 85
1 Bob 70
3 Charlie 92

6. Importing and exporting data between pandas and CSV file

import pandas as pd
# Export DataFrame to CSV file
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 28]}
df = pd.DataFrame(data)
df.to_csv('output.csv', index=False)

# Import CSV file into DataFrame


imported_df = pd.read_csv('output.csv')
print("Imported DataFrame from CSV:")
print(imported_df)

Output
Imported DataFrame from CSV:
Name Age
0 Alice 25
1 Bob 30
2 Charlie 28

7)Create a Series from ndarray


i)Without index

import pandas as pd1


import numpy as np1
data = np1.array(['a','b','c','d'])
s = pd1.Series(data)
print(s)
Output
1a
2b
3c
4d
dtype: object
Note : default index is starting
from 0

ii)With index position

import pandas as p1
import numpy as np1
data = np1.array(['a','b','c','d'])
s = p1.Series(data,index=[100,101,102,103])
print(s)
Output
100 a
101 b
102 c
103d dtype:
Object

iii)Create a Series from dict


Eg.1(without index)
import pandas as pd1
import numpy as np1
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd1.Series(data)
print(s)
Output
a 0.0
b 1.0
c 2.0
dtype: float64

Eg.2 (with index)


import pandas as pd1
import numpy as np1
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd1.Series(data,index=['b','c','d','a'])
print(s)
Output
b 1.0
c 2.0
d NaN
a 0.0
dtype: float64

8.(A) Creation of Series from Scalar Values:


(A) Creation of Series from Scalar Values
A Series can be created using scalar values as shown in
the example below:
>>> import pandas as pd #import Pandas with alias pd
>>> series1 = pd.Series([10,20,30]) #create a Series
>>> print(series1) #Display the series
Output:
0 10
1 20
2 30
dtype: int64

User-defined labels to the index


and use them to access elements of a Series.
The following example has a numeric index in random order.
>>> series2 = pd.Series(["Kavi","Shyam","Ra
vi"], index=[3,5,1])
>>> print(series2) #Display the series

Output:
3 Kavi
5 Shyam
1 Ravi
dtype: object

(B) Creation of Series from NumPy Arrays


We can create a series from a one-dimensional (1D)
NumPy array, as shown below:

>>> import numpy as np # import NumPy with alias np


>>> import pandas as pd
>>> array1 = np.array([1,2,3,4])
>>> series3 = pd.Series(array1)
>>> print(series3)

Output:
01
12
23
34
dtype: int32
(C) Creation of Series from Dictionary
Dictionary keys can be used to construct an index for a
Series . Here, keys of the dictionary dict1 become indices in the series.

>>> dict1 = {'India': 'NewDelhi', 'UK':


'London', 'Japan': 'Tokyo'}
>>> print(dict1) #Display the dictionary
{'India': 'NewDelhi', 'UK': 'London', 'Japan':
'Tokyo'}
>>> series8 = pd.Series(dict1)
>>> print(series8) #Display the series

Output:
India NewDelhi
UK London
Japan Tokyo
dtype: object

9)
Pandas Series
Head function

import pandas as pd1


s = pd1.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
print (s.head(3))
Output
a1
b. 2
c. 3
dtype: int64
Pandas Series tail function e.g import pandas as pd1 s =
pd1.Series([1,2,3,4,5],index = ['a','b','c','d','e']) print (s.tail(3)) Output c 3 d.
4 e. 5 dtype: int64

10)
Pandas Series
tail function

import pandas as pd1


s = pd1.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
print (s.tail(3))
Output
c3
d. 4
e. 5
dtype: int64
11)Accessing Data from Series with indexing and slicing

import pandas as pd1


s = pd1.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
print (s[0])# for 0 index position
print (s[:3]) #for first 3 index values
print (s[-3:]) # slicing for last 3 index values
Output
1
a. 1
b. 2
c. 3
dtype: int64 c 3
d. 4
e. 5
dtype: int64

12)Create a DataFrame from Dict of ndarrays / Lists


e.g.1
import pandas as pd1
data1 = {'Name':['Freya', 'Mohak'],'Age':[9,10]}
df1 = pd1.DataFrame(data1)
print (df1)

Output
Name Age
1 Freya 9
2 Mohak 10

13)Create a DataFrame from List of Dicts


e.g.1
import pandas as pd1
data1 = [{'x': 1, 'y': 2},{'x': 5, 'y': 4, 'z': 5}]
df1 = pd1.DataFrame(data1)
print (df1)

Output
xyz
0 1 2 NaN
1 5 4 5.0
14)Row Selection, Addition, and Deletion
#Selection by Label
import pandas as pd1
d1 = {'one' : pd1.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd1.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])} df1
= pd1.DataFrame(d1)
print (df1.loc['b'])

Output
one 2.0
two 2.0
Name: b, dtype: float64

15)selection by integer location

import pandas as pd1


d1 = {'one' : pd1.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd1.Series([1, 2, 3, 4], index=['a', 'b', 'c','d'])}
df1 = pd1.DataFrame(d1)
print (df1.iloc[2])

Output
one 3.0
two 3.0
Name: c, dtype: float64
16)Iterate over rows in a dataframe

import pandas as pd1


import numpy as np1
raw_data1 = {'name': ['freya', 'mohak'],
'age': [10, 1],
'favorite_color': ['pink', 'blue'],
'grade': [88, 92]}
df1 = pd1.DataFrame(raw_data1, columns = ['name', 'age',
'favorite_color', 'grade'])
for index, row in df1.iterrows():
print (row["name"], row["age"])

Output
freya 10
mohak 1
17)Binary operation over dataframe with series

import pandas as pd
x = pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })
y = pd.Series([1, 2, 3])
new_x = x.add(y, axis=0)
print(new_x)
Output 0 1 2
0147
1 4 10 16
2 9 18 27
18)Binary operation over
dataframe with dataframe

import pandas as pd
x = pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })
y = pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })
new_x = x.add(y, axis=0)
print(new_x)
Output
012
0 2 8 14
1 4 10 16
2 6 12 18
19)Merging/joining dataframe

import pandas as pd
left = pd.DataFrame({
'id':[1,2],
'Name': ['anil', 'vishal'],
'subject_id':['sub1','sub2']})
right = pd.DataFrame(
{'id':[1,2],
'Name': ['sumer', 'salil'],
'subject_id':['sub2','sub4']})
print (pd.merge(left,right,on='id'))

Output1

id Name_x subject_id_x Name_y subject_id_y


0 1 anil sub1 sumer sub2
1 2 vishal sub2 salil sub4

1)Plot following data on line chart:

Day Monday Tuesday Wednesday Thursday Friday

Income 510 350 475 580 600

1. Write a title for the chart “The Weekly Income Report”.


2. Write the appropriate titles of both the axes.
3. Write code to Display legends.
4. Display red color for the line.
5. Use the line style – dashed
6. Display diamond style markers on data points

input
import matplotlib.pyplot as pp
day
=['Monday','Tuesday','Wednesday','Thursday','Friday']
inc = [510,350,475,580,600]
pp.plot(day,inc,label='Income',color='r',linestyle='d
ashed',marker='D')
pp.title("The Weekly Income Report")
pp.xlabel("Days")
pp.ylabel("Income")
pp.legend()
pp.show()

Output:
2) A Shivalik restaurant has recorded the following data into their register
for their income by Drinks and Food. Plot them on the line chart.
Day Monday Tuesday Wednesday Thursday Friday

Drinks 450 560 400 605 580

Food 490 600 425 610 625

input
import matplotlib.pyplot as pp
day
=['Monday','Tuesday','Wednesday','Thursday','Friday']
dr = [450,560,400,605,580]
fd = [490,600,425,610,625]
pp.plot(day,dr,label='Drinks',color='g',linestyle='do
tted',marker='+')
pp.plot(day,fd,label='Food',color='m',linestyle='dash
dot',marker='x')
pp.title("The Weekly Restaurant Orders")
pp.xlabel("Days")
pp.ylabel("Orders")
pp.legend()
pp.show()

Output:
3) Write a program to plot a range from 1 to 30 with step value 4. Use
the following algebraic expression to show data.
y = 5*x+2
import matplotlib.pyplot as pp
import numpy as np
x = np.arange(1,30,4)
y = 5 * x + 2
pp.plot(x,y)
pp.show()
Output:
4)Display following bowling figures through bar chart:
Overs Runs

1 6

2 18

3 10

4 5

import matplotlib.pyplot as pp
overs =[1,2,3,4]
runs=[6,18,10,5]
pp.bar(overs,runs,color='m')
pp.xlabel('Overs')
pp.xlabel('Runs')
pp.title('Bowling Spell Analysis')
pp.xticks([1,2,3,4])
pp.yticks([5,10,15,20])
pp.show()

Output:
5)Given the school result data,analyse the performance of 5 students on
different parameter, e.g. subject wise.
Below is implementation code /source code
Here is program code to analyse the performance of student data on
different parameter, e.g. subject wise or class wise

1 import matplotlib.pyplot as plt


2 import pandas as pd
3 import numpy as np
4 marks = { "English" :[67,89,90,55],
5 "Maths":[55,67,45,56],
6 "IP":[66,78,89,90],
7 "Chemistry" :[45,56,67,65],
8 "Biology":[54,65,76,87]}
9 df =
1 pd.DataFrame(marks,index=['Sumedh','Athang','Sushi
0 l','Sujata'])
1 print("******************Marksheet****************
1 ")
1 print(df)
2 df.plot(kind='bar')
1 plt.xlabel(" ")
3 plt.ylabel(" ")
1 plt.show()
4
1
5

Below is output:
******************Marksheet****************
English Maths IP Chemistry Biology
Sumedh 67 55 66 45 54
Athang 89 67 78 56 65
Sushil 90 45 89 67 76
Sujata 55 56 90 65 87

Below is bar chart showing performance of students


subject-wise
MYSQL
1. Create a student table with the student id, name, and
marks as attributes where the student id is the primary key.

i)Create table student


Input:
CREATE TABLE student
( student_id INT PRIMARY KEY, name VARCHAR(50), marks
INT );
Output: Table student is created

ii)insert 5 values:
Input:
INSERT INTO student values(1, 'John', 85),
(2, 'Jane', 75),
(3, 'Michael', 90),
(4, 'Emma', 95),
(5, 'William', 78);

2. Insert the details of a new student in the above table

Input:
INSERT INTO student (student_id, name, marks) VALUES (6, 'Olivia',
88);
Output: 1 new record is inserted
3. Delete the details of a student in the above table.

Input:
DELETE FROM student WHERE student_id = 3;

Output: Student with student_id 3 is deleted.

4. Use the select command to get the details of the students with marks
more than 80.

Input:
SELECT * FROM student WHERE marks > 80;

Output:
| student_id | name | marks |
|------------|----------|-------|
| 1 | John | 85 |
| 3 | Michael | 90 |
| 4 | Emma | 95 |
| 6 | Olivia | 88 |
5. Find the min, max, sum, and average of the marks in a student
marks table.

Input:
SELECT MIN(marks) AS min_marks, MAX(marks) AS max_marks, SUM(marks) AS
sum_marks, AVG(marks) AS avg_marks
FROM student;

Output:
| min_marks | max_marks | sum_marks | avg_marks |
|-----------|-----------|-----------|-----------|
| 75 | 95 | 423 | 84.6 |

6.Find the total number of customers from each country in the


table (customer ID, customer Name, country) using group by.

i) create table customers


Input:
CREATE TABLE customers ( customer_id INT PRIMARY KEY,
customer_name VARCHAR(50), country VARCHAR(50) );

ii)Describe the structure of the customer table:


Input:
DESCRIBE customer;

Output:

| Field | Type | Null | Key | Default | Extra |

|---------------|--------------|------|-----|---------|----------------|

| customer_id | int(11) | NO | PRI | NULL | auto_increment|

| customer_name | varchar(50) | YES | | NULL | |

| country | varchar(50) | YES | | NULL | |

iii)Insert 6 values into the customer table:

Input:
INSERT INTO customers VALUES(1, 'Alice', 'USA'),
(2, 'Bob', 'Canada'),
(3, 'Charlie', 'USA'),
(4, 'David', 'UK'),
(5, 'Eva', 'Canada'),
(6, 'Frank', 'Germany');

Output: Table customers is created, and 6 records are inserted.

iv)Display all values in the customer table:


Input:
SELECT * FROM customer;

Output:

| customer_id | customer_name | country |


|-------------|---------------|----------|
| 1 | Alice | USA |
| 2 | Bob | Canada |
| 3 | Charlie | USA |
| 4 | David | UK |
| 5 | Eva | Canada |
| 6 | Frank | Germany |

v)Find the total number of customers from each country using GROUP
BY:
Input:
SELECT country, COUNT(*) AS num_customers FROM
customers GROUP BY country;
Output:

| country | num_customers |
|---------|---------------|
| Canada | 2 |
| Germany | 1 |
| UK |1 |
| USA | 2 |

7.Write a SQL query to order the (student ID, marks) table in


descending order of the marks.

Input:
SELECT student_id, marks
FROM student
ORDER BY marks DESC;
Output:
| student_id | marks |
|------------|-------|
|4 | 95 |
|3 | 90 |
|6 | 88 |
|1 | 85 |
|5 | 78 |
|2 | 75 |

You might also like