Professional Documents
Culture Documents
Variables
LAB EXPERIMENT #7: Linear Transformation for Random
Pre-requisite
1. Required basic knowledge of probability theory.
Pre-lab.
1. Explain what is a random variable?
ub
Anscoe: Kandem v/aualule ua Vaiallo uhex valuL
t eal
uvknpLn e a utin that omigus v/aluws
a n eapeui uts outieml3
49
udput:
Sepal Jecatlu sepal. uwdth petal-lorugt petal-oidic Speutes
8S
2
4
us- ukea
82 13 0:2
D2
6sus-Aete
148 62
80
54
S'
nisvoqait
&
us-VUainia
spal-
width yetal. luugth pelil wit
-
3 30 OO00
4O000 D Soo000
+0'l oOO00
19O000o
A00000
mecm'
5.8u3333
9 sepal-kmgt
S 3 0SuO00
Aepal widte
75866+
petol- leugth 3
198664
petalwittth
oltype:tlratt4
meduam,
eph! - e n t 5.80
Rpoul uwidt 3 00
Pltal -engt
30
Petulwidtt kype: a tb4
In-lab. 19CS2205 Data Science
Data-sct
The Iris flower data set or
British statistician, Fisher's Iris data set is a
consists of 50 sampleseugenicist, and multivariate data set
from each of biologist Ronald Fisher in his introduced by the
versicolor). Four features were three species of Iris (Iris setosa, Iris1936. The data set
the sepals and petals, in measured from each virginica and Iris
Fisher developed a linear centimetres. Based on the sample: the length and the width of
combination of these four
The dataset is available in discriminant model to distinguish the
following link. species from eachfeatures,
other.
a.
https:/ www.kaggle.com/arshid/iris-flower-dataset
Read data from the .CSV file. Get
and standard deviation of the the basic statistics like
mean, median, variance
petal_width of the data set. attributes sepal_length, sepal_width,
b. Find the co-relation between the attributes petal_length,
(petal_length, petal_with) (sepal_length, sepal_width) and
C. Make a linear transformation
of the attribute sepal_length by adding 1.5
Similarly make a linear transformation of the attribute
to it.
two. petal_length by multiplying
d. Now find the co-relation between the attributes
(sepal_length, sepal_width) and
(petal length, petal_with). Then analyze it and draw
Make a linear transformation of the attribute conclusion.
Similarly make a linear transformation of sepal_length by subtracting 1.5 from it.
the attribute
f Now find the co-relation between the attributes petal_length by dividing two.
(sepal_length, sepal_width) and
(petal length, petal_with). Then analyze it and draw conclusion.
d pd v(ontutIsersu) SV)
4
WapliawnU0
dyOH) 50
Stol:
Sepal-ouath o 820806%
Aepul -udidti O4355q4
Ytal-Acug 464420
tpe: Hnt 6
StPl. Lugtt D'6856q4
0 188004
sepal. width
tal-leugl sl317
OS82414
Yetal widt&
dtype: Hirate4 ysien-petnl addll
divsn-petal
elal.lugt pelal
uwidte
kpal.widle
3epal ugt
08/ 4S4 o8tt154
vSH1S4
-0:109369
0oODO - 03S6Suu
-b30316-0:35654u
-0S
jal-hugt -D20516
00000
0
000000
-o109369 9627S+
SEpal-widt. -0y2051b 0ooo0
100000
-0420S|b
6215
O 46215S
PetalLugtle
018H154
3s6sy 4
bAK*
0 00CU0
00 00OO
08175
getal-wakl. o S/454-0
:000000
0:LOSi6
diwibxen
Pehl-longth 0:8Hy
o811S4 000000
D8tSU
0'FIBu ST/75t
0STSlu
0
-0:104369
addia.Sepal-
00000
wgth
0 09569au495064931
49 3 '0
15
3 4 8
0 uisse
5.0
(S
19CS2205 Data Science
widlth3)
efo (a4 C'Sepal.
4l'apal.bungla7.| .
lees(
ltal.lugta.
J.
Csepal_Liutin J+1s
ed. Spal. iugth J +06
hcad ls
))
sepol-eurgt'
l'addad.
s t (44
hilt . kepal tugth7)
.
iet (ol'adld
bey kuugth ]J/2
/2
L'pttal. lugth
eugt o d4 Lpetal.
] l4 =
d i v i s e n - p e t a l .
eugte)
C'divisien -petal
HtAist( division-petal
laugtus)
inglt (a4 l Upetal-Aeugii)
vii
(64Caiisien-petal lugt I,o}
ades
utput
Sepal.Lun sepolwidl petul.uugt. Jtal-uoidtt Speuies isinjetal -ovoe
8S b2
19uis setle
uis-sets
us sete 4S
I4 suis setse
51
utput-
eude terdut Ut buaatiy
rveite Bamd Cy ustn
ype
o 10-b7 umdes Huralt /lealtt
taltle .464
A Yawgon btpudy 961115
electenit 28
22-8i
C Naupyitaud Nmal kemalt
aLLeMeies
80 2100
Nohnal Home aud 4b-3
268-1 A Yangon tembe
Male
xfetyla I6.11S,
S82 2
Mmles Ml Hoalel 88.& 8E0
312319 A Yanget beauuby -.
C0U
o 633962088 5 890689
Min Hoo5tales:
Hin Mar Scalas ( Ceb =0 5 48 1 5
80 2-200
8uDs2s3
84 ou$d
634 378S
u9.2990
6S 82
i q 88 34
Post-lab.
accessories, Electronic
DATA SET Health and beauty, following
iinK.
Home
and litestyle,
https://www.kaggle.com/aungpyaeap/supermarket-sales
transform the
attributes unit-price and Total. Then between u n i t
between the
Find the co-relation normalization. Then find the co-relation
as pd
impokt padas
as np
tmpot numpy
30les -Sheet1 a )
(3uCIceuteut|&pekmcoket
dfa pd suaa.
dfe hoad )
cescCdfe ['1otat)
dtl'Unit paice'
Scale
ppupocONMNG mpaout Aunon
fpgwn 8leasua
Alin llnStlUR [dfe lUut puie 7, d19["
ola l '
19CS2205 Data Science
Viva Voce:
X (,2, 3, 415, 6)
2. Consider X (A random variable) to be the number of heads obtained in three tosses of a coin
o d s : 0, l, 2,3
x(o,1,2,
Session:J-I-
aueocg *
Pre-lab
random variable? s t e lng-tescm
1. What is expectation of a uandom
Vauinlle
e tatÒn ef
a
Ansuwe
a n l e n valuualle.
calulotes
ung
douoted as 0 , is
valuu, uxaly
he eyeted
x if (X)
ECx)
Caleulace
Expectation of a discrete random variable.
2.
value us
tle erpectel
6 a disecate gamdem vaualle, Luandp
LLE of value
unmaui zing lumming) the puolut tAC
talkun ousL au
Vaualle aud itu aneciated poelauilty ,
Values f andom vauialle- olluoted
onite
PCY-ylx-n)- PCY-9),feå ail
54
Output: pelal uwidt
fetal.uwitse
yetal kugt
Bepal ougttr
Bepal.
wdt
speties
62
Lrisser
uisseka
2 32
9useken
HiS-ete
3
148 62 3
8
41 54
is-li
sepal-ougth Sepal-oidth pelal -Auugti petal.eoilts apeies
0 2
51 Guis sett
30
uis-mtea
3
Luis wBa.
53 84 Lisseltre
83
tetal teuut
50
wque
58 5 5 5
u& 43
L5.1 4-9 4:7 4b 5. 5 u
puoalelity
Sepal lengt hupucuy Kutlalulity
8S 000
0
C-O0
2
0
36
O00
13
1S 2
23 000
19CS2205 Data Science
In-lab
Data-set
The Iris flower data set or
British Fisher's Iris data
set is a multivariate
statistician, eugenicist, and data set introduced by the
consists of S0 samples from biologist Ronald Fisher in his 1936. The data set
each of three
versicolor). Four features were measured species of Iris (Iris setosa, Iris virginica and Iris
the sepals and from each sample: the
petals, in centimetres. Based the
length and the width of
Fisher developed a linear discriminant model
on combination of these four features,
The dataset is available in following link.
to distinguish the species from each othe.
https://www.kaggle.com/arshid/iris-flower-dataset
a. From the above data set
values and its
only consider the species iris-setosa, get all the distinct
frequency for the attribute sepal-length.
b. Calculate the
probability of each distinct value of the attribute
species iris-setosa. sepal-length for
C. Now, calculate the expected value of the attribute
and draw your conclusion. sepal-length for species iris-setosa
Empout Pamalaspo a
Writing space for the Problem: (For Student's use only)
tmpaut wmpy as nP
dpd Lad svC ltonteut Is RIS Cav)
auela lulituy CS
tuut 0
ei in tewp:
LOumt -4SLtomt t1 55
(For Student's
Problem: (For
,Writing space for theroblgm:
.use
only
lot MAL)
hallety oPpead l teMit
p u o t(t e m y )
t ( Y r u c g u a u
g uTpetNabulite
l
(i Sepalleuglt tlewmp,
nd Datafsaune
PA Da
FaouU:ftaqyuen
,Pcebaliy'. qpxeatulu
pected _Value=o
i:0 Ci)
value in temp +{ValueB prLetaluli
expected- Volu
xpected value
=
DaLintEpected v e l u l erpected-valu)
Cukputi
Species
Seul.Luugt epol-widlh getal.luugt elal-udoa
02 Jois setoa
0:2 pus Belev
0
02 Jpis Aetev
3 2 13
'
r'
Juis vgdnia
3
Juis voiniu
4pecio
Ape cio
tlal-vorilt
Sepal uuot sepalwiat. pebolLuuglh
100 33 O yuis-vngiA
10 S
4 Juisosaoics
r
Juis HAta
48 62 3
8
Juús-gni
tetal teuuntr
wuei
7u 74 5
ueb alibkg
Bepal Leugth hlutuy tuekalulily
0 02
006
00
5 o 00
18 0 00
14 6 00
20 5 A
19CS2205 Data Science
Post-Lab:
all the distinct
a. From the above data set only consider the species Iris-virginica, get
values and its frequency for the attribute Petal Length.
csvC'GRIS. CU')
d pd suad
44.lo[olt ["Apteies]= "1sis-vongnica
tevmp-dtl.Sepal.levgtle r wiqut )
puint (temp .
tsquulenaY=C), pu@6aluiutyt3 ,count0
fo tn temp:
i n datafuame Cpttoul-lengtiJ: Punt Ctemp)
ti
Leumt- loumt puintCPustaliity)
tRueyuPpLft! Eetal tount)
PLLebatuluouppevd[louut[tetal-teunut)
Iris-
attribute petal-length for species
c. Now, calculate the expected value of the
draw your conclusion.
virginica and
C
s l-Pol Datat uow
Sepal Lenglu :te
Paebally pevedaleuy
pet valun
LID
oNalue in temp t puLebaluuy
etvolue enpet value +(value
iitl
puntCEnpeted Valul pet-vaue,
57
enpected Value-
8.1S8000000 OO0 O004
19CS2205 Data Science
Viva-Voce
1What doyou understand by probability mass function (PMF) and probability density function?
1.M
(PDF).
Auswe,-
Puelrlu
Puelrrlu las tundien (PrM£) a wlion thot gies dthe
LEnluly
puetalulily that a
distute audem uuiahe is yact
Quual te sene value
Joint Peobaliluty
dt te puelnliulay ef a euut tuwig stuultanebuvdy
a Cunlu