You are on page 1of 7

Nama :

Kelas :

Kasiful Aprianto (12.7209)


4 KS 2
Tugas APG QQ plot Multivariat

Soal 1:
Diketahui (Soal Halaman 186)
1
2
3
4
5
6
7
8
9
10

x1
1889
2403
2119
1645
1976
1712
1943
2104
2983
1745

x2
1651
2048
1700
1627
1916
1712
1685
1820
2794
1600

x3
1561
2087
1815
1110
1614
1439
1271
1717
2412
1384

x4
1778
2197
2222
1533
1883
1546
1671
1874
2581
1508

11
12
13
14
15
16
17
18
19
20

x1
1710
2046
1840
1867
1859
1954
1325
1419
1828
1725

x2
1591
1907
1841
1685
1649
2149
1170
1371
1634
1594

x3
1518
1627
1595
1493
1389
1180
1002
1252
1602
1313

x4
1667
1898
1741
1678
1714
1281
1176
1308
1755
1646

21
22
23
24
25
26
27
28
29
30

x1
2276
1899
1633
2061
1856
1727
2168
1655
2326
1490

x2
2189
1614
1513
1867
1493
1412
1896
1675
2301
1382

x3
1547
1422
1290
1646
1356
1238
1701
1414
2065
1214

x4
2111
1477
1516
2037
1533
1469
1834
1597
2234
1284

Gambarkan QQ plot!
d 2=( x )' S1 ( x)

Cari nilai S

1
X rata = . X ' .1(nx1)
n
d i=x i xi

x 11 x1 x 12 x2 x 14 x4
x x
x 24 x4
d= 21 1

x n1 x1 x n 2 x1 x n 4 x4
1
1
2
S= ( ( xx ) )= d ' d
n
n
2
1
d =d ' S d

Cari nilai standar chi square kuantil


buat nilai p dengan urutan

(1 12 ) , (2 12 )

n
dapatkan nilai qshicq dari peluang p
std.quantile = qchisq (p)

]
n

1
1
i
n
(
(
2)
2)
, ... ,
, ... ,
n

Urutkan nilai d 2 , lalu plot nilai std.quantile sebagai x dan d 2 sebagai y

Hasil plot soal 1

> plot1$nilai
d.kuadrat std.chisq.quantile
[1,] 0.1295714
0.03361424
[2,] 0.4635159
0.10258659
[3,] 0.6000129
0.17402275
[4,] 0.7665400
0.24810530
[5,] 0.7962096
0.32503786
[6,] 1.0792484
0.40504853
[7,] 1.3632124
0.48839392
[8,] 1.3980776
0.57536414
[9,] 1.4649908
0.66628889
[10,] 1.4876570
0.76154499
[11,] 1.9307771
0.86156583
[12,] 2.2191409
0.96685330
[13,] 2.3816050
1.07799300
[14,] 2.5385575
1.19567400
[15,] 2.5838186
1.32071472
[16,] 2.6959024
1.45409746
[17,] 2.9951752
1.59701539
[18,] 3.3979804
1.75093747
[19,] 3.5018290
1.91770069
[20,] 3.9900603
2.09964425
[21,] 4.5767867
2.29981117
[22,] 4.9883498
2.52226244
[23,] 5.0557446
2.77258872
[24,] 5.2076098
3.05879041
[25,] 5.4770196
3.39289858
[26,] 6.2837628
3.79423997
[27,] 7.6166439
4.29686883
[28,] 9.8980384
4.96981330
[29,] 12.2647550
5.99146455
[30,] 16.8474070
8.18868912

Plot pada gambar mengikuti sebaran gars lurus, sehingga dapat dikatakan bahwa data yang
diberikan mengikuti sebaran distribusi normal

Soal 2:
Menggunakan data Hepatic Carcinoma (Iizuka et al) hep_japan.gct.txt
http://www.broadinstitute.org/mpr/publications/projects/Cancer_Susceptibility/hep_japan.gct
NAME
1

AFFX-BioB-5_at

AFFX-BioB-M_at

AFFX-BioB-3_at

AFFX-BioC-5_at

AFFX-BioC-3_at

6
.
.
.

AFFX-BioDn-5_at
.
.
.

7124 X16699_at
7125 X83863_at
7126 Z17240_at

7127 L49218_f_at
7128 M71243_f_at
7129 Z78285_f_at

Description
J04423 E coli bioB gene biotin synthetase (-5, -M, -3 represent
transcript regions 5 prime, Middle, and 3 prime respectively)
J04423 E coli bioB gene biotin synthetase (-5, -M, -3 represent
transcript regions 5 prime, Middle, and 3 prime respectively)
J04423 E coli bioB gene biotin synthetase (-5, -M, -3 represent
transcript regions 5 prime, Middle, and 3 prime respectively)
J04423 E coli bioC protein (-5 and -3 represent transcript regions 5
prime and 3 prime respectively)
J04423 E coli bioC protein (-5 and -3 represent transcript regions 5
prime and 3 prime respectively)
J04423 E coli bioD gene dethiobiotin synthetase (-5 and -3
represent transcript regions 5 prime and 3 prime respectively)
.
.
.
X16699, class C, 20 probes, 20 in all_X16699 2053-2130, Human
mRNA for cytochrome P-450HP
X83863, class A, 20 probes, 20 in X83863cds 1151-1241, H.sapiens
mRNA for prostaglandin E receptor (EP3f)
Z17240, class C, 20 probes, 20 in all_Z17240 956-1014, Homo
sapiens for mRNA encoding HMG2B
L49218, class A, 20 probes, 20 in L49218exon 4-91, Homo sapiens
retinoblastoma susceptibility protein (RB1) E413K 1 bp deletion
mutant (resulting in premature stop at amino acid 416) gene, exon
13 (L11910 bases 73717-73901).
M71243, class B, 16 probes, 14 in M71243mRNA 25-38: 2 not in GB
record, Human glycophorin Sta (type A) exons 3 and 4, partial.
/gb=M71243 /ntype=DNA /annot=exon
Z78285, class A, 20 probes, 20 in Z78285 3-137, H.sapiens mRNA
(clone 1A7).

NonBC3T

HBV5T

...

BL26T/HCV174T

BL27T/HCV182T

16.7

27.4

...

71.4

146.9

1.8

-3.3

...

51.1

117.3

14.1

78.6

...

25.3

52.2

219.1

254.2

...

188.5

314.7

161.4

213.6

...

135.4

311.1

-1.8
.
.
.

74.1
.
.
.

...
.
.
.

250.3
.
.
.

505.8
.
.
.

-12.8

-15.7

-17.7

-29.5

-0.4

-1.2

58.0

37.1

53.7

66.4

30.0

15.9

-17.0

-11.2

-6.4

-9.2

6.9

3.7

-4.0

-13.7

-19.6

-19.3

-18.4

-22.1

Terdiri dari 7129 observasi dengan 60 variabel


Gambarkan QQ plot!
d 2=( x )' S1 ( x)

Cari nilai S

1
X rata = . X ' .1(nx1)
n
d i=x i xi

x 11 x1 x 12 x2 x 14 x4
x x
x 24 x4
d= 21 1

x n1 x1 x n 2 x1 x n 4 x4
1
1
S= ( ( xx )2 )= d ' d
n
n
d 2=d ' S1 d

Cari nilai standar chi square kuantil


buat nilai p dengan urutan

(1 12 ) , (2 12 )

n
dapatkan nilai qshicq dari peluang p
std.quantile = qchisq (p)

]
n

1
1
i )
n )
(
(
2
2
, ... ,
, ... ,
n

Urutkan nilai d 2 , lalu plot nilai std.quantile sebagai x dan d 2 sebagai y

Hasil plot soal 2


> plot2$plot

> plot2$nilai[1:20,]
d.kuadrat std.chisq.quantile
[1,] 0.2838672
0.0001402770
[2,] 0.3198427
0.0004208607
[3,] 0.3306744
0.0007014836
[4,] 0.3358435
0.0009821460
[5,] 0.3366150
0.0012628478
[6,] 0.3392422
0.0015435889
[7,] 0.3512671
0.0018243695
[8,] 0.3512719
0.0021051895
[9,] 0.3539925
0.0023860489
[10,] 0.3541168
0.0026669478
[11,] 0.3544927
0.0029478861
[12,] 0.3560702
0.0032288639
[13,] 0.3561437
0.0035098812
[14,] 0.3584968
0.0037909380
[15,] 0.3602497
0.0040720343
[16,] 0.3622715
0.0043531701
[17,] 0.3625841
0.0046343454
[18,] 0.3642986
0.0049155602
[19,] 0.3689574
0.0051968146
[20,] 0.3690493
0.0054781086
*hanya menampilkan 20 teratas dari 7129 observasi

Plot pada gambar tidak mengikuti sebaran gars lurus, terlihat melengkung ke atas, sehingga dapat
dikatakan bahwa data yang diberikan tidak mengikuti sebaran distribusi normal

QQ plot Code
# preprocessing data #
data1
<matrix(c(1889,2403,2119,1645,1976,1712,1943,2104,2983,1745,1710,2046,1840,1867,
1859,1954,1325,1419,1828,1725,2276,1899,1633,2061,1856,1727,2168,1655,2326,1490
,1651,2048,1700,1627,1916,1712,1685,1820,2794,1600,1591,1907,1841,1685,1649,214
9,1170,1371,1634,1594,2189,1614,1513,1867,1493,1412,1896,1675,2301,1382,1561,20
87,1815,1110,1614,1439,1271,1717,2412,1384,1518,1627,1595,1493,1389,1180,1002,1
252,1602,1313,1547,1422,1290,1646,1356,1238,1701,1414,2065,1214,1778,2197,2222,
1533,1883,1546,1671,1874,2581,1508,1667,1898,1741,1678,1714,1281,1176,1308,1755
,1646,2111,1477,1516,2037,1533,1469,1834,1597,2234,1284),
ncol=4, nrow=30)
data2 <- read.csv(file = "C:/Users/kasiful/Documents/hep_japan.gct.txt", header
= T, sep = "\t")
rownames(data2) <- data2$NAME
data2 <- data2[,-c(1,2)]
# membuat fungsi determinan untuk menghitung invers
determinan <- function(x){
tanda <- 1
for (i in 1:(length(x[,1])-1)){
if (x[i,i] == 0){
#jika sama dg nol, tukar dg yg dibawah yg tidak sama dg nol, tanda =
tanda*(-1)
for (j in (i+1):length(x[,1])){
if(x[j,i] != 0){
tmp <- x[j,]
x[j,] <- x[i,]
x[i,] <- tmp
tanda <- tanda*(-1)
break()
}
}
}
for (j in (i+1):length(x[,1])){
tmp <- x[j,i]/x[i,i]*x[i,]
x[j,] <- x[j,] - tmp
}
}
hasil <- 1
for (i in 1:length(x[,1])){
hasil <- hasil * x[i,i]
}
hasil <- hasil * tanda
return (hasil)
}
#membuat invers untuk qq plot
invers_gauss_jordan <- function(x){
if (determinan(x) == 0){
return ("matriks singular")
}
else {
dataI <- diag(length(x[,1]))
x <- cbind(x,dataI)
#untuk perhitungan ke bawah
for (i in 1:(length(x[,1])-1)){
if (x[i,i] == 0){
#jika sama dg nol, tukar dg yg dibawah yg tidak sama dg nol
for (j in (i+1):length(x[,1])){
if(x[j,i] != 0){
tmp <- x[j,]
x[j,] <- x[i,]
x[i,] <- tmp
break()

}
}
}
for (j in (i+1):length(x[,1])){
x[i,] <- x[i,]/x[i,i]
tmp <- (x[j,i]/x[i,i])*x[i,]
#
print (tmp)
x[j,] <- x[j,] - tmp
#
print (x)
}
}
#untuk perhitungan ke atas
x[length(x[,1]),] <- x[length(x[,1]),]/x[length(x[,1]),length(x[,1])]
for (i in length(x[,1]):2){
for (j in 1:(i-1)){
tmp <- x[j,i]/x[i,i]*x[i,]
x[j,] <- x[j,] - tmp
#
print (x)
}
}
#print (x)
hasil <- c()
for (i in (length(x[,1])+1):(length(x[,1])*2)){
hasil <- cbind(hasil, x[,i])
}
return (hasil)
#}
}
}
#membuat plot qq plot
qqplot_mtv <- function(x){
x.bar <- 1/nrow(x) * t(x) %*% matrix(1, nrow = nrow(x), ncol = 1)
x.min.xbar <- matrix(nrow=nrow(x), ncol=ncol(x))
for (i in 1:ncol(x)){
x.min.xbar[,i] <- x[,i] - x.bar[i,]
}
L <- t(x.min.xbar) %*% x.min.xbar
n <- nrow(x)
s <- 1/(n-1) * L
s.inverse <- invers_gauss_jordan(s)
d.kuadrat <- NULL
for(i in 1:nrow(x)){
temp <- x.min.xbar[i,]
d.kuadrat <- c(d.kuadrat, t(temp) %*% s.inverse %*% temp )
}
d.kuadrat <- sort(d.kuadrat)
std.chisq.quantile <- 1:nrow(x)
std.chisq.quantile <- (std.chisq.quantile-0.5) / nrow(x)
std.chisq.quantile <- qchisq(std.chisq.quantile, df = 2)
data <- cbind(d.kuadrat, std.chisq.quantile)
colnames(data) <- c("d.kuadrat", "std.chisq.quantile")
plot(std.chisq.quantile, d.kuadrat)
garis <- lm(d.kuadrat~std.chisq.quantile)
abline(garis, col = "red")
simpan_plot <- recordPlot()
hasil <- list(data, simpan_plot)
names(hasil) <- c("nilai", "plot")

return(hasil)
}
#########################
#
#
#
cetak hasil
#
#
#
#########################
plot1 <- qqplot_mtv(data1)
plot1$nilai
plot1$plot
plot2 <- qqplot_mtv(data2)
plot2$nilai[1:20,]
plot2$plot

You might also like