Professional Documents
Culture Documents
Ngon-Ngu-R - Lecture-2 - Phan-Tich-Do-Thi - (Cuuduongthancong - Com)
Ngon-Ngu-R - Lecture-2 - Phan-Tich-Do-Thi - (Cuuduongthancong - Com)
Tổng quan
• Số liệu
• Đồ thị cột- Barchart
• Đồ thị tần số- Historgram
• Đồ thị đường thẳng-Stripchart
• Đồ thị hộp-Boxplot
• Đồ thị xy- Scatter plot
Số liệu
• Số liệu về thành phần của thân thể đo bằng phương
pháp hấp thu tia X
• 43 nam và nữ tuổi từ 11 đến 28
• Tên biến:
– id
– age
– sex
– dur
– weight
– height
– lm (lean mass)
– pclm (percent lean mass)
– fm (fat mass)
– pcfm (percent fat mass)
– bmc (bone mineral contents)
3
1
CuuDuongThanCong.com https://fb.com/tailieudientucntt
Đọc dữ liệu vào R
setwd(“c:/works/stats”)
bc <- read.table(“comp.txt”, header=T)
attach(bc)
names(bc)
Xem số liệu
bc
id age sex dur weight height lm pclm fm pcfm bmc
1 1 15 M 5 39 148 32.96 84.50 4.86 12.5 1.33
2 2 16 M 8 45 162 38.16 84.80 4.15 9.2 1.89
3 3 11 M 4 23 132 18.51 80.50 2.99 13.0 0.74
4 4 19 M 9 46 159 35.92 78.10 6.73 14.6 1.59
5 5 19 M 6 56 166 46.63 83.00 5.61 10.2 2.56
6 6 22 M 12 50 152 42.13 84.00 3.93 8.1 2.12
7 7 16 M 8 53 170 45.23 85.00 5.15 9.8 2.21
8 8 12 M 5 35 151 25.26 72.20 9.02 25.6 0.95
9 9 21 M 8 46 166 39.44 85.70 4.64 10.1 2.00
10 10 15 M 6 45 165 38.47 85.50 3.92 8.9 1.70
11 11 13 M 5 32 142 25.50 79.70 4.26 13.9 0.99
12 12 20 M 6 40 153 32.70 82.00 4.66 12.0 1.38
...
40 40 12 M 10 39 155 33.00 84.60 3.50 9.2 1.43
41 41 15 M 6 45 154 36.00 80.00 5.33 12.5 1.52
42 42 22 M 7 46 157 38.50 84.00 4.63 10.3 1.86
43 43 25 M 13 45 162 37.35 83.00 4.34 10.0 1.70
5
Se x distribution
30
25
M
20
15
10
F
5
0
F M
0 5 10 15 20 25 30
6
2
CuuDuongThanCong.com https://fb.com/tailieudientucntt
Tần số theo nhóm : barplot
agegroup <- cut(age, 3)
agesex <- table(sex, agegroup)
barplot(agesex)
25
20
15
10
5
0
10
15
10
5
5
0
hist(age, breaks=20)
6
hist(age, breaks=40)
8
5
Frequency
Frequency
hist(age, breaks=50)
4
6
3
4
2
2
1
0
10 15 20 25 15 20 25
age age
7
6
6
5
5
Frequency
Frequency
4
4
3
3
2
2
1
1
0
15 20 25 15 20 25
9
age age
3
CuuDuongThanCong.com https://fb.com/tailieudientucntt
Phân phối số liệu: Histogram
par(mfrow=c(2,2)) His togram of age Histogram of w eight
hist(age)
15
10
hist(weight)
hist(lm) 8
10
hist(fm)
Frequency
Frequency
6
4
5
2
0
0
10 15 20 25 20 30 40 50 60
age weight
15
10
10
Frequency
Frequency
8
6
5
4
2
0
15 20 25 30 35 40 45 50 2 4 6 8 10 12 1014
lm fm
0.05
12
0.04
10
8
Frequency
0.03
Density
6
0.02
4
0.01
2
0
0.00
15 20 25 30 35 40 45 50
lm 10 20 30 40 50 11
N = 43 Bandwid th = 2 .60 7
35
30
25
20
-2 -1 0 1 2
12
The oretical Quantiles
4
CuuDuongThanCong.com https://fb.com/tailieudientucntt
Tính liên tục của số liệu: stripchart
stripchart(lm, xlab=“Lean mass; kg")
20 25 30 35 40 45
Lean mass; kg
13
12
40
10
35
8
30
6
25
4
20
LM
Min. 1st Qu. Median Mean 3rd Qu. Max.
18.51 31.91 35.92 35.65 40.14 46.63
FM
Min. 1st Qu. Median Mean 3rd Qu. Max. 14
2.990 4.250 5.270 6.500 8.795 12.800
12
40
10
35
8
30
6
25
4
20
F M F M
15
5
CuuDuongThanCong.com https://fb.com/tailieudientucntt
Phân tích mức độ liên kết: scatter plot
45
40
40
35
35
lm
lm
30
30
25
25
20
20
15 20 25 15 20 25
age age
16
30
25
20
15 20 25
age
17
M
45
M
M M
M M
M M M
40
M M M
M
M M M M
M
M M
35
F
F
M M
Kg
M F M F F
M F F
F
30
F
F F F
F
M
25
M
20
15 20 25 18
A ge
6
CuuDuongThanCong.com https://fb.com/tailieudientucntt
Phân tích nhiều liên kết-multiple
associations
data <- data.frame(age, weight, lm, fm, bmc)
pairs(data) 25 35 45 55 4 6 8 10 12
25
age
20
15
55
45
weight
35
25
40
lm
30
20
10 12
fm
8
6
4
2.5
2.0
bmc
1.5
1.0
19
15 20 25 20 30 40 1.0 1.5 2.0 2.5
pairs(data,lower.panel=panel.smooth,
upper.panel=matrix.cor)
20
Kết quả
25 35 45 55 4 6 8 10 12
25
age
** * ***
20
0.48 0.36 0 .0 9 5
0.56
15
55
*** ***
45
weight 0.88 0 .1 1
0.85
35
25
* ***
40
lm 0.36 0.86
30
20
8 10 12
fm 0.16
6
4
2.5
2.0
bmc
1.5
1.0
21
15 20 25 20 30 40 1.0 1.5 2.0 2.5
7
CuuDuongThanCong.com https://fb.com/tailieudientucntt
Tóm tắt
• R mạnh về phân tích đồ thị
22
8
CuuDuongThanCong.com https://fb.com/tailieudientucntt