You are on page 1of 14

*

E-mail: shariat@iust.ac.ir
-1388/09/21 :1389/03/12:


.
.
.
.
) (- .
.
.
.

: ) (CART

.1

. 3

24000

%70

] .[1388

3.

50 60

1 ] .[1386

] .[Han (et. al.), 2006

169648 7241

. 90

153

1389

.2

] .[Mussone (et. al.), 1999

10

[(Al-Ghamdi, 2002), Bedard (et. al.), 2002

Valent (et. al.), 2002, Wood (et. al.), 2002, (Yau,


]) . 2004

5 .

) .(Al-Ghamdi, 2002

] [Wood (et. al.), 2002

[Bedard (et. al.), 2002, Valent

] .(et. al.), 2002 11

7 8

[Renski (et. al.), 1999, Khattak (et. al.), 2002,

Kockelman (et. al.), 2002, Kweon (et. al.), 2003,


] .Zajac (et. al.), 2003, Abdel-Aty (et. al.), 2005

9.

[Kweon (et. al.),

] .2003

] .[Renski (et. al.), 1999

].[Zajac (et. al.), 2003

1389

154


[Sohn (et. al.),

.3

] . 2001 12

[Abdelwahab (et. al.), 2001],

]. [Abdelwahab (et. al.), 2002], [Delen (et. al.), 2006

30353

14 .


].[Delen (et. al.), 2006

-1-3

13

[Stewart, (1996), Sohn (et. al.), 2001,

)(

Chong (et. al.), 2005, Tesema (et. al.), 2005, Chang


]. (et. al.), 2006

12604

15

] .[Han (et. al.), 2006

) (

] .[Chong (et. al.), 2005 4658

( .

.
16

]. [Berry (et. al.), 2004

155

1389

120km/h 120km/h

.2



21 :

)( 1

) ( j) N j (m
)p( j, m
= ), p( j, m
,
)p(m
Nj

= )p( j | m
J

)p(m) = p( j, m
j =1

)( 2

)Gini(m) = 1 p 2 ( j | m
j =1

:
*

J ( j )

22 j

.1

) 2-5 ( Nj(m)

j Nj m

17 18

j p(j|m)

) N1

j m ) Gini(m

N2( 19 .

Gini(m)

) Gini(m

20 .

2 .

1389

156

) 23

N1 (1

)(

24 -

25

- :

T
J

)misclassif ication cos t = p (t ) 1 p 2 ( j | t ) (3


t =1
j =1

] .[Giudici, 2003

) p (t t

] [SPSS, Clementine, 2008

T .

)
( .

. 26
27

28

- 29 .
) (
.

.3


. 3

2-3

)( .

) (

) xj (...

nm
))Gini( S ( x j , m
m =1 N

= ) VIM ( x j

)(4

) S ( x j , m xj m

)) Gini ( S ( x j , m m

nm xj M m
N

157

1389

N .

169648

xj

7241

. nm
N

.
.

xj

. 13 1

1 .

) (

11 )
(

.4

1 ) (

. ]Stewart,

) (1387-1385 .

[1996

30 31

) (21579 km

%70

%30 .

114

k 32 .

) (

] .[Stewart, 1996

) (... )

%30

(... )

(...
) (... ) (

.5

1-5

) (

) (

( .

) (

2 ( . %27/9

1389

158

( )

) .((j)=0.5 3

[Dissanayake (et. al.),

] . 2002 and Delen (et. al.), 2006

)) ((j


) 2-1 -

. 4

12 (

. 1-2-

2-1-

(0)= 156719/(156719+11138+1791) =0/92 2-1 -

((1)= (11138+1791)/(156719+11138+1791)=0/08

2-1 -

)(

) (

[Steinberg (et. al.),

] . 2007

2-5

. 3

%60

. 8

)) (j 1

159

1389


.1

) (

.1 : .2 .3

) (

.1 :.2 .3

.1.2

) (

) (

.1 .2 .3

.1 .2 .3 .4
.5 .6 .7 .8
.9 .10 .11 .13
.14 .15 .16 .17
.18 .19 .20 .21 4
.22 .23 .24 .25 .26
.1 .2

.3 .4 .5
.6 .7 .8 .9
.10 .11

.1.2 .3 .4 .5 .6

.1.2 .3

.1.2 .3 .4 .5 .6 .7

.1.2 .3 .4 .5 .6 .7

.1 .2 .3 ) (.4 ) (.5 .6

.1.2 .3

) (

) (

.1.2 .3 .4 .5 .6 .7 .8 .9
.10.11 .12 .13 .14 .15

.2

109656

)83463 (76/11%

47063

)2324 (29/93%

7765

)35736 (75/93%

3373

)941 (27/90%

1297

)665 (51/27%

494

)244 (49/39%

118718

)86452(72/82%

50930

)92/37% (156719) 6/56% (11138


)1/05% (1791

)36921 (72/49%

)43/79% (3171
)49/10% (3556
)7/10% (514

1.1
1.2
2.1
2.2

-4

1.1
1.2
2.1
2.2

-4

.4 : :

1389

160


.3
1-2

1-1

2-2

2-1

56/80%

55/00%

58/12%

71/10%

57/55%

73/36%

40/70%

47/40%

58/33%

88/05%

59/88%

67/33%

57/62%

63/17%

69/51%

70/00%

56/91%

55/32%

59/06%

70/86%

57/59%

72/59%

44/43%

50/35%

3-5

) (Al-Ghamdi, 2002

[Sohn (et. al.), 2001 and Bedard (et. al.), 2002,

Valent (et. al.), 2002, Kweon (et. al.), 2003, Delen


] . (et. al.), 2006

) (
.

4 -2

1 2-1

-4

-4

1-2 2-1 %90

4-5

-4

33

1 -2

5 .

161

1389


-4 .
1.2
1-
2

11.1-1

2.2
2-2

2.1
2-
1

0/65

0/94

0/96

0/4

0/07

0/01

0/01

0/13

0/07

0/01

0/01

0/09

0/07

0/01

0/01

0/08

0/07

0/01

0/01

0/04

0/07

0/01

0/01

0/04

0/01

0/01

0/04

0/01

0/01

0/04

0/04

0/04

0/04

-4 .
1.21
2

1.1
1-
1

2.2
2-2

2.12
-1

0/61

0/59

0/53

0/31

0/11

0/2

0/23

0/26

0/07

0/06

0/06

0/14

0/07

0/05

0/06

0/14

0/07

0/04

0/04

0/14

0/07

0/03

0/03

0/02

0/02

0/02

0/02

0/02

. 6

12 11

. 17 18

. 4

) (5111718 18

4 3 . 3

( .

2 5 6

. 5

( 14 5 3 1

23 19 18 16 15 25 %68

1389

162

( )

3 (.

4 9

5 3

15 16

1 -2

.5 1.2

163

1389


5. Functional form
)6. Classification and Regression Tree (CART
7. Independent variable
8. Target variable
9. Classification and Prediction
10. Logit models
11. Ordered probit
12. Artificial Neural Networks
13. Decision tree
14. Variable importance
15. Nominal
16. Leaf or terminal node
17. Splitter
18. Purity
19. Child node
20. Parent node
21. Gini index
22. Prior probability
23. Root node
24. Misclassification cost
25. Goodness of fit
26. Maximal tree
27. Overfitting
28. Pruning
29. Cost-complexity
30. Train
31. Test
32. K-fold cross validation
33. Rules

.6

.
" "if-then .

.



.


.


.
.

.

.9

- )(1386

"

( .

"

3.221-213 .

)" (1388 "

.7

- Abdel-Aty, M. and Keller, J (2005) "Exploring the

overall and specific crash severity levels at signalized


intersections", Accident Analysis and Prevention
37(3), pp. 417-425.

)
(

)- Abdelwahab, H. T. and Abdel-Aty, M. A. (2001


"Development of artificial neural network models to
predict driver injury severity in traffic accidents at
signalized intersections.", Transportation Research
Record 1746(1), pp. 6-13.

. 8
1. Two-lane Two-way rural roads
2. Injury severity
3. Data mining
4. Regression type generalized linear models
1389

)- Abdelwahab, H. T. and Abdel-Aty, M. A. (2002


"Artificial neural networks and logit models for

164


- Kweon, Y. and Kockelman, K. (2003) "Driver
attitudes and choices: Seatbelt use, speed limits,
alcohol consumption and crash histories". 82nd.
Annual Meeting of Transportation Research Board,
Washington D. C.

traffic safety analysis of toll plazas" , Transportation


Research Record 1784(1), pp. 115-125.
- Al-Ghamdi, A. (2002) "Using logistic regression to
estimate the influence of accident factors on accident
severity.", Accident Analysis and Prevention 34(6),
pp.729-741.

- Mussone, L. and Ferrari, A.(1999) "An analysis of


urban collisions using an artificial intelligence
model", Accident Analysis and Prevention 31(6), pp.
705-718.

- Bedard, M. and
Guyatt, G. (2002) "The
independent contribution of driver, crash and vehicle
characteristics to driver fatalities", Accident Analysis
and Prevention, 34(6), p. 717.

- Renski, H. (1999) "Effect of speed limit increases


on crash injury severity: Analysis of single-vehicle
crashes on North Carolina interstate highways",
Transportation Research Record, 1665(1), pp. 100108.

- Berry, M. J. A. and Linoff, G. S. (2004) "Data


mining techniques: for marketing, sales, and
customer relationship management", Wiley Computer
Publishing.

- Sohn, S. Y. and Shin, H. (2001) "Pattern


recognition for road traffic accident severity in
Korea", Ergonomics, 44(1), pp. 107-117.

- Chang, L. Y. and Wang, H. W. (2006) "Analysis of


traffic injury severity: An application of nonparametric classification tree techniques", Accident
Analysis and Prevention , 38(5), pp. 1019-1027.

- SPSS Clementine Software (2008)


- Steinberg, D. and Golovnya, M. (2007) "CART 6.0,
User`s Guide".

- Chong, M. and Abraham, A. (2005). "Traffic


accident analysis using machine learning paradigms",
Informatica, 29, pp.89-98.

- Stewart, J. (1996) "Applications of classification


and regression tree methods in roadway safety
studies", Transportation Research Record: Journal of
Transportation Research Board, 1542(1), pp.1-5.

- Delen, D. and Sharda, R. (2006) "Identifying


significant predictors of injury severity in traffic
accidents using a series of artificial neural networks",
Accident
Analysis
and
Prevention,
38(3),
pp. 434-444.

- Tesema, T. B. and Abraham, A. (2005) "Rule


mining and classification of road traffic accidents
using adaptive regression trees",
International
Journal of Simulation Systems, 6, pp. 80-94.

- Dissanayake, S. and Lu, J. J. (2002) "Factors


influential in making an injury severity difference to
older drivers involved in fixed object-passenger car
crashes", Accident Analysis and Prevention 34(5),
pp. 609-618.

- Valent, F. and Schiava, F. (2002) "Risk factors for


fatal road traffic accidents in Udine, Italy", Accident
Analysis and Prevention, 34(1): pp.71-84.
- Wood, D. and Simms, C. (2002) "Car size and
injury risk: a model for injury risk in frontal
collisions", Accident Analysis & Prevention, 34(1),
pp. 93-99.

- Giudici, P. (2003) "Applied data mining: Statistical


methods for business and industry", Wiley.

- Yau, K. (2004) "Risk factors affecting the severity


of single vehicle traffic accidents in Hong Kong",
Accident
Analysis
and
Prevention,
36(3),
pp. 333-340.

- Khattak, A. [et. al.] (2002). "Risk factors in large


truck rollovers and injury severity: Analysis of
single-vehicle collisions"

- Han, J. and Kamber, M. (2006) "Data mining:


concepts and techniques", Morgan Kaufmann.

- Kockelman, K. and Kweon, Y. (2002). "Driver


injury severity: An application of ordered probit
models", Accident Analysis and Prevention 34(3), pp.
313-322.

- Zajac, S. and Ivan, J. (2003) "Factors influencing


injury severity of motor vehicle-crossing pedestrian
crashes in rural Connecticut", Accident Analysis &
Prevention, 35(3), pp. 369-379.

1389

165

Analysis of Two-Lane, Two-Way Rural Roads Traffic Injury


Severity Based on Data Mining Models
A. Shariat Mohaymany, Assistant Professor, Department of Civil Engineering, Iran University
of Science and Technology, Tehran, Iran.
A. Tavakoli Kashani, Ph.D. Candidate, Department of Civil Engineering, Iran University
of Science and Technology, Tehran, Iran.
E-mail:shariat@iust.ac.ir

ABSTRACT
In this study, factors influencing driver and occupant injury severity in two lane , two way roads
of Iran are identified. Using statistical models is one of the most common methods that were
employed to analyze crash severity. In this study, in addition to examining such models and
expressing their weak points, CART method is introduced. Classification and regression trees
(CART), which is one of the most common methods of data mining, was employed to analyze
the traffic crash data over a three-year period (2006-2008). Expressing models in decision tree
format with explicit rules is one of their most useful outputs. In the analysis procedure, the
problem of three-class prediction was decomposed into a set of binary prediction models and
then eight models were analyzed. This resulted higher overall accuracy of predicting the model,
besides the prediction accuracy of the fatality class, which was nearly 0%, and in most of the
previous studies, increased significantly. Results indicated that improper overtaking and ignoring
seat belts are the most important factors affecting the severity of injuries.
Keywords: Data mining, Classification and Regression Trees (CART), injury severity, rural
pathaaaaaa roads

Transportation Research Journal (TRJ)

200

Vol.7, No.2, Summer 2010

You might also like