You are on page 1of 1

Variables used for analysis:

We first explored the data by looking at the total number of observations (i.e. 13587), number of
variables (i.e. 34), number of missing values for each variable, and the type of each variable. The
data consists of mainly categorical variables and text variables which means that it would be
most appropriate to apply association rules to extract useful insights from the data. The data also
contained few variables of discrete numeric nature.

Variables Reasons
Province/ Administrative Region/ State This variable is kept for the purpose of
(provstate) identifying that we are keeping the data state-
wise.
Attack Information (attacktype1_txt) This variable is kept because this categorical
variable is one of the major factors to classify
the data on. It does not have missing values
making it easier for us to classify data on it.
Target/ Victim Type (targtype1_txt) This variable is kept because it tells about the
type of victim which is one of the major
factors to find the association.
Total Number of Fatalities (nkill) This is a numerical variable which is storing
the number of total fatalities. This is
important to draw conclusions for association.
This variable determines the gravity of the
accident. Thus, including it is important.

You might also like