You are on page 1of 24

DATA GATHERING

2.0 DATA GATHERING AND COLLECTION

Evaluation, design, planning and operation of transportation systems requires rich data, and
systematic and efficient data collection plans. All persons involved in transport and land-use
planning will at some stage be involved with data collection. The need for data and techniques
for data analysis must be adapted to the problem and site in question. Different types of data
require different reduction techniques as well as methods for accurate statistical data analysis.
One of the greatest challenges of any statistical survey is producing high quality, useful data with
limited budget and resources.

Data in transportation engineering serves to paint a picture of the existing transportation situation
and comes in handy in planning and design of transport systems. Data can also be used to
validate or calibrate transport planning models. The models on the other hand help transport
planners to accurately predict future system situation so as to formulate appropriate policies

SAMPLING PROCEDURES
A sample is defined to be a collection of units which is some part of a larger population and
which is specially selected to represent the whole population. Its considerations are of particular
 what are the units which comprise the sample,
 what is the population which the sample seeks to represent
 how large should the sample be
 how is the sample to be selected?
The objective of sampling is to obtain a small sample from an entire population such that the
sample is representative of the entire population. The need for sampling is based on the
realisation that in transport studies we are often dealing with very large populations. To attempt
to survey all members of these populations would be impossible.

Target Population

The target population is the complete group about which one would like to collect information.
The elements of this group may be people, households, vehicles, geographical areas or any other
discrete units. The definition of the target population will, in many cases, follow directly from
the objectives of the survey. For example, consider a survey which has the objectives of
determining the travel patterns of elderly, low-income residents living in a particular area of
Nyeri County. In order to define the target population for this study, it is necessary to define
‘elderly’, ‘low income’ and the precise geographical extent of the study. One way to define ‘the
elderly’ is by the retirement age (60 years). ‘Low income’ may be defined as those spending less
than one dollar per day. The geographical extent may be defined as the entire county. The target
population is thus defined as ‘Retirement-age residents of Nyeri County spending less than one
dollar per day.

Sampling units
The survey population is composed of individual elements. Thus continuing with the example of
the Nyeri County survey, the elements of the population are the individual elderly residents.
However, the selection of a sample from this population is based on the selection of sampling
units from the population. Sampling units may or may not be the same as elements of the
population; in many cases, they are aggregations of elements. Thus in the Nyeri County survey
example given above, it may be decided to select elderly households rather than individual
elderly residents. Thus, the sampling unit for the survey may be defined as an ‘Elderly
household’. The adopted definition of an elderly household is one in which at least one elderly
individual (as previously defined) resides on a permanent basis. Generally, sampling units may
typically include such entities as: Individuals, Households, Companies, Geographic regions
(zones, cities, states, and nations), Vehicles, Intersections or road links, other features of the
transport network

Sampling frame

Having identified the desired survey population and selected a sampling unit, it is necessary to
obtain a sampling frame from which to draw the sample. A sampling frame is a base list or
reference which properly identifies every sampling unit in the survey population. Clearly, the
sampling frame should contain all, or nearly all, the sampling units in the survey population.
Depending on the population and sampling units being used, some examples of sampling frames
which could be used for various transport surveys include:
 Electoral rolls
 Block lists (lists of all dwellings on residential blocks)
 Lists by utility companies (e.g. electricity service connections)
 Mailing lists
 Local area maps
 Census lists (if available)
 Society membership lists
 Motor vehicle registrations
Each of these sampling frames suffers from certain deficiencies including inaccuracy,
incompleteness, duplication, inadequacy, out of date, etc. A common reason for such
deficiencies is that the list that the researcher plans to use as a sampling frame has been compiled
for a completely different reason. Continuing the example of elderly people in Nyeri County, a
sampling frame may be obtained from the pensions list at the Pensions Department, electoral
roll, social and religious organization membership, social services department, etc.
If no adequate sampling frames can be found, then it may be necessary to conduct a preliminary
survey with a view to establishing a suitable sampling frame. Alternatively, the survey can be
designed using a larger than required sampling frame and using filter questions at the beginning
of each questionnaire (or interview) to eliminate nonrelevant sampling units from the survey.

Sampling methods

The accuracy of sample parameter estimates, however, is totally dependent on the sampling
being performed in an acceptable fashion. Almost always, the only acceptable sampling methods
are based on some form of random sampling. Random sampling entails the selection of units
from a population by chance methods such as flipping coins, rolling a die, using tables of random
numbers or through the use of pseudo-random numbers generated by recursive mathematical
equations.
The essence of pure random sampling is that sampling of each unit is performed independently
and that each unit in the population has an equal probability of being selected in the sample (at
the start of sampling). There are many types of sampling methods, each of which is based on the
random sampling principle. The most frequently encountered methods are:
1. Simple random sampling
2. Stratified random sampling
3. Variable fraction stratified random sampling
4. Multi-stage sampling
5. Cluster sampling
6. Systematic sampling
In addition, there are a number of other sampling methods which, while used in transport
surveys, are not based on random sampling and are therefore not highly recommended. These
methods include quota sampling and expert sampling. This section will describe each of the
above sampling methods, indicating their strengths and weaknesses.
Simple Random Sampling
Simple random sampling is the simplest of all random sampling methods and is the basis of all
other random sampling techniques. In this method, each unit in the population is assigned an
identification number and then these numbers are sampled at random to obtain the sample.
For example, consider a population of 100 sampling units. The task is to select a random sample
of 10 sampling units from this population. The first step in simple random sampling is to name
each of the sampling units. This is often done by assigning a unique identification number to
each of the sampling units, even if they already have unique identifying names. The elements in
the population of 100 units may be numbered from 1 to 100. Using random number selection
methods a set of ten random numbers is now selected (e.g. 2, 14, 17, 32, 33, 41, 62, 66, 72, 99),
and the sampling units corresponding to these numbers are included in the sample. The question
arises in this method as to whether sampling should be performed with replacement, or without
replacement. The usual practice is to sample without replacement such that each unit can be
included in the sample only once; this is particularly so when dealing with individuals or
households which are being sampled for an interview survey. It is possible, however, to sample
with replacement and simply include the results from those sampling units selected more than
once as many times as they are selected. That is, if a household is selected twice, then you do not
interview the household twice but merely include the results from this household twice in the
data set.
Stratified Random Sampling
The sample of ten sampling units identified above may well be a good representation of the 100
sampling units in the population. However, if we have some prior information about the
population, it may be clear that this is not the case. For example, assume that the sampling frame
developed in the previous section (list of 1-100) comes from a list of employees in a company
and that, for some other reason, the employees are listed by gender such that the first 40
employees are female and the second 60 employees are male. We might have inadvertently over-
sampled females (selecting 5 out of 40) and under-sampled males (selecting 5 out of 60). As a
result, any inferences drawn from this sample will be biased towards the behaviour or attitudes of
females because they are over-represented in the sample compared to their representation in the
population. To overcome this problem, stratified random sampling makes use of prior
information to subdivide the population into strata of sampling units such that the units within
each stratum are as homogeneous as possible with respect to the stratifying variable. Each
stratum is then sampled at random using the same sampling fraction for each stratum. When the
same sampling fraction is used in each stratum, this method is sometimes called proportionate
stratified sampling. The resulting sample will then have the correct proportion of each stratum
within the whole population, and one source of error will have been eliminated.
It is important that the prior information should relate to the variables which are to be measured
in the survey. For example, if one were attempting to measure trip generation rates in a survey,
then stratification on the basis of car ownership would be more useful than stratification on the
basis of the day of the week on which the respondent was born (assuming both data sets were
available prior to sampling).
Variable Fraction Stratified Random Sampling
The above discussion of stratified random sampling has implicitly assumed that within each
stratum, the same sampling fraction will be used. Whilst this may often be the case, an added
advantage of stratified sampling is that it allows different sampling fractions to be used in each
stratum. Such variable fraction sampling may be desirable in three distinct situations:
 First, as will be shown later, the accuracy of results obtained from a sample depends on
the absolute size of the sample, not on the fraction of the population included in the
sample. In some populations, stratification on the basis of a particular parameter may
result in a highly variable number of sampling units in each stratum. If a constant
sampling fraction were used for each stratum, one would obtain highly variable sample
sizes in each of the strata, and hence highly variable degrees of accuracy in each stratum,
all other things being equal. To obtain equal accuracy in each of the strata, it would be
necessary to use different sampling fractions in each stratum so that approximately equal
sample sizes were obtained for each stratum.
 The second factor which affects the accuracy of a parameter estimate obtained from a
sample is the variability of that parameter within the population; higher variability
parameters require higher sample sizes for a specified degree of accuracy. If the sampling
units within different strata exhibit different degrees of variability with respect to a
parameter of interest, then it would be necessary to use high sampling fractions for strata
with high variability if equal degrees of accuracy were to be obtained for each stratum.
 The third reason for choosing variable fraction sampling is more pragmatic than
theoretical. It may be that the costs of sampling and/or collecting the data may vary
across the different strata. In such a case, a trade-off is necessary between the anticipated
costs of reduced accuracy and the known costs of obtaining the data. It may therefore be
desirable to reduce the sampling fraction in those strata where data is more expensive to
obtain.
Two drawbacks with variable sampling fraction methods should be noted:
 First, the method may require far greater prior information about the population,
including the size of each strata, the variability of the specified parameter within each
stratum, and the cost of data collection within each strata.
 Second, because sampling units in each strata no longer have the same chance of
selection (because some strata are being deliberately oversampled), one of the two basic
conditions for random sampling has now been violated. Therefore, the raw data obtained
is no longer a random sample representation of the entire population. It will be necessary
to assign weightings to each of the strata samples during analysis to generate population
estimates which are truly representative of the population.
Multi-Stage Sampling
In simple random sampling, the first stage in the process is to enumerate (give names or numbers
to) the entire population. While this may be feasible for small populations, it is clearly more
difficult with larger populations. For example, identifying every individual in a large city or a
nation is clearly a non-trivial task. In such circumstances, another variation of random sampling
is called for. Multistage sampling is a random sampling technique which is based on the process
of selecting a sample in two or more successive, contingent stages. Consider, for example, a
multi-stage survey of travel patterns for an entire nation.
a) First-stage: divide nation into counties and sample from total population of counties.
b) Second-stage: divide selected counties into sub-counties and sample from these sub-
counties within each selected county.
c) Third-stage: divide selected sub-county into Census Collectors' Districts and sample
Census Collectors' Districts.
d) Fourth-stage: divide selected Census Collectors' Districts into households and sample
households.
e) Fifth-stage: divide selected households into individuals and sample individuals.
At the end of this process we have a random sample of individuals from the nation (i.e. every
individual had an equal chance of being selected at the start of the process) provided that
appropriate sampling procedures are used at each of the stages.
It should also be noted that at each stage in the multi-stage sampling process, different sampling
methods can be applied. Thus stratified sampling and variable fraction sampling can be applied
to meet certain objectives.
Multi-stage sampling can also be used in the design of on-board transit surveys. In such surveys,
the sampling unit is the transit passenger, but it would be impractical to have a sampling frame
based on the names of all transit passengers. Rather, the sample of transit passengers can be
drawn in a four stage process, where each stage takes account of a different dimension in the
transit passenger population :
Stage 1: Geographically-stratified sampling of routes
Stage 2: Sampling of vehicles from the selected routes
Stage 3: Time-stratified sampling of runs on selected vehicles
Stage 4: Surveying of all passengers on selected runs
Cluster Sampling
Cluster sampling is a variation of multi-stage sampling. In this method, the total population is
first divided into clusters of sampling units, usually on a geographic basis. These clusters are
then sampled randomly and the units within the cluster are either selected in total or else sampled
at a very high rate. Like multi-stage sampling, cluster sampling can be much more economical
than simple random sampling both in drawing the sample and in conducting the survey. The art
of cluster definition is to find economical clusters which maintain heterogeneity in the
parameters to be estimated.
Systematic Sampling
When random sampling is being performed in conjunction with a sampling frame list, it is
frequently more convenient to use a technique called systematic sampling rather than rely on the
use of random numbers to draw a sample. Systematic sampling is a method of selecting units
from a list through the application of a selection interval, i, such that every i th unit on the list,
following a random start, is included in the sample. The selection interval is simply derived as
the inverse of the desired sampling fraction. For example, in the population of 100 iunits
discussed in earlier sections, a number between 1 and 10 can be randomly sampled, say 04.
Then every 10th number from that number i.e. 14, 24, 34, ...etc.
The major task in systematic sampling lies in the preparation of an appropriate sampling frame
list.
Non-Random Sampling Methods
Quota sampling, as the name suggests, is based on the interviewer obtaining responses from a
specified number of respondents. The quota may be stratified into various groups, within each of
which a quota of responses must be obtained. This method, for example, is often used when
interviewing passengers disembarking from aircraft or other transport modes and for many types
of street interviews where passers-by are stopped and asked questions.
The major problem with quota sampling is not that quotas are used for each sub-group (after all,
this is the basis of stratified sampling), but that the interviewer is doing the sampling in the field
and this sampling procedure may be far from random, unless strictly controlled. Left to
themselves, interviewers will generally pick respondents from whom they feel they will most
readily obtain a response. Thus passers-by who appear more willing to cooperate, are not in a
hurry, and are of a social class comparable to the interviewer will more likely be interviewed. In
a household survey, households which are closer to the interviewer's residence (and hence
require less travel to reach), households whose members are more often at home, and households
without barking dogs are more likely to be interviewed. Such preferential selection can often
cause gross biases in the parameters to be estimated in the survey.
Expert sampling, on the other hand, takes the task of sampling away from the interviewer and
places it in the hands of an "expert" in the field of study being addressed by the survey. The
validity of the sample chosen then relies squarely on the judgment of the expert. While such
expert sampling may well be appropriate in the development of hypotheses and in exploratory
studies, it does not provide a basis for the reliable estimation of parameter values since it has
been repeatedly shown that people, no matter how expert they are in a particular field of study,
are not particularly skilled at deliberately selecting random samples. A more appropriate role for
the expert in sample surveys is in the definition of the survey population and strata within this
population, leaving the task of selecting sampling units from these strata to the aforementioned
random sampling methods.

Sampling error and sampling bias

Despite all our best intentions in sample design, the parameter estimates made from sample
survey data will always be just that: estimates.There are two distinct types of errors which occur
in survey sampling and which, when combined, contribute to measurement error in sampled
data:
The first of these errors is termed sampling error, and is the error which arises simply because
we are dealing with a sample and not with the total population. Sampling error is primarily a
function of the sample size and the inherent variability of the parameter under investigation.
However, sampling error should not affect the expected values of parameter averages; it merely
affects the variability around these averages and determines the confidence which one can place
in the average values. Sampling error is primarily a function of the sample size and the inherent
variability of the parameter under investigation.
The second type of error in data measurement is termed sampling bias. It is a completely
different concept from sampling error and arises from mistakes made in choosing the sampling
frame, the sampling technique, or in many other aspects of the sample survey.
Sampling bias is different from sampling error in two major respects. First, whilst sampling error
only affects the variability around the estimated parameter average, sampling bias affects the
value of the average itself and hence is a more severe distortion of the sample survey results.
Second, while sampling error can never be eliminated and can only be minimized by increasing
the sample size, sampling bias can be virtually eliminated by careful attention to various aspects
of sample survey design. Small sampling error results in precise estimates while small sampling
bias results in accurate estimates

Data requirements and survey


Practical considerations in data collection

In transport study surveys and data collection, practical limitations have a strong influence in determining
the most appropriate type of survey for a given situation. The data gathering techniques and the methods
of analysis of data will depend on the type of survey for a particular given situation. The design of a
survey requires considerable skills and experience. Collection entails, among others, recruiting and
training staff, questionire design, supervision and quality control. The following are the practical
constraints in transportation studies.

(i)The length of the study. This determines indirectly how much time and effort it is possible to devote
to data collection stage. It is important to achieve a balance in study and avoid the problem of spending
the largest part of the study budget in data collection, analysis and validation.

(ii)Limits of the study area: It is important to ignore formal political boundaries and concentrate on the
whole area of interest. It is important to distinguish between area of interest and the study area as defined
in the project brief; the former is normally larger as we would expect the latter to develop in a period of,
say 20 years. The definition of the area of interest depends on the type of policies examined and decisions
to be made.

(iii) Study of resources: It is important to know:

 How many personnel and of what level will be available for the study.
 What type of computing facilities will be available and what restrictions to their use will
exist.

Note that time available study resources should be commensurate with the importance of the
decisions to be made from result. The greater the cost of wrong decisions the more the resources that
should be devoted to get it right. The other possible restriction to be taken into account;

 Physical e.g. size and topography of locality


 Social and environmental-reluctance of the population to answer certain type of question
 The often likely reluctance of travelers to answer (yet another) questionnaire

Responding to questions may be time consuming and may be seen by some as a violation of
privacy. The problem may result in refusal to answer questions or giving misleading answers.

There is need to obtain permission from the authorities before embarking in any traffic survey involving
disrupting travelers

There are basically two main types transportation studies, based on the study time horizon, namely;

 Short-range transportation study.


 Strategic (or long-range) transport study with an analysis horizon of, say, 20 year. In this
case data may be needed not only about trips but also about land use, employment and other
activities in general. It is with reference to this study that we now discuss data collection
methods and surveys

Typical information or data needs


The main types of data that must be collected for transportation study are as follows:

1. Infrastructure and existing services inventory i.e. data on transportation system or


networks over which traffic is accommodated (e.g. public and private transport networks, traffic
signals) may be important e.g. for model calibration, especially assignment models.
2. land use inventories and inventories on associated human activities ,residential zones (housing density)
commercial and industrial zones (by type and establishment),packing spaces e.t.c. these are particularly useful for
trip generation models
3. Inventories of travel: OD travel surveys (at house hold, cordons and screen lines) and associated traffic
counts , flows,speed,and travel time measurement (to build speed flow relationships) for modeling trip distribution
and generation
4. Social-economic information (income, car ownership, family size and structure etc.) especially for trip
generation and modal split.

DELINEATING (SKETCH OUT) STUDY AREA

It is necessary to define the area of interest of the study. Its external boundary is known as the external cordon.
Once this is defined the area is divided into zones in order to have clear and spatially disaggregated area of the
origin and destination of trips. Also to be spatially quantifying some variables such as population and employment.
The area outside the external cordon may also be divided into zones but at a laser level of detail (larger zones).
Inside the sturdy area there can be other internal cordons as well as screen lines (i.e artificial divide following
natural or artificial boundary with few crossings such as river and railway line)

The information derived from O-D surveys is put into the following uses:

 Establishing travel characteristics from various types of land uses


 Establishment of travel demand on existing or future transport facilities
 Establishment of adequacy of existing parking facilities.
 Determining the number of potential users say of proposed by-pass justification for construction

Terminologies used in O-D surveys:

1. Screen line: is a line established along some physical (natural or artificial features) boundaries with few
crossings on it. It divides the study area into zones for purposes of checking accuracy (e.g. of the house hold survey)
information or data and for correction.
2. Cordon: imaginary line defining the boundary of study area. Data obtained may also be useful for
correction of data.
3. Desire lines: lines connecting centroids of various zones such that their widths are proportional to travel
volume and their directions give the direction of movement /travel.
Zoning of study area

Study areas characteristics:

1. Should be reasonably close to an area of existing or anticipated development.


2. Should incorporate natural boundary lines as much as possible
3. Should be suitably located such that roadside interviews (RSI) may be conveniently carried out.
4. Zones (areas of similar trip characteristics due to their uniformity and land use e.CBD, industrial
district, residential area) should be sizable enough to be reliably analyzed while at the same time ensuring
that the data collected statistically significant.

2.3.3 Origin-Destination (O-D) surveys

Methods of conducting Origin-Destination surveys

1. Roadside interviews (RSI)


Interviewers stop and interview road users at origin, destination or often at some intermediate point .the
interview should be short and precise to avoid unnecessary delay to the road users. It is usually possible to
stop and interview all motorists hence sampling procedures are normally used. At the planning stage it’s
necessary to determine the sample size to achieve sufficient degree of accuracy when considering the
whole study area as representative of the whole traffic flow. The sample sizes should be as small as
possible consistent with the object of survey in order to minimize personal requirements and the delay to
traffic.

Three main sampling methods in RSI


 A fixed number of vehicles may be stopped (3out of 12) and the next 9 allowed to pass to give
25% sample.
 All vehicles to be stopped that arrive at the site during a predetermined period of time thus all
vehicles would be stopped every alternate half-hour in order to obtain 50% by time sample.
 Similar to second in that no attempt is made to maintain a fixed relationship between the number
of drivers interviewed to total number of vehicles on road instead a variable sampling fraction is obtained
by the interviewers after completing an interview.
Just as it is not practical to stop all vehicles on the road at a given time, so also it is no normally possible
to carry out RSI through the years. Instead a survey can be carried out during a period when the traffic
pattern is considered to be representative of the whole year or when traffic problems are considered to be
most acute/severe.

Advantages of RSI
 The data obtained is for actual trips made
 Requires minimum person

Disadvantages

 Only trips passing through interviews stations are interviewed.


 Delays may cause congestion due to the stoppage.

RSI sampling procedure


a. Time cluster sampling
During each hour of survey, a period t is selected when all vehicles are stopped and interviewed and
further period T when all vehicles are let to pass.

Sample size %=100t/(T+t)

b. Variable rate sampling


A predetermined volume of x of vehicles is stopped and interviewed and a further volume X allowed to
pass.
Sample size %=100x/(x+X)

c. Variable rate sampling


Interviews are conducted at constant rates such that sample sizes depend on flow at any particular
duration
d. Systematic sampling: randomly pick the first vehicle, and then pick every nth vehicle.

2. Registration number (number plate) method.

Observers are stationed at the road side of the road to record registration numbers, time of passing
station, classification of vehicles and the direction of travel. Two variations of this method can be
applied, viz…
 First variation observers with synchronized watches to be stationed about or within survey area,
then as vehicle pass an observer its passes time and registration number is recorded. At the end of the
survey the records of all observation sites are compared and each vehicle trip through the survey area is
traced.

By noting entry and exit times, the journey time of each vehicle can be estimated and compared with
known journey time for the same trip. The vehicle can be classified as stopping or non-stopping within
study (survey) area.

Advantage
 Can be used where traffic is heavy since it does not interfere with traffic flow.

Disadvantage
 Requires more personnel
 Use is limited to single carriageway.
 Second variation involves the recording for given day the registration number of all vehicles
within study area. These are later compared with motor vehicle registration list and the origin assumed
to be where registered while destination assumed to be where parked.
Advantage
 Simplicity of the method in terms of preparation, host, equipment

Disadvantage

 Information obtained is limited since no data obtained regarding through traffic, time of journey,
final destination, location of stoppages.
 It is confined to small survey areas due to the number of observers required
 It is also prone to personal errors.
3. Postcard method (questionnaire)
This is a postage prepared card of questionnaire form are handed to drivers to complete and return with
detailed information on their trips, purpose of trips, vehicle type, type of people. These are two methods:
 The postcard is sent to all registered vehicles within survey area
 Handling the card to motorists as they pass through a selected area

Advantage

 Cost effective since minimum delays, observers, preparation, equipment

Disadvantage

 Lack of co-operation from drivers


 Information only got on regular trips
 Data could be unreliable since it could be weighted in favor of a particular faculty of interest to
the motorist.
4. Home interview method
Because of many variables in urban areas e.g traffic generated from great distances, presence o farrows
streets, well-developed and valued property and constrictive topography, it is not possible to decide by
observing traffic flows where highway improvements could be made. It is also not practical to obtain data
by stopping vehicles within heavily trafficked area. Hence a home interview sampling technique is
usually adopted as method of data collection to obtain comprehensive but detailed information. Travel
trends being habitual patterns can be determined by applying statistical sampling method to be
representative of the persons included in geographical area.

Interviewers call at home of road users to interview them about journeys they have made in a day. This
method provides additional information on house hold structure and size, occupation of house hold. The
sample size for home interview is function of purpose of survey based on the distribution of various types
of land uses. Traditional survey sample sizes based on experience on past large population surveys can be
used, viz.:

Population (x103) recommended sample size


50-150 12.5%
150- 300 10%
300- 500 7%

Advantage

 Direct information is obtained


 It is also based on the socio-economic details for concerned households.

Disadvantages
 Lack of co-operation from the house hold

Accuracy of home interview method is obtained by comparing with other methods of collecting data e.g.
screen line.

TRAFFIC FLOW CHARACTERISTICS:


I. Random flows:
Occurs in light to moderate traffic flows away from the influence of intersection /junctions and often
traffic control devices such that the driver is free to choose his own operating speed.
II. Platooned flows:
Where vehicles are bundled together with short spacing’s between successive vehicles such that a
driver’ s speed of travel depend on the speed of vehicles in front of him /her e.g. .where traffic lights are
used.
III. Semi -random
Cases of isolated pluton of vehicles with small spacing’s between successive vehicles particularly at
restrictions to flow. Driver chooses his own speed. Occurs away from the influence of intersection.

Traffic head way distribution:

Density (k) defined as number of vehicles occupying a given length of lane or road way, averaged over
time (given in vehicles /km); k is given as:

no . of vehicles (N)
k=
length of road(L)

Traffic flow or volume is the number of vehicles (traffic) passing appoint during a given time interval. It
is measured in vehicles /seconds or vehicles/hour.
no . of vehicles (N )
flow , q=
time(t )

Spacing (s) is the distance between successive vehicles in a traffic stream measured from front bumper to
front bumper. Headway is the corresponding time between successive vehicles as they pass a point on a
road way.

length of road( L) 1
Mean spacing¿ =
no . of vehicles (N) k
time(t) 1
Mean headway =
no . of vehicles (N ) q
ORIGIN DESTINATION (O-D) STUDIES/SURVEYS

O-D data give, among others, the origin and destination of traffic in accordance to zones, and hence
describe the distribution of trips in terms of volumes, types of traffic, purpose of trips etc.

Origin and destination data are important in determining the location s and demands for new through
fares, bridges, tunnel and parking places/ parking garages. It helps in determination of the desire lines of
travel in a given area and provides bases for estimating volumes of traffic which could use new or
improved routes and terminals at selected positions. Hence these data are important in planning for
future demand and needs and requirements in terms of infrastructure, management of traffic in the
area of study. Investment for the future infrastructure needs and design of the infrastructure facilities
require the knowledge of O-D data.

The methods or techniques of data collection depend on type of survey being undertaken. They from
simple observations, intensive home interviews. The observations should be made for representative
periods on week days unaffected by weather conditions or other unusual events. The complete area of
investigations are divided into traffic zones and these are divided into mosaic subdivisions, and counting
stations are located.

The important O-D study surveys and data collection techniques are as follows:

1. HOUSE HOLD OR HOME SURVEYS

 Although the house hold based O-D is the most expensive ,difficult and time
consuming ,it is also that which gives the possibility of obtaining more useful and
comprehensive data .However, when interest is not centered on gathering data for the complete
model system but only for parts of it, e.g. for mode choice and assignment in short-term
studies ,other methods could be used corridor-based journey-to-workstudies,few examples, use
of work interviews of sample of employees with permission of their work institution
(employers).Such data would be choice based in terms of destination but random with respect
to mode of travel.
 Data collection techniques include house interviews and questionnaires [and special techniques
called travel-diary survey techniques]. Interviews and questionnaires are administered to appropriately
randomly selected sample of households in the area of study by trained interviewers. Trained interviewers
are sent to the selected dwellings in the survey area to collect data from the members of the households.
Every family member above 12 years old should be interviewed in person. Questionnaire –design must be
appropriate: simple and easily understood questions, short yet obtains sufficient information avoids open
–ended questions as much as possible etc.

GENERAL CONSIDERATIONS ESPECIALY FOR HOUSEHOLD –SURVEY (***these apply


also to other types of O-D survey in certain/most of the aspects)

 It is known that both the procedures and measurement instruments used to collect information on
site have a direct and profound influence on the results derived from any data collection effort. Thus, the
development and use of the measurement instruments designed to measure activity patterns outside the
household are important aspects. Empirical measurement of travel behavior is one of the main inputs to
the decision –making process in urban transport planning. It provides the basis for the formulation and
estimation of models to explain and predict future activities. Thus, methodological deficiency must be
avoided at every stage of the transport planning process
 Criticisms about house hold or work place O-D surveys:
- They only measure average rather than actual travel behavior of individuals
- Only parts of individuals movements can be investigated;
- Information (e.g .about travel times) is often poorly estimated by the interviewee. Poorly
estimated also are distances and costs of travel. These variables measurements obtained from survey are
inadequate (inaccuracy w.r.t the reality). The bias has systematic nature and is apparently related with
user altitudes with respect to each mode e.g access, waiting and transfer times in public transport tend
to be over –estimated.

 Survey date: determining appropriate date to conduct O-D survey is dependent on its
objectives. To obtain travel behavior data about the inhabitants of the study area during a typical
working day, the survey date should not coincide with holidays, weekends or with bad climatic
conditions.
 Days and times to conduct the survey: to obtain data for a typical working days,
Mondays and Fridays could be avoided; the former due to absenteeism and the latter due to
registration of more trips more than all other working days. House hold survey should be done at
the times when people are likely to be at home e.g. between 18.00 and 21.00hours. For work
place survey, the best times are between the normal working hours. One could first do pilot
surveys to be sure.
 Survey period: Ideally all the selected sample should be interviewed on one day to
obtain a real snapshot of what happens on typical day. But due to number of people to be
interviewed, it has become a standard practice to conduct the survey during a period of several
days. It is assumed that the sum of responses for the days is good representation for the answers
that would be obtained in one day /single day of complete survey.

Questionnaire design: the questionnaire must be such that the resistance on the part of the interviewee to
answer question minimized. Difficult questions (e.g. about income) are thus formulated at the end of the
interview. The questionnaire to satisfy the following
 The questions should be simple and direct
 The number of open- ended questions should minimized
 The information about travel must elicited with reference to activities which originated the trips.
 For household surveys, each member older than 12 yrs. Should be personally interviewed.

The three sections of a household O-D survey are:

I. Personal characteristics and identification : the questions here are designed to classify
household members according to: relation to the head of house hold , sex, age, possession of driving
license , education level and activity. It is important to define a complete set of activities.
II. Trip date :this part aims at detecting and characterizing all trips made by house hold members
identified at part (i) the trip defined as any movement greater than 300 metres from an origin ,a
destination with a given purpose. Characterizing trips is based on variables such as: origin and
destination, trip purpose, trip start and ending times, mode used, walking distance, public transport and
transport station or bus stop etc.
III. House hold characteristics: the questions here seek to obtain socio economic information about
the household. Examples: characteristics of a house, identification of house hold vehicles (including a
code to identify their usual user), house ownership and income.
 Sample size: traditionally households surveys have been taken on the basis of very large
random samples recommended for properties as given below. It is argued that these allow for
eventual wide losses.

TRADITIONAL O-D SURVEY SAMPLE SIZE.

POPULATION OF SAMPLE SIZE(DWELLING UNITS


AREA
RECOMMENDED MINIMUM
Under 50,000 1 in 5 (20%) 1 in 10 (10%)
50,000-150,000 1 in 8 1 in 20
150,000-300,000 1 in 10 1 in 35
300,000-500,000 1 in 15 1 in 50
500,000-1000,000 1 in 20 1 in 70
Over 1,000,000 1 in 25 (4%) 1 in 100 (1%)

Methods to estimate sample size from a more logical and less wasteful statistical approach require the
knowledge about the variable to be estimated, its coefficient of variation and the desired accuracy of
measurement together with the level of significance associated to it. Sample size selection is not easy.
First, it is necessary to have great clarity about survey objectives; secondly, a decision must be reach
about how much effort should be spent in order to achieve given level of accuracy in the result.

The coefficient of variation ¿ ) was an unknown in the past but now it may be estimated using
information from the large number of households O-D survey which has been conducted in the recent
years. The accuracy level (% error acceptable to the analyst) and its confidence level are context-
dependent matter to be decided by the analyst on the basis of personal experience. Once these three
factors are known, the sample size may be computed. The formular by M.E. Smith for computing the
sample size (n) is

( cv )2 Z2∝
n=
E2

Where CV is the coefficient of variation, E is the level accuracy (expressed as proportion) and Z∝ is the
value of the standard normal variant for the confidence level (∝) required.[Recall :
EXAMPLE:
Assume that we need to measure the number of trips per household in a certain area, and that we have
data about the cv of this variable as follows

Area CV
Average for the nation (1969) 0.87
Area/region A (1967) 0.86
Area/region B (1964) 1.07
Area/region C (1962) 1.05

From which the estimate for CV for the study area is estimated as 1.00
Assume that we ask for 0.05(5%) level of accuracy at 95% level of confidence. For =95%, the value of
Z∝=1.645 [from the normal distribution table]. Thus we get:
( 1.0 )2 1.6452
n= =1082.41~1083
0.052
thus, it would suffice to sample of approx. 1000 observations to ensure trip rates with a 5% tolerance 95%
within the time.
[2] Cordon surveys:

Cordon surveys provide useful information about external-external and external-internal trips.
Their objective: to determine the number of trips that enter, leave and/or cross the cordoned area, thus
helping complete the information obtained from O-D household survey.
The main survey is taken at the external cordon and the others at internal cordons, by stopping a sample
of vehicles passing cordon control station.
Data collection techniques

 Mail-return or post card questionnaires


The road user is asked to answer the questions on the postcard/questionnaire and mail postage
for this is pre-paid. In some cases, the return post card or questionnaires are mailed (later) to
vehicle owners whose vehicles’ license plates are registered at the control stations during the
survey. They are asked to complete them and mail them back. The problem is that less 50% of
questionnaires are returned.
 License plate observations: - Observers are stationed at selected (control) points to record
all entries and exits of the area being surveyed. License plate numbers recorded in a time series,
are matched to determine the routes traversed by the vehicles. Or: coloured tags could be fixed to
the vehicles at the cordon area and removed at their exit of the cordon area
 Route interview/road side interviews-suitable questionnaires are used by trained
personnel to interview road users at the control stations. It is important to have uniformed police
for the survey. It is more accurate. Vehicles are stopped at regular intervals e.g. every 10 th
vehicle.

[3] SCREEN-LINE SURVEY:


Done at screen-lines; the latter divide the study area into large zones (e.g. rivers, railways, motorways
etc.) with few crossings between them. The procedure is analogous to cordon survey.
The data serve to fill in gaps and validate information from household and cordon surveys. Data
collection techniques are similar to those used in cordon survey; viz: (1) post card / mail return
questionnaires ; 2 license plate observation or survey and / or questionnaires 3 road side interviews

[4] TRAVEL DIARY SURVEYS


Is a special type of household survey aimed at obtaining more details from travelers / households? Diaries
are given to each member of the households travelling at the time of study. The diaries are carried and
self-completed by the subjects during the day as they travel. Thus the diaries must be easy to transport,
easy to understand by the user and easy to complete/answer.
The households’ are visited twice; first, to deliver the diaries and explain the procedure; second, they are
visited again to collect the completed diaries the following day.

O-D SURVEY DATA CORRECTION:


This is to achieve the results which are not only representative of the whole population, but also reliable
and valid.Series of correction steps are as follows (in the order given):

 Correction by the household site: in sampling it is possible to over-sample household of


bigger size and under-consider households of small size. To solve problems, sample family size
must be compared with census family size and corrected appropriately.
 Sociodemographic correction: this is necessary if differences in the distribution of the
variables ‘sex’ and ‘age’ are detected between sample and population (i.e. census). The
consistency of definition of family and household in both cases must be checked.
 Non-response correction: this problem is caused by possible variation in the travel
behavior between those that do and do not answer the survey .Correction factors may be
estimated on the basis of the number of visits required to complete the questionnaire at different
types of households.
 Correction for non-reported trips: this problem arises because non-mandatory trips
tend to be under-estimated. Thus the number of trips by purpose obtained in the O-D survey
should be checked with those of travel diaries where detailed information about each trip should
have been gathered.

Sample expansion:
The corrected data is expanded in order to be representative of the total population; to achieve this
expansion factor are define for each study zone

A−A ( C+CD /B )
B
f i=
B−C−D

Where f i is the expansion factor for zone i, A is the total number of addresses in the original population
list, B is the total number of addresses selected as the original sample, C is the total number of the
sampled addresses that were non-eligible in practice (e.g. demolished, non-residential), D is the number
of sample addresses where no response was obtained.
Validation of the results: there are three validation processes for O-D survey data. The first is on- site
check for completeness and coherence of the data. It is followed by their coding and digitalizing in the
office. The second is a computational check of valid ranges for most variables and in general the internal
consistency of the data. These processes ensure that the data is free from errors.the last process consists in
traffic count at cordons and screen-lines during the O-D survey period. The corrected and expanded
survey data are contrasted with information obtained from the counts (vehicles and pedestrians suitably
transformed by means of occupancy rates also measured on site.)

ANALYSIS OF O-D DATA RESULTS


They are analyses graphically and statistically
The results of O-D survey are expressed in the form of O-D matrices and in the form of desire line
graphs. A desire –line is a straight line between origin and destination of traffic (trips). The thickness of
the band represents to scale the number of trips.
Separate graphs may be drawn to show desire-lines for through trips, internal trips and trips between
internal and external zones
The survey is divided into uniform squares. A number of trips crossing each square are noted and then
contour-lines are drawn through squares of like desire-line density.

OTHER SURVEY APPROACHES AND DATA TYPES

Stated preferences survey: The data collection approach so far discussed implicitly assumes that any
data corresponded to revealed preferences (R.P) information; this means data about actual or observed
choices made by individuals. Strictly speaking, these are seldom actually observed “choices” but rather
what people report they do or had done the previous day.

What distinguish R.P data from stated preferences(SP) data is that in the stated preferences (SP) data
individuals are asked about “what they would do” or “how they would rank certain options” in a
hypothetical situation. For good results the approach needs carefully designed data collection methods,
not only in terms of survey design expertise, but also in terms of requirements for operation.(e.g. in terms
of resources)
Stages in an SP Data collection exercise:
An SP experiment involves the construction of a set of hypothetical (but realistic) option known as
technologically feasible alternatives. The stages are:
1) Identify the range of choice (.i.e. the options and their level of disaggregation), the attributes to be
considered and their likely levels of variation.
2) Design an initial version of the experiment and of the survey instrument (e.g. questionnaires);
using simulated data check that the design allows all the parameters of the model to be recovered.
3) Pre-test the survey instrument using small stratified samples in order to consider the opinion of
the largest possible number of interesting sector of the population.
4) Evaluate the pre-test results both in terms of quality of the survey instrument and of the intuitive
quality of the responses obtained by population strata; correct the instrument before its distribution (for
the actually survey exercise).

Longitudinal data collection:

What has been dealt with so far is data gathering for “cross sectional data”- This refers to information
about trip patterns revealed by a cross section of individuals at a single point in time. A fundamental
assumption here is that a measure of the response to incremental change may simply be found by
computing the derivatives of a demand function with respect to policy variables in question.ix a realistic
stimulus response relation may be derived from model parameters estimated from observation at one
point in time. Such data lack recognition of inter-temporality.

Longitudinal data collection includes the element of time. It is a time-series data collection method.
Longitudinal or time-series data incorporates information on response by design response in series of
observation with time. It may provide the means to directly test or even reject hypothesis relating to
response.

Forms of longitudinal survey or data collection are;

 Repeated cross-sectional survey: here similar measurements are made on sample from an
equivalent population at different points in time, without ensuring that any respondent is included in more
than one round of data collection. It provides series of snapshots of the population at several points in
time.
 Panel survey (=called before –and after survey): here similar measurements, sometimes called
waves, are made on the same sample at different points in time.
 Rotating panel survey; a panel survey in which some elements are kept in the panel for only a
portion of the duration of survey.
 Split panel survey; this is a combination of panel and rotating panel survey.
 Cohort study; a panel survey based on elements from population subgroups that have shared a
similar experience (e.g. birth during a given year). A panel data may become unrepresentative of the
initial population as samples age with time.

OTHER TRAFFIC STUDY SURVEYS AND DATA COLLECTION


Traffic survey are carried out

a. To obtain the knowledge of the type and volume of traffic at present and estimate the
future traffic that the road is expected to carry.
b. To determine the facilities provided on the roads such as traffic regulations and control,
intersection etc., so that improvement on the basis of traffic density may be carried out.
c. To design the geometric features and pavement thickness of roads on the basis of traffic
survey.
d. Design drainage system, bridges and culverts etc.
e. Surveys relating to accidents help in redesigning roads width: the road curves, traffic
signals, intersections etc.

1. Traffic volume survey/study

 Can be carried out for vehicles and pedestrians separately or combined. Carried out
during peak hours between the hours when maximum traffic occurs in order to obtain the
required information. It is also used for rural roads for 7 days continuously once during the peak
seasons (e.g. during harvesting and marketing) and the other during the low seasons.
 The objective is to get a record of all types of vehicle at an intersection or other road way
location; observation stations are located where traffic is heavy up and down traffic are recorded
separately on straight roads. At intersection vehicle traffic is classified by movement, turning left
or right or straight. Load meter may be used to record loads of vehicles.
 Three methods of counting the vehicles are by
a. Automatic recorders: these can only determine the number of vehicles, but not type of
vehicles and direction of vehicles. They are of two types: fixed and portable. They are suitable
for long counts .As these recorders do not give direction counts and incase of two or more
vehicles passing at the same time is recorded only as one vehicle, sample survey by manual
count is taken at direction or classified counts.
b. Manual count: here enumerators record the volume of flow on a prepared form/sheet.
Pedestrians count can be undertaken simultaneously; this helping planning signal, pedestrian’s
protection, and installation of barriers, islands and as sidewalks. Though it is not possible to
carry out manual count for 24 hours, it is the best method to obtain accurate and reliable
classified volume.
c. Moving car method-: an observer moving in a car once against the traffic specific time
and second time along with the traffic records the number of vehicles met and over –taken
respectively. The volume is calculated by the formula
x+ y
V=
t m +t o

Where V =vehicles per minute ∈one direction


x=no of vehicles met whenmoving against
the direction∈t m minutes
vehicles
y=no of when moving along with the
taken
traffic ∈the desired direction t o minutes

d. Using modern video recorders: This process leads to data processing using computer
with specially designed software.

Analysis:
This is done to obtain

 Hourly, daily, yearly and seasonal traffic variation.


 Volume and direction of traffic
 Variation of vehicular flow on different parts of a road system.
 Proportion of commercial, heavy vehicles, slow vehicles etc.
2. Road parking and studies:

The number of vehicles which are parked in a specified area at a given instant are evaluated. A survey of
this accumulation is facilitated by an inventory of the curb space, showing location and regulation in
effect and an inventory of off-space showing locations, types and fees. In the case of garages and lots, the
observer may count the number of marked stations .Parked vehicles are noted giving their types, time of
arrival, time of departure and a violations. Parking studies given information regarding supply and
demand of facility.
Parking surveys are made using same techniques as those of O-D survey data collection.
Analysis:
The actual number of vehicles which park is the parking volume. The data survey for vehicles hours of
parking is the parking load. The magnitude of its distribution is the measure of overall usage of the
enable movement.
ROAD ACCIDENTS AND STUDIES
There is a chance of hazard in traffic movements. Events occur in the traffic stream which result in death,
injury and property damage. The accident is a result of non-compliance on the part of the road-user, the
vehicles or the facility provided to function properly in the traffic movements.
Analysis of accident facts makes it possible to understand the conditions surrounding the functional
failure of a traffic facility and the causes of the failure. The items to be analyzed include:
a) Highway design standards
b) Warrants for traffic control devices
c) Channelization schemes
d) Street and highway lighting
e) Pedestrian safety facilities
f) Traffic regulations
Accurate records of traffic accidents are essential for solution of traffic problems. Information on both
minor and serious accidents are important since the circumstances leading to the accidents may be the
same in both cases. From engineering point of view, reports of accidents location wise on the section of a
road or highway is important. Correlation of the accident history with traffic flow and roadway geometry
and conditions will help in establishing the causes of an accidents, and thereafter remedy and correction
may be suggested.
Traffic accidents records are kept by the traffic police at the traffic police headquarters. Retails and more
information on accidents could be obtained at the police stations near the scenes of accidents or which
cover the areas/divisions/districts where the accidents occur. The accident details must be carefully
extracted from the records by the analyst/researcher. It is necessary to visit accident scenes as well to
ascertain facts where necessary.
Analysis
Traffic accidents are usually analytical in rates per 100000 population ten thousand vehicles or hundred
million vehicle miles giving seasonal and daily variations and types.

 Accident spot maps: These show accidents spots by pins or pasted spots or symbols for
a street or locality or section of a highway.
 Collision diagram: This is a diagram showing details of path of vehicles and pedestrians
involved in the accidents before the collision and after the collision. They are generally not to
scale. The diagram help to study accident pattern, to determine remedial measures in case of
finding fault, the diagrams are drawn to scale showing skid-marks before collision and after
collision.
 Condition diagram: It is a drawing to scale showing physical conditions like curb line,
sign and signal post, property lines, sidewalks and driveways and type of road surface. These are
in accidents causes’ analysis.

You might also like