Itc TDM Notes 2017

Theory
Introductory notes in
transport planning and
travel demand modelling
Prof. Dr. Ir. M.F.A.M. van Maarseveen (University of Twente)

Dr. Ir. M.H.P. Zuidgeest
PART A – TRANSPORT PLANNING 4
1 INTRODUCTION TO TRANSPORT MODELLING 5

1.1 Urban transport planning 5
1.1.1 Introduction 5
1.1.2 The urban transport planning process 6
1.1.3 The role of models in urban transport planning 7
1.1.4 Continuous transport planning 8
1.2 Theoretical background on transport systems 9
1.2.1 Introduction to travel behaviour and transport systems 9
1.2.2 Consumer travel behaviour 11
1.2.3 Demand 12
1.2.4 Supply 12
1.2.5 Equilibrium 13
1.3 Modelling issues 14
1.3.1 General modelling issues 14
1.3.2 Aggregate or disaggregate modelling 16
1.3.3 Types of models 17
1.4 The classic four-stage model 17
1.5 Limitations of the four-stage model 19
1.5.1 Limitations of trip generation models 19
1.5.2 Limitations of trip distribution models 19
1.5.3 Limitations of modal split models 20
1.5.4 Limitations of assignment models 20
2 DATA COLLECTION 22
2.1 Introduction to survey planning 22
2.2 Sampling methods 25
2.2.1 Sample design 25
2.2.2 Sampling methods 26
2.3 Errors in data collection and modelling 28
2.3.1 Types of error 30
2.3.2 Model complexity versus data accuracy 30
2.4 Data collection 32
2.4.1 Household-based survey 33
2.4.2 Non-household based survey 35
2.4.3 Data correction, expansion and validation 39
2.4.4 Stated preference surveys 40
2.4.5 Longitudinal surveys 41
PART B – TRAVEL DEMAND MODELLING 43
3 NETWORKS AND STUDY AREA DEFINITION 44

3.1 Zoning design 44
3.2 Network representation 46
3.2.1 Schematisation 46
3.2.2 Link properties 46
3.2.3 Network costs 46
2 UNIVERSITY OF TWENTE, THE NETHERLANDS

INTRODUCTION TO TRANSPORT PLANNING AND MODELLING
4 TRIP GENERATION MODELLING 48

4.1 Introduction 48
4.2 Classification of trips 48
4.3 Methods to model trip generation 49
4.3.1 Growth-factor modelling 49
4.3.2 Regression analysis 50
4.3.3 Category analysis technique for trip generation 51
4.4 Balancing 52
5 TRIP DISTRIBUTION MODELLING 53

5.1 Introduction 53
5.2 Updating a base-year table with future forecasts 53
5.1.1 Uniform growth factor 54
5.1.2 Singly constrained growth factor method 54
5.1.3 Doubly constrained growth factor method 55
5.2 The Gravity method 57
5.3 Tri-proportional fitting 60
5.4 Some practical notes 60
6 MODAL SPLIT MODELLING 62

6.1 Introduction 62
6.2 Trip Interchange modal-split models 62
7 TRAFFIC ASSIGNMENT MODELLING 65

7.1 Introduction 65
7.2 Classification of traffic assignment models 67
7.3 Traffic assignment algorithms 68
7.3.1 All – or – nothing assignment (AON) 68
7.3.2 Wardrop’s user-equilibrium assignment (DUE) 69
7.3.3 Method of Successive Averages (MSA) 70
7.3.4 Stochastic user-equilibrium assignment (SUE) 71
8 RESUME ON COMPLETE TRANSPORT MODEL 72

8.1 Study area 72
8.2 Stage 1: Trip generation 72
8.3 Stage 2: Trip distribution 76
8.4 Stage 3: Modal split 80
8.5 Stage 4: Assignment 83
APPENDIX A USED REFERENCES 85
UNIVERSITY OF TWENTE, THE NETHERLANDS 3

PART A – Transport Planning

1 Introduction to transport modelling1

This chapter gives a brief introduction on urban transport planning, as it has raised the need for
transport models. Some aspects of the theoretical background on transport systems, based on economic
theory, will be explained in section 1.2. In section 1.3 modelling issues will be explained, and an
overview of existing models will be given. In section 1.4 the classic four-stage transport model will be
explained. Limitations of this model will be discussed in section 1.5.
1.1 Urban transport planning
1.1.1 Introduction
In a community, decisions made by decision-makers are mostly based on plans that provide information
like forecasts on future developments in a certain policy area. Transport is one of those policy areas. It is
of major importance to the community as the economic and social health of an area depends on the
performance of the transport system.
Urban transport planning started in the United States in the 1950s with the Detroit and Chicago
Transport Studies, and was used to inform decision-makers on the transport system. Urban transport
planning analyses the transport system, gives forecasts on future performance of the system and
suggests measures to improve this performance in order to meet the level desired.
In earlier times the studies were mainly concerned with the provision of capacity for the growing demand
of motorcar travel. Now, after nearly fifty years, the major concern is about the environmental effects,
and studies are focussing on how to restrain the growth of motorcar travel, with transport pricing as the
main objective.
The rise of urban transport planning included the start of the development of transport models, as they
are an essential component of urban transport planning. The techniques developed in the United States
were imported tot the United Kingdom in the 1960s, followed by important theoretical developments in
both the United States and Europe in the next twenty years.
Although the development of transport models has been evolutionary rather than revolutionary, two
important changes have taken place:
q A theoretical framework was developed, compatible with economic theory, providing a justification
and clarification of methods that were originally proposed on practical grounds.
q The major increase in computing power made it possible to analyse problems with a significant larger
scale and level of detail.
1
Based on M.D. Meyer & E.J. Miller, Urban transportation planning, 2001, p. 1-35, 256-261; J. de D. Ortúzar, &L.G.
Willumsen, Modelling transport, 1994, p. 1-32; R. Tolley & B.Turton, Transport systems, policy and planning, 1995, p.
197-210; E.A. Beimborn, A transportation modelling primer, 1995; M. Taylor, W. Young & P. Bonsall, Understanding
traffic systems, Data, analysis and presentation, 1996, p. 32-33

1.1.2 The urban transport planning process
The general framework for the urban transport planning process is derived from the pioneer urban
studies of Chicago and Detroit mentioned earlier. A systems approach is used (a system can be defined
as a set of objects and the relationships between them) in which land use and transport facilities are the
‘objects’ and the relationship between them is traffic.
To understand a system it has to be analysed. The basic components of this analysis are:
q Definition: what problem is the plan intended to solve?
q Projection: how will the situation develop if the problem continues?
q Constraints: what are the limits of finance, time etc. within which planning must take place?
q Options: what are the alternatives and their pro’s and con’s?
q Formulation: what are the main alternative plans, i.e. packages of available options within the
prevailing constraints
q Testing: how would each of the alternative plans work out in practice?
q Evaluation: which plan gives greatest value (within the constraints) in terms of solving the problems
already defined?
The results and proposals of this analysis would be fed back into the political process for appraisal. Often
adaptations will have to be made to the plan, so the process can be seen as a learning one.
The basic stages of the process are the pre-analysis, technical analysis and post-analysis phase (figure
1.1).
Figure 1.1 A general representation of the urban transport planning process

Source: Pas (1986)

The pre-analysis phase

After the identification of the problems, the goals can be defined. These are usually broad statements
such as “provide a safe, energy-efficient transport system”, and are operationalised by a set of specific
objectives that can be used to evaluate the alternatives. These could be for example: “reduce the
number of traffic accidents” or “reduce the consumption of fossil fuels per person”.
After that data need to be collected on land use, transport and travel inventories, distribution of land
use, current travel patterns, preferred travel modes and socio-economic situation. More on data
collection in chapter 2.
In the end, alternatives will be generated.
The technical analysis phase

The technical analysis phase involves predicting the traffic flow on the links of a specified network.
Forecasting techniques are used for this, which are generically known as the urban transport planning
modelling system. A major input is the future distribution of houses, employment, shops and other land
uses in the urban area. These are predicted by models relating to land uses, population trends,
employment and income levels and the performance of the urban economy, and by references to urban
physical planning proposals.
The urban transport modelling system consists of a set of submodels which tackle the problem in four
stages:
1 Whether to make a trip (trip generation)
2 Where to go (trip distribution)
3 Which mode of transport to use (modal split)
4 Which route to use (traffic assignment)
As this lecture note is mainly on modeling transport, the following sections and chapters will concentrate
on this four-stage model.
The post-analysis phase

The outcome of the technical analysis phase is a set of predictions of the likely impacts of the various
plans. In the post-analysis phase the alternatives are evaluated, and after the decision has been made,
the chosen alternative can be implemented. The system will be monitored to be able to compare the
effects with the forecasted effects, in order to identify the measures needed to keep the new system on
track.
1.1.3 The role of models in urban transport planning

In the above it already has become clear that models take a central place in the urban transport planning
process. Why are models so important?
A model can be defined as a (simplified) representation of a part of the real world – the system of
interest – which concentrates on certain elements considered important for its analysis from a particular
point of view. There are physical and abstract models. The former is used, for example, in architecture.
The latter includes the analytical models that are used in transport planning.
Models are necessary in transport planning because it is impossible to conduct experiments on existing
infrastructure, and, of course, on non-existing infrastructure and transport modes (e.g. roads, rail track,
a new mode of bus transport).

Figure 1.2
Source: http//www.uwm.edu/Dept/CUTS.primer.htm
Analytical models attempt to replicate the system of interest and its behaviour by means of
mathematical equations. These equations are based on theoretical statements about the system and its
behaviour. The models are often complex and large amounts of data are required. The value of these
models is limited to a range of problems under specific conditions.
As we cannot predict the effects of future developments on transport, models can be used to do this. The
mathematical equations depend on a range of variables that might change in the future. A model is
designed with base-year data but if it is adequate, it can also be solved for future values of the variables,
which delivers the desired forecasts. However, one should always keep in mind that models are subject
to bias when used for forecasting.
1.1.4 Continuous transport planning

In section 1.1.2 it can be seen that modelling is part of a problem-solving process, in this case the urban
transport planning process. The role of the model in transport planning can be presented as contributing
to the key steps of a decision-making framework. Although this seems to be largely the same as the
framework in section 1.1.2 it is more specific on the modelling part of the process, and therefore
provides some extra information (figure 1.3).
1. Formulation of the problem. A problem can be defined as a mismatch between expectations and
perceived reality. The formal definition of a transport problem requires reference to objectives,
standards and constraints. The objectives are a definition of an ideal but achievable future state.
Standards are provided to compare whether minimum performance is being achieved at different
levels of interest (e.g. if all signalised junctions in a city operate at a 90% degree of saturation, this
can indicate a network overload). Constraints can be of many types: financial, temporal,
geographical, technical, or areas/buildings that should not be threatened by new proposals.
2. Collection of data about the present state of the system of interest in order to support the
development of the analytical model. Data collection and model development are closely interrelated
as the latter defines which types of data are needed.
3. Construction of an analytical model of the system of interest. In general one would select the
simplest modelling approach possible to make a choice between schemes on a sound basis. The
construction of an analytical model involves specifying it, estimating and calibrating its parameters
and validating its performance.
4. Generation of solutions for testing. This can be achieved in many ways: ranging from tapping the
experience and creativity of local transport planners and interested parties to construction of a large-
scale design model, perhaps using optimisation techniques.
5. In order to test the solutions or schemes proposed in the previous step it is necessary to forecast
the future values of the planning variables which are used as inputs to the model.
6. Testing the model and solution. The performance of the model is tested under different scenarios
to confirm its reasonableness.

7. Evaluation of solutions and recommendation of a plan/strategy/policy. This involves operational,

economic, financial, and social assessment of alternative courses of action on the basis of the
indicators produced by the model. A combination of skills is required here, from economic analysis to
political judgment.
8. Implementation of the solution and search for another problem to tackle; this requires recycling
through this framework starting again at point (1).
Formulation of the
problem
Data collection
Construct analytical
model and calibrate
Generate solutions for Forecast planning vari-

testing ables
Test model and solution
Evaluate solutions and

recommend best one
Implement solutions
Figure 1.3 A framework for rational decision making with models

Source: Ortúzar & Willumsen (1994)
1.2 Theoretical background on transport systems
1.2.1 Introduction to travel behaviour and transport systems2

Transport systems are composed of a complex set of relationships between the demand, the locations
they service and the networks that support movements. They are mainly dependent on the commercial
environment from which are derived operational attributes such as transport costs, capacity, efficiency,
reliability and speed. Such conditions are closely related to the development of transport networks, both
in capacity and in spatial extent. Transport systems are also evolving within a complex set of
relationships between transport supply, mainly the operational capacity of the network, and transport
demand, the mobility requirements of an economy.
What are the differences between an airplane, an oil tanker, a car and a bicycle? Many indeed, but they
each share the common goal of fulfilling a derived transport demand, and they thus all fill the purpose of
supporting mobility. Transport is a service that must be utilized immediately and thus cannot be stored.
Mobility must occur over transport infrastructures, providing a transport supply. In several instances,
transport demand is answered in the simplest means possible, notably by walking. However, in some
cases elaborate and expensive infrastructures and modes are required to provide mobility, such as for
international air transport.
2
Based on: T. Schoenmaker, Samenhang in vervoer – en verkeerssystemen, 2002, Coutinho, Bussum; Rodriguez et
al., The geography of transport systems, 2006, Routlege.

An economic system including numerous activities located in different areas generates movements that
must be supported by the transport system. Without movements infrastructures would be useless and
without infrastructures movements could not occur, or would not occur in a cost efficient manner. This
interdependency can be considered according to two concepts, which are transport supply and demand.
Transport supply is the expression of the capacity of transport infrastructures and modes, generally over
a geographically defined transport system and for a specific period of time. Therefore, supply is
expressed in terms of infrastructures (capacity), services (frequency) and networks. The number of
passengers, volume (for liquids or containerized traffic), or mass (for freight) that can be transported per
unit of time and space is commonly used to quantify transport supply. Transport demand is the
expression of the transport needs, even if those needs are satisfied, fully, partially or not at all. Similar
to transport supply, it is expressed in terms of number of people, volume, or tons per unit of time and
space.
There is a simple statistical way to measure transport supply and demand for passengers or freight. The
passenger-km is a common measure expressing the realized passenger transport demand as it compares
a transported quantity of passengers with a distance over which it gets carried. The ton-km is a common
measure expressing the realized freight transport demand. Although both the passenger-km and ton-km
are most commonly used to measure realized demand, the measure can equally apply for transport
supply.
For instance, the transport supply of a Boeing 747-400 flight between New York and London would be
426 passengers over 5,500 kilometres (with a transit time of about 5 hours). This implies a transport
supply of 2,343,000 passenger-kms. In reality, there could be a demand of 450 passengers for that
flight, or of 2,465,000 passenger-km, even if the actual capacity would be of only 426 passengers (if a
Boeing 747-400 is used). In this case the realized demand would be 426 passengers over 5,500
kilometres out of a potential demand of 450 passengers, implying a system where demand is at 105% of
capacity.
Transport demand is generated by the economy, which is composed of persons, institutions and
industries and which generates movements of people and freight. When these movements are expressed
in space they create a pattern, which reflects mobility and accessibility. The location of resources,
factories, distribution centers and markets is obviously related to freight movements.
The transport system is built-up of three interdependent layers:

1. Travel patterns of travellers and goods, both being the elements that need to move or be
transported. Travel patterns are a consequence of the geographical or spatial distribution of
activities, travel distances, trip purposes, time of day effects etc.
2. Transport services, for enabling the movement of travellers and goods, using different transport
modes. These services can be public, semi-public or private (incl. walking). The space – time
distribution of transport services is strongly related to the distribution of activities in space – time.
For private means this follows directly from peoples choices to travel. For ‘public’ means the travel
market dominates whether services will be offered (supplied to the demand for travel).
3. Traffic services, for enabling the movement of transport modes through physical infrastructure and
management and operations of the infrastructure (incl. pricing policies).
The total traffic and transport system can be depicted as an interrelated system of layers (see Figure)
market types (the travel market between the demand for travel (travel patterns) and the supply of
transport services and the traffic markets between the demand of these services for infrastructure and
the supply of infrastructure) exist.

Remote sensing and (GIS) databases
Distribution in space
Elements
and time Transport/traffic domain
Land-use/Transport Interaction
Activity-based analysis
Travel demand Travel patterns Travellers, freight Travel demand modelling
Travel and gender
Accessibility analysis
Impact analysis (macro)
travel market equilibrium …
Network and corridor design
Location allocation
Travel supply=
Transport services Modes of transport Intermodal planning
Traffic demand
(network, corridor, nodes)
BRT & NMT services planning
traffic market equilibrium Routing and logistics
…
Traffic control and
optimization
Traffic supply Traffic services Traffic infrastructure Infrastructure maintenance
BRT & NMT services control
Impact analysis (micro)
...
Mapping and visualisation
Figure 1.4 The three-layer traffic and transport system.

Adapted from T. Schoemaker, 2002.
The theoretical background on transport systems can largely be derived from economic theory. There are
four aspects of economic theory that will be explained for transport problems. These are consumer travel
behaviour, demand, supply and equilibrium.
1.2.2 Consumer travel behaviour

The basic premise of the theory of consumer behaviour is that an individual will select a bundle of goods
over all affordable bundles if it yields the greatest utility (i.e. satisfaction). The individual’s decision-
making consists of maximising a utility function U subject to a budget constraint Y:
max(U ) = U ( X 1 ,..., X n )
Y = P1 X 1 + ... + Pn X n
X1,…,XN = GOODS THAT ARE CONSUMED

P1,…,PN = PRICES OF GOODS
Y = INCOME
Figure 1.5 presents the solution of this problem when two types of goods (X1 and X2) are considered. The
indifference curve u presents the combinations of X1 and X2 corresponding with a given utility level. The
income line y presents the possible combinations of X1 and X2 corresponding with a given income level.
The equilibrium is reached at point E, and represents the point at which the individual’s valuation of the
goods is the same as the market valuation.

Figure 1.5 Consumer utility maximizing behaviour
Source: Meyer & Miller (2001)
In this basic premise, the assumption is made that utility is generated by the quantity of goods, while, in
most cases, it is generated by the attributes of goods. The demand for a good therefore depends on its
price, characteristics and the characteristics of the consumer.
In case of transport, the “good” being demanded is a certain transport service. The “price” consists of all
perceived costs of the traveller, not only the monetary costs of the trip but also the time spent travelling.
Long-run costs are rarely used in utility functions as they are unlikely to influence the decision of a
traveller. If the monetary value of time (or other factors) is known, time and price can be combined to
yield a generalised cost of travel. This is however, not necessary.
The utility of a trip, and therefore the demand for it, depends upon the characteristics of:
q The trip to be made
q The available modes
q The individuals making the trips
1.2.3 Demand
Demand for travel is actually a derived demand, as it is generated by the desire to join in activities, and
generally not by the desire just to travel. The transport system provides a physical connection between
activities.
Due to the derived nature of it, transport demand cannot be analysed without considering the socio-
economic activity system, as it is served by the transport system and generates travel demand. The
accessibility provided by the transport system can over longer periods influence where people live and
where economic activities occur. Therefore predicting land use patterns is necessary when travel demand
is forecasted over longer periods in time.
Travel can be characterised in terms of time, monetary cost, inconvenience, discomfort, and so on,
associated with the trip. These characteristics represent the disutility or ‘generalised cost of travel’, as
one would prefer to spend less time travelling, incur less expense, and be more comfortable. It is
reasonable to assume that a potential trip maker will choose the option with the maximum (personal)
utility out of the mobility options available for the specified trip.
1.2.4 Supply
The supply curve expresses the quantity of a given good that will be supplied or produced as a function
of the price of the good. This function will always be upward sloping (or at least non-decreasing)
indicating that greater quantities of the good will be produced only if the price of the good rises. This is
due to the fact that it leads to higher marginal operating cost. In the long run however, these costs can
be reduced, which leads to a supply curve that lies under the original one.

It is clear that the supply function, as is the case with the demand function, depends on more factors
than just the price of the good, including the prices of the input factors and the technology used to
produce the good.
In transport, one of the possible definitions of supply is system performance. This can be seen as the
whole of travel times, headways and capacities provided by the transport system given a certain capital
investment, operating strategy and demand level. This leads to an inverted supply function, in the way
that the price (e.g. travel times and costs) is now a function of the quantity of the good (i.e. what level
of demand can be accommodated, i.e. the flows).
1.2.5 Equilibrium
In figures 1.5 and 1.6 the demand and supply curves are drawn in one diagram. The point of intersection
between the curves is called the equilibrium point. At this point the quantity demanded is equal to the
quantity supplied.
If shifts in demand and supply curves do not occur, markets can be expected to move towards the
equilibrium point. This can be explained in the following way:
q If demand is higher than supply, the prices would rise due to “bidding up” of the customers. This
stimulates an increase in supply and a decrease in demand, and thus driving the market to the
equilibrium point.
q The other way around, if supply is higher than demand, the prices would fall, stimulating a decrease
in supply and an increase in demand.
It is assumed that there will also be equilibrium within a transport system, or at least it will arrive in such
a state after being left undisturbed for some time. There will of course be disequilibria due to, for
example, traffic accidents, but these will always be transient. It is, however, difficult to compare the
units of travel for demand and supply. As for demand, the units are counted in number of trips or
distances, while for supply the response of the system is related to volumes of traffic at different places
and times.
In figure 1.6, where the vertical axis denotes supply in terms of the ‘price’ of travel (travel time + travel
costs = generalized costs) offered and the horizontal axis denotes the demand in terms of the number of
trips made, i.e. the flow. It is shown that if a shift in e.g. the supply curve occurs (from supply 1 to
supply 2) a ‘new’ equilibrium point will be found following the same mechanism set out above. The
‘newly’ derived travel demand (on top of the already ‘revealed’ demand) is often called the ‘induced’
demand.

Figure 1.6 Demand-supply equilibriums
Source: Zuidgeest (2005)
1.3 Modelling issues

In this section, some critical modelling issues are discussed that are relevant to the choice of the model.
Besides general modelling issues, aggregate or disaggregate modelling is discussed. In modelling there
also has to be chosen if cross-sectional or time series data are used, and revealed or stated preference
data. These two methods are closely related to methods of data collection, and will therefore be
discussed in section 2.4.
Section 1.3.3 provides an overview of types of models that can be used in transport studies at different
levels of detail.
1.3.1 General modelling issues

The roles of theory and data
Many people tend to associate the word “theory” with endless series of formulae and algebraic
manipulations. In transport modelling this association has largely been correct, as it is difficult to
understand and replicate the complex interactions between human beings, which are an inevitable
feature of transport systems. In fact, it has occurred that a “pragmatic” transport model, built by
practitioners who despised theory, showed elasticities with a wrong sign.
It is often possible to derive the same functional form from different theoretical perspectives (e.g. the
functional form of the gravity model that will be discussed later can be derived from analogy with
physics, entropy maximisation and maximum utility formalisms). The model output, however, is
dependent on the theory adopted. Using a theoretical framework also extends the credibility of a model
being able to forecast future behaviour.
There are two classical styles of approach to the development of theory:

q Deductive: building a model and testing its predictions against observations
q Inductive: starting with data and attempting to infer general laws
The deductive approach has been found more productive in pure sciences, and the inductive approach
has been preferred in the analytical social sciences. In both cases data play a central role. The
availability and nature of data in many cases restricts the choice of a model to a single option.
An issue closely related to the question of data is the type of variables to be represented in the model.
Models predict a number of dependent (endogenous) variables given other independent (explanatory)
variables. Data is needed on each variable to test the model. One type of variable is the policy variable,
which is interesting because it is under control of the decision-maker, and can therefore be varied by the
analyst in order to evaluate different policies.
Model specification
The following themes can be recognised, concerning model specification:
q Model structure: can the system be modelled by a simple structure, which assumes, for example,
that all alternatives are independent. Or is it necessary to build a complex model to be able to
calculate probabilities of choice conditional on previous selections.
q Functional form: is it possible to use linear forms or does the problem require postulating more
complex non-linear functions.
q Variable specification: which variables are used and in which form should they enter the model
(e.g. if income is assumed to influence individual choice, should it enter the model as a variable, or
deflating a cost variable?).
Model calibration, validation and use

A model can simply be represented as a mathematical function of variables X and parameters q, such as:
Y = f ( X ,q )
Calibrating a model is on finding the right values for the parameters. This is further explained in box 1.1.
The large majority of transport models have been built on cross-sectional data. This had led to the
tendency that validation of the model was interpreted in terms of the goodness-of –fit achieved between
observed behaviour and the base year predictions. Although this is a necessary condition for model
validation, it is not sufficient. Validation requires comparing the model predictions with information not
used during the model estimation process.
Starting with the modelling task, a modeller has to decide which variables are going to be predicted by
the model and which variables will be required as input to it. Some variables will never enter the model
because the modeller lacks control over them or because the theory behind the model ignores them. This
implies immediately a degree of error and uncertainty, which gets compounded with other errors
inherent to modelling, for example: sampling errors and errors due to the simplification of reality that is
unavoidable to make the model practical. See figure 1.5 for an overview on the modelling process.
Box 1.1 Calibration or estimation?
It is interesting to mention that the twin concepts of model calibration and model estimation have taken
traditionally a different meaning in the transport field. Calibrating a model requires choosing its
parameters, assumed to have a non-null value, in order to optimise one or more goodness-of-fit
measures, which are a function of the observed data. This procedure has been associated with the
physicists and engineers responsible for the first generation of transport models who did not worry
unduly about the statistical properties of these indices, e.g. how large any calibration error could be.
Estimation involves finding the values of the parameters, which make the observed data more likely
under the model specification; in this case one or more parameters can be judged non-significant and left
out of the model. Estimation also considers the possibility of examining empirically certain specification
issues; for example, structural and/or functional form parameters may be estimated.

This procedure has tended to be associated with the engineers and econometricians responsible for the
second generation of models, who placed much importance on the statistical testing possibilities offered
by their methods. However, in essence both procedures are the same because the way to decide which
parameter values are better is by examining certain previously defined goodness-of-fit measures. The
difference is that these measures generally have well-known statistical properties, which in turn allow
confidence limits to be built around the estimated values and model predictions.
Source: J. de D. Ortúzar & L.G. Willumsen, Modelling transport, 1994, p. 19
Figure 1.5 Modelling and sampling

The main use of models in practice is for conditional forecasting (i.e. it produces estimates of dependent
variables given a set of independent variables). Typical forecasts are conditional in two ways:
q In relation to the values assigned to the policy variables, of which the impact is being tested with the
model
q In relation to the assumed values of other variables
A model is normally used to test a range of alternative plans for a range of possible future values of the
other variables. This means that the model has to be ‘run’ many times to generate all outcomes for the
ranges that are mentioned above. A lot of computing power is needed to guarantee a quick turn around
time, given that transport models involve complex equilibration processes and contain considerable
amounts of data.
1.3.2 Aggregate or disaggregate modelling

The level of aggregation selected for the measurement of data is an important issue in the general
design of a transport planning study. Although a greater level of detail – leading to a higher degree of
accuracy – should improve the quality of a forecasting model, the costs of data collection and analysis,
and of most other aspects of the modelling exercise, will probably increase.
Of central interest is the aggregation of exogenous data, that is, information about items other than the
travel behaviour (this is the endogenous or dependent variable, which the model attempts to replicate).
Exogenous data can be seen in many cases as input to the value of the independent variables.

In aggregate or first generation models (such as the trip distribution and modal split models that will be
discussed in chapters 5 and 6), the model at base aims at representing the behaviour of more than one
individual. These models were used up to the late 1970s. They became familiar, demanded relatively few
skills and have the property of offering a ‘recipe’ for the complete modelling process. First generation
models have on the other hand been criticised for their inflexibility, inaccuracy and cost.
Disaggregate or second generation models attempt to represent the behaviour of individuals (e.g.
discrete choice models). They became increasingly popular in the 1980s, and offer substantial
advantages over the traditional methods while remaining practical in many application studies. However,
they demand a higher level of statistical and econometric skills from the analyst than is the case with
aggregate models.
The difference between first and second generation model systems have often been overstated, as the
disaggregate models have been seen as ‘revolutionary’ while eventually it became clear that an
‘evolutionary’ view was more adequate. In many cases there is a complete equivalence between the
models. The difference lies in the treatment of the description of behaviour, particularly during model
development process. The disaggregate approach is superior in that case.
The issue is if one of both approaches is to be preferred, and in what circumstances. It has been
concluded that there is not a definitive approach appropriate to all situations, therefore the best
approach needs to be chosen for a certain situation.
1.3.3 Types of models

The various levels of investigation involved in traffic impact analyses call for a range of traffic models, as
there is no one model that can provide in answers for the full range of problems. The hierarchy of traffic
network models (described below) shows which model should be used for which purpose. The different
levels represent modelling in different scales.
The hierarchy starts at the most detailed level.
q Microscopic simulation, of individual units in a traffic stream. For example, for the assessment of
individual vehicle or driver performance at an intersection or along a link.
q Macroscopic flow models, in which the flow units are assumed to behave in some collective
fashion.
q Simulation models of flows in intersection clusters, for the optimisation of network performance
(e.g. delays at traffic signals when the flows on each road section or link are fixed).
q Dense network models, which simulate flows in small-scale networks where the level of flow on
each link can vary in response to changes in the traffic control system and traffic congestion levels.
These models focus on short time periods (e.g. a peak hour).
q Strategic network models, which simulate or optimise network flows in the large-scale networks,
which represent a regional or metropolitan transport system. These models focus on long time
periods (e.g. 24 hour flows).
q Land use impact assessment models, that focus on the extent of changes to new land use
facilities (e.g. a retail centre) and use a rudimentary description of the transport system serving that
facility in predicting its impacts on the surrounding region.
q Sketch planning models, of land use-transport interactions.
The models presented in this lecture note can be used as the components of the above models 4 to 7.
1.4 The classic four-stage model

Years of experimentation and development have resulted in a general structure which has been called
the classic transport model. It resulted from practice in the 1960s and has remained more or less
unaltered despite major improvements in modelling techniques in the 1970s and 1980s. Figure 1.6
shows this general structure.
The approach starts with considering a network and zoning system (see chapter 3) and the
collection of data (see chapter 2). These data are used to estimate a model of the total number of trips
generated by or attracted to each zone of the study area: the trip generation model (see chapter 4).
The next step is to allocate these trips to particular destinations, so a trip matrix can be produced. This is
called trip distribution (see chapter 5). The following stage is usually the modelling of the choice of
mode, which is called modal split (see chapter 6). The last stage in the classic model requires the
assignment of the trips by each mode to their corresponding networks (see chapter 7).
Figure 1.6 The classic four-stage transport model

Comments on the classic transport model

The classic transport model is presented as a sequence of four sub-models (trip generation, trip
distribution, modal split, and assignment). However, travel decisions are actually rarely taken in this
sequence, a contemporary view is that the location of each sub-model in the sequence depends on the
form of utility function assumed to govern travel choices.
The classic transport model is seen as concentrating on only a limited range of travellers’ responses.
Current thinking requires an analysis of a wider range of responses to transport problems and schemes.
For example, when a trip maker is faced with increased congestion, he can respond with a range of
simple changes to:
q The route followed to avoid congestion or take advantage of new links
q The mode used to get to the destination
q The time of departure to avoid the most congested part of the peak
q The destination of the trip to a less congested area
q The frequency of journeys by undertaking the trip at another day
Alternative methods
Some contemporary approaches attempt to treat simultaneously the choices of trip frequency,
destination and mode of travel, thus collapsing trip generation, distribution and modal split in one single
model. Other approaches emphasise the role of the household activities and the travel choices they
entail: the so-called activity-based models. These are more difficult to cast into the four-stage model,
and they are not yet in operational use. However these models provide an improved understanding of
travel behaviour and are therefore likely to enhance conventional modelling approaches in the future.
Using the classic transport model

The sequence as described above is the most common one, but not the only possible one. Some studies
have placed modal split between trip generation and trip distribution, to obtain greater emphasis on

decision variables depending on the trip generation unit (e.g. household). However, this makes it difficult
to include attributes of the journey and modes in the mode. Therefore it could be better to perform trip
distribution and modal split simultaneously.
It should be noted that the classic model makes trip generation inelastic to the level of service provided
in the transport system. This is probably unrealistic, but only recently techniques have been developed
which can take systematic account of these effects.
Once the model has been calibrated and validated for the base year conditions it must be applied to one
or more planning horizons. Therefore different scenarios and plans should be developed that describe the
transport system and planning variables under alternative futures. After that, the model can be run again
(several times, depending on the number of alternatives) with this new input. A comparison can then be
made, most likely between costs and benefits, of different schemes under different scenarios, from which
the most attractive programme can be chosen. This depends on the conditions that it is subject to.
An important issue in the classic four-stage model is the consistent use of variables affecting demand.
For example, at the end of the assignment stage, new flow levels and therefore new travel times are
obtained. These are unlikely to be the same as the travel times assumed when the trip distribution and
modal split models were run. So this calls for a re-run of these models, but if after that the assignment is
run again, this will again result in a new set of travel times. Trying to solve this problem by repeating
this procedure (iterations) has seen not to be leading to equilibrium when travel times are concerned.
There are methods to find equilibrium in the assignment, which will be discussed in chapter 7. There is a
particular risk in choosing the wrong plan, depending on how many iterations one is prepared to
undertake.
1.5 Limitations of the four-stage model

In this section some limitations of the use of the different sub-models of the four-stage model will be
discussed. It provides a critical view on these models, and should not be seen as a restricting factor to
practising them but rather as an enhancement of comprehension.
1.5.1 Limitations of trip generation models

q Independent decisions. Travel behaviour is a complex process where often decisions of one
household member are dependent on others in the household. For example, childcare needs may
affect how and when people travel to work. This interdependency for trip making is not considered.
q Limited trip purposes. With no more than four to eight trip purposes considered, a simplified trip
pattern results. All shopping trips are treated the same whether shopping is done for groceries or
lumber. Home based “other” trips cover a wide variety of purposes – medical, visit friends, banking,
etc. which are influenced by a wider variety of factors than those used in the modelling.
q Limited variables. Trip making is found as a function of only a few variables such as car ownership,
household size and employment. Other factors such as the quality of transit service, ease of walking
or bicycling, fuel prices, land use design and so forth are not typically included.
q Combinations of trips (trip chaining) are ignored. Travellers may often combine a variety of
purposes into a sequence of trips as the run errands and link together activities. This is called trip
chaining and is a complex process. The modelling process treats such trip combinations in a very
limited way. For example, non-home based trips are calculated based only on employment
characteristics of zones and do not consider how members of a household co-ordinate their errands.
q Feedback, cause and effect problems. Trip generation models sometimes calculate trips as a
function of factors that in turn could depend on how many trips there are. For example shopping trip
attractions are found as a function of retail employment, but it could also be argued that the number
of retail employees at a shopping centre would depend on how many people come there to shop. This
'chicken and egg' problem comes up frequently in travel forecasts and is difficult to avoid. Another
example is that trip making depends on car availability, but it could be also argued that the number
of automobiles a household owns would depend upon how active they are in making trips.
1.5.2 Limitations of trip distribution models

q Constant trip lengths: In order for the model to be used as a forecasting tool it must be assumed
that the average lengths of trips that occur now will remain constant in the future. Since trip lengths
are measured by travel time this means that improvements in the transport system that reduce
travel times are assumed to be balanced by a further separation of origins and destinations.

q Use of automobile travel times only to represent 'distance'. The gravity model requires a
measurement of the distance between zones. This is almost always based on automobile travel times
rather than transit travel times and leads to a wider distribution of trips (they are spread out over a
wider radius of places) than if transit times were used. This process limits the ability to represent
travel patterns of households that locate on a transit route and travel to points along that route. This
may be particularly important if a rail transit system is being analysed.
q Limited effect of socio-economic-cultural factors. The gravity model distributes trips only on the
basis of size of the trip ends (trip productions, trip attractions) and travel times between the trip
ends. Thus the model would predict a large number of trips between a high-income residential area
and a nearby low-income employment area or between a Spanish-speaking neighbourhood and a
nearly non-Spanish speaking neighbourhood. The actual distribution of trips is affected by the nature
of the people and activities that are involved and their socio-economic and cultural characteristics as
well as the size and distance factors used in the model. For example such factors as: differences in
income, crime conditions, and attractiveness of the route are not considered. Furthermore, groups of
travellers might avoid some areas of the city and favour others based on socio-economic-cultural
reasons. Adjustments are sometimes made in the model to account for such factors, but it is difficult
since the effects of such factors on travel is difficult to quantify much less to predict how it would
change over time.
q Feedback problems: Travel times are needed to calculate trip distribution, however travel times
depend upon the level of congestion on streets in the network. The level of congestion is not known
during the trip distribution step since that is found in a later calculation. Normally what is done is that
travel times are assumed and checked later. If the assumed values differ from the actual values, the
model should be iterated a number of times to get the inputs and outputs of the model to balance.
1.5.3 Limitations of modal split models

q Mode choice is only affected by time and cost characteristics. An important thing to
understand about mode choice analysis is that shifts mode usage would only be predicted to occur
only if there are changes in the characteristics of the modes, i.e. there must be a change in the in-
vehicle time, out-of-vehicle time or cost of the automobile or transit for the model to predict changes
in demand. Thus if one substitutes a light rail transit system for a bus system without changes in
travel times or costs from the bus system, the model would not show any difference in demand.
People are assumed to make travel choices based only on the factors in the model, factors not in the
model will have no effect on results predicted by the models.
q Omitted factors. Factors which are not included in the model such as crime, safety, security, etc.
concerns have no effect. They are assumed to be included as a result of the calibration process.
However, if an alternative has different characteristics for some of the omitted factors, the model will
predict no change. Such effects need to be factored in by hand and require considerable skill and
assumptions.
q Access times are simplified. No consideration is given to the ease of walking in a community and
the characteristics of a waiting facility in the choice process. Strategies to improve local access to
transit or the quality of a place to wait do not have an effect on the models.
q Constant weights. The importance of time cost and convenience is assumed to remain constant for
a given trip purpose. Trip purpose categories are very broad (i.e. 'shop', 'other'). Differences in the
importance of time and cost within these categories are ignored.
1.5.4 Limitations of assignment models

q Intersection delay is ignored. Most traffic assignment procedures assume that delay occurs on the
links rather than at intersections. This is a good assumption for through roads and freeways but not
for highways with extensive signalised intersections. Intersections involve highly complex movements
and signal systems. They are highly simplified in traffic assignment and the assignment process does
not modify control systems in reaching equilibrium. Use of sophisticated traffic signal systems,
freeway ramp meters or enhanced network control of traffic cannot be easily analysed with
conventional traffic assignment procedures.
q Travel only occurs on the network. It is assumed that all trips begin and end at a single point in a
zone (the centroids) and occurs only on the links included in the network. Not all road streets are
included in the network nor all possible trip beginning and end points included. The zone/network
system is a simplification of reality and excludes some travel, especially shorter trips. To get total
travel, say for air pollution analysis, a certain percentage of off network travel must be added to
assignment results.
q Capacities are simplified. To determine the capacity of roadways and transit systems requires a
complex process of calculations that consider many factors. In most travel forecasts this is greatly
simplified. Capacity is found based only on the number of lanes of a roadway and its type (freeway or
arterial). Most travel demand models used for large transport planning studies do not consider other

factors such as truck movement, highway geometry and other factors affecting capacity in their
calculations.
q Time of day variations. Traffic varies considerably throughout the day and during the week. The
travel demand forecasts are made on a daily basis for a typical weekday and then converted to peak
hour conditions. Daily trips are multiplied by an "hour adjustment factor", for example 10%, to
convert them to peak hour trips. The number assumed for this factor is very critical. A small
variation, say plus or minus one percent, will make a large difference in the level of congestion that
would be forecast on a network.
q Emphasis on peak hour travel. As described above, forecasts are done for the peak hour on a
typical weekday. A forecast for the peak hour of the day does not provide any information on what is
happening the other 23 hours of the day. The duration of congestion beyond the peak hour, i.e. peak
spreading, is not determined. In addition travel forecasts are made for an 'average weekday'.
Variation in travel by time of year or day of the week is usually not considered.

2 Data collection3
Data are an essential component of transport modelling. It is therefore important that a transport
planner has at least some knowledge of data collection methods. Even if he will never perform a data
collection survey, the planner needs this knowledge to interpret the collected data in the right way. It
also expands his knowledge about transport systems in general.
Data are needed for three main purposes:

q Description of the present situation, often called the base-year situation
q Input to development and use of transport models
q Monitoring the effects of the implementation of policies, strategies and investments
The data that are generally required for transport studies can be subdivided into the categories of
supply and demand:
Supply data
q Capacity (function of number of lanes or public transport vehicles)
q Design speed
q Type of service provided (e.g. freeway, local road, express-bus service, train service)
q Use restrictions (e.g. turn prohibitions, parking permitted or prohibited, operation only in peak)
q Parking places
Demand data
q Volumes of use by time of day, trip purpose, means of travel and specific location
q Current actual speed (peak and off-peak)
q Costs and times experiences by users, by time of day or by origin-destination locations
q Attributes of users that relate to levels of use and methods of use (e.g. income, age, car ownership,
household size, working status)
It will not be possible to collect all these types of data in just one survey, because of the difference in
survey methods, survey instruments and sampling procedures.
This chapter describes methods by which demand data can be collected. It starts with an introduction to
survey planning. The next section is on sampling methods, followed by a section on errors in data
collection and modelling. The last section is on data collection methods. In the chapters following it is
described how the data will be analysed for usage in the submodels of the traditional four-stage
transport model.
2.1 Introduction to survey planning

Data collection is expensive and time consuming. Therefore it is necessary to pay sufficient attention to
planning, designing and conducting traffic surveys, so the amount and type of data to be collected is
clear. In the following, the stages of traffic data collection will be explained. Figure 2.1 presents the
stages of traffic data collection. The existence of feedback loops in this figure indicates that survey
design (ideally) is not purely a sequential process. For example, the survey instruments may have to be
modified as an outcome of a pilot survey.
3 1
Based on: P. R. Stopher, ‘Survey and sampling strategies’, In: D.A. Hensher & K.J. Button (eds.) Handbook of
transport modelling, 2000, p. 229-250; M.Taylor, W, Young & P. Bonsall, Understanding traffic systems, Data, analysis
and presentation, 1996, p. 129-156, 247-266; J. de D. Ortúzar & L. G. Willumsen, Modelling Transport, 1994, p. 55-
108

Figure 2.1 Stages in the design and conduct of a traffic survey

Source: Taylor, Young and Bonsall (1996)
Objectives
At the start of the data collection exercise it is necessary to define the objectives of the survey, as they
can be seen as the starting point of the survey. Questions to be asked to define the objectives are, for
example:
q Is the survey required as part of an ongoing monitoring process or an ad-hoc investigation?
q Are the results supposed to relate to a specific place or are general results sought?
q What hypotheses are to be tested?
q What level of disaggregation is required?
Availability of existing data

When the objectives are defined, it is worth considering if relevant data is already available. Data
previously collected may remove or reduce the need to collect further data. Sometimes updating
techniques may be applied to correct existing data sets for current (base-year) situations. Furthermore
this data may help defining the required sample sizes or other data collection items. In that case the
pilot survey may not be necessary anymore.
Specification of requirement for new data

From the objectives and the availability of existing data it may be possible to distillate the requirements
for new data. However, a distinction has to be made between the data essential for the survey and
precisely specified, and the data that may be optional and less precisely defined. One should not be
tempted to include a lot of redundant items in the specification ‘just in case’ they turn out to be useful.
Of course, the available resources also will restrict this. Together with the knowledge of available
resources, the survey instrument can be selected and the sampling strategy designed.
Available resources
Examples of resources are time, people and money. Usually, these resources are a constraint on the
specification of the survey. Compromises often have to be made between what the analyst ideally
wants, and what can actually be afforded. A typical household survey for example may easily go in
terms of hundreds of interviews requiring lots of labour and therefore costs.
Choice of survey instrument

The choice of a survey instrument is in some cases quite simple, e.g. when there exists just one
procedure or piece of equipment that can do the specified task. It is however more common that there
are several alternative procedures or pieces of equipment available to do the job. Which procedure or

equipment has to be used will be judged considering the weaknesses and strengths of these methods in
the light of the specifications made earlier in the process.
Design of sample
The sample design is interrelated with the choice of survey instrument. For example, a certain
instrument might be chosen that needs a minimum amount of observations to reduce measurement
error, which determines the sample size. On the other hand, if the sample requires data recorded every
second, manual measurement techniques are inadequate.
More about sample design is discussed in section 2.2.
Survey plan
The initial survey plan will be based on the decisions taken on survey instrument and sample design. It
will also include operational/procedural aspects such as the recruitment of staff, acquisition of
equipment and the schedule of key events (figure 2.2).
Figure 2.2 Typical survey schedule

Source: Taylor, Young & Bonsall (1996)
Each survey schedule is unique, and has to be drawn up as a critical path flow chart. The job is about
fitting the required steps around fixed dates and other constraints. These will include the latest date by
which results are required, the window of opportunity for the survey (e.g. in case seasonal factors play
a role) and constraints in resources (staff, equipment). Each survey also has its own additional
constraints.
Although anyone wants time included in schedules for unpredicted circumstances, this will almost
always be impossible due to the compromise that has to be made between the ideal survey plan and
the plan according to which data is delivered at the agreed date.
Pilot survey
The pilot survey is an important element in the survey plan. In the pilot survey the survey instruments
and the associated procedures can be tested. It may be reduced in case standard procedures and
equipment are being used but when innovations are introduced, a pilot survey is vital for the success of
the survey. The piloting can be done at different levels, but in all cases it is necessary to reserve
sufficient time and resources for revision and redesign of the survey plan. In bad cases the pilot survey
may lead to major adjustments to the survey plan, or even abandonment. But of course this is better to
be concluded in this stage of the survey than later.
Conduct of main survey, data processing and archiving

When the pilot survey has shown the techniques and procedures to work, the main survey can be
conducted and followed by data processing as required. It is important that all procedures used are

carefully archived along with other information relevant to conduct of the survey, namely factors that
could have affected the data (e.g. weather conditions).
Next to all these procedures it is also good practice to archive the raw data. This can be useful in case,
for example, the processing was potentially subject to error. This can be quite costly, and mostly there
is put a limit on the storage time of raw data. However, this should not be the case with other elements
of the survey report since they are inevitable for correct interpretation of the results.
2.2 Sampling methods

There are two ways to collect data on a population:
q Observe every member of a population
q Observe every member of a sample of this population
Sampling is used when it is not economically (or sometimes technically) feasible to observe an entire
population. However, the problem then arises how to expand the data in the sample to data valid for
the entire population. So there are two difficulties:
q How to ensure a representative sample
q How to extract valid conclusions from a sample
In this section sample design will be explained, along with the description of sampling methods
applicable to transport studies.
2.2.1 Sample design

In the following the stages of sample design are being described.
Target population
The target population is the population that is of interest for the given study area and from which the
sample has to be drawn. This can be a population that is directly influenced by changes to be made to
the transport system, but it can also be a population outside the area of interest, which will be used as
a comparison to rule out the effects that have nothing to do with the proposed changes.
It should be investigated (e.g. by means of a pilot survey) if there are important subgroups in the
population for which the effects are significantly different from the rest of the population.
Sampling unit
The definition of the sampling unit depends primarily on the nature and purpose of the study, but may
also be constrained by practical considerations involved in collecting the required data. A population
consists of individuals, or individual items, such as persons or vehicles. Sampling units can be
individuals, but also more aggregate units like households, buses or geographical areas.
Sampling frame
The sampling frame is a sort of list, which contains all members of the target population from which in
all cases the actual size of the population can be determined. A sampling frame is for example a list of
all vehicles registered in a certain area.
Sampling method
There are two main methods in sampling: random sampling and judgement sampling. In random
sampling all members of the target population have the same chance to be chosen in the sample.
Judgement sampling uses personal knowledge, expertise and opinion to identify sample members. They
have a certain convenience and can be used in case studies, for example. However, they cannot
represent the target population because they have no statistical meaning. They can be used in pilot
surveys to examine the possible extremes of outcomes using minimal resources.
In random sampling there are four basic methods available:
q Simple random sampling
q Stratified random sampling
q Cluster sampling
q Systematic sampling
These methods will be further described in section 2.2.2

Sampling error and bias
Sampling error is due to the fact that we are dealing with a sample, and not with the complete
population. This means that this error will always be present. It does not affect the expected values of
the means, but the variability around them. It is a function of sample size, and the inherent variability
of the parameter under investigation. Enlarging the sample size can reduce the error, however it can
never be eliminated.
Sampling bias is caused by mistakes in defining the target population, selecting the sampling method or
in any other stage of sample design. There are two differences with sampling error: it affects the mean
value of the estimated parameter as well as the variability around it, but it can be eliminated by being
prudent during sample design stages and data collection.
The two errors described above combined contribute to the measurement error of the data. More about
errors in section 2.3.
Sample size
As seen above, the reliability of a sample increases with the size of it. This means that there has to be a
trade-off, since increasing sample size also implies increasing costs. There has to be found an optimum
sample size at which the reliability and costs are both reasonable.
This sample size depends on three factors:

q Variability of the parameters in the target population
q Degree of accuracy required for each parameter
q Population size (this will only affect sample size in case of small populations)
In box 2.1 the determination of sample size is explained.
Requirements for sample design

The success of a statistical investigation depends largely on the size and representativity of the traffic
survey used to collect the data. For useful results, the investigator needs to:
q Define the precise aims of the survey
q Define the target population
q Avoid introducing unquantified bias into the data
q Specify the parameters to be estimated and the desired level of accuracy
q Ensure the sample is of sufficient size and truly represents the target population
2.2.2 Sampling methods

As stated earlier in the previous section, all sampling methods applicable to transport studies are
random samples. In this section the four basic methods are explained further.
Simple random sampling

The sample is selected by a method that allows each possible sample to have the same probability of
being chosen. The individuals in the population are all associated with a unique number (sequential
from 1 to n, when n is population size). Then the sample can be selected by choice of random numbers
out of the population. There are tables with random digits available to perform such jobs.
The sampling can be performed “with replacement” (i.e. a population member may be chosen more
than once) or “without replacement”. In the latter case, the chance of being chosen is not the same for
each member of the population. However, with large populations and relatively small samples, this
effect is minimal.
Systematic sampling
Systematic sample is a simple and convenient method of selecting a pseudo-random sample. At first a
random starting point in a sampling frame is chosen, and from this starting point every nth element in
the sampling frame is selected. For example, when the sample should contain 10% of the population,
the starting point is selected out of the first 10 elements, and after that, every 10th element is selected.
Systematic sampling has two advantages over simple random sampling:
q It is quick and demands only limited resources
q It can easily be applied by unskilled workers

Box 2.1: Determination of sample size
The Central Limit Theorem (CLT, from Statistics theory) postulates that the estimates of the mean tend
to become distributed Normal as the sample size (n) increases. This holds if n>30, or if the population
has a Normal-like distribution.
Consider a (target) population with size N that is distributed with mean µ and variance s 2 . The CLT
states that the mean x of successive samples is distributed Normal with mean µ and standard
deviation se(x ) , standard error of the mean, given by:
se( x ) = ( N - n)s 2 /[n( N - 1)]
If there is only one sample considered, the best estimate of µ is x and the best estimate of s2 is
S 2 (the sample variance), which leads to standard error estimation:

se( x ) = ( N - n) S 2 / nN
This is a function of three factors: N, n and S². For large populations, and small sample sizes, which is
mostly the case, the factor (N-n)/N is very close to 1 which reduces the function to:
S
se( x ) =
n
This means that, for example, quadrupling the size of the sample will only halve the standard error. The
required sample size may be estimated now using the last two equations, first calculating n’ from the
last one:
S2
n¢ =
se(x ) 2
Then correcting it for finite population size, if necessary:

n¢
n=
n¢
1+
N
Although the above is quite objective, there are two important problems which makes it all less easily.
First the sample variance S² can only be drawn from the sample itself. Therefore it has to be estimated
from other sources.
Second, an acceptable level for the standard error has to be chosen. This is related to the desired
degree of confidence to be associated with the use of the sample mean as an estimate of the population
mean. Confidence is in practice specified as an interval around the mean, therefore there are two
judgements needed to calculate an acceptable standard error:
- A confidence level for the interval must be chosen (a confidence level of 95% means that wrongly
accepting the sample mean as the true mean occurs in 5% of the cases).
- It is necessary to specify the limits of the confidence interval around the mean, either in absolute or
relative terms.
For more information on this subject the reader is referred to statistics theory.
Based on: J. de D. Ortúzar & L.G. Willumsen, Modelling Transport, 1994, p. 58-59

This method is called pseudo-random because the sample completely defined after selection of the first
element. Initially every element has the same probability of being included in the sample, but each
sample doesn’t have equal probability of being selected. Bias could be introduced if some surveyed
parameters correlate with the order of elements in the sampling frame, for example registration plates
or age of motor vehicles.
Stratified random sampling

This method requires division of the target population into relatively homogenous groups, or strata. This
can be done, for example, by income, age or car ownership. These cases are about a certain property of
the population. There is also another way of dividing the population, for example by choice of transport
mode. This is called a choice based sample, which is a subset of stratified random sampling.
There are to methods to draw a stratified random sample:

q Draw a simple random sample of a specified size from each stratum, corresponding to the
proportion of that stratum in the target population
q Draw equal sized samples from each stratum and weigh the results corresponding to the proportion
of that stratum in the target population
The advance of stratified random sampling is that differences between subgroups of the population can
be recognised, which is not the case with simple random sampling.
In box 2.2 an example is given to calculate the probabilities to find an individual with certain properties
in a sample drawn from a given population using different sampling methods.
Cluster sampling
This method requires division of the target population in clusters from which a random sample will be
drawn. For example, a random number of streets (read: clusters) can be selected from a municipality
for a trip generation survey, and then each household in those streets will be surveyed. This is
convenient in case of the distribution of a mail questionnaire or a household interview survey.
The assumption is made here that there would be no bias in selecting some streets to represent the
municipality. However, there could be a bias due to misrepresenting socio-economic groups, age of
housing, etc.
Both stratified random sampling and cluster sampling divide the target population in well-defined
groups. The difference is that stratified random sampling should be used when each group has small
internal variations, but there is a wide variation between the groups. Cluster sampling should be used
when the groups have a considerable internal variation, but the groups have essentially the same
characteristics.
Box 2.2: Example sampling methods
Assume that for the purposes of a transport study the population of a certain area has been classified
according to two income categories, and that there are only two means of transport available (car and
bus) for the journey to work. Let us also assume that the population distribution is given by:
Low income High income Total

Bus user 0.45 0.15 0.60
Car user 0.20 0.20 0.40
Total 0.65 0.35 1.00
1. Random sample. If a random sample is taken, it is clear that the same population distribution
would be obtained.
2. Stratified sample. Consider a sample with 75% low income (LI) and 25% high income (HI)
travellers.

From the previous table it is possible to calculate the probability of a low-income traveller using bus, as:
P( LI & Bus ) 0.45

P( Bus / LI ) = = = 0.692
P( LI & Bus ) + P( LI & Car ) 0.45 + 0.20
Now, given the fact that the stratified sample has 75% of individuals with low income, the probability of
finding a bus user with low income in the sample is 0.75 x 0.692 = 0.519. Proceeding analogously, the
following table of probabilities for the stratified sample may be build:
Bus user 0.519 0.107 0.626
Car user 0.231 0.143 0.374
Total 0.750 0.250 1.000
3. Choice based sample. Let us assume now that we take a sample of 75% bus users and 25% car
users. In this case, the probability of a bus user having low income may be calculated as:
P( LI & Bus ) 0.45

P( LI / Bus ) = = = 0.75
P( LI & Bus ) + P( HI & Bus ) 0.45 + 0.15
Therefore, the probability of finding a low-income traveller choosing bus in the sample is 0.75 x 0.75 =
0.563. Proceeding analogously, the following table of probabilities for the choice-based sample may be
build:

Bus user 0.563 0.187 0.75
Car user 0.125 0.125 0.25
Total 0.688 0.312 1.000
Source: J. de D. Ort Ortúzar & L.G. Willumsen, Modelling transport, 1994, p. 62-63
2.3 Errors in data collection and modelling

The statistical procedures used for (travel demand) modelling assume that:
q The correct functional specification of the model is known in advance
q The data used to estimate the model parameters have no errors
However, it often occurs that these conditions are not satisfied. And even if they were, the problem
remains that model forecasts are usually subject to errors due to inaccuracies in the values of
explanatory variables in the design year.
Models are often built with the aim of forecasting demands in the (near) future. The trade-off has to be
made between model complexity and data accuracy to fit within the required forecasting precision and
the study budget. Two types of errors should therefore be distinguished:
q Errors that could cause even correct models to yield incorrect forecasts
q Errors that actually cause incorrect models to be estimated
The next section explains about different types of errors that could arise during building, calibrating and
forecasting with models. The following section is on the trade-off between model complexity and data
accuracy.

2.3.1 Types of error
Measurement errors
These errors occur due to inaccuracy in data measuring in the base year. For example: questions badly
registered by the interviewee, network measurement errors, coding and digitising errors, etc. These
errors tend to be higher in developing countries. Improving data-collection effort or allocating more
resources to data quality control can reduce them, but, of course, this is more expensive.
Another form of error is the difficulty in defining the variables to be measured. This is, however, not a
measurement error. This also rules for perception error: the error caused by the fact that the model is
based on the information from perceptions of the users, but it is not known what these users will
perceive in the future.
Sampling errors
These errors are, as explained before, due to using a sample instead of the entire population. Increasing
sample size can reduce sampling error. However, quadruple sample size is needed to halve the errors.
Computational errors
Models are generally based on iterative procedures. In case of complex models the exact solution if
often not found due to computational costs. This gives rise to computational errors. In most cases they
are typically small in comparison with other errors, except for cases such as assignment of congested
networks as will be seen later in chapter 7, or equilibration between supply and demand in complete
model systems.
Specification errors
These arise either because the phenomenon being modelled is not well understood or because it needs
to be simplified for whatever reason. Imported subclasses of this type of error are the following:
q Inclusion of an irrelevant variable
q Omission of a relevant variable
q Exclusion of taste variations on the part of the individuals
q Other specifications errors: the use of model forms which are not appropriate (e.g. linear functions
representing non-linear effects)
Increasing model complexity can reduce all specification errors. However, this will be at increasing
costs. It has to be accepted that specification errors may be present in all feasible models.
Transfer errors
These occur when a model developed in one context of time and/or place is applied in a different one.
Adjustments can be made, but it can always be the case that behaviour is different in another context.
In case of a spatial transfer (using the same model in another place), errors can be reduced or
eliminated by partial or complete re-estimation of the model in the new context. Although the latter
would mean that there no advantage in costs of using an existing model.
In case of a temporal transfer (using the same model for future situations), re-estimation is not possible
because of the lack of data, which means that any error must be accepted.
Aggregation errors
These arise out of the need to use groups of people instead of individuals to make forecasts. Behaviour
would have been captured better if modelling were done on the individual level. Important subclasses of
this type of error are the following:
q Data aggregation (this causes some form of specification error because population averages are
used instead of individual values)
q Aggregation of alternatives
q Model aggregation
2.3.2 Model complexity versus data accuracy

It is mentioned earlier in this section that there has to be a trade-off between model complexity and
data accuracy. The level of complexity of a model influences the level of precision of the forecasts the
model produces. The level of complexity is determined by the accuracy of the data. The increase of
accuracy costs money, so the point is to achieve an optimum in complexity of the model and data
accuracy given the study budget.

In box 2.3 the calculation of the influence of errors in input variables on model accuracy is explained.
Complexity can be defined as the increase in the number of variables and/or an increase in the number
of algebraic operations with the variables. It is obvious that the specification error (es) will decrease
with increasing complexity. However, there is more data to be measured, so the measurement error
(em) will increase.
If the total modelling error is defined as E = (es2 + em2 ) , it can be seen that the minimum of E does
not necessarily have to be at the point of maximum complexity. This is shown in figure 2.3. The figure
even shows that with increasing measurement error the total error will only increase more when
complexity increases.
Figure 2.4 illustrates that if data are not of a good quality (which is often the fact in developing or poor
countries), it might be safer to predict with simpler and more robust models. However, better-specified
models are always preferable.
Box 2.3: The influence of errors in input variables on model accuracy
Consider the observed variables x with the associated errors ex (standard deviation). The output error
derived from the propagation of error in a function such as:
z = f ( x1 , x 2 , x3 ,..., x n )
can be found with the following formula:
2
æ ¶f ö 2 ¶f ¶f
e = å çç
2
z
÷÷ e xi + åå e xi e x j rij
i è ¶x i ø i j ¹ i ¶x i ¶x j
In this formula rij is the coefficient of correlation between xi and xj. The formula is exact for linear func-
tions, and a reasonable approximation in other cases.
It is clear from this formula that not using correlated variables can reduce error.
The partial derivative of ez with respect to e xi (ignoring the correlation term) is:
2
¶e z æ ¶f ö e xi
=ç ÷
¶e xi çè ¶xi ÷ø e z
This yields the marginal improvement rate per variable. Along with the estimated marginal costs of en-
hancing data accuracy it should be possible to determine an optimum improvement budget. However, it
is not that simple because these marginal costs are not linear but proportionate to the amount of error
reduction.
From the above derivative two rules can be deduced that can be used in determining which variable
should be improved to achieve the largest reduction of total error:
q Concentrate the improvement effort on those variables with a large error
q Concentrate the effort on the most relevant variables, i.e. those with the largest increase as they
have the largest effect on the dependent variable
Based on: J. de D. Ortúzar & L.G. Willumsen, Modelling Transport, 1994, p. 70-71

Figure 2.3 Variation of error with complexity
Figure 2.4 Influence of the measurement error

2.4 Data collection

In this section, methods of data collection are discussed. The method to be used depends on practical
limitations and the type of data that is needed. Furthermore there can be chosen for a certain data
collection strategy, which also depends on the purpose of the model as a whole.
Practical limitations
q Length of the study. This implies how much time and money can be devoted to data collection.
q Study horizon. If the design year is close or far away can be conditional on the type of survey that
will be used.
q Limits of the study area. Formal political boundaries (county or district boundaries) should be
ignored, while concentrating on the whole area of interest.
q Study resources. These have to be known in advance to determine which method is to be used. Also
questionnaire respondents must be seen as resources!

Data needs in transport modelling

The type of model they will be used for determines the data needs:
q Land use inventory. Needed for: trip generation models.
q Socio-economic information. Needed for: trip generation and modal split models.
q O-D travel surveys and traffic counts. Needed for: trip distribution models and calibration.
q Infrastructures and existing services inventory. Needed for: assignment models and calibration.
The emphasis in this section will be on data collection on the demand side, more specific on O-D travel
surveys. Therefore the study area has to be defined. The external boundary is known as the external
cordon. The area within this cordon has to be divided into (internal) zones. More about how this should
be performed is discussed in chapter 3+1. The area outside the cordon is also divided in (external)
zones, which are substantially larger than the internal zones. Inside the study area can be internal
cordons and screen-lines. A screen-line is an artificial divide following a natural or artificial boundary
with few crossings (e.g. a river or a railway track).
Data collection strategies

Data collection strategies can be classified in two ways:
q Cross-section and Time Series
q Revealed and Stated Preference
Cross-sectional data are collected at a single point in time while, for example, longitudinal data are
collected at different points in time, using the same sample every time. The methods of data collection
can be the same for both strategies. More about longitudinal data is discussed in section 2.4.5
Revealed preference data are the observed choices and decisions of the travellers, while stated
preference data are collected using the response to hypothetical choices. Of course, the latter can only
be done with interviews or questionnaires. More about stated preference surveys is discussed in section
2.4.4.
2.4.1 Household-based survey

A household-based survey is the most expensive and difficult type of O-D survey. However, there can
be obtained a lot of useful data with them, and therefore they are widely used. The following may also
apply to other types of O-D survey.
General considerations
It is widely recognised that the procedures and measuring instruments used for data collection influence
the data collection results, and should therefore be included in the survey planning process. Of course,
each method has its shortcomings and criticisms. For household-based surveys, some of the frequent
criticisms are:
q The surveys measure average rather than actual travel behaviour of individuals
q Only part of the individual’s movements can be investigated
q Information is often poorly estimated by the interviewee (e.g. travel times)
These criticisms have been analysed which leaded to two conclusions. First, travel behaviour should not
be sought in general terms (averages) but referenced to a temporal point of reference. This has led to
substantial improvement of measurement procedure. Second, the various activities should not be
examined in isolation, but as a complete pattern of activities. For example, asking for starting and
ending times of an activity proved to lead to more accurate result than asking for travel times. This
resulted for example in the travel diary method, which will be discussed below.
Survey date
The date on which the O-D survey should be performed depends on its objectives, but mostly the
objective will be to survey travel behaviour during a working day. These working days can best be
selected in spring or autumn, as summer includes holidays and in winter travel behaviour can be
influenced by climatic conditions.

Days and times to conduct the survey
Monday and Friday should not be selected as a survey date. Monday suffers a relatively high rate of
absenteeism; while on Friday more trips are registered than on other working days because of being
prior to the weekend. As a survey should be asking about the previous day to ensure good recollection,
it is usually carried out on Wednesdays, Thursdays and Fridays.
The time of day should be between 18.00 and 21.00, as the probability of finding people at home is
highest.
Survey period
Ideally all households in the selected sample should be interrogated in one single day. In practice this is
done in several days, which reduces the need for interviewers who, on the other hand, become more
experienced in the job. In most cases, the sum of the responses over several working days seems to be
a good representation of the answers that would have been obtained in one single day.
Questionnaire design
The order in which the questions are asked should minimise resistance on the part of the interviewee,
this means that ‘difficult’ questions should be at the end of the interview. Furthermore, the following
aspects should be satisfied in composing the questionnaire or interview:
q The questions should be simple and direct
q The number of open questions should be minimised
q The information about travel must be elicited with reference to the activities which originated the
trips
q Each member of the household older than 12 years old should be personally interviewed. The rest
may be considered letting another member of the household answer for them.
In general, household O-D surveys have three distinct sections:

q Personal characteristics and identification. In this part questions are designed to classify the
household members according to the following aspects: relation to the head of the household, sex,
age, possession of a driving licence, educational level and activity. In order to reduce possibility of
subjective classification, a complete set of activities should be defined.
q Trip data. The trips made by household members have to be detected and characterised using
question to determine: origins and destinations (nearest road junction or post code), trip purpose,
trip start and ending times, mode used, walking distance (including transfers), public transport line
and transfer station or bus stop. Trips up to 300 metres will normally not been recognised as trips.
q Household characteristics. This section includes questions designed to obtain socio-economic
information about the household, such as: characteristics of the house, identification of household
vehicles and their usual user, house ownership and income.
Sample size
Traditionally for household O-S surveys large random samples were used. In table 2.1 the values are
shown that are postulated as recommended practice, but are rarely used. However, if they were used,
particularly in developing countries, the sample sizes were enlarged with 20% to compensate for
validation losses. This was done because the sample sizes were believed as essential.
To reduce the enormous sample sizes, statistics can be used to estimate them (see box 2.1). However,
this requires knowledge about the variable to be estimated, its coefficient of variation, and the desired
accuracy of measurement together with the level of significance associated to it.
Sample size (dwelling units)

Population of area Recommended Minimum
Under 50 000 1 in 5 1 in 10
50 000 – 150 000 1 in 8 1 in 20
150 000 – 300 000 1 in 10 1 in 35
300 000 – 500 000 1 in 15 1 in 50
500 000 – 1 000 000 1 in 20 1 in 70
Over 1 000 000 1 in 20 1 in 100
Table 2.1 Sample sizes recommended in traditional surveys
Source: Bruton, Introduction to Transportation Planning, 1985

Travel diary surveys

A travel diary survey is a special type of household survey. There are two objectives for which it is
used: to aid the general correction process of the O-D survey (section 2.4.3) and to provide a databank
that can be used to estimate disaggregate modal choice models.
The data collection takes place in two steps:
q A first visit to each household in the sample (which should not be the same household as in the O-D
survey!) during which the diaries are presented and explained. At the same time, socio-economic
data is collected as in the O-D survey. Each member of the household is asked to fill the diaries with
complete details of their travel data for the following day. If the survey is over more than one day,
more diaries will be provided.
q A second visit, the day following the last surveyed day, to collect the completed forms and to help
completing them if necessary.
The socio-economic information and trip rates by purpose are registered first and used for the
correction process of the O-D survey. After that the data of each trip is more precisely considered and
used for disaggregate choice model estimation.
The travel diaries should satisfy the following design objectives:

q Ease of transport: a small format allowing their storage in pockets or handbags is required.
q Ease of understanding to the user: for example the instructions should be in view on each page of
the diary.
q Ease of completion: offer pre-codified options wherever possible to reduce the need for written text.
Workplace surveys
Workplace surveys are very similar with household surveys. The difference is that the data are collected
at the workplace and not at home. This is particularly suitable for corridor-based journey-to-work
studies. The local authority asks a sample of employers in a certain district for permission to interview a
sample of their employees. In some cases it is efficient to ask for the sample of employees to be
distributed by residence. However, it must be noted that the data collected from such a sample is choice
based, and not random as in the household case. Nevertheless, the sample is random with respect to
mode.
The best survey times are, of course, during the normal working hours. Thus, the survey period is
extended considerably with respect to a household survey, which is interesting because the
interviewers’ time is much better used.
2.4.2 Non-household based survey

In the following some non-household based survey methods will be explained.
Cordon survey
A cordon survey provides information on trips originating in external zones and ending in or passing
through the study area. With this information O-D matrices following from household surveys can be
completed. The external cordon is defined as the boundary of the complete study area. Internal cordons
can also be used. The location of the cordons should be carefully chosen (see figure 2.5). The survey
can be performed using techniques as roadside interviews or registration plate matching, which will be
discussed later.

Figure 2.5 Setting a cordon line for a study area
Screen-line survey
A screen-line survey can be performed using the same methods as the cordon survey. A screen-line is
above defined as an artificial divide. The survey will be performed at the crossings of this screen-line.
The data may be used for filling gaps in and validate data from household and cordon surveys.
However, in correcting household data, care must be taken, because it might not be easy to conduct the
comparison without introducing bias.
Roadside interview
These provide useful information about trips that do not originate in the study area and therefore
cannot be detected in household surveys. They are often a better method for estimating trip-matrices
than home interviews because larger samples are possible. The data can also be useful for validating
and extending household based information.
In roadside interviews a sample of drivers and passengers of vehicles crossing a roadside station are
asked a limited set of questions. These must include at least origin, destination and trip purpose.
Information about age, sex and income is desired but seldom asked due to time limitations. The
experienced interviewer can however collect at least part of these data from observation of the vehicle
and its occupants. A typical roadside interview form can be seen in figure 2.6.

Figure 2.6 Typical roadside interview form

The conduct of a roadside interview survey requires a good deal of organisation and planning to avoid
unnecessary delays, ensure safety and deliver quality results. Important elements in the success of
these surveys are:
q Identification of suitable sites
q Co-ordination with the police
q Arrangements for lighting and supervision
The determination of sample size in roadside interview surveys is discussed in box 2.4.
Box 2.4 Sample size for roadside interviews
To determine the sample size the following expression can be used:
p (1 - p )
n> 2
æeö p (1 - p )
ç ÷ +
èzø N
Where n is the number of passengers to survey, p is the proportion of trips with a given destination, e is
an acceptable error (expressed as a proportion), z is the standard Normal variate value for the required
confidence level, and N is the population size (i.e. observed passenger flow at a roadside station). It can
be seen that for a given N, e and z, the value p = 0.5 yields the highest (i.e. most conservative) value
for n in the above formula. Taking this value and considering e = 0.1 (i.e. maximum error of 10%) and
z = 1.96 (corresponding to a confidence level of 95%), the values shown the following table are
obtained

N N 100 n/N
(passengers/hour) (passengers/hour) (%)
100 49 49.0
200 65 32.5
300 73 24.3
500 81 16.2
700 85 12.1
900 87 9.7
1100 89 8.1
Source: J. de Ortúzar & L.G. Willumsen, Modelling Transport, 1994, p. 81-82
On-board survey
These can be seen as the counterpart of the roadside interview for public transport. A surveyor or
interviewer collects data on board a public transport vehicle. This can be done in two ways:
q Participatory surveys. The surveyor interviews the passengers, or hands out survey forms, which
have to be filled out during the trip or later.
q Non-participatory surveys. The surveyor counts the passengers getting on or off the vehicle and the
passengers on board between two stops. If is called a fare-box survey, the surveyor records the
number of fares paid and the number of passes and transfers used, and the data are correlated with
the total fares taken in the fare-box. This is less useful when the majority of the passengers uses
Registration plate matching

This survey is performed by recording registration numbers of the vehicles passing at points on a
(internal or external) cordon or screen-line. These numbers are matched afterwards, which makes it
possible to create a cordon O-D matrix.
This method is very susceptible to error. If a vehicle passes two observation points and is incorrectly
recorded at one of them, one inbound and one outbound trip will be deduced from the data, instead of
one through trip. The workload of the observers should be reduced in order to get better results.
Reducing the sampling rate and recording only part of the registration plate can do this.
However, with a to low sampling rate certain cells in the matrix will not be observed. And if the records
of the plates are to brief, they lead to spurious matching, which means that vehicles that are recorded
as the same are not in fact the same in reality. To reduce the last type of error it can help if the time at
which the recording of registration plates was performed is recorded.
Tag survey (probe vehicles)

This method is used to observe the dispersal of vehicle from different points in the CBD, shopping
centre, transport terminal or leisure complex. To do this, the vehicles in the car park are ‘tagged’ with
small stickers that have a different colour for each location. After that the appearance of the tagged
vehicles at other locations in the network is logged. The stickers should remain affixed for long enough
but should also be easy to remove by the vehicle owner.
At present time vehicles become more often equipped with a GPS system that helps them to find their
way through the network. In commercial transport (e.g. cargo, bus companies) it also used as a means
to monitor the location of the company vehicles in the network. It can be said that these vehicles are
electronically tagged and may therefore be used as probe vehicles that can be followed through the
network. This might be a source of O-D and route choice data. However, it will not become an important
source due to privacy concerns.
Headlight survey
This method can be used to observe the patterns of dispersion of vehicles from special events and on
general O-D patterns. It involves placing a sign that asks drivers to put their headlights on and to keep
them lit until they finish their journey or are told otherwise. An observer directly downstream of the site
counts the proportion of vehicles that have indeed put their headlights on. Further on in the network the
vehicles passing with their headlights on are counted. This amount can be multiplied with the inverse of
the earlier observed proportion of vehicles having their headlights on. This gives an estimate of the flow

of vehicles from the original site to downstream sites. The method depends on the assumption that
drivers will not extinguish their headlights before arriving at their destination, and that the pattern is
not distorted by other factors that influence drivers to put on their headlights (e.g. tunnels, bad
weather). Also, there can be vehicles that have their headlights on anyway. Therefore an observer
should be upstream of the sign to calculate the proportion of vehicles with their headlights on already,
so this can be used as a correction.
2.4.3 Data correction, expansion and validation

Data correction
There is a great need for observed data to be corrected to be not only representative for the population
but also reliable and valid. Just expanding the sample data to the population is not appropriate. A series
of correction steps to be performed is discussed below.
q Correction by household size. Samples are usually selected from lists of addresses; therefore it is
possible that there are proportionally more bigger-sized households in the sample than smaller
sized if compared to the population. This should be corrected using household size data from the
entire population.
q Socio-demographic correction. This is necessary if differences in distribution of the variables sex and
age are detected between the sample and the population. The definitions of family and household
size should be consistent in both cases. This correction must be done after the correction by
household size.
q Non-response correction. It is possible that there is a variation in travel behaviour between people
who do and do not answer the survey, they obviously travel more. Correction may be possible on
the basis of the number of visits needed to complete the questionnaire at different types of
household. This correction must be performed after the previous two, and may induce significant
changes in the data.
q Correction for non-reported trips. The traditional type of home survey tends to underestimate non-
mandatory trips. The number of trips by purpose of the O-D survey should be checked with those of
the travel diaries. In travel diaries more detailed information of each journey should have been
gathered. For a method to perform the correction see box 2.5.
Sample expansion
After being corrected, the data have to be expanded to represent the total population. An expansion
factor can be defined for each study zone as the ratio between the total amount of addresses in the
zone and the amount of addresses in the sample. There is however a more accurate method, needed
because the information on the total amount of addresses may be outdated. The expansion factor can
be determined using the following formula. In this formula Fi is the expansion factor, A is the total
number of addresses in the population list, B is the total number of addresses selected as the original
sample, C is the number of sampled addresses that were non-eligible in practice (e.g. demolished, non
residential) and D is the number of addresses where no response was obtained.
A - A(C + CD / B ) / B
Fi =
B-C - D
Validation of results
The data obtained in O-D surveys are normally submitted to three validation processes.
• On site checks of completeness and coherence of the data, followed by coding and digitising in the
office.
• Computational check of valid ranges for most variables and in general of the internal consistency of
the data.
• The corrected and expanded survey data are contrasted with information of the traffic counts on
cordons and screen-lines, performed during the O-D survey. This usually presents some practical
problems, as in case of car trips the route choice information is normally lacking.

Box 2.5 A correction method for non-reported trips
q Divide the household into categories (say defined by income, number of cars and family
size); the total number of categories is limited by the condition that each one must have at least
30 observations from the travel diary survey (i.e. to ensure that their mean trip rate is distrib-
uted normal).
q Calculate the average number of trips by purpose (and its variance) for each category, and
for both the O-D survey and travel diary data; let the means be X a and X b and the variances
Sa and Sb respectively. Calculate D = Xa - Xb.
q The minimum detectable difference (d) between the means of a certain variable X in two
samples with sizes Na and N b , for an 80% probability of finding that their actual difference
(D) is significant at the 95% level, is given by:
1/ 2
æS S ö
d = 2.8çç a + b ÷÷
è Na Nb ø
q If D>d, the difference is significant; therefore if the average trip rate in that category is
smaller in the O-D survey than in the travel diary, it has to be factored to equal the average trip
rate for the diaries. If the reverse occurs no correction is performed (i.e. the factor is one.
q If D £ d , the difference is not significant and no correction is required.
Source: J. de D. Ortúzar & L.G. Willumsen, Modelling Transport, 1994, p. 85
2.4.4 Stated preference surveys

The previous sections on household-based and non-household-based surveys both discussed revealed
preference surveys. In these surveys actual choices made by individuals are observed. However, the
observation depends on what people report they do. Revealed preference data have the following
limitations:
q Observation of actual choices may not provide sufficient variability for constructing good models for
evaluation and forecasting.
q The observed behaviour may be dominated by a few factors making it very difficult to detect the
relative importance of other variables.
q The difficulties in collecting responses for policies which are entirely new, for example a completely
new mode or road-pricing system.
In practice it is hardly possible to experiment with, for example, a new mode of transport to collect data
on choice behaviour. Instead a stated preference (SP) survey can be performed, which consists of a
quasi-experiment based on hypothetical situations set up by the researcher. Individuals are asked in
interviews or questionnaires what they would choose to do in a hypothetical situation.
A basic problem of SP data collection is if individuals would in reality choose the same option as they
stated they would do. In the 1970s it appeared that only half the people did what they said they would
do. Fortunately this amount has been reduced considerably due to improvements in survey design,
requirements for trained survey staff and quality assurance procedures.
Experimental design
At first a set of hypothetical but realistic alternatives must be constructed. These are called
technological feasible alternatives. There are four tasks in designing these alternatives:
q The identification of the range of choices (e.g. car and rail or different types of service within a
mode)
q The selection of attributes included in each option (e.g. travel time, cost, waiting time)
q The selection of the measurement unit for each attribute
q The specification of number and magnitude of attribute levels
It should always be kept in mind that the alternatives should be designed in a way that it ensures
realistic response.

The attribute combinations presented in the alternative are usually independent from one another. This
implies that the number of alternatives would be n a , where a is the number of attributes and n is the
number of levels they can take. If this number of alternatives is used it is called a full factorial design.
However, when a or n increase, the number of alternatives increases exponentially which will induce
fatigue in the respondent and reduce the value of the responses. Therefore a method called fractional
factorial design is used, which excludes part of the alternatives (on certain grounds) at the cost of being
unable to recover one or more interaction effects. The complexity of the design should however be
maintained so that there are up to three simultaneous changes in the alternatives. This has been
proven to result in the most reliable answers.
The way in which the attributes are presented must be similar with the way they are perceived by the
traveller. This can range from pictures of means of public transport to the way people perceive
“frequency”. In the latter case it has been shown that nobody thinks in trains per hour or per day, but
rather in, for example, the waiting time at the station until departure. Research has to be done before
starting the survey to ascertain in which way a certain attribute is perceived.
Questionnaire design
There are three main ways of collecting information on preferences about alternatives:
q Ranking responses. All alternatives are presented at once, and should be put in order of preference
by the respondent. It limits the number of alternatives to be used without inducing fatigue.
Furthermore it could be that the ranking can be seen as judgement by respondents which does not
necessarily correspond to the type of choices they face in real life.
q Rating techniques. These are widely used in market research. The respondents are asked to express
their degree of preference for an option using an arbitrary scale. Usually this is a scale between 1
and 10, where 1 = ‘strong dislike’, 5 = ‘indifference’ and 10 = ‘strong preference’.
q Choice experiments. The respondents have to choose between two options. Instead of this binary
method there is also a rating technique possible. The degree of preference can be expressed on a 5-
point scale: ‘definitely choose A’, ‘probably choose A’, ‘cannot choose’, ‘probably choose B’, and
‘definitely choose B’.
Sampling strategy
SP surveys are statistically efficient, because each interviewee produces not just one observation but
several. Samples therefore, can be typically smaller than is the case with revealed preference surveys.
However, if each interview results in 10 responses on 10 hypothetical choices, this provides information
about the variation within the individual but not between the individuals. For a representative model
both kinds of information are needed, and only an adequately sized and representative sample can do
this.
To forecast demand it is thus necessary to survey many types of individuals in order to obtain
representative results. Otherwise large samples would be needed to achieve enough observations on
minority choices. Using a choice based sample may be very cost-efficient in this case, but it may induce
additional bias because of the different ways different individuals perceive the choice context.
2.4.5 Longitudinal surveys

The discussion above has been concentrated on cross-sectional data collection. In this section the
longitudinal or time-series data collection methods will be discussed.
There are several types of longitudinal survey:
q Repeated cross-sectional survey. Similar measurements are made on samples of an equivalent
population at different points in time without ensuring that any respondent is included in more than
one round of data collection. Observations should be treated as if they were obtained from a single
cross-sectional survey as inferences about the population may be biased with this type of data.
q Panel survey. Similar measurements are made on the same sample at different points in time,
called ‘waves’. This section will concentrate on panel surveys, as they are apparently preferred.
There are different types of panel survey.
q Rotating panel survey. Some elements are kept in the panel for only portion of the survey duration.
q Split panel survey. This is a combination of panel survey and rotating panel survey.
q Cohort study. This is a panel survey based on elements of groups that have shared a similar
experience (e.g. birth during a given year).

Representative sampling
Panel designs are often criticised of becoming unrepresentative as their samples age over time. This is
only true in cohort study designs, which consider an unrepresentative sample to start with. A panel
design should attempt to maintain a representative sample of the entire population over time. All
families and individuals should have a known possibility to enter the sample. However, this is not
simple.
Sources of error in panel data

A panel design may add to the quality of the data, if it is properly done. But there are also higher rates
of non-response and there is a risk of contamination. This all will be explained below.
q Effects on response error. The respondents in the panel have repeated contact with interviewers
and questionnaires, which may improve the data due to the following reasons:
q Repeated interviewing reduces the amount of time between event and interview
q Repeated contact increases the chance that respondents understand the purpose of the study
q Data quality tend to improve in later ‘waves’ of the panel, probably because of learning of the
respondents and/or interviewers.
q Non response issues. There can be very little done on the non-response issue, but the information
gathered can be used to determine the characteristics of non-respondents.
q Response contamination. The initial ‘waves’ of response may differ from the subsequent waves.
Therefore they are often not used for comparative purposes. Another issue is if the membership of
the panel will influence behaviour or reported behaviour. This seems to depend on the type of
behaviour measured.

PART B – Travel Demand Modelling

3 Networks and study area definition4
In this chapter some aspects on the definition of networks and study area will be discussed. The
network can be seen as the supply of capacity for travelling, and will be used to derive the cost of
travelling (or deterrence) on the network which is an important input to the trip distribution and
assignment stages of the four-stage transport model.
In defining the network and study area there will have to be made a compromise between the level of
detail (accuracy) and cost.
3.1 Zoning design

At first the study area should be carefully defined, by which the following ideas could be helpful:
q Consider the schemes to be modelled and the nature of the trips of interest: mandatory or optional,
long or short distance, etc.
q One would probably like that the majority of trips have their origin or destination in the study area
(however, this is not always possible)
q The study area should be somewhat bigger than the area of interest to cope with future changes
When the study area is known, a zoning system must be defined. In a zoning system the individual
households and premises in a study area are aggregated into manageable chunks so they can be used
in a model. These chunks are called internal zones. In the area outside the study area external zones
are defined, which are usually larger and are used to model the traffic entering, leaving or passing
through the study area. However the latter is often omitted. Traffic between zones is called interzonal
traffic. Traffic within zones intrazonal traffic. Of course the volume of intrazonal traffic increases with
increasing zonal size.
Zones vary in size, with the smallest about the size of a block in the downtown area (Central Business
District = CBD), whereas the largest on the urban fringe may be several square kilometres in area. An
area with a million people could have 700 to 800 zones. The study-area is accordingly divided into
zones. In general the following factors are considered relevant in the design of a zoning system for a
study area (Black, 1981):
q zones should contain distinctive land-use patterns such as residential or industrial use
q characteristics of the activities within a zone should be as homogeneous as possible so that derived
zonal means are representative of activity in the whole zone
q the zone system should conform to census (survey) collection areas (e.g. postal codes)
q zonal boundaries need to follow, where possible, rivers and other physical barriers to movement.
These zones are represented in the models as if all their characteristics (attributes) and properties were
concentrated in a single point called the zone centroid, connected to the road network with a centroid
connector, or zone connector.
An example of a zoning system is shown in figure 3.1.
4
Based on: J de D. Ortúzar & L.G. Willumsen, Modelling transport, 1994, p. 102-108

Figure 3.1: Structure of zones with highway network


3.2 Network representation
This section discusses the way to describe a transport network in a computer model.
3.2.1 Schematisation
A transport network may be represented at different levels of aggregation. The highest level of
aggregation, for example, assumes a continuous equation of the average traffic capacity per unit of
area. Mostly, however, discrete elements are used which are called links.
Normal practice is to model the network as a directed graph, which is a system of nodes and links
joining them. The nodes are to represent junctions, and the links to represent homogeneous stretches
of road between two junctions. There can be virtual nodes needed if the attributes of a link change
elsewhere than on a junction. The links have attributes like length, speed, number of lanes etc. Links
have to be unidirectional so two-way links will be split up in two one way links. The zone centroids are
represented as nodes, and the centroid (or zone) connectors as links. See figure 3.1.
A problem in this network schematisation is that connectivity to each link joining a specific node is
offered at no cost, while some turning movements are more difficult to perform than others. One might
have to wait for a long time to turn left (or right, for countries driving left) for example. The turning
movements can be banned or penalised. This can be done manually by introducing dummy links with a
certain cost for each turning movement, or semi-automatic by an advanced computer program.
One of the key decisions is how many levels of the road hierarchy should be represented in the model.
Generally, only the roads that have a traffic-flow or access function will be represented. It is also said
that one should include at least one level of roads lower than the level of interest.
For public transport modelling special routes can be modelled with attributes as frequency, capacity and
travel times. This will not be discussed here.
3.2.2 Link properties

The level of detail provided about the attributes of links depends on the general resolution of the
network and on the type of model used.
The attributes are minimally:
q length
q travel speed
q capacity
In addition one can provide information on:

q type of road
q road width, number of lanes, or both
q an indication of the presence or otherwise of bus lanes, or prohibitions of use by certain vehicles
q banned turns, or turns to be undertaken only when suitable gaps in the opposing traffic become
available etc.
q type of junction and details including signal timings
q storage capacity for queues and their presence at the start of a signal phase
These are not all attributes that can influence a drivers choice for a specific route, there have also been
find others like toll or the scenic quality of a route. These are attributes depending on the location, and
should be included when this is assumed necessary.
3.2.3 Network costs

Most of the assignment techniques assume that drivers seek to minimise a linear combination of time
and distance (generalised cost of travel). Although it has been seen that other route features can also
be of importance, the great majority of network models in use today, deal only with travel time and
distance.
When travel time is modelled as a function of flow there are two different cases:
q Delay on a link is assumed to depend only on the flow on the link itself (inter-urban area)

q Delay on a link depends in an important way on flow on other links (urban area)
When facing the issue of equilibration of supply and demand the first case is easier than the second.
However, there are techniques to balance demand and supply in the case of link-delay models
depending on flows on several links.

4 Trip generation modelling
4.1 Introduction
In this chapter the stage of trip generation modelling will be discussed. After some introductory notes
three methods of trip generation modelling will be explained: growth factor modelling, regression
analysis and category analysis.
The following scheme shows the process of trip generation modelling.
INPUT OUTPUT
- Socio-economic data Trip Number of trips generated

- Land-use description of Generation in different zones of the
the activity system Model study area
Figure 4.1 Trip generation modelling
First some basic definitions will be given:

q Trip – a one way movement from a point of origin to a point of destination
q Home-based trip – a trip of which the home is either the origin or the destination
q Non-home-based-trip – a trip of which the home is neither the origin nor the destination
q Trip production model – the total number of trips produced in a zone, irrespective of their
destination is derived
Trip production is largely dependent on the characteristics of the household (income, household
structure, mode availability), the characteristics of the zone (land-use, housing density,
industrialisation) and the accessibility of the zone (quality and quantity of transport possibilities
(infrastructure and modes)).
q Trip attraction model – the total number of trips attracted to a zone, irrespective of their origin is
derived
Trip attraction is largely dependent on job availability, land-use (for industry, education, shops, health
care, banks, governmental offices, recreation, airports, harbour etc.)
The terms origin and destination do not always have the same meaning as production and attraction. In
home-based trips the home of the trip-maker is always the trip production. With non home-based trips
the trip is always produced at the origin. The zone of a non-home activity for a home-based trip, or the
destination zone for a non home-based always attracts a trip.
4.2 Classification of trips

To make a good estimate of the trip production and attraction a classification of trips is necessary.
This classification can be done by trip purpose, time of day, person type of modal choice.
By trip purpose
There are five categories that are usually employed:
q Trips to work

q Trips to school or college

q Shopping trips
q Social and recreational trips
q Other trips
By time of day
Trips are usually classified in peak and off-peak trips. There are significant differences between these
periods, for example in trip purpose. There is also a difference between the morning and the evening
peak.
By person type
Travel behaviour is largely dependent on socio-economic attributes, so the classification by person type
is important. The following (stratified) categories are usually employed:
q Income level
q Car ownership
q Household size
By modal choice
The trips can be classified by modal choice, like car, train, bus or bicycle. Of course the availability of a
certain mode plays a role in this.
4.3 Methods to model trip generation
4.3.1 Growth-factor modelling

Most methods attempt to predict the number of trips generated by household or zone as a function of
relations to be defined from available data. One of these methods is the growth- factor method which
will be discussed here.
The growth-factor model can be described as:

Ti = Fi * t i
where Ti and ti are respectively future and current trips in zone i and Fi is a growth factor.
The factor Fi should be estimated. Usually it is related to population (P), income (I) and car ownership
(C), in a function such as:
d d d
f ( Pi , I i , C i )
Fi =
f ( Pi c , I ic , C ic )
where f is a function of the given variables, and d and c denote the design and current years
respectively.
Although easy to use, the growth-factor method is subject to large error, as can be seen in the example
in box 4.1. Therefore it is only used in practice to predict the future number of external trips to an area.
They are not too many and there are no simple ways to predict them otherwise.

Box 4.1 Example: growth factor modeling
Consider a zone with 250 households with car and 250 households without car. Assuming we know the
average trip generation rates of each group:
car-owning households produce: 6.0 trips/day

non-car-owning households produce: 2.5 trips/day
we can easily deduce that the current number of trips per day is:
t i = 250 * 2.5 + 250 * 6.0 = 2125trips / day
Let us also assume that in the future all households will have a car; therefore, assuming that income
and population remain constant (which is a safe hypothesis in the absence of other information), we
can estimate a simple multiplicative growth factor as:
d c
Fi = C i / C i = 1 / 0.5 = 2
Then, we can estimate the number of future trips as:
Ti = Fi * t i = 2 * 2125 = 4250trips / day
However the method is very crude, as we will demonstrate. If we use our information about average
trip rates and make the assumption that these will remain constant, we can estimate the future num-
ber of trips as:
Ti = 500 * 6 = 3000trips / day
which means that the growth factor method would overestimate the number of trips of approximately
42%. This is very serious because trip generation is the first stage of the modelling process; errors
here are carried through the entire process and may invalidate work on subsequent stages.
Source: J. de D. Ortúzar & L.G. Willumsen, Modelling transport, 1994, p. 118
4.3.2 Regression analysis5

The general form of trip generation models is:
Y = A + B1 X 1 + B2 X 2 + B3 X 3 + ...
where Y for example are the trips/household, X1 the car ownership, X2 the family income and X3 the
family size in case of a production equation. The A’s and B’s are coefficients determined through
multiple linear regression. For background on regression analysis the reader is referred to statistics
theory. Note that it is important to have at least some basic knowledge on this subject to be able to
judge the results of a regression exercise.
The model parameters are established using base-year data. Once the equations are calibrated, they
are used to estimate future travel for a target year. A goodness-of-fit (or coefficient of multiple
regression) R2 may be used to find the quality of fit of the calibrated equation to the data. R being a
value between 0 and 1. The closer R is to 1, the better is the linear relationship between the variables.
5
More information on multiple regression analysis can be found in any textbook on statistics, e.g. Alan Field’s
Discovering Statistics using SPSS – Chapter 5.

Question: Suppose we have Y=50.5 + 0.80X1+1.75X3, with a R2=0.34. Should you use this trip
generation model?6
Recall that residential land-use can be seen as an important trip generator (producer). Non-residential
land-use (shopping, industry etc.) in many cases is a good attractor of trips.
To calculate the total number of trips produced in a zone Oi (the total number of origins Oi, trips
departing zone i) having a regression model on household level7 we have to sum all the trips produced
in the households by doing ShOh in case of:
Oh = 0.91 + 1.44 Eh + 1.07Ch
with Eh number of employed residents in household h, Ch the number of cars available to household h.
Besides this we might have an equation for the trips attracted (the total number of destinations Dj, trips
arriving in zone j) to the zones (in this case at zonal level already!):
D j = 3.29 E z + 64.68 Rz + 58.53
where Ez is the employment in zone z and Rz the retail floor space in the zone in m2.
In table 4.1 an example of some trip generation rates taken from the American Institute of
Transportation Engineers handbook on Trip Generation are given. In this handbook (3 volumes) trip
generation rates for all kinds of land-uses are given.
Land use Daily vehicle trip rate Per

Residential
Single-family 9.55 Dwelling unit (DU)
Apartment 6.47 DU
Condo/townhouse 5.86 DU
Mobile-home park 4.81 Occupied DU
Planned unit development 7.44 DU
Retail
Shopping centre
<100.000 sq.ft. 70.7 1000 sq. ft. floor space
100.000 to 500.000 sq. ft. 38.7 1000 sq. ft. floor space
Office
General 11.85 1000 sq. ft. floor space
Medical 34.17 1000 sq. ft. floor space
Research and development 7.7 1000 sq. ft. floor space
Table 4.1 Average number of daily vehicle trip ends for some land-uses (ITE, 1998)
4.3.3 Category analysis technique for trip generation

With category analysis techniques the population in a study-area is divided into a number of
homogenous groups (or categories) with certain socio-economic characteristics. Per category the travel
behaviour is assumed to be steady over time. By predicting the future composition of the population the
future travel behaviour of the population can be predicted.
In this method the cell rates are computed per purpose group by dividing the total number of trips in a
cell h, by purpose p, by the number of households H(h) in it, as follows:
6
The R2 is far from 1, and therefore apparently the used variables are not representative enough for estimating the
trip generation in the area. Incorporating some other variables might solve the problem. Another reason might be
the sample-size. The number of family’s or company’s interviewed can be too small to aggregate.
7
Sometimes also zonal regression models exist which calculate the total number of trips originating in a zone as a
function of for example population Pi, number of houses Hi and cars Ci in a zone like a study in Toronto
Oi=0.351Pi+0.145Hi-0.253Ci. Explain the minus sign for Ci!

T p ( h)
t ( h) =
p
H ( h)
where tp(h) is now the average number of trips with purpose p (and at a certain time period) made by
members of households of type h. Types are defined by the stratification chosen: for example, a cross-
classification based on m household sizes and n car ownership classes will yield mn types h.
The principle of category analysis is illustrated in table 4.2.
It has been shown the trip production of households is largely dependent on the car ownership and
household size. Suppose a division is made in 3 classes of car ownership and 4 classes of household
size. The number of trips per household is now determined as:
House- Car ownership

hold
0 1 2+
size
1 0.12 0.94 -
2 or 3 0.60 1.38 2.16
4 1.14 1.74 2.60
5 1.02 1.69 2.60
Table 4.2 Results of a category analysis
Alike the regression models the category method also allows for aggregation by multiplying the trip
rates with the number of households of type h in zone i. If Hn(h) is the set of households of type h
containing persons of type n, then the total trip productions with purpose p by person type n in zone I,
is as follows:
Oinp = å a (h)t i
p
( h)
hÎH n ( h )
Category analysis is seldom used for trip attraction calculations.
4.4 Balancing
Since trips, which originate somewhere, always have to end somewhere there is always the condition
that Oi and Dj are equal (in balance). Unfortunately since all input data normally comes from different
sources (production data are socio-economic data from household interviews, and attraction data from
aggregated statistical sources on land-use) this is seldom the case (SiOi¹SjDj). Normally, production data
are more accurate and therefore zonal attractions are normally multiplied with a ‘correction’ (balancing)
factor f:
i
åO i
f = i =1
J
åD
j =1
j

5 Trip distribution modelling
5.1 Introduction
In this chapter the stage of trip distribution modelling will be discussed. Since we know for every zone
in our study area the total number of trips originating and departing (Oi and Dj), we are now interested
where those productions go to, and where the attractions come from. The outcome is of course the
origin-destination trip table.
The following scheme shows the process of trip distribution modelling.
INPUT OUTPUT
Number of trips generated Trip Matrix with person trips

in different zones of the Distribution distributed amongst different
study area Model origins and destinations
(OD-matrix [person trips])
Figure 5.1 Trip distribution modelling
In general two situations now exist:

1. you already have a base-year or old OD table
2. you have to derive a OD trip table from scratch
An important concept used in this stage is the so-called impedance. Impedance can represent travel
time, cost, distance, or a combination of factors. Generally, impedance is the weighted sum of various
types of times and types of cost. Therefore it is also called generalised costs, as an equation:
Cij
cij = min (Tij + )
r g
where the minimum generalised cost on a trip from i to j are a combination of the travel time Tij plus
the money-costs Cij, translated in time using the value-of-time g (in Netherlands for example something
between 5 (recreational) and 20 (business) Euro per hour).
5.2 Updating a base-year table with future forecasts

Assume the following OD trip table:
Arrivals
Departures Sj Tij
1 2 j n
1 T11 T12 T1n O1
2 T21 T22 T2n O2
…
i Tij Oi
m Tml Tm2 Tmn Om
Si Tij D1 D2 Dj Dn T
Table 5.1 Standard OD trip matrix

From this table it must be clear that SjTij=Oi for all i, and SiTij=Dj for all j. So, if the situation exists that
you already have a (trustful) base-year OD then growth factor models might help you to predict
(forecast) the traffic for the coming 10 years.
Three types of growth factors are important:

1. Uniform growth factor model
2. Singly constrained growth factor model (origin and destination constrained)
3. Doubly constrained growth factor model
5.1.1 Uniform growth factor

In this case you have limited information. Therefore the whole base-year table is multiplied with one
factor t (for example t=1.2, means an increase of total traffic with 20% in 10 years.). This method is
seldom used.
In formula in the uniform growth factor method each cell of the OD table is multiplied with the general
growth rate:
Tij = ttij "i, j
where Tij are the ‘updated’ future trips, and tij the ‘old’ base-year trips.
5.1.2 Singly constrained growth factor method

Consider the situation where information is available on the expected growth in trips originating
(production Oi) in each zone, for example trips to school. In this case the base-year matrix (table) can
be updated with an origin-specific (singly constrained) growth factor (ti), which is being applied to all
rows in the table. The same situation is possible regarding the trips arriving in each zone, for example if
information is available on the expected growth in number of shopping trips, because a big shopping-
centre is built at the other side of the canal. In this case all columns are multiplied with destination-
specific growth factor (Gj), again this is singly constrained.
In formula this can be written respectively as:

Tij = t i tij " rows
Tij = G j tij " columns
As an example8 consider the following ‘old’ or base-year OD table:
Arrivals Sj Tij
Departures
1 2 3 4
1 5 50 100 200 355
2 50 5 100 300 455
3 50 100 5 100 255
4 100 200 250 20 570
Si Tij 205 355 455 620 1635
Table 5.2 Base-year trip matrix
Question: Calculate the future matrix according to the uniform growth factor method, assuming no
other information is available to you (t=1.2)9.
Suppose that for the origins the growth is already predicted, which is displayed in the column ‘Target
Oi’.
Arrivals Sj Tij Target
Departures
1 2 3 4 Oi
1 5 50 100 200 355 400
2 50 5 100 300 455 460
3 50 100 5 100 255 400
8
Example taken from (Ortúzar & Willumsen, 1994).
9
Just multiply each cell with the growth factor 1.2, for example cell (1,1) becomes 6, the row total ∑jT1j (total trip
production of row 1) becomes 6+60+120+240 = 426 trips.

4 100 200 250 20 570 702

Si Tij 205 355 455 620 1635 1962
Table 5.3 Origin-constrained growth trip matrix
By multiplying each row by the ratio t=(Target Oi)/Sj the resulting, updated OD table is derived.

Departures
1 2 3 4 Oi
1 5.6 56.3 112.7 225.4 400 400
2 50.5 5.1 101.1 303.3 460 460
3 78.4 156.9 7.8 156.9 400 400
4 123.2 246.3 307.9 24.6 702 702
Si Tij 257.7 464.6 529.5 701.2 1962 1962
Table 5.4 Expanded origin-constrained growth trip matrix
5.1.3 Doubly constrained growth factor method

Updating the OD trip table becomes much more difficult when information is available on the future
number of trips originating and arriving in each zone of the study-area. Because in this case we have
two growth rates (two separate but not independent singly constrained growth factor problems), ti and
Gj.
You might wonder for example whether you can take the average growth factor (ti + Gj)/2?10
Several methods to solve this exist, but the best one proven is the so-called Furness method using
balancing factors. This method will also be useful when discussing the gravity model.
The balancing factors are called Ai and Bj and can be written:

Tij = t i Ai G j B j tij , or
Tij = ai b j tij
when incorporating the growth factors in ai=tiAi and bj=Gjbj.
These factors ai and bj have to be recalculated until SjTij=Oi and SiTij=Dj, according to iterative process
called the Furness method:
set all bj=1.0 and solve for ai (like in the singly constrained case)
with the latest ai, solve for bj, i.e. satisfy the attraction constriant
keeping bj fixed, solve for ai and repeat 2) and 3) until changes are sufficiently small, like 5% difference
between target and estimated value.
This method is also called bi-proportional fitting of the base-year data to expected future data.
An example will make clear that the theory of this method is much more difficult than the ‘doing’.
Assume the following doubly constrained growth factor problem:

Departures
1 2 3 4 Oi
1 5 50 100 200 355 400
2 50 5 100 300 455 460
3 50 100 5 100 255 400
4 100 200 250 20 570 702
Si Tij 205 355 455 620 163511
Target Dj 260 400 500 802 1962
Table 5.5 Doubly-constrained growth trip matrix
10
If you try this you will notice that the average growth factor is not converging to your target productions and
attractions.
11
Remember from previous paragraph that the total productions and attractions had to be balanced, otherwise we
would have had a problem at this stage, with a different row and column total!
First start with calculating the correction factors ai by dividing the target Oi’s by the Sj’s, deriving next
table;
Step 1:
Arrivals Sj Tij Target ai
Departures Oi
1 2 3 4
1 5 50 100 200 355 400 1.13
2 50 5 100 300 455 460 1.01
3 50 100 5 100 255 400 1.57
4 100 200 250 20 570 702 1.23
Si Tij 205 355 455 620 1635
Target Dj 260 400 500 802 1962
Bj 1 1 1 1
Step 2:
Accordingly ‘correct’ all rows with the ai’s, as well as deriving the new bj’s, resulting in:
Departures Oi
1 2 3 4
1 5.63 56.34 112.68 225.35 400.00 400 1
2 50.55 5.05 101.10 303.30 460.00 460 1
3 78.43 156.86 7.84 156.86 400.00 400 1
4 123.16 246.32 307.89 24.63 702.00 702 1
Si Tij 257.77 464.57 529.51 710.14
Target Dj 260 400 500 802 1962
Bj 1.01 0.86 0.94 1.13
Step 3:
The biggest difference in total and the target value is 14% (0.86), so the next step is to multiply all
columns with the bj’s, obtaining:
Departures Oi
1 2 3 4
1 5.68 48.51 106.40 254.50 415.09 400 0.96
2 50.99 4.35 95.46 342.53 493.33 460 0.93
3 79.11 135.06 7.41 177.15 398.73 400 1.00
4 124.22 212.08 290.73 27.82 654.85 702 1.07
Si Tij 260.00 400.00 500.00 802.00
Target Dj 260 400 500 802
Bj 1 1 1 1
The biggest difference now being 7% (1.07).
The solution of this problem, after three iterations on rows and columns (three sets of corrections for all
rows and three for all columns), can be shown to be12.
Departures
1 2 3 4 Oi
1 5.25 44.12 98.24 254.25 401.85 400
2 45.30 3.81 84.78 329.11 462.99 460
3 77.04 129.50 7.21 186.58 400.34 400
4 132.41 222.57 309.77 32.07 696.82 702
Si Tij 260 400 500 802 1962
Target Dj 260 400 500 802 1962
Table 5.6 Expanded doubly-constrained growth trip matrix
12
Try this out yourself. This is easiest using some kind of spreadsheet programme like MS-Excel.

Check the accuracy is now less than the target accuracy of 5%!
5.3 The Gravity method

The gravity method is normally being used if no ‘old’ base-year OD table exists, or too many changes
took place in your activity system or transport system. Otherwise, you can stick to a matrix update
using growth factors. Since we are interested in having an OD trip table, we make use of the
information we already have on total productions and total attractions for the zones (the outcome of the
trip generation model) and the generalised costs cij of travelling between zones (these have to be
estimated using a shortest path algorithm, see box 7.1).
In general, if the generalised costs of making a trip increase, the ‘likeliness’ f(cij) of making that trip
decreases, which sounds quite logical. An important type of distribution function is the negative
exponential function:
f (cij ) = exp(- bcij )
with b a calibrated coefficient, with a value around 0.05. Another one is the power function:
-a
f (cij ) = cij
Figure 5.1: Some distribution functions (Immers & Stada, 1999)
These functions don’t represent reality too good, for example with the use of the car, we normally
expect that for the shorter distances the distribution value is small and increases only after cij has a
moderate value up to a certain maximum after which it behaves again as an exponential function
(decreasing with increasing costs).
To solve this problem the exponential distribution function is usually combined with the power function:
f (cij ) = ca ij exp(- bcij )
with a=0.5 and b=0.12. The power function, negative exponential and combined function are shown in
figure 5.1.

Question: Comment on the practicality of the exponential model, based on the following statement: an
absolute increase in resistance (impedance) at low values of resistance cij has the same effect on the
likeliness of the trip as the same increase should happen at higher values of resistance cij.13
This distribution function is now used in the gravity model. The gravity model is based on Newton’s
gravity law. Trip making behaviour is influenced by external factors such as total trip ends (Oi and Dj)
and distance travelled (distribution function). In its simplest formulation it states that the number of
Pi Pj
Tij = a
d ij2
trips between two zones is the resultant of both populations (P) divided by the distance (d) between the
two zones squared;
In our case this model is further generalised as:

Tij = aOi D j f (cij )
, which can also be formulated as Tij=µPiAjf(cij).
In practice the gravity model is applied to the singly (only the productions or attractions are known for
the future) and doubly (both the productions and attractions are known for the future) constrained
case. Again the balancing factors Ai and Bj are introduced (to replace a (or µ)) and are grouped again to
form ai and bj.
Tij = ai b j f (cij )
which is the same as with the growth factor, but with the base-year table tij replaced with the initial
impedance’s14 (function of generalised costs) of travelling between each origin and destination.
An example is given using the negative exponential function as distribution function f(cij)=exp(-bcij).
First we want to know the initial ‘costs’ (in time and/or money) of travelling between each O and D as
well as the future trip-ends (target Oi and Dj). Costs: cij
Costs Cij Target

From/ to
1 2 3 4
1 3 11 18 22 400
2 12 3 13 19 460
3 15.5 13 5 7 460
4 24 18 8 5 702
Target 260 400 500 802 1962
Table 5.7 Cost matrix and trip-end totals for gravity model estimation
From this table we derive the ‘likeliness’ that someone is making that trip. For example it seems
unlikely that many people will go from zone 4 to zone 1, since the cost are very high: 24. Maybe
because there is a canal between zone 4 and 1, making the trip very lengthy by forcing people to make
a big detour.
13
Intuitively you must agree that an increase in resistance from 5 to 10 minutes has a larger impact on once
destination choice than for the same absolute increase in travel time from 120 to 125 minutes. Therefore the
negative exponential function shall only fit empirical data to a certain extend.
14
Most textbooks on transport modelling give different names for the resistance of travelling: impedance, friction,
generalised costs etc. Also distribution function, friction function etc.

Filling in all these cij in the function assuming b=0.10, we get:
Arrivals Sj Tij
Departures
1 2 3 4
1 0.74 0.33 0.17 0.11 1.35
2 0.30 0.74 0.30 0.15 1.49
3 0.21 0.27 0.61 0.50 1.59
4 0.09 0.17 0.45 0.61 1.31
Si Tij 1.34 1.51 1.52 1.36 5.74
Table 5.8 Matrix exp(-bcij) and sums to prepare for a gravity model run
This table together with the target values forms the basis for the Furness problem previously discussed.
The total initial table being:

Departures Oi
1 2 3 4
1 0.74 0.33 0.17 0.11 1.35 400 296.30
2 0.30 0.74 0.30 0.15 1.49 460 308.72
3 0.21 0.27 0.61 0.50 1.59 460 289.31
4 0.09 0.17 0.45 0.61 1.31 702 535.88
Si Tij 1.34 1.51 1.52 1.36 5.74
Target Dj 260 400 500 802 1962
bj 1 1 1 1
After a few steps the resulting gravity model matrix is obtained:
Arrivals Sj Tij
Departures
1 2 3 4
1 155.37 99.00 64.46 74.17 393.36
2 57.54 200.22 106.73 90.98 455.56
3 25.87 47.01 137.16 192.77 402.81
4 20.8615 53.77 191.65 444.08 710.37
Si Tij 260 400 500 802 1962
Table 5.10 The resulting gravity model matrix
This OD table is very important, because at stage we know exactly how many trips are produced in and
attracted to every zone, but we also know how many trips go from one zone to another.
A weakness of this OD table is that the result is based on the use of a distribution function that is using
unimodal travel cost cij, whereas you should expect that different modes have different distribution
effects. To cope with this some authors have proposed using the minimum cost of traversing a certain
pair ij (min{cijm}), or using the average costs for cij, like depicted in the following formula:
åc m
ij
cij = m
m
Or, a weighted average cost function, where bijm is the mode choice proportion.
cij = å b ijm × cijm

m
In simultaneous distribution / modal split models this problem is taken hold of (see also chapter 6).
15
Indeed a small number of trips go from zone 4 to zone 1! Imagine now what would have happened if the connection
between zone 4 and zone 1 was made ‘cheaper’ (in length and/or time) by the construction of a tunnel.
5.3 Tri-proportional fitting
In sections 5.1 and 5.2 the method of bi-proportional fitting was used to solve doubly-constrained
problems. The constraints concerned the total of origins and destinations in a zone. However, a third
constraint, concerning trip length distribution, can be introduced.
The trip length can also be seen as the impedance, or cost used in the gravity model. Instead of trip
length ranges, cost-bins will be used.
The following example will illustrate the tri-proportional fitting method. Therefore the same problem as
in section 5.2 will be used. The model can be solved using Furness iterations as shown before, however
now three steps have to be taken in each iteration instead of two.
The balancing factors are now ai, bj and Fk. The target ranges are given in table 5.11, the resulting
matrix after iterations in table 5.12.
Ranges (cost-bins)
1.0-4.0 4.1-8.0 8.1-12.0 12.1-16.0 16.1-20.0 20.1-24+
Number of trips 365 962 160 150 230 95
Table 5.11 Target values trip length distribution for tri-proportional gravity model calibration
Departure Arrivals Sj Tij ai

s 1 2 3 4
1 161.6 102.5 60.8 72.5 397.4 1.27
2 56.5 199.4 101.2 101.0 458.0 1.13
3 18.9 48.7 116.7 217.1 401.4 0.60
4 23.0 49.5 221.3 411.5 705.3 1.14
Si Tij 260 400 500 802 1962
bj 0.57 0.70 0.87 1.63
Ranges (cost-bins)
1.0-4.0 4.1-8.0 8.1-12.0 12.1-16.0 16.1-20.0 20.1-24+
Number of trips 365 962 160 150 230 95
Tk 360.9 966.5 159.0 149.8 230.3 95.5
Fk 224.55 220.13 87.54 102.05 54.66 34.90
Table 5.12 The resulting matrix after five complete iterations
5.4 Some practical notes

Sparse matrices
Sparse trip matrices contain zero values, and are quite common. As samples are used it is possible that
zero values are observed for a certain OD-relation, while in reality this relation is not zero but very low
having a smaller chance to be included in the sample. This can lead to error during expansion of the
matrix. It is possible that for example a bi-proportional algorithm will not converge to a solution due to
the empty cells in the matrix. Generally this can be solved by filling the empty cells with a low number,
like 1.
External trips
Trips with one end outside the study area are not considered in the synthetic gravity model. It is
impossible as the distance or cost of the external trips is unknown.
The practice is to exclude external trips from the synthetic modelling process. Roadside interviews at
cordon points will lead to the desired information on external-external and external-internal trips. These
data can then be updated using the Furness growth factor method. A number of the trip ends from the
trip attraction models correspond to the external-internal trips and have to be subtracted from the trip
end totals to be used as constraints.

Intrazonal trips
To estimate the number of intrazonal trips cost values are given to centroid connectors, which is a crude
method but a necessary approximation in some cases. However it is preferable to remove intrazonal
trips from the synthetic modelling process and forecast them for example as a fixed proportion of the
trip-ends.
Intrazonal trips do not use the network modelled so it is less essential to model them in an accurate
way. If a coarse zoning system is used, the problem can however be significative.
Productions-Attractions, Origins-Destinations
To assign a trip matrix onto a network it is necessary that it has the shape of an origin-destination
matrix. However, in synthetic modelling, the trip productions and attractions are used. As we have seen
before, in home based trips the home end is always the production end, which would in case of a trip to
work and home again lead to two trips with the same production and attraction end, but with different
origin and destination.
In case of a 24-hour trip matrix the OD-matrix is almost the same as the production/attraction matrix,
as it is assumed that each produced trip is made once in each direction every day.
In case of a shorter time period this will be different. It is not sure in which direction a trip will take
place. To solve this there are two different approaches:
Produce a matrix for a single purpose, typically ‘to work’, and assume that these follow just one
direction of travel (production-attraction during morning peak). Some corrections have to be made for
shift work, flexible working hours etc.
Use survey data to determine the proportion of the matrices for each purpose that will fit within the
time of day, for example 70% production-attraction and 30% attraction-production trips.

6 Modal split modelling
6.1 Introduction
In this chapter modal split modelling will be discussed. At this stage in the modelling process we have
information on the number of trips between every origin and destination in the study-area. These are
person-trips, and not vehicle-trips, since, we don’t know who is cycling, walking, taking a car, public
transport etc. Therefore we need to predict the mode use, the so-called modal split.
The following scheme shows the modal split modelling process.
INPUT OUTPUT
Number of person trips Modal - Number of vehicle trips

between origins and Split between origins and destina-
destinations Model tions
- OD-matrix [vehicle trips]
Figure 6.1 Modal split modelling
The factors influencing the choice of mode may be classified into three groups (ú & Willumsen, 1994):
1. Characteristics of the trip maker. The following features are generally believed to be important:
q car availability and/or ownership
q possession of a driving license
q household structure (young couple, couple with children, retired, single etc.)
q income
q decisions made elsewhere, for example the need to use a car at work, take children to school,
etc.
q residential density
2. Characteristics of the journey. Mode choice is strongly influenced by:
q the trip purpose; for example, the journey to work is normally easier to undertake by public
transport than other journeys because of its regularity and the adjustment possible in the long
run
q time of the day when the journey is undertaken. Late trips are more difficult to accommodate
by public transport
3. Characteristics of the transport facility. These can be divided into two categories.
a. quantitative factors such as:
q relative travel time, in-vehicle, waiting and walking times by each mode
q relative monetary costs (fares, fuel and direct costs)
q availability and cost of parking
b. qualitative factors which are less easy to measure:
q comfort and convenience
q reliability and regularity
q protection, security
6.2 Trip Interchange modal-split models

This model is a so-called post-distribution model (since there also exist simultaneous models, which
calculate mode choice as part of the trip distribution). The models use the logit model. The term logit
refers to the S-shaped logit curve shown in figure 6.1. The logit formulation is a share model (as the
gravity model discussed in previous paragraph) that divides the persons between the various modes

depending on each mode’s relative desirability for any given trip. Modes are said to be relatively more
desirable if they are faster, cheaper, or have other more favourable features than competitive models.
Figure 6.1: Modal split (logit curve)

The first models included only one or two characteristics of the journey, typically (in-vehicle) travel
time. It was observed that an S-shaped curve seemed to represent this kind of behaviour better. In the
curve the proportion of trips by mode 1 (T’ij/Tij) against the cost or time difference is given.
Apart from the use of the logit curve there exists the sequential mode choice model using the theory of
utility as discussed in Intermezzo II. The better a mode is the more utility is has for the potential
traveller. The logit model takes the following form to trade off the relative utilities of various modes:
probability of using mode i, Pi, is given by:
eU (i )
Pi = n
åe
r =1
U (r )
, where U(i) is the utility of mode i, U(r) is the utility of mode r and n the number of modes in
consideration. This model is also called a multinomial logit model (or binomial logit model if only two
modes (often car versus public transport) are considered).
An example in box 6.1 will show the use of a simple binomial logit model.

Box 6.1 Example: binomial logit model
Assume the calibrated utility functions for car and public transport travel16:
Car: Ucar= -0.3-0.04X-0.1Y-0.03C
Public Transport (PT): UPT=-0.04X-0.1Y-0.03C
, where Ui = utility function of mode i

X = in-vehicle travel time
Y = out-of-vehicle travel time
C = cost of travel/income
A zone in our study area has the following characteristics:
Car Public transport

In-vehicle time (min) 15 20
Out-of-vehicle time (min) 5 10
Travel costs (cent) 300 75
What is the probability that a person from a certain zone with an average income per inhabitant of USD
10,000 will travel by public transport?
Calculate Ucar and UPT:

Ucar = -0.3-0.04(15)-0.1(5)-0.03(300/10000)=-1.4
UPT = -0.04(15)-0.1(10)-0.03(75/10000)=-2.6
The probability of the trip maker taking PT is:
eU PT e -2.6
PPT = = = 0.23 , or 23%, so Pcar=1-PPT=77%
eU PT + eU car e - 2.6 + e -1.4
If now for example the OD table for this zone gives 620 trips to a certain other zone, determine the
number of trips by each mode.
Normally, after we know the mode-split for the different zones of our study-area the OD trip table (from
the trip distribution step) is transformed into n OD mode specific trip tables, using the modal-split and
the vehicle-occupancy. Using vehicle-occupancy factors we can translate 300 car trips in a certain zone
into the number of vehicles we actually will find on the road! For example, assuming a vehicle-occupancy
factor of 1.3 gives 231 vehicle-trips. For bicycle and pedestrian this factor is of course 1.0. The factor is
normally not used for public transport, since occupancy-factors are route- and time depended.
The simultaneous trip distribution/modal split model incorporates mode-specific distribution functions
fm(cijm ) in the gravity model:
Tij = å Tijm = ai b j å f m (cijm )
"m "m
, which means that the distribution of trips over the different modes m between every origin O and
destination D is proportional with the mode-specific distribution function values fm(cijm ).
The only thing that is left for analysis is the actual number of vehicles we will encounter on the roads.
This is the last step route assignment, where we translate the OD tables into vehicles on the different
routes in the network.
16
Why are all signs negative?

7 Traffic assignment modelling
7.1 Introduction
In this chapter traffic assignment modelling will be discussed.
The following scheme shows the process of traffic assignment modelling.
INPUT OUTPUT
Number of vehicle trips Traffic Traffic volumes, noise,

between origins and Assignment pollutant emissions, speeds
destinations per time period Model etc. on the different routes
in the network
Figure 7.1 Traffic assignment modelling
In the assignment procedure the transport engineer or planner predicts the routes the vehicle-trips
estimated in the previous steps will take and assigns them to the different links. For example, if a trip
goes from a suburb to downtown, the model predicts the specific streets or public transport routes to be
used. The trip assignment process begins by constructing a map representing the vehicle and public
transport networks in the study area. These maps show the possible routes that trips can take.
The intersections, called nodes on the network map are identified, so that the sections between them,
called links can be identified. After the links are identified by nodes, the length, type of facility, location
in the area, number of lanes, speed, maintenance condition and travel time are identified for each link.
For public transport extra information on routes, frequencies, headway’s etc is necessary. This
information allows us (in practice the computer) the routes that travellers might take between any two
points (nodes) on the network and to assign trips between zones (represented by the centroid with its
centroid connector to the network) to these routes.
The output of trip assignment shows the routes that all trips will take, and therefore the number of cars
on each roadway (link’s) and the number of passengers on each public transport route.
With this, and the use of the previous step the planner can obtain realistic information/estimates of the
effects of policies and programs on travel demand (for example the introduction of a tunnel connection
between two zones, which are separated through the canal). The planner can assess the performance of
alternative transport systems and identify various impacts that the system will have on the urban area,
such as energy use, pollution and accidents. With this information on how transport systems perform
and the magnitude of their impacts, the planners can provide decision makers (politicians) with some of
the information they need to evaluate alternative methods of supplying the community with transport
services.
Box 7.1 Dijkstra Algorithm
The assignment models discussed in this chapter make use of shortest-path algorithms to obtain the
shortest path between each origin and destination. Only one of the methods, Dijkstra Algorithm, will be
discussed. This is the one mostly used.

If we want to obtain the shortest path from node s towards node t in a network, we start building a tree
(a shortest path tree) starting in s (this algorithm type is often called a tree builder algorithm).
Starting from s we ‘walk’ through the network labelling each node u with the label L(u). L(u) indicates
the length of the temporary shortest path of s to u. So, it can still be changed with a shorter one, until
no other node can be found which is decreasing L(u) . The label becomes definitive.
Let:
V seto f odes
T seto f nodes with temporary label
s the origin node
t the destination node
u, v a node
e a link
The algorithm:
Step 1: L(s): = 0 and for all veV, v¹s: L(v) = ¥
Step 2: T: = V
Step 3: let ueT for which L(u) is MIN: do
If L(u) = ¥ then STOP; no solution
If u=t then T:=T-{u} and STOP (L(t) is shortest path from s to t)
Step 4: each link e from u to veT: do
If L(v)>L(u)+length(e) then
L(v):=L(u)+length(e)
Step 5: T:=T-{u}: return to step 3
Consider the following transport network:
4
B C
2 2 3
6 D 4
A F
2
3
7
E
Assuming a direct graph (one way traffic from West to East), find the shortest path from A -> F. try to
reconstruct the table below yourself.
A B C D E F T
0 ¥ ¥ ¥ ¥ ¥ {A,B,C,D,E,F}
0 2 ¥ 6 7 ¥ {B,C,D,E,F}
0 2 6 4 7 ¥ {C,D,E,F}
0 2 6 4 6 8 {C,E,F}
0 2 6 4 6 8 {E,F}
0 2 6 4 6 8 {F}
The shortest path A-F has a length of 8 units. By recalling the predecessor nodes (the label which made
it definitive) the route can be reconstructed backward. A-B-D-F.

7.2 Classification of traffic assignment models

The demand for transport, given the OD trip table, varies with time. Likewise the network conditions
may differ with time (e.g. as a function of the travel demand). Therefore different traffic assignment
models have been developed. First of all, recent research focuses at developing dynamic traffic
assignment models, given that demand and supply may change over time. This is the basis for the
recent scientific insights in Dynamic Traffic Management and Intelligent Transport Systems.
In static traffic assignment models the demand for transport as well as the supply of the infrastructure
network are regarded independent of time. The demand is homogeneously spread over the routes (in
space and time), therefore they are also called steady-state models. In this course we stick to static
assignment models.
Within static assignment models a differentiation can be made in deterministic or stochastic assignment
models. In the case of deterministic traffic assignment models all travellers are assumed to have the
same perfect knowledge and act on it rationally and identically. Stochastic models on the other hand
allow for imperfect knowledge and taste variation, and are therefore more realistic (but also more
complicated).
Within these methods two other distinctions can be made, i.e. one for uncongested assignment (the so-
called all-or-nothing method if deterministic, or stochastic assignment if stochastic) and one for
congested assignment (the so-called user-equilibrium method if deterministic or stochastic user
equilibrium method if stochastic).
The four methods are summarised in table 7.1

Table 7.1 Classification of static assignment models
For the congested assignment knowledge on the speed (travel time)/flow relationship of roads is
necessary, which basically says that the higher the flow imposed on the network or on a link the lower
will be the speed (or the higher the travel time) delivered.
The famous Bureau of Public Roads (BPR) method is often used for link loading, i.e. for travel time vs.
flow:
b
é æ V ö ù
TQ = T0 ê1 + a çç ÷÷ ú
êë è Vmax ø úû
where
TQ = travel time at traffic flow Q
T0 = free-flow travel time
V = traffic flow, volume (veh/hr)
Vmax = saturation flow
a, b = parameters
For example suppose on a certain stretch of highway (1 km) the free-flow travel time is 0.6 minutes,
a=0.15, b=4 the volume is 2400 veh/hr, whereas the saturation flow is at 2100 veh/hr. The travel time
increases from 0.6 min. during “normal” circumstances to 0.75 minutes, which is about 80 km/hr.

Figure 7.1. Speed-flow and cost-flow relationship
A number of factors are thought to influence the choice of route when driving between two points;
these include journey time, distance, monetary cost (fuel and other), congestion, type of road, scenery,
signposting etc. The production of a generalised cost expression should enable you to incorporate these
factors; though in practice only time and monetary costs are used.
7.3 Traffic assignment algorithms
7.3.1 All – or – nothing assignment (AON)

The simplest route choice and assignment method is the all-or-nothing assignment. This method
assumes that there are no congestion effects, that all drivers consider the same attributes for route
choice and that they perceive and weigh them in the same way. The absence of congestion effects
means that link costs are fixed; the assumption that all drivers perceive the same costs means that
every driver from i to j must choose the same route. Therefore, all drivers are assigned to one route
between i and j and no driver is assigned to other, less attractive in terms of generalised costs, routes.
These assumptions are probably reasonable in sparse and uncongested networks where there are few
alternative routes and they are very different in cost.
The assignment algorithm itself is the procedure that loads the OD trip table (after modal split) to the
shortest path trees and produces flows VA,B (between nodes A and B).
An example taken from (Ortúzar & Willumsen, 1994): Consider the simple network in figure 7.2a and its
associated simplified OD table:
From/To A B C D
A 0 0 400 200
B 0 0 300 100
C 0 0 0 0
D 0 0 0 0
Section a) shows the travel costs (times) on each link; section b) the corresponding trees based on
these costs together with the contributions to the total flow after assignment; these are shown in
section c).
All-or-nothing assignment is generally of limited interest to the planner; it may be used to represent
some sort of ‘desire line’, i.e. what drivers would like to do in the absence of congestion. However, its
most important practical feature is as a basic building block for other types of assignment techniques,
a/o. the user-equilibrium assignment discussed next.

7.3.2 Wardrop’s user-equilibrium assignment (DUE)

Equilibrium methods generally use the speed or travel time/ flow relationship as a generator of a spread
on the network. The Wardrop’s method uses an equilibrium condition called the first Wardrop’s rule:
Under equilibrium conditions traffic arranges itself in congested networks in such a way that no
individual trip maker can reduce his route costs by switching routes.
If all trip makers perceive costs in the same way: Under equilibrium conditions traffic arranges itself in
congested networks such that all used routes between an O-D pair have equal and minimum costs while
all unused routes have greater or equal costs.
Assume as an example a town served by a bypass, with an idealised town with a low-capacity through
route (1000 veh/hr) and a high-capacity bypass (capacity 3000 veh/hr), which is longer but much
faster. Total travel demand is given as V=2000 vehicles.
Figure 7.2. A simple network, its trees and flows from loading a trip table

Bypass
Town centre
Figure 7.3 Town served by a bypass and a town centre route
Assume that the absolute capacity restriction for each route is replaced with two corresponding time-
flow relationships:
C b = 15 + 0.005Vb
C t = 10 + 0.02Vt
where Cb and Ct are travel costs via the bypass and the town-centre routes respectively, and Vb and Vt
are their corresponding flows17.
The flows on both routes will satisfy Wardrop’s equilibrium when the corresponding costs are identical.
By equating both equations it is possible to find the direct solution to Wardrop’s equilibrium as a
function of the total flow Vb+Vt=V:
15 + 0.005Vb = 10 + 0.02(V - Vb ) ® Vb = 0.8V - 200
From this you can see that in case V>250 the two routes will be used, for example at V=2000, Vb=1400
vehicles and Vt=600 and the costs by each route are 22 minutes. The total system costs in this situation
(often to the interest of the politician or planner) are now: 1400*22 + 600*22=2000*22=44,000
minutes. You can imagine that for the planner it might be interesting to lower this total system costs,
and therefore the minimum total system costs for the demand of 2000 vehicles can be calculated (MIN
(Vb*Cb+Vt*Ct)18).
This analytical calculation of Wardrop’s equilibrium is only possible if two (or max. three) routes are
considered. Therefore and iterative method is necessary for more complex networks. Two iterative
methods are popular, i.e. the Incremental Method and the Method of Successive Averages (MSA). Only,
the latter one is discussed in these notes.
7.3.3 Method of Successive Averages (MSA)

In an iterative assignment algorithm the ‘current’ flow on a link is calculated as a linear combination of
the current flow on the previous iteration and an auxiliary flow resulting from an all-or-nothing
assignment in the present iteration. The algorithm can be described by the following steps:
1. Select a suitable initial set of current link costs, usually free-flow travel times. Initialise all flows
Va=0; make n=0
2. Build the set of minimum cost trees with the current costs; make n=n+1
3. Load the whole of the trip table T all-or-nothing to these trees obtaining a set of auxiliary flows Fa
4. Calculate the current flows as: Van = (1 - f )Van -1 + fFa , with f=1/n between 0 and1. Having f=1/n
produces a solution convergent to Wardrop’s equilibrium.
5. Calculate a new set of current link costs based on the flows Vna. If the flows or costs have not
changed significantly in two consecutive iterations, stop; otherwise proceed with step 2.
If we apply this method to the town-centre example from (Ortúzar & Willumsen, 1994) we get the
following table19
17
Draw both time-flow diagrams in one figure!
18
Prove that the minimum total system costs are at 43,750 minutes (with Vb=1500 and Vt=500).
19
Recalculate this yourself.

Iteration f Flow town Cost Flow bypass Cost

town bypass
1 F 2000 0
Vn 1 2000 50 0 15
2 F 0 2000
Vn 1/2 1000 40 1000 20
3 F 0 2000
Vn 1/3 667 23.3 1333 21.7
4 F 0 2000
Vn 1/4 500 20 1500 22.5
5 F 2000 0
Vn 1/5 800 26 1200 21
6 F 0 2000
Vn 1/6 667 23.3 1333 21.7
7 F 0 2000
Vn 1/7 572 21.4 1428 22.1
8 F 2000 0
Vn 1/8 750 25 1250 21.25
9 F 0 2000
Vn 1/9 667 23.3 1333 21.7
10 F 0 2000
Vn 0.1 600 22 1400 22
7.3.4 Stochastic user-equilibrium assignment (SUE)

This method differs slightly from the deterministic case in that the travel time is perceived differently by
the different users. The travel time on a route is assumed to be a stochastic function with a
deterministic part and a subjective part (the error term ea per link a. The average is still the
deterministic travel time.
The Method of Successive Averages can still be used for the SUE, but now every iteration the travel
times are randomly determined as follows:
C a = c a + z jc a
, where the Ca is the perceived resistance by the traveller/user, ca the deterministic resistance (the
physical resistance due to for example length of link a), j a factor indicating the variation (standard
deviation) of the uncertainty interval, z a randomised number (N(0,1)-statistical distribution).

8 Résumé on complete transport
model
This chapter provides an overview of the complete four-stage transport model. With a simple case the
successive stages of the transport model will be performed using the techniques discussed in chapter 3
to 7.
8.1 Study area
5
2
4 7
1
3 6
Figure 8.1 Study area
In figure 8.1 the (hypothetical) study area for this case is given. It is divided into 7 zones, each with a
centroid. The network between the centroids is given, including 1 centroid connector in zone 1.
The trip production and attraction, as well as the socio-economic and land-use data are known for the
base-year. These will be used in the first stage to forecast the future production and attraction.
8.2 Stage 1: Trip generation

In table 8.1 the socio-economic and land-use data for each zone is given.
Zone Population Retail floor space Employment Production Attraction

(Pop) in m² (FSp) (E) (P) (A)
1 5000 20 420 16500 2800
2 2000 300 2500 6700 28000
3 3500 30 430 11650 3650
4 3000 200 3000 10500 23350
5 6000 70 700 21400 8200
6 3500 40 2500 11650 11300
7 2000 90 450 6600 7700
Total 25000 750 10000 85000 85000

Table 8.1 Data for the base-year

To calculate trip production for future years a single linear regression is performed, which will give an
equation of type:
y = a*x+b
For zone i this will lead to:
Pi = a * Popi + b
The alpha and beta can be derived from the linear regression and the formula will be:
Pi = 3.56 * Popi - 554.05
The trip attraction formula will be derived using multiple linear regression, with equation type:
y = a * x1 + b * x 2 + c
For zone i this will lead to:

Ai = a * Wi + b * FSpi + g
And finally the formula:

Ai = 3.29 * Wi + 64.68 * FSpi - 518.53
The regression analysis can be performed with SPSS. This is explained in box 8.1.
Box 8.1 Regression Analysis using SPSS
To get started the data given in table 8.1 is entered, which is shown in the next figure:

To perform the single linear regression for trip production, the following input is needed:
An output file is generated, containing the following tables from which the R² and the coeffi-
cients can be read.
Model Summary
Adjusted Std. Error of

Model
1 R,996a R Square
,993 R Square
,991 the Estimate
489,280
a.
Predictors: (Constant), POP
Coefficientsa
Unstandardized Standardized
Coefficients Coefficients
Model
1 (Constant) B
-554,054 Std. Error
515,049 Beta t
-1,076 Sig.
,331
POP 3,555 ,135 ,996 26,413 ,000
a.
Dependent Variable: PROD
The same procedure can be followed for the attraction:

The output generated for the attraction gives:
Model Summary
Adjusted Std. Error of

Model
1 R,999a R Square
,998 R Square
,997 the Estimate
491,587
a.
Predictors: (Constant), EMPL, FSP
Coefficientsa
Unstandardized Standardized
Coefficients Coefficients
Model
1 (Constant) B
518,528 Std. Error
312,317 Beta t
1,660 Sig.
,172
FSP 64,678 2,570 ,692 25,163 ,000
EMPL 3,286 ,229 ,395 14,359 ,000
a.
Dependent Variable: ATTR
With these formulas the production and attraction in the future year can be calculated, using socio-
economic and land-use forecast for the future year. These forecasts are given in the following table in
which also the new production and attraction are calculated.
Zone Population (Pop) Retail floor space Employment Production Attraction

in m² (FSp) (E) (P) (A)
1 7000 66 1100 24365.95 8406.41
2 2800 397 3450 9413.95 37546.99
3 4900 65 1711 16889.95 10351.92
4 4200 270 5000 14397.95 34432.13
5 8400 120 2010 29349.95 14893.03
6 4900 60 3500 16889.95 15914.33
7 2800 120 1418 9413.95 12945.35
Total 35000 1098 18189 120721.7 134490.2

Table 8.2 Data for the future year
As can be seen the total production is not equal to the total attraction, so balancing has to be
performed. The attraction has to be multiplied with a factor f as calculated below:
åP i
120721.7
f = i
= = 0.8976
åA
j
j 134490.2
In the following table the production and attraction are balanced and now the next stage can be
performed.

Zone Production Attraction (A)
(P)
1 24365.95 7545.799
2 9413.95 33703.09
3 16889.95 9292.136
4 14397.95 30907.12
5 29349.95 13368.35
6 16889.95 14285.09
7 9413.95 11620.06
Total 120721.7 120721.7

Table 8.3 Balanced production and attraction
8.3 Stage 2: Trip distribution

In this case a gravity model will be used to determine the trip distribution. In figure 8.2 the (fictitious)
distances of the network are given.
3 5 6
5 2.5
2
3.5 8
5 6 7
7.5 4 4
1
8
3 6
Figure 8.2 Distances
The gravity model:

Tij = µPi A j f (cij )
The distribution function used for this case:

1
f (cij ) = 2
cij

Distribution function
1.2
1
0.8
f(
Cij 0.6
) 0.4
0.2
0
1 2.5 4 5.5 7 8.5 10 11.5 13 14.5 16 17.5 19
Cij
Figure 8.3 Distribution function
The deterrence between the zones cij can be derived by building a shortest-path-tree. The following
table shows the resulting matrix for cij.
1 2 3 4 5 6 7
1 - 5.0 7.5 7.5 10.0 12.5 16.0
2 5.0 - 3.5 2.5 5.0 7.5 11.0
3 7.5 3.5 - 4.0 7.0 8.0 13.0
4 7.5 2.5 4.0 - 3.0 5.0 9.0
5 10.0 5.0 7.0 3.0 - 9.0 6.0
6 12.5 7.5 8.0 5.0 4.0 - 6.0
7 16.0 11.0 13.0 9.0 6.0 6.0 -
Table 8.4 Matrix cij
In box 8.2 an example is given of shortest-path-tree-building using the Dijkstra algorithm.
Then the startmatrix is determined using the above distribution function, each cell represents the value
of f(cij) for the specific OD-pair.
1 2 3 4 5 6 7
1 - 0.04 0.17778 0.017778 0.01 0.0064 0.003906
2 0.04 - 0.081633 0.16 0.04 0.017778 0.008264
3 0.017778 0.081633 - 0.0625 0.020408 0.015625 0.005917
4 0.017778 0.16 0.0625 - 0.111111 0.04 0.012346
5 0.01 0.04 0.020408 0.111111 - 0.012346 0.027778
6 0.0064 0.017778 0.015625 0.04 0.0625 - 0.027778
7 0.003906 0.008264 0.005917 0.012346 0.027778 0.027778 -
Table 8.5 Startmatrix f(cij)

Box 8.2 Shortest-path-tree-building using Dijkstra algorithm
As an example, the shortest path tree with node 1 as origin node, and node 7 as destination
node will be built using the Dijkstra algorithm. The length of the links will be used as given in
figure 8.2. The Dijkstra algorithm can be found in chapter 7, in box 7.1.
Initialisation:
Step 1: L(1) = 0, L(2,3,4,5,6,7) = ¥
Step 2: T = {2,3,4,5,6,7}
Step 3: L(2) = 5, L(3) = 7,5 MIN: L(2)
Step 4: Calculate paths via links from node 2:
Path L(2) L(3) L(4) L(5) L(6) L(7)
Old 5 7,5 ¥ ¥ ¥ ¥
Calculated - 8,5 7,5 10 - -
New 5 7,5 7,5 10 ¥ ¥
Step 5: T-{2}= {3,4,5,6,7}

Iteration 1:
Step 3: MIN: L(3)
Path L(2) L(3) L(4) L(5) L(6) L(7)
Old 5 7,5 7,5 10 ¥ ¥
Calculated - - 11,5 - 15,5 -
New 5 7,5 7,5 10 15,5 ¥
Step 5: T-{3}= {4,5,6,7}

Iteration 2:
Step 3: MIN: L(4)
Path L(2) L(3) L(4) L(5) L(6) L(7)
Old 5 7,5 7,5 10 15,5 ¥
Calculated - - - 10,5 12,5 -
New 5 7,5 7,5 10 12,5 ¥
Step 5: T-{4}= {5,6,7}

Iteration 3:
Step 3: MIN: L(5)
Path L(2) L(3) L(4) L(5) L(6) L(7)
Old 5 7,5 7,5 10 12,5 ¥
Calculated - - - - 18 16
New 5 7,5 7,5 10 12,5 16
Step 5: T-{5}= {6,7}

Iteration 4:
Step 3: MIN: L(6)
Path L(2) L(3) L(4) L(5) L(6) L(7)
Old 5 7,5 7,5 10 12,5 16
Calculated - - - - - 18,5
New 5 7,5 7,5 10 12,5 16
Step 5: T-{6}= {7}

Iteration 5:
Step 3: MIN: L(7) => STOP (L(7) is the shortest path from 1 to 7)
The shortest paths from node 1 to node 2,3,4,5,6,7 can be read from the last table.

In the following tables the gravity model will be estimated using Furness iterations.
Destination
Origin 1 2 3 4 5 6 7 total target a
1 0,000 0,040 0,018 0,018 0,010 0,006 0,004 0,096 24365,95 254177,4
2 0,040 0,000 0,082 0,160 0,040 0,018 0,008 0,348 9413,95 27076,87
3 0,018 0,082 0,000 0,063 0,020 0,016 0,006 0,204 16889,95 82850,32
4 0,018 0,160 0,063 0,000 0,111 0,040 0,012 0,404 14397,95 35661,88
5 0,010 0,040 0,020 0,111 0,000 0,012 0,028 0,222 29349,95 132419,9
6 0,006 0,018 0,016 0,040 0,063 0,000 0,028 0,170 16889,95 99305,33
7 0,004 0,008 0,006 0,012 0,028 0,028 0,000 0,086 9413,95 109478,5
total 0,096 0,348 0,204 0,404 0,272 0,120 0,086 1,529
target 7545,799 33703,09 9292,136 30907,12 13368,35 14285,09 11620,06 120721,7
b 78715,22 96938,5 45580,74 76552,99 49185,05 119114,9 135134,3
Table 8.6 Gravity model: initialisation
Destination
1 0,00 10167,09 4518,77 4518,77 2541,77 1626,74 992,82 24365,95 24365,95 1
2 1083,07 0,00 2210,37 4332,30 1083,07 481,37 223,76 9413,95 9413,95 1
3 1472,91 6763,32 0,00 5178,15 1690,81 1294,54 490,23 16889,95 16889,95 1
4 634,00 5705,90 2228,87 0,00 3962,43 1426,48 440,28 14397,95 14397,95 1
5 1324,20 5296,80 2702,43 14713,31 0,00 1634,86 3678,36 29349,95 29349,95 1
6 635,55 1765,45 1551,65 3972,21 6206,58 0,00 2758,50 16889,95 16889,95 1
7 427,62 904,73 647,78 1351,62 3041,09 3041,09 0,00 9413,95 9413,95 1
total 5577,36 30603,29 13859,85 34066,36 18525,76 9505,07 8583,95 120721,7
target 7545,799 33703,09 9292,136 30907,12 13368,35 14285,09 11620,06 120721,7
b 1,352933 1,10129 0,670435 0,907262 0,721609 1,502892 1,353696
Table 8.7 Gravity model: step 1
Destination
1 0,00 11196,92 3029,54 4099,71 1834,17 2444,81 1343,97 23949,11 24365,95 1,017405
2 1465,33 0,00 1481,91 3930,53 781,56 723,45 302,91 8685,68 9413,95 1,083847
3 1992,75 7448,38 0,00 4697,94 1220,10 1945,55 663,62 17968,33 16889,95 0,939984
4 857,76 6283,85 1494,31 0,00 2859,32 2143,84 596,01 14235,08 14397,95 1,011441
5 1791,55 5833,31 1811,80 13348,83 0,00 2457,01 4979,38 30221,89 29349,95 0,971149
6 859,86 1944,27 1040,28 3603,84 4478,72 0,00 3734,18 15661,15 16889,95 1,078462
7 578,55 996,37 434,30 1226,28 2194,48 4570,44 0,00 10000,41 9413,95 0,941357
total 7545,80 33703,09 9292,14 30907,12 13368,35 14285,09 11620,06 120721,6
target 7545,799 33703,09 9292,136 30907,12 13368,35 14285,09 11620,06 120721,7
b 1 1 1 1 1 1 1

Destination
1 0,00 11391,80 3082,27 4171,06 1866,09 2487,36 1367,36 24365,95 24365,95 1
2 1588,19 0,00 1606,16 4260,10 847,09 784,11 328,31 9413,95 9413,95 1
3 1873,16 7001,36 0,00 4415,99 1146,88 1828,78 623,79 16889,95 16889,95 1
4 867,57 6355,74 1511,41 0,00 2892,04 2168,37 602,83 14397,95 14397,95 1
5 1739,86 5665,01 1759,53 12963,70 0,00 2386,12 4835,72 29349,95 29349,95 1
6 927,33 2096,82 1121,90 3886,60 4830,13 0,00 4027,16 16889,95 16889,95 1
7 544,62 937,94 408,83 1154,36 2065,79 4302,41 0,00 9413,95 9413,95 1
total 7540,73 33448,68 9490,10 30851,81 13648,01 13957,15 11785,17 120721,7
target 7545,799 33703,09 9292,136 30907,12 13368,35 14285,09 11620,06 120721,7
b 1,000672 1,007606 0,97914 1,001793 0,979509 1,023496 0,98599
As we can see in table 8.9, the accuracy after three steps is higher than 97%.
The resulting OD-matrix is to be used in the next stage.
1 2 3 4 5 6 7
1 0 11392 3082 4171 1866 2487 1367
2 1588 0 1606 4260 847 784 328
3 1873 7001 0 4416 1147 1829 624
4 868 6356 1511 0 2892 2168 603
5 1740 5665 1760 12964 0 2386 4836
6 927 2097 1122 3887 4830 0 4027
7 545 938 409 1154 2066 4302 0
Table 8.10 OD-matrix
8.4 Stage 3: Modal split

To determine the modal split a logit model will be used in which we assume that there are:
Two different travel modes (car, bus)
Two different market segments (that experience a different utility)
The utility functions for market segment 1 are:
U Car1 = -0.012t - 0,40c + 2,50

U Bus1 = -0,01t - 0,50c
and for market segment 2:
U Car 2 = -0.02t - 0,35c + 3,50

U Bus 2 = -0,015t - 0,35c
in which t= travel time and c= out-of-pocket cost.

The travel times and cost per OD-pair are given in the following table:
Travel time by car Travel time by bus

1 2 3 4 5 6 7 1 2 3 4 5 6 7
1 5.0 7.5 7.5 10.0 12.5 16.0 1 7.5 10.0 10.0 15.0 19.0 22.5
2 5.0 3.5 2.5 5.0 7.5 11.0 2 7.5 6.0 5.0 7.5 11.0 16.0
3 7.5 3.5 4.0 7.0 8.0 13.0 3 10.0 6.0 5.5 10.0 12.5 20.0
4 7.5 2.5 4.0 3.0 5.0 9.0 4 10.0 5.0 5.5 5.0 7.5 13.5
5 10.0 5.0 7.0 3.0 4.0 6.0 5 15.0 7.5 10.0 5.0 6.5 9.0
6 12.5 7.5 8.0 5.0 4.0 6.0 6 19.0 11.0 12.5 7.5 6.5 10.0
7 16.0 11.0 13.0 9.0 6.0 6.0 7 22.5 16.0 20.0 13.5 9.0 10.0
Car cost Bus cost

1 2 3 4 5 6 7 1 2 3 4 5 6 7
1 1.25 0.75 1.50 1.00 1.25 1.60 1 1.50 1.50 1.50 1.50 1.50 2.00
2 0.50 0.35 1.00 0.50 0.75 1.10 2 1.50 1.25 1.25 1.25 1.25 1.50
3 0.75 1.10 1.15 0.70 0.80 1.30 3 1.50 1.25 1.25 1.25 1.25 1.50
4 0.75 1.00 0.40 0.30 0.50 0.90 4 1.50 1.25 1.25 1.25 1.25 1.50
5 1.00 1.25 0.70 1.05 0.40 0.60 5 1.50 1.25 1.25 1.25 1.25 1.50
6 1.25 1.50 0.80 1.25 0.40 0.60 6 1.50 1.25 1.25 1.25 1.25 1.50
7 1.60 1.80 1.30 1.65 0.60 0.60 0.60 7 2.00 1.50 1.50 1.50 1.50 1.50
Table 8.11 Travel time and cost of both modes
The following binomial logit model will be used:

eU i
Pi =
åe
Uj
j
,with Pi the chance to choose mode i.
In the following tables the chance to choose car are calculated per OD-pair for both market segments
using the utility functions already mentioned. We are only interested in car since we are only
considering the car network in the study area.
Pij(car) segment 1 1 2 3 4 5 6 7
1 0.9241 0.94075 0.95073 0.9346 0.94685 0.94213 0.9475
2 0.9554 0.92414 0.95271 0.9396 0.94979 0.94506 0.9447
3 0.9507 0.9372 0.92414 0.9354 0.94588 0.94449 0.94125
4 0.9507 0.93963 0.95129 0.9241 0.95343 0.94979 0.94868
5 0.9468 0.93339 0.94588 0.9381 0.92414 0.95175 0.95382
6 0.9421 0.92724 0.94449 0.9334 0.95175 0.92414 0.95426
7 0.9475 0.92811 0.94125 0.9319 0.95382 0.95426 0.92414
Pij(car)
segment 2 1 2 3 4 5 6 7
1 0.97069 0.973403 0.977302 0.97069 0.975873 0.973979 0.974852
2 0.97942 0.970688 0.97886 0.97372 0.977577 0.975636 0.974913
3 0.9773 0.972682 0.970688 0.97174 0.975932 0.975517 0.973661

4 0.9773 0.973725 0.978119 0.97069 0.979113 0.977577 0.976626
5 0.97587 0.971041 0.975932 0.97301 0.970688 0.978437 0.978752
6 0.97398 0.968553 0.975517 0.97104 0.978437 0.970688 0.979061
7 0.97485 0.96817 0.973661 0.96982 0.978752 0.979061 0.970688
Table 8.12 Modal split car per OD-pair

The population fraction per market segment is as follows:
Zone Market segment 1 Market segment 2

1 0.37 0.63
2 0.58 0.42
3 0.42 0.58
4 0.66 0.34
5 0.31 0.69
6 0.53 0.47
7 0.34 0.66
Table 8.13 Population fraction per market segment
When we assume that the average auto occupancy is 1.4 passengers, the total amount of trips by car
can be determined with the following formula:
Car ( PijCar ;seg1 miseg1 + PijCar ;seg 2 miseg 2 ) * Tij

Tij =
1.4
The definitive OD-matrix will then be:
Tij (car) 1 2 3 4 5 6 7
1 0 7822 2130 2852 1286 1709 942
2 1095 0 1105 2903 582 536 224
3 1293 4790 0 3017 789 1257 428
4 595 4319 1037 0 1988 1485 413
5 1202 3882 1215 8910 0 1653 3354
6 634 1418 769 2641 3327 0 2778
7 376 640 281 789 1432 2983 0
Table 8.14 OD-matrix
8.5 Stage 4: Assignment

For this case an All-Or-Nothing assignment is used. The shortest path tree was already built to
determine the trip distribution. In the following table the OD-pairs that use the links are summarized.
Link Used by OD-pair i-j

1-2 1-2 2-1 1-4 4-1 1-5 5-1 1-6 6-1 1-7 7-1
1-3 1-3 3-1
2-3 2-3 3-2
2-4 1-4 4-1 1-6 6-1 2-4 4-2 2-6 6-2
2-5 1-5 5-1 1-7 7-1 2-5 5-2 2-7 7-2
3-4 3-4 4-3 3-5 5-3 3-7 7-3
3-6 3-6 6-3
4-5 3-5 5-3 3-7 7-3 4-5 5-4 4-7 7-4
4-6 1-6 6-1 2-6 6-2 4-6 6-4
5-6 5-6 6-5
5-7 1-7 7-1 2-7 7-2 3-7 7-3 4-7 7-4 5-7 7-5
6-7 6-7 7-6
Table 8.15 Used links by OD-pairs

Now the traffic can be loaded onto the network. Figure 8.4 shows the loads per link, per 24 hours.
Figure 8.4 Assignment

Appendix A Used references
Black, 1981: Urban transport planning – theory and practice, Croom Helm, London, England
Bovy & van der Zijp, 1999: Transportation modelling, Lecture note CTvk4800, Delft University of
Technology, the Netherlands.
Hensher &Button [Eds.], 2000: Handbook of transport modelling Elsevier Science Ltd., UK
Immers & Stada, 1999: Verkeersmodellen, Lecture note H111, Catholic University of Leuven, the
Netherlands (in Dutch).
Institute of Transportation Engineers, 1997:Trip generation, USA
Manheim, 1979: Fundamentals of transportation systems analysis, The MIT press, USA
Meyer & Miller, 2001: Urban transportation planning: a decision-oriented approach, Mc Graw Hill
series in transportation, USA
Ortúzar [Eds.], 1992: Simplified transport demand modelling, PTRC Education and Research services
Ltd., England
Ortúzar & Willumsen, 1994: Modelling Transport, 2nd. Ed. Wiley & Sons, England.
Pas, 1986: The urban transportation planning process. In Hanson S. (eds.) The geography of urban
transportation Guilford Press, NY, 49-70.
Rodriguez, Comtois & Slack, 2006: The geography of transport systems. Routledge, England.
Schoemaker, 2002: Samenhang in vervoer- en verkeerssystemen. Uitgeverij Coutinho. [in Dutch].
Slavin, 2003: The Role of GIS in Land Use and Transport Planning. In: Hensher, Button, Haynes &
Stopher. Handbook of Transport Geography and Spatial Systems, (Handbooks in Transport, Volume
5), Elsevier Science, The Netherlands.
Taylor, Young & Bonsall, 1996: Understanding traffic systems: data, analysis and presentation
Ashgate Publishing Ltd., England
Tolley & Rodney, 1995: Transport systems, policy and planning: a geographical approach Longman,
England
Zuidgeest, 2005. Sustainable urban transport development: a dynamic optimisation approach. TRAIL
Research School, The Netherlands.

Itc TDM Notes 2017

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Itc TDM Notes 2017

Uploaded by

Copyright:

Available Formats

Theory

Prof. Dr. Ir. M.F.A.M. van Maarseveen (University of Twente)

1 INTRODUCTION TO TRANSPORT MODELLING 5

PART B – TRAVEL DEMAND MODELLING 43

3 NETWORKS AND STUDY AREA DEFINITION 44

2 UNIVERSITY OF TWENTE, THE NETHERLANDS

4 TRIP GENERATION MODELLING 48

5 TRIP DISTRIBUTION MODELLING 53

6 MODAL SPLIT MODELLING 62

7 TRAFFIC ASSIGNMENT MODELLING 65

8 RESUME ON COMPLETE TRANSPORT MODEL 72

APPENDIX A USED REFERENCES 85

UNIVERSITY OF TWENTE, THE NETHERLANDS 3

4 UNIVERSITY OF TWENTE, THE NETHERLANDS

1 Introduction to transport modelling1

1.1 Urban transport planning

UNIVERSITY OF TWENTE, THE NETHERLANDS 5

Figure 1.1 A general representation of the urban transport planning process

6 UNIVERSITY OF TWENTE, THE NETHERLANDS

The pre-analysis phase

In the end, alternatives will be generated.

The technical analysis phase

The post-analysis phase

1.1.3 The role of models in urban transport planning

UNIVERSITY OF TWENTE, THE NETHERLANDS 7

1.1.4 Continuous transport planning

8 UNIVERSITY OF TWENTE, THE NETHERLANDS

7. Evaluation of solutions and recommendation of a plan/strategy/policy. This involves operational,

Generate solutions for Forecast planning vari-

Test model and solution

Evaluate solutions and

Figure 1.3 A framework for rational decision making with models

1.2 Theoretical background on transport systems

1.2.1 Introduction to travel behaviour and transport systems2

UNIVERSITY OF TWENTE, THE NETHERLANDS 9

The transport system is built-up of three interdependent layers:

10 UNIVERSITY OF TWENTE, THE NETHERLANDS

Remote sensing and (GIS) databases

Mapping and visualisation

Figure 1.4 The three-layer traffic and transport system.

1.2.2 Consumer travel behaviour

X1,…,XN = GOODS THAT ARE CONSUMED

UNIVERSITY OF TWENTE, THE NETHERLANDS 11

12 UNIVERSITY OF TWENTE, THE NETHERLANDS

UNIVERSITY OF TWENTE, THE NETHERLANDS 13

1.3 Modelling issues

1.3.1 General modelling issues

There are two classical styles of approach to the development of theory:

Model calibration, validation and use

Box 1.1 Calibration or estimation?

UNIVERSITY OF TWENTE, THE NETHERLANDS 15

Source: J. de D. Ortúzar & L.G. Willumsen, Modelling transport, 1994, p. 19

Figure 1.5 Modelling and sampling

1.3.2 Aggregate or disaggregate modelling

16 UNIVERSITY OF TWENTE, THE NETHERLANDS

1.3.3 Types of models

The hierarchy starts at the most detailed level.

1.4 The classic four-stage model

Figure 1.6 The classic four-stage transport model

Comments on the classic transport model

Using the classic transport model

18 UNIVERSITY OF TWENTE, THE NETHERLANDS

1.5 Limitations of the four-stage model