Professional Documents
Culture Documents
Acknowledgement
My special thanks is to our best Professor Ma for his devoted effort made to introduce us the basic
lessons needed for future career and in guiding us in each and every stage of this work. This work is
only possible because of the constant effort on the course of study, so I want to thank and extend my
sincere appreciation for his valuable advice, constant support, commitment, dedication,
encouragement, precious guidance, creative suggestions, critical comments and his devoted effort
made to introduce the basic lessons needed for future career related Modelling Methods and
Applications in Traffic and Transportation. To conclude, I am grateful and wish him all the bests for
his patience and a strong commitment to the advancement of this course and I accept full responsibility
for any errors that may occur.
Analyzing the Effect of Household Travel Survey Characteristics on Number of Trips Made By the Household
Abstract
With increasing diversification of lifestyles in recent decades, it is becoming more necessary to
consider factors other than gender and age, such as household composition, for accurate travel
behavior analysis. Understanding the amount and type of travel by the residents is important for
planners and policymakers. This study was conducted purposely for addressing the relation to the
change of household characteristics influence number of trip made by the household in the study areas.
Household travel data is a critical component of the travel-demand forecasting process. In addition to
providing information on regional travel characteristics, these data are used to estimate and update
travel-demand models designed to analyze proposed transportation policy decisions. The foundation
of this report is the data collected in household travel survey sample of 2000 households drawn from
the 1990 US National Personal Transportation Survey (NPTS). Household and personal
characteristics influence average number of trips made by the household. The objective of the paper
was to obtain household travel survey data and to determine factors of household characteristics, and
identifies possible interrelationship between travel attribute of number of trips made by the households
and household characteristics variables This study employs simple statistical analytical methods
applied to determine relationships between household characteristics variables and travel attributes.
In addition, Descriptive analysis is used in this paper provides a clearer understanding of household
characteristics behavior in the state travel groups, such as households number of persons, income,
workers, age, driving populations, and number of automobiles in the household. Furthermore,
regression and correlational analyses were used to explore relationships and correlation between
household travel characteristics groups and number of trips made by the households. Which household
composition is an important indicator for trip pattern analysis as and an aggregate level understanding
how the household characteristics is used and how it serves the people and is a critical component of
developing policies, plans, and programs that optimize system performance, provide for the mobility
needs of travelers, and maintain economic vitality. Also, this information is important for assessing
system performance for economic vitality, mobility, equity, and a host of other system objectives.
Key words: Household Characteristics, Number of Trips Made by the Household, , Descriptive
Analysis, Correlational Analyses, Normality/ Skewness, Regression Analyses, SPSS
Several factors appear to be impacting travel behavior in ways not observed over the past several
decades: advances in communication technology as a potential substitution for travel, increases in
fuel costs, effects of environmental concerns regarding the impact of roadway travel, and fundamental
changes in the income and wealth of the public. If a "new normal" for person travel is being shaped,
it is important for planners and policymakers to understand it (Nancy, 2016). The uncertainty in future
per capita travel demand as well as in demographic and economic growth post-recession suggests a
significant degree of uncertainty regarding estimates of future overall travel demand. Thus, there is
greater uncertainty regarding future transportation infrastructure and service needs. This suggests that
transportation planning will need to embrace scenario planning to explore the robustness of plans in
light of various forecasts for future travel demand. Collectively, these factors create a challenging but
exciting time for transportation planning (Tomasz, 2016).
Different studies are employed simple statistical analytical methods applied to determine
relationships between household characteristics variables and travel attributes. The overall findings
indicated that, trips made by the households for work, school, shopping and recreation based on
household size, household income, number of workers in the household, and number of persons aged
in the household, automobiles in the household and number of drivers in the household (Nancy, 2016).
Household travel surveys are the primary means for collecting travel-behavior data in a region. They
provide information to measure and assess transportation system performance, and data used in the
prediction of future demands on the regional transportation system. Other reasons for their conduct
may include the replacement of a prior survey and the development of inputs for re-calibrating an
existing travel forecasting model. Finally, household travel surveys may serve a number of secondary
purposes such as measuring public reaction to a proposed transportation policy decision such as the
imposition of tolls or addition of a light-rail service (Jacob, 2019). Household travel surveys (HTS)
are an important data source for transport planning and research. Household size has consistently
proven the most significant variable in predicting trip rates. Vehicle ownership or availability is also
highly significant in predicting trip rates (McDonald and Stopher, 2014) and is critical for predicting
mode shares (Ou and Yu, 2008). Household income is also significant in predicting trip rates and is
highly significant in the prediction of work trip lengths (Elmi et al., 2017). According to, 1995 NPTS
data. Clearly, while household size is the most critical variable, vehicles and income have significant
impacts across the size categories.
Household surveys are stratified sample surveys. A sampling plan is developed that specifies the
number of households and number of trips made by the household to be surveyed cross-classified by
household size, household income, number of workers in the household, and number of persons aged
in the household. In some areas, the sampling stratification also includes the number of automobiles
in the household and number of drivers in the household. The number of household survey to be
analyzed is based on the estimated number and distribution of households in the population and the
expected amount of travel that will be generated by those households. Statistically, the sampling plan
is designed to evaluate an overall relationship between the characteristics household survey and level
of total person trips in the households (Nancy, 2016).
The household survey was designed to measure the amount of household travel. A target population
is identified, from which a sampling frame is defined. For travel surveys the target population is
further defined as residents in households. Most travel surveys limit households to non-institutional
and non-group homes (such as penitentiaries, dormitories, hospitals, and nursing homes). Specific
information was gathered from 2000 different household surveys ranging from the poor, middle
Household travel data is a critical component of the travel-demand forecasting process. In addition to
providing information on regional travel characteristics, these data are used to estimate and update
travel-demand models designed to analyze proposed transportation policy decisions. The data are
typically generated through a household based survey in which a sample of the population records
their travel patterns over a given time period. This information is combined with socio demographic
information about the sample to develop relationships between individual/household characteristics
and their observed travel patterns (Stephen, 2000). While we may be at a tipping point in terms of
fundamental travel behavior trends, transportation resources are increasingly scarce. We cannot
afford to make decisions that are not based on the best possible data on travel behavior (Tomasz,
2016).
1. Whether there is relationship between numbers of trips made by the individual household other
household characteristics variables.
2. Whether change of household characteristics variables influences individual household travel
number.
The following section reported on a variety of household characteristics obtained from the household
travel survey. In this section, these household and person characteristics are related to household
travel characteristics. Household size, household income, household life cycle, household vehicle
availability, household licensed drivers, and household employment all affect the amount of
household travel. Household Persons living at the same residential address who share meals and have
some type of relationship. Household Characteristics provides an overview of the type of structures
that renters live in, as well as their household living arrangements, number of members per household,
and their number of vehicles per household, etc. The household sector includes individuals or groups
of individuals (Jacob, 2019).
The first step in setting up the simulation was to categorize the households into relatively
homogeneous groupings with respect to each dependent variable. The initial idea was to develop one
household schema to predict all four trip attributes. This would entail treating each attribute as a
multivariate entity (i.e., simultaneously) and is one approach used to model household activity-travel
patterns for the TRANSIMS project (Vaughn et al., 1999). In recent years, the incorporation of
powerful classification algorithms into standard statistical software has increased the options
available for this type o f a market segmentation problem. In this paper, the SPSS add-on module was
used to assist with the initial delineation of categories the difference with respect to a dependent
variable.
Both households and families are the basic units in the society to define demography. The main
difference between family and household is that a family refers to a group of members who maintain
kinship with each other by living in the same dwelling or different dwellings, whereas a household
refers to a group of people who may or may not maintain kinship with each other while living in the
same dwelling (Tomasz, 2016). The living arrangements of individuals may have different stages.
Thus, an individual may live in a family household with parents and may leave the family household
when he or she grows up. After that, they may live in a non-family household with their friends.
Likewise, one can form a new household with a spouse and have children. Consequently, it becomes
a separate family household. Due to many reasons like migrations, many people around the world
live in non-family households. Therefore, all households cannot be considered as families. A
household refers to a small social unit composed of members living in the same house, apartment, or
annex. Here, a group of people shares the same dwelling irrespective of their kinship (Westat, 2019).
Most of the time, members of a household are family members.
But in some instances, there can be students and workers who share the same house. These are non-
family households – a dwelling shared by non-related people. In contrast, family households consist
of two or more individuals who share a blood relationship, kinship, or adoption. The main difference
between family and household is that a family refers to a group of members who maintain kinship
with each other while living in the same dwelling or different dwellings, whereas a household refers
to a group of people who may or may not maintain kinship with each other while living in the same
dwelling (Tomasz, 2016).
The number of vehicles available to members of a household for making trips and it is the number of
occupants in a vehicle during a vehicle trip including the driver of the vehicle. In general, as the
number of vehicles available to the household increases, daily household travel increases. This
household characteristic also affects forecasting and the demand for public transportation. As
household vehicle availability increases, the household demand for public transportation tends to
decrease. Different studies shows, the household trip rates as a function of the number of vehicles
available to household members for travel. As expected, households with no vehicles available made
fewer trips per household than those households that have vehicles available; however, note that
households with no vehicles still make a meaningful number of trips (CAMPO, 2017).
Generally, vehicle ownership is one of the most important factors influencing transportation demand
is vehicle availability. Growth in vehicle ownership has far outpaced growth in U.S. population, more
than tripling in the last four decades. Vehicle Ownership Vehicle ownership varies across the Nation.
Overall, 8.6 percent of U.S. households do not have access to a vehicle (either by choice or by
circumstance) according to the 2019 American Community Survey. Not surprisingly, income is one
of the major determinates of the number of vehicles in a household (i.e., lower-income households
tend to own less or no vehicles compared to higher income households). However, additional factors
influence vehicle ownership besides income. Households with no vehicles were more likely to live in
urban areas, be renters, and have incomes under $25,000 as compared to households with at least one
vehicle This is likely due to changes in household size, labor force participation, and access to
alternative transportation modes (such as on-demand transportation and shared modes). For example,
as household size decreases, the number of vehicles per household also declines, as there are fewer
drivers (McDonald and Stopher, 2014).
A person trip is a trip by one or more persons in any mode of transportation. Each person is considered
as making one person trip. For example, four persons traveling together in one auto are counted as
four person trips. Trip Travel between two addresses for the purposes of carrying out one or more
activities (e.g., a trip from home to work or family trip from home to the beach) or a trip can be
defined as the individual movement by motorized means of transport in one direction (NHTS, 2017).
Each trip possess two ends, the first is located in the start of trip (origin) and the second in the trip
end (destination). Trips are usually divided into home-based and non-home-based. The trip generation
process aims at estimating the total number of trips generated from and attracted to each traffic
analysis zone of the study area for each trip purpose. It predicts the number of trips originating in or
Trip generation analysis focuses on residences and residential trip generation is thought of as a
function of the social and economic attributes of households. At the level of the traffic analysis zone,
residential land uses "produce" or generate trips. Traffic analysis zones are also destinations of trips,
trip attractors. The analysis of attractors focuses on non-residential land uses (Westat, 2019).
Although different urban settings will have different trip compositions, most of the trips undertaken
in urban areas across the world are work-based:
Work. Commutes performed towards the workplace, which represent approximately of daily
commutes.
Business (work). Trips from the workplace to a business destination.
Personal. Trips related to personal activities such as restaurants, the library, or the post office.
Shopping. Commutes towards any store regardless of its size, merchandise, or whether or not
any purchases are made.
Social and recreational. Social trips are related to activities such as visiting family and friends.
Recreational trips are performed with the intention of recreation such as cultural or sports
events. These trips represent about 27% of daily commutes.
Education. Commutes towards a learning establishment by those seeking any type of training,
regardless of the level of learning. These commutes represent 10% of the daily travel total.
The method has a number of advantages over difference testing in that it is quantitative and can be
used to describe differences between products and the main sensory drivers (be they positive or
negative, identified within products or especially when combined with objective consumer testing
and objective multivariate data analysis). However, the method can be expensive and time consuming
because of the necessity to train and profile individual panel lists over extended periods of time; days
or even weeks. It is also not a method that can be readily used for routine analysis. Later we will
discuss ‘flash profiling’ (FP) as a compromise method of analysis. Descriptive analysis is a method
where defined sensory terms are quantified by sensory panel lists. A list of descriptive terms are
determined initially and are referred to as a lexicon or descriptive vocabulary and describe the specific
sensory attributes in a meat sample and can be used to evaluate the changes in these attributes (Byrne
Data aggregation and data mining are two techniques used in descriptive analysis to churn out
historical data. In Data aggregation, data is first collected and then sorted in order to make the datasets
more manageable. Descriptive techniques often include constructing tables of quintiles and means,
methods of dispersion such as variance or standard deviation, and cross-tabulations or "crosstabs"
that can be used to carry out many disparate hypotheses. These hypotheses often highlight differences
among subgroups. Measures like segregation, discrimination, and inequality are studied using
specialized descriptive techniques. Discrimination is measured with the help of audit studies or
decomposition methods. More segregation on the basis of type or inequality of outcomes need not be
wholly good or bad in itself, but it is often considered a marker of unjust social processes; accurate
measurement of the different steps across space and time is a prerequisite to understanding these
processes (Ayush, 2021).
A table of means by subgroup is used to show important differences across subgroups, which mostly
results in inference and conclusions being made. When we notice a gap in earnings, for example, we
naturally tend to extrapolate reasons for those patterns complying. But this also enters the province
of measuring impacts which requires the use of different techniques. Often, random variation causes
difference in means, and statistical inference is required to determine whether observed differences
could happen merely due to chance. A crosstab or two-way tabulation is supposed to show the
proportions of components with unique values for each of two variables available, or cell proportions.
For example, we might tabulate the proportion of the population that has a high school degree and
also receives food or cash assistance, meaning a crosstab of education versus receipt of assistance is
supposed to be made. Then we might also want to examine row proportions, or the fractions in each
education group who receive food or cash assistance, perhaps seeing assistance levels dip
extraordinarily at higher education levels (Ayush, 2021).
Column proportions can also be examined, for the fraction of population with different levels of
education, but this is the opposite from any causal effects. We might come across a surprisingly high
number or proportion of recipients with a college education, but this might be a result of larger
numbers of people being college graduates than people who have less than a high school degree
The descriptive analysis includes mean, standard deviation, skewness and kurtosis values. Mean or
average value, a measure of central tendency, is popularly used to indicate the center of distribution.
In addition, the standard deviation is used to see how the data have deviated from the mean. Kurtosis
and skewness are generally used to delineate the shape of the distribution. The mode is the value that
has the most occurrence frequency. Different levels of measurement can be measured using the central
tendencies as grouped: Nominal Mode; Ordinal: Median, Mode; and Scale: Mean, Median, Mode
Descriptive statistics are numerical and graphical methods used to summarize data and bring forth the
underlying information. The numerical methods include measures of central tendency and measures
of variability. Descriptive statistics in SPSS can be accessed by clicking Analyze Menu → Descriptive
Statistics. Detailed information can be obtained using Frequencies, Descriptive, Explore or Crosstabs.
There are, however, different procedures depending on whether you have a categorical or continuous
variable. Some of the statistics (e.g. mean, standard deviation) are not appropriate if you have a
categorical variable. Key Takeaways
Frequencies
Statistics
Total number of persons in household
N Valid 2000
Missing 0
Mean 2.67
Median 2.00
Mode 2
Std. Deviation 1.416
Variance 2.004
Skewness .943
Std. Error of Skewness .055
Minimum 1
Maximum 10
Sum 5347
Bar Chart
Frequencies
Statistics
Number of workers in the HH
N Valid 2000
Missing 0
Mean 1.18
Median 1.00
Mode 1
Std. Deviation .933
Variance .870
Skewness .679
Std. Error of Skewness .055
Minimum 0
Maximum 7
Sum 2368
Number of workers in the HH
Cumulative
Frequency Percent Valid Percent Percent
Valid 0 503 25.2 25.2 25.2
1 791 39.6 39.6 64.7
2 579 29.0 29.0 93.7
3 99 5.0 5.0 98.6
4 20 1.0 1.0 99.6
5 7 .4 .4 100.0
7 1 .1 .1 100.0
Total 2000 100.0 100.0
Frequencies
Statistics
Number of automobiles
N Valid 2000
Missing 0
Mean 1.79
Median 2.00
Mode 2
Std. Deviation 1.020
Variance 1.040
Skewness .769
Std. Error of Skewness .055
Minimum 0
Maximum 8
Sum 3574
Number of automobiles
Cumulative
Frequency Percent Valid Percent Percent
Valid 0 149 7.4 7.4 7.4
1 652 32.6 32.6 40.1
2 823 41.2 41.2 81.2
3 267 13.4 13.4 94.6
4 78 3.9 3.9 98.5
5 26 1.3 1.3 99.8
6 3 .2 .2 99.9
7 1 .1 .1 100.0
8 1 .1 .1 100.0
Total 2000 100.0 100.0
Correlation Analysis is statistical method that is used to discover if there is a relationship between
two variables/datasets, and how strong that relationship may be. Essentially, correlation analysis is
used for spotting patterns within datasets. A positive correlation result means that both variables
increase in relation to each other, while a negative correlation means that as one variable decreases,
the other increases. Correlation Coefficients: There are usually three different ways of ranking
statistical correlation according to Spearman, Kendall, and Pearson. Each coefficient will represent
the end result as ‘r’. Spearman’s Rank and Pearson’s Coefficient are the two most widely used
analytical formulae depending on the types of data researchers have to hand. Therefore, in this paper
we use Pearson correlation method in order to discover the relationship and how strong that
relationship may be between number of trips made by the household and each seven household
characteristics variables/datasets (such as: number of persons in the household, number of persons
aged 0-4 in the household, number of persons aged 5-21 in the household, number of workers in the
household, number of drivers in the household, household income, and household vehicle
ownership).
Interpreting Results: Positive Correlation is any score from +0.5 to +1 indicates a very strong
positive correlation, which means that they both increase at the same time. The line of best fit, or the
trend line, is places to best represent the data on the graph. In this case, it is following the data points
upwards to indicate the positive correlation. Negative Correlation is any score from -0.5 to -1 indicate
a strong negative correlation, which means that as one variable increases, the other decreases
proportionally. The line of best fit can be seen here to indicate the negative correlation. In these cases
it will slope downwards from the point of origin. No Correlation is very simply, a score of 0 indicates
that there is no correlation, or relationship, between the two variables.
Therefore, as shown the above tables of interest model summary, which indicates all household
characteristics variables are positively correlated with number of trips made by the household. In
additions, number of persons in the household, number of persons aged 5-21 in the household, number
of workers in the household, and number of drivers in the household; have relatively high degree of
correlation with number of trips made by the household; and number of automobiles in the household
and household income have moderate degree of correlation with number of trips made by the
household; but number of persons aged 0-4 in the household have relatively minimum degree of
correlation with number of trips made by the household.
Checking the normality assumption is necessary to decide whether aparametric or non-parametric test
needs to be used. Descriptive statistics in SPSS can also provide different statistics; one is the
distribution of score on continuous variables (Skewness and Kurtosis). These statistics are important
when using parametric statistical techniques (t-tests, ANOVA, Correlation or regression). Skewness
provides indication if the distribution is symmetric or not, while Kurtosis on the other hand provides
information about the ‘peakedness’ of the distribution. In statistics, normality tests are used to
determine whether a data set is modeled for normal distribution. Many statistical functions require
that a distribution be normal or nearly normal. There are both graphical and statistical methods for
evaluating normality: Graphical methods include the histogram and normality plot, and Statistically,
two numerical measures of shape – skewness and excess kurtosis – can be used to test for normality.
If the distribution is perfectly normal, you would obtain a Skewness and kurtosis value of 0 (rather
an uncommon occurrence in the social sciences). Positive Skewness values indicate positive skew
(scores clustered to the left at the low values). Negative Skewness values indicate a clustering of
scores at the high end (right-hand side of a graph). Most researchers consider data to be approximately
normal in shape if the Skewness and kurtosis values turn out to be anywhere from – 1.0 to + 1.0. In
order to Display Skewness and Kurtosis on the output, Select Options Button after entering the
variables in the Variable(s) list box, and select you will be shown the following dialog box
Statistics
Number of trips made by a household
N Valid 2000
Missing 0
Skewness 1.650
Std. Error of Skewness .055
Statistics
Total number of persons in household
N Valid 2000
Missing 0
Skewness .943
Std. Error of Skewness .055
Statistics
Number of Number of Number of Number of
persons aged persons workers in drivers in the Household Number of
0-4 aged 5-21 the HH HH income automobiles
N Valid 2000 2000 2000 2000 2000 2000
Missing 0 0 0 0 0 0
Skewness 2.628 1.771 .679 .607 1.063 .769
Std. Error of Skewness .055 .055 .055 .055 .055 .055
5. Household Income
Regression models describe the relationship between variables by fitting a line to the observed data.
Linear regression models use a straight line, while logistic and nonlinear regression models use a
curved line. Regression allows you to estimate how a dependent variable changes as the independent
variable(s) change.
Assumptions of simple linear regression: Simple linear regression is a parametric test, meaning that
it makes certain assumptions about the data. These assumptions are:
Homogeneity of variance (homoscedasticity): the size of the error in our prediction doesn’t
change significantly across the values of the independent variable.
Independence of observations: the observations in the dataset were collected using statistically
valid sampling methods, and there are no hidden relationships among observations.
Normality: The data follows a normal distribution.
The relationship between the independent and dependent variable is linear: the line of best fit
through the data points is a straight line (rather than a curve or some sort of grouping factor).
The dependent variable and independent variable measured at the continuous level
There needs is linear relationship between the two variables. By check creating a scatterplot
using SPSS Statistics where can plot the dependent variable against independent variable.
An outlier is an observed data point that has a dependent variable value that is very different
to the value predicted by the regression equation.
The above Regression models describe the relationship between variables by fitting a line to the
observed data. Also the regression models use a straight line and the regression allows us to estimate
how a dependent variable changes as the independent variable(s) change. An independent variable,
sometimes called an experimental or predictor variable, is a variable that is being manipulated in an
experiment in order to observe the effect on a dependent variable, sometimes called an outcome
variable. The formula for a simple linear regression is:
y is the predicted value of the dependent variable (y) for any given value of the independent
variable (x).
B0 is the intercept, the predicted value of y when the x is 0.
B1 is the regression coefficient – how much we expect y to change as x increases.
x is the independent variable ( the variable we expect is influencing y).
e is the error of the estimate, or how much variation there is in our estimate of the
regression coefficient.
Linear regression finds the line of best fit line through your data by searching for the regression
coefficient (B1) that minimizes the total error (e) of the model.
SPSS Statistics generate quite a few tables of output for a linear regression. In this section, we show
you only the three main tables required to understand the results from the linear regression procedure,
assuming that no assumptions have been violated.
The first table of interest is the Model Summary table, as shown table provides
the R and R2 values. The R value represents the simple correlation. Therefore, according to
the above tables using the (the "R" Column), which indicates number of persons in the
household and number of persons aged 5-21 in the household have relatively high degree of
correlation with number of trips made by the household; number of workers in the household,
number of drivers in the household, and number of automobiles in the household have
moderate degree of correlation with number of trips made by the household; and number of
persons aged 0-4 in the household and household income have relatively minimum degree of
correlation with number of trips made by the household.
The next table is the ANOVA table, which reports how well the regression equation fits the
data (i.e., predicts the dependent variable). This table indicates that the regression model
predicts the dependent variable significantly well. How do we know this? Look at the
"Regression" row and go to the "Sig." column. This indicates the statistical significance of
the regression model that was run, that is less than 0.05. Therefore, Here, p < 0.0005, which
is less than 0.05, and indicates that, overall, the regression model statistically significantly
predicts the outcome variable (i.e., it is a good fit for the data).
The third table is the Coefficients table provides us with the necessary information to predict
dependent variable from independent variable, as well as determine whether independent
variable contributes statistically significantly to the model (by looking at the "Sig." column).
Furthermore, we can use the values in the "B" column under the "Unstandardized
Coefficients" column. Therefore, from the above tables analysis all independent variables are
contributes statistically significantly to the model (by looking at the "Sig." column).
The last three lines of the model summary are statistics about the model as a whole. The most
important thing to notice here is the p value of the model. Here it is significant (p < 0.001),
which means that this model is a good fit for the observed data.
It can also be helpful to include a graph with your results. For a simple linear regression, you
can simply plot the observations on the x and y axis and then include the regression line and
regression function.
Multiple Regression analysis allows for investigating the relationship between variables. Usually, the
variables are labelled as dependent or independent. An independent variable is an input, driver or
factor that has an impact on a dependent variable (which can also be called an outcome). All
experiments examine some kind of variables. A variable is not only something that we measure, but
also something that we can manipulate and something we can control for. To understand the
characteristics of variables and how we use them in research, this guide is divided into three main
sections. First, we illustrate the role of dependent and independent variables. Second, we discuss the
difference between experimental and non-experimental research. Finally, we explain how variables
can be characterized as either categorical or continuous.
Assumptions: Multiple linear regression follows the same logic as univariate linear regression
except multiple regression, there are more than one independent variable and there should be non-
collinearity among the independent variables. Also, multiple regression analyses are affected by
factors, namely, sample size, missing data and the nature of sample.2
Small sample size may only demonstrate connections among variables with strong
relationship. Therefore, sample size must be chosen based on the number of independent
variables and expect strength of relationship.
Many missing values in the data set may affect the sample size. Therefore, all the missing
values should be adequately dealt with before conducting regression analyses.
The subsamples within the larger sample may mask the actual effect of independent and
dependent variables. Therefore, if subsamples are predefined, a regression within the sample
could be used to detect true relationships. Otherwise, the analysis should be undertaken on the
whole sample.
Regression
Variables Entered/Removeda
Model Variables Entered Variables Removed Method
1 Number of automobiles, Number of . Enter
persons aged 0-4, Number of persons
aged 5-21, Household income,
Number of workers in the HH, Number
of drivers in the HH, Total number of
persons in householdb
a. Dependent Variable: Number of trips made by a household
b. All requested variables entered.
Model Summaryb
Std. Error Change Statistics
R Adjusted R of the R Square Sig. F
Model R Square Square Estimate Change F Change df1 df2 Change
1 .632a .399 .397 4.862 .399 189.116 7 1992 .000
a. Predictors: (Constant), Number of automobiles, Number of persons aged 0-4, Number of persons aged 5-21,
Household income, Number of workers in the HH, Number of drivers in the HH, Total number of persons in
household
b. Dependent Variable: Number of trips made by a household
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 31294.464 7 4470.638 189.116 .000b
Residual 47090.088 1992 23.640
Total 78384.552 1999
a. Dependent Variable: Number of trips made by a household
b. Predictors: (Constant), Number of automobiles, Number of persons aged 0-4, Number of persons aged 5-
21, Household income, Number of workers in the HH, Number of drivers in the HH, Total number of persons
in household
Coefficientsa
Unstandardized Standardized
Coefficients Coefficients Collinearity Statistics
Toleran
Model B Std. Error Beta t Sig. ce VIF
1 (Constant) .575 .324 1.777 .076
Total number of .454 .184 .103 2.460 .014 .173 5.764
persons in
household
Number of persons .355 .289 .029 1.228 .220 .528 1.894
aged 0-4
Number of persons 2.325 .209 .348 11.145 .000 .310 3.224
aged 5-21
Number of workers 1.036 .156 .154 6.657 .000 .561 1.783
in the HH
Number of drivers 1.044 .206 .140 5.066 .000 .397 2.519
in the HH
Household income 1.495E-5 .000 .052 2.713 .007 .818 1.222
Number of .219 .151 .036 1.450 .147 .499 2.003
automobiles
a. Dependent Variable: Number of trips made by a household
Collinearity Diagnosticsa
Variance Proportions
Eig Total Number Number Hous Number
env number of of Number of of Number ehold of
alu Condition persons in persons persons workers of drivers incom automo
Model Dimension e Index (Constant) household aged 0-4 aged 5-21 in the HH in the HH e biles
1 1 5.8 1.000 .00 .00 .00 .00 .00 .00 .00 .00
96
2 .84 2.647 .00 .00 .50 .00 .00 .00 .00 .00
1
3 .64 3.033 .01 .00 .00 .29 .00 .00 .02 .01
1
4 .24 4.938 .06 .00 .00 .01 .54 .01 .19 .00
2
5 .16 5.975 .07 .01 .00 .00 .18 .04 .69 .07
5
6 .12 6.889 .36 .01 .02 .00 .10 .00 .04 .48
4
7 .06 9.748 .15 .00 .00 .00 .17 .82 .06 .41
2
8 .02 14.290 .34 .98 .47 .69 .01 .11 .00 .02
9
a. Dependent Variable: Number of trips made by a household
Residuals Statisticsa
Minimum Maximum Mean Std. Deviation N
Predicted Value 1.07 29.87 7.08 3.957 2000
Residual -19.872 39.333 .000 4.854 2000
Std. Predicted Value -1.520 5.760 .000 1.000 2000
Std. Residual -4.087 8.090 .000 .998 2000
a. Dependent Variable: Number of trips made by a household
It's well -known statistical technique for fitting mathematical relationships between dependent and
independent variables. In the case of trip generation equation, the dependent variable is the number
of trips and the independent variables are the various variable factors that influence trip generation.
These independent variables are land use and socioeconomic characteristics which discussed earlier.
Furthermore, regression analysis has four primary purposes: description, estimation, prediction and
control. By description, regression can explain the relationship between dependent and independent
variables. Estimation means that by using the observed values of independent variables, the value of
dependent variable can be estimated. Regression analysis can be useful for predicting the outcomes
Assumes that the distribution of dependent data is normal, there is linear relationship between
dependent and independent variables, and the independent variables are not be highly correlated.
Because, higher correlation among the independent variables may affect the relationship between
independent and dependent variable.
The first table of interest is the Model Summary table. This table provides the R, R2, adjusted
R2, and the standard error of the estimate, which can be used to determine how well a
regression model fits the data: The "R" column represents the value of R, the multiple
correlation coefficient. R can be considered to be one measure of the quality of the prediction
of the dependent variable; in this case, number of trips made by the household. A value of
0.632 indicates a good level of prediction.
The "R Square" column represents the R2 value (also called the coefficient of determination),
which is the proportion of variance in the dependent variable that can be explained by the
independent variables. You can see from our value of 0.399 that our independent variables
explain 39.9% of the variability of our dependent variable, number of trips made by the
household. However, you also need to be able to interpret "Adjusted R Square" (adj. R2) to
accurately report your data. We explain the reasons for this, as well as the output, in our
enhanced multiple regression guide, i.e. 0.397.
Statistical significance: the F-ratio in the ANOVA table (see above) tests whether the overall
regression model is a good fit for the data. The table shows that the independent variables
statistically significantly predict the dependent variable, F (7, 1992) = 189.116, p < .0001, p
< .0005, i.e., the regression model is a good fit of the data.
Estimated model coefficients: The general form of the equation to predict Number of trips
made by the household from seven household characteristics (number of persons in the
household, number of persons aged 0-4 in the household, number of persons aged 5-21 in the
household, number of workers in the household, number of drivers in the household,
household income, and household vehicle ownership) is:
Predicted ntrip = 0.575 + (0. 454 x hhsize) – (0.355 x num0to4) – (2.325x num5to21) +
(1.036x numwork) + (1.044 x numdrive) + (1.495E-5 x income) + + (0.219 x numcars)
Generally, a multiple regression was run to predict number of trips made by the household
characteristics from number of persons in the household, number of persons aged 0-4 in the
household, number of persons aged 5-21 in the household, number of workers in the
household, number of drivers in the household, household income, and household vehicle
ownership. These variables statistically significantly predicted ntrip , F (7, 1992) = 189.116,
p < .0001, p < .0005, i.e., the regression model is a good fit of the data. The four variables
added statistically significantly to the prediction, p < .05 and three variables added but not
statistically significantly to the prediction, p > .05.
Therefore, Regression analysis is a powerful and useful statistical procedure with many implications
for nursing research. It enables researchers to describe, predict and estimate the relationships and
draw plausible conclusions about the interrelated variables in relation to any studied phenomena.
Regression also allows for controlling one or more variables when researchers are interested in
examining the relationship among specific variables. Some of the key considerations are presented
that may be useful for researchers undertaking regression analysis. While planning and conducting
regression analysis, researchers should consider the type and number of dependent and independent
variables as well as the nature and size of sample. Choosing a wrong type of regression analysis with
small sample may result in erroneous conclusions about the studied phenomenon.
This paper found that household characteristics exhibit significantly important to predict Number of
trips made by the household and this suggests that how household composition is an important
indicator for trip pattern analysis. Additionally, in order to capture these behavior by household the
study argues that it is necessary to employ simple statistical analytical methods applied to determine
relationships between household characteristics variables and travel attributes. Therefore,
Descriptive, regression, and correlational analyses are used to explore character, relationships, and
correlation between household travel characteristics groups and number of trips made by the
households.
According to correlation analysis indicates all household characteristics variables are positively
correlated with number of trips made by the household, but number of persons aged 0-4 in the
household have minimum degree of correlation relative to other characteristics. Besides, according
the regression results, number of persons in the household, number of persons aged 0-4 in the
household and household income have relatively have minimum degree of significance, but number
of persons aged 5-21 in the household, number of workers in the household, number of drivers in the
household, and number of automobiles in the household have relatively high degree of positive effect
to number of trips made by the household. Therefore, the output of this paper will be helpful to
academic knowledge and enable understanding of the subject matter as it paves the way for further
investigation on the issues and it indicates constraints, low standards, and challenges to related sectors
and concerned offices for efficient budget allocation and resource management.
References
1. Professor Ma, (2022). Different Power Points and Documents
2. Abley, S., Chou, M., & Malcolm, D. (2008). National travel profiling part A: description of daily travel
patterns (Research Report 353). http://worldcat.org/isbn/9780478334081
3. Akinlotan, M., Primm, K., Khodakarami, N., Bolin, J., & Ferdinand, A. O. (2021). Rural-Urban
Variations in Travel Burdens for Care: Findings from the 2017 National Household Travel Survey.
JULY, 1–20. https://srhrc.tamhsc.edu/docs/travel-burdens-07.2021.pdf
4. Aschauer, F., Hössinger, R., Axhausen, K. W., Schmid, B., & Gerike, R. (2018). Implications of survey
methods on travel and non-travel activities: A comparison of the austrian national travel survey and
an innovative mobility-activity-expenditure diary (MAED). European Journal of Transport and
Infrastructure Research, 18(1), 4–35. https://doi.org/10.18757/ejtir.2018.18.1.3217
5. Cong, X. (2012). Using traditional household survey and {GPS} data for advanced travel behavior
and emission analysis. https://drum.lib.umd.edu/handle/1903/13550
6. Council, M. R., & City, K. (2020). 2019 Kansas City Regional Household Travel Survey Final Report.
3129(February).
7. Currans, K. M., & Clifton, K. J. (2015). Using household travel surveys to adjust ITE trip generation
rates. Journal of Transport and Land Use, 8(1), 85–119. https://doi.org/10.5198/jtlu.2015.470
8. Dambula, I., & Chibwana, E. N. B. (2004). Characteristics of households and household members.
Population (English Edition), 9–24.
9. Department for Transport. (2019). Analyses from the National Travel Survey. Statistical Release,
January, 34.
https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/fil
e/775032/2019-nts-commissioned-analyses.pdf
10. Dowds, J., Harvey, C., Lamondia, J., Howerter, S., Ullman, H., & Aultman-Hall, L. (2018). Advancing
Understanding of Long-Distance and Intercity Travel with Diverse Data Sources A Research Report
from the National Center for Sustainable Transportation About the National Center for Sustainable
Transportation.
11. Extract, D., & Guide, U. (1978). National travel survey. Annals of Tourism Research, 5(4), 466.
https://doi.org/10.1016/0160-7383(78)90383-3
12. Federal Highway Administration. (2009). 2009 National Household Travel Survey - Florida Data
Analysis. March. http://www.fdot.gov/planning/trends/special/nhts.pdf
13. Federal Highway Administration. (2017). Typecasting Neighborhoods and Travelers. December.
14. Garrett, M. (2014). National Household Travel Survey. In Encyclopedia of Transportation: Social
Science and Policy. https://doi.org/10.4135/9781483346526.n341
15. Gauteng Province Department of Roads and Transport. (2020). Gauteng Province Household Travel
Survey Report 2019/20.
16. Greaves, S. (2000). Simulating Household Travel Survey Data in Metropolitan Areas.
http://digitalcommons.lsu.edu/cgi/viewcontent.cgi?article=8356&context=gradschool_disstheses
17. Highway Administration, F. (2021). The Transportation Future: Trends, Transportation, and Travel.
November.
18. HOBBS, F. D. (1979). Traffic Surveys and Analysis. Traffic Planning and Engineering, 94–172.