**Regression Cost Model
**

Introduction Regression analysis – A statistical method by which estimates are made of the value of a variable from a knowledge of the values of one or more other variables. and the errors involved in this estimating process measured Normally used in situations where relationships between variables is unique Main types§ Simple Linear Regression Analysis § Multiple Linear Regression Analysis the not .

Assumptions § The standard deviation in the error associated with the dependent variable cost remains constant throughout the domain § This error is normally distributed § The effect of any variable is always expressed in terms of a fixed cost increase or decrease. irrespective of project size or type .

the value of y when the independent variable is zero coefficient of x (b).The slope of the line for straight line Constant The Expression y = a +bx .the value to be estimated Independent variable (x) – the factor from which the estimates are made (a).Simple Linear Regression Analysis Two-variable linear regression – describes the relationship between two variables by computing a straight line through the data obtained Dependent variable (y) .

Dependent variable b = tan θ θ a Independent variable § Prediction within the range of values in the dataset is known as interpolation § Prediction outside this range of the data is known as extrapolation .

Steps of SLR Model .

Steps of SLR Model Specification on the relationship § Begins with theoretical reasoning between variables § Form equations to represent the relationships between variables § Since the population parameters are unknown. sample is considered and the model is built with estimated values Estimation § Lean squares estimation procedure is used most of the time § Include a series of statistical tests to make sure that the estimated model is a good representation of the postulated relationship .

The ratio of the regression mean square to the residual mean square o T ratio test.Steps of SLR Model Validation § Evaluate the quality of the model § Evaluated on the basis of following statistics o Coefficient of determination o Standard error o F ratio test.The ratio of the coefficient to its standard error Forecasting § Forecasting should be satisfactory to the users § Accuracy depends on the acceptable error amount of the model .

xn y = a + b1x1 + b2x2 + b3x3 + … bnxn + e . x3….Multiple Linear Regression Analysis This aims to create a relationship with the dependant variable with several other independent variables. § Independent / Response variable ..x1. x2.y § Dependant/ Explanatory variables.

Steps of MLR Model .

o Use of the t-ratio. . x2. § Resolving multicollinearity § Eliminating non-significant variables one at a time until all the remaining variables are significant. . . o Use of the F-ratio -Test for the significance of the overall dependence of y on the variables (x1. .STEPS OF MLR MODEL Specification § Begins with theoretical reasoning on the relationship between variables § Selecting a full set of explanatory(Independent) variables Estimation § Determining the correlation coefficients between all possible pairs.a large t-ratio is desirable. xn ) § .

the method of least squares is used due to its simplicity.STEPS OF MLR MODEL Estimation Cont’ § Constructing a multiple linear regression model o Making estimates of the coefficients in the regression model. § Forecasting § If validation is a success practical use on construction projects . Validation § Validation done before practical use in construction industry using another actual project.

Application in Construction Industry Simple Linear Regression Analysis § Case Study: Consider the possible sample values of bricklayer hours and areas of brickwork from 10 fictitious contracts in the following table § .

§Plo tte d sc atte r diagram .

§ Scatter is caused by the factors other than area which affect the hours required o Bricklayer-hours : Independent / Response variable o Areas of brickwork : Dependent / Regression variable § To avoid individual judgement in constructing the line – method of least squares used § In fitting the regression line to a set of data. several parameters are estimated which need to be tested for the significance before being accepted § As an overall guide to the strength of association between the two variables the correlation coefficient is calculated Perfect correlation = 1 Calculated coefficient correlation = 0.998 .

anticipated difference between the actual values and what the regression line predicts. should be calculated .§ Shows an excellent degree of correlation which cannot be found by using one variable only § Standard error of estimate .

§ § Firstly listing out the independent variables of Project A o o o o o Type of pour Total volume(m3) Number of trucks on job Average volume of load(m3) Start time Ave rage truc k time (minute s) o Numbe r o f lo ads o We athe r o Co nc re te mix o cyc le . For this a wastewater project was observed in the North-East of Scotland (Project A).Application in Construction Industry Multiple Linear Regression Analysis Case Study: Obtaining a model that will estimate productivity rates of concrete operations. The regression analysis methodology used in this study is backward elimination. stepwise regression.

.§Calculating the correlation coefficients between all possible pairs by using the inbuilt functions of Microsoft Excel §Resolving multicollinearity by removing one variable (Total volume) out of the highly corerated two variables ( i.e Total volume and No. of Loads).

97) and the start time (t-statistic=1. § § § § § § § § § § § Insignificant variables have small absolute values-Should be eliminated § Carrying out two further runs.§ Estimating partial regression coefficients and the corresponding t-statistics from the regression on actual productivity for all eight explanatory variables.72) from the regression model. . eliminating the insignificant variables: concrete mix (t-statistic=0.

56Tn + 0. § § Constructing a multiple linear regression model for actual productivity for a single server concrete system.31Tp + 1.37Ln – 6.§ An important assumption made is the variability of the data does not change for different levels of the response or explanatory variables. o This is checked by carrying out residual plots.75Va + 0.01Ct þ 0.59W 0.95 Tp = Type of pour Va = Average volume of concrete Tn = Number of trucks on job W = Weather Ct = Average cycle time Ln = Number of loads . Pactual =1.

§ The actual productivities achieved on 32 operations observed on Project B are compared to the predicted productivities using the derived regression model .§ Validation is done by using an actual concrete pours from another wastewater project in Scotland by a different contractor (Project B).

and if the correlation coefficient (positive/ negative) is high it is difficult to get their separate effects on the dependent variable. Omitted variables § If independent variables that have significant relationships with the dependent variable are left out of the model. E. quality etc cannot be quantified § Biasness of selecting independent variables § Endogeneity § Changes in the dependent variable cause changes in the independent variable.g location. the results will not be satisfactory.Drawbacks in regression Multicollinearity § If the explanatory variables in multiple regression are correlated. § Leads to a poorly estimated partial regression coefficient. § .

processing and storing data packages perform least § All major statistical software squares regression analysis. § Specialized regression software has been developed for use in fields (survey analysis.An example of an algorithmic software cost estimation model developed using basic regression formula. § . neuro imaging). § Simple linear and multiple regression using least squares can be done in some spreadsheet applications and on some calculators. accessing. § The Constructive Cost Model (COCOMO).Development of Regression Model Development of technologies in computing.

Conclusions Regression Analysis falls under the Algorithmic Cost Model which uses mathematical formulae linking costs/inputs with metrics to produce an estimated output. in spread sheets and in a calculator is beneficial for the Quantity Surveyor . It is used not only for estimating costs but also for forecasting productivity. time and any other parameter. A widely used method not just in the construction Industry. The knowledge of using Regression analysis in a specialized cost estimation software. When there is only one major factor affecting the response SLR can be used When there are more than 1 major factor affecting the respone MLR can be used There are several drawbacks and limitations in this method.

