You are on page 1of 12

Practical Implementation on

Linear Regression Analysis



where is intercept, is slope/regression
coefficient and is error.

• = SSE
Concept of Linear Regression

• To minimize SSE,
Given Example
Scatter Plot of given data set
No of x y xy x2 y2
observation

1 50 122 6100 2500 14884

2 53 118 6254 2809 13924

3 54 128 6912 2916 16384

4 55 121 6655 3025 14641

5 56 125 7000 3136 15625

6 59 136 8024 3481 18496

7 62 144 8928 3844 20736

Σ 389 894 114690


49873 21711

𝒄𝒐𝒗 (𝒙 , 𝒚 )
𝜷𝟏 ′=
𝒗𝒂𝒓 (𝒙 )
Fitted Regression Curve
Calculating Error
• Once the fitted regression line is known, the fitted value of corresponding to any
observed data point can be calculated. For example, the fitted value
corresponding to the 21st observation in above Table is:

• The observed response at this point is y21 = 194 Therefore, the residual at this
point is:
Calculated Error Table
Coefficient of Determination (R2)

SST = SSR + SSE


= 1-
Assignments
Q.1 Implement the given example in ppt for
calculating Error and R2.
Q.2 Apply linear regression on Appliances
energy prediction Data Set,
(i) Calculate Error for each sample and R2 for
each column.
(ii) Plot fitting line.
• Attribute Information:

1. vendor name: 30
(adviser, amdahl,apollo, basf, bti, burroughs, c.r.d, cambex, cdc, dec,
dg, formation, four-phase, gould, honeywell, hp, ibm, ipl, magnuson,
microdata, nas, ncr, nixdorf, perkin-elmer, prime, siemens, sperry,
sratus, wang)
2. Model Name: many unique symbols
3. MYCT: machine cycle time in nanoseconds (integer)
4. MMIN: minimum main memory in kilobytes (integer)
5. MMAX: maximum main memory in kilobytes (integer)
6. CACH: cache memory in kilobytes (integer)
7. CHMIN: minimum channels in units (integer)
8. CHMAX: maximum channels in units (integer)
9. PRP: published relative performance (integer)

You might also like