You are on page 1of 27

Extrapolation Technique Summarized

The extrapolation technique (aka curve fitting) is a simplistic model that uses past gross population trends to project future population levels. The defining characteristics of trend extrapolation is that future values of any variable are determined solely by its historical values. (SLPP, p. 161 emphasis added) Basic Procedure: 1) Identify overall past trend and fit proper curve 2) Project future populations based upon your chosen curve We use a linear equation for most of these equations. A linear transformation is required to make projections for all but the Parabolic Curve. Advantages: 1) Low data requirements 2) Very easy methodology 3) 1+2 = Low resource requirements (money, skills, etc.) Disadvantages: 1) Uses only aggregate data 2) Assumes that past trends will predict the future

Visualizing the Technique


Leon County Population, 1940-1990
250,000 200,000

Population

150,000 100,000 50,000 1940 1950 1960 Year 1970 1980 1990

Linear Curve: Plots a straight line based on the formula: Y = a + bX Geometric Curve: Plots a curve based upon a rate of compounding growth over discrete intervals via the formula: Y = aebX Parabolic (Polynomial) Curve: A curve with one bend and a constantly changing slope. Formula: Y = a + bX + cX2 Modified Exponential Curve*: An asymptotic growth curve that recognizes that a region will reach an upper limit of growth. It takes the form: Y = c + abX Gompertz Curve*: Describes a growth pattern that is quite slow, increases for a time, and then tapers off as the population approaches a growth limit. Form: Y = c(a) exp (bX) Logistic Curve*: Similar to the Gompertz Curve, this is useful for describing phenomena that grow slowly at first, increase rapidly, and then slow with approach to a growth limit. Y = (c + abX)-1 * = Asymptotic Curves

The Curves to Be Fit

The Linear Curve (Y = a + bX)


Fits a straight line to population data. The growth rate is assumed to be constant, with non-compounding incremental growth. Calculated exactly the same as using linear regression (least-squares criterion). Advantages: --Simplest curve --Most widely used --Useful for slow or non-growth areas Disadvantages: --Rarely appropriate to demographic data Example: Y = 55,000 + 6,000(X) In plain language, this equation tells us that for each year that passes, we can project an additional 6,000 people will be added to the population. So, in 10 years we would project 60,000 more people using this equation (6,000 * 10). Evaluation: Generally used as a staring point for curve fitting.

Manatee County Linear Curve


Year 1950 1960 1970 1980 Actual Data 34,704 69,168 97,115 148,442 Projection 21,421 67,862 114,303 160,743

1990 2000
2010 2020 2030 Y Int Slope

211,707 264,002

207,184 253,625
300,066 346,507 392,948

-9034568.9 4644.09714

Manatee County Linear Regression Projections


450,000 400,000 350,000

Actual Data Projection

Population

300,000 250,000 200,000 150,000 100,000 50,000 0 1950 1960 1970 1980 1990 2000 2010 2020 2030

Year

In this curve, a growth rate is assumed to be compounded at set intervals using a constant growth rate. To transform this equation into a linear equation, we use logarithms. Advantages: --Assumes a constant rate of growth --Still simple to use Disadvantage: --Does not take into account a growth limit Example: Y = 55,000 * (1.00 + 0.06)X In plain language, this equation tells us that we have a 6% growth rate. After one year we project a population of 58,300. After 10 years we would project a population of 98,497. Evaluation: Pretty good for short term fast-growing areas. However, over the long-run, this curve usually generates unrealistically high numbers.

The Geometric Curve (Y = aebX)

Manatee County Geometric Curve


Year 1950 Actual Data 34,704 Log of Pop 4.5404 Log Proj 4.6158 Projection 41,281

1960
1970 1980

69,168
97,115 148,442

4.8399
4.9873 5.1716

4.7885
4.9613 5.1341

61,454
91,484 136,189

1990
2000

211,707
264,002

5.3257
5.4216

5.3069
5.4797

202,741
301,813

2010
2020 2030 Y Int Slope (29.080) 0.0173

5.6525
5.8253 5.9981

449,298
668,855 995,702

Manatee County Geometric Curve Projections


1,200,000

1,000,000

Actual Data Projection

Population

800,000

600,000

400,000

200,000

0 1950 1960 1970 1980 1990 2000 2010 2020 2030

Year

The Parabolic Curve (Y = a + bX + cX2)


Generally has a constantly changing slope and one bend. Very similar to the Linear Curve except for the additional parameter (c). Growing very quickly when c > 0, declining quickly when c < 0. Advantage: --Models fast growing areas Disadvantages: --Poor for long range projections (familiar refrain?) --No Growth Limit --More complex Example: Y = 43.46 + 8.78(X) + 0.581(X2) When X=0, Y =43.46. When X = 6, Y = 117.1 Evaluation: Exactly the same as the Geometric Curve; good for fast growing areas, but poor over the long run.

Manatee County Parabolic Curve


Even Number of Observations Year 1950 1960 1970 1980 Actual Data 34,704 69,168 97,115 148,442 Index Value -5 -3 -1 1 Index Squared 25 9 1 1 Product of Index and Index ^4 Observed 625 81 1 1 -173520 -207504 -97115 148442 Column F Squared Projection

867600 35,136 622512 65,118 97115 103,330 148442 149,771

1990
2000 2010 2020 2030

211,707
264,002

3
5 7 9 11

9
25 49 81 121

81
625 2401 6561 14641

635121
1320010

1905363 204,441
6600050 267,341 338,471 417,830 505,419

Manatee County Parabolic Curve


600,000 500,000

400,000

Pop

Actual Data Projections

300,000

200,000

100,000

0 1950 1960 1970 1980 1990 2000 2010 2020 2030

Year

The first of the Asymptotic Curves. Takes into account an upper or lower limit when computing projected values. The asymptote can be derived from local analysis or supplied by the model itself. Advantage: --Growth limit is introduced --Best fitting growth limit Disadvantage: --Much more complex calculations --Misleading Growth limit (high and low) Example: Yc = 114 - 64(0.75)X The growth limit is 114. The curve takes into account the number of time periods and as X gets larger the closer you get to the Growth limit. When X = 0, Y = 50; when X = 2, Y = 78, etc. Evaluation: This curve largely depends upon the growth limit. If the limit is reasonable, then the curve can be a good one. Also, the ability to calculate the growth limit within the model is very useful.

Modified Exponential Curve (Y = c + abX )

Manatee County Modified Exponential Curve


Year 1950 1960 1970 Index 0 1 2 Actual Data 34,704 69,168 97,115 Projection 38,242 65,630 100,535

1980
1990 2000 2010

3
4 5 6

148,442
211,707 264,002

145,022
201,722 273,987 366,090

2020
2030 Total

7
8 825,138

483,476
633,087

Manatee County Pop Projections Best Fitting Mod Exp Curve


700,000 600,000 500,000 Actual Data Projection

Population

400,000 300,000 200,000 100,000 0 1950 1960 1970 1980 1990 Year 2000 2010 2020 2030

The Gompertz Curve (Y = c(a) exp (bX))


Describes a growth pattern that is initially quite slow, increases for a period and then tapers off. Like the Mod Exp curve, the upper limit can be assumed or derived by the model. Advantage: --Reflects very common growth patterns Disadvantages: --Getting even more complex --Misleading growth limit (limit can be high or low) Example: log Yc = 2.699 - 1.056(0.9221)X The equation itself is tough to understand. When X = 0, Log Y = 1.64, so Y = 44.0 (via antilog calculation). Note: Antilog of 2.699 is 500 (the growth limit) Evaluation: A very useful curve that can be fitted to all kinds of growth patterns. However, as with the previous curve, using an assumed growth limit can be problematic unless it is reasonable and makes sense for the case at hand.

Manatee County Gompertz Curve


Actual Year 1950 1960 1970 Index 0 1 2 Data 34,704 69,168 97,115 Log of Log of Projection 37,910 63,319 98,906 Obs Value Proj 4.5404 4.8399 4.9873 4.5788 4.8015 4.9952

1980
1990 2000 2010

3
4 5 6

148,442
211,707 264,002

5.1716
5.3257 5.4216

5.1636
5.3100 5.4373 5.5480

145,754
204,186 273,726 353,169

2020
2030 Total

7
8 825,138

5.6442
5.7278

440,755
534,378

600,000

Manatee County Pop Projections Best Fitting Gompertz Curve

500,000

Actual Data

Population

400,000

Projection

300,000

200,000

100,000

0 1950 1960 1970 1980 1990 2000 2010 2020 2030

Year

The Logistic Curve (Y = (c + abX)-1 )


VERY similar to the Mod Exp and the Gompertz curves, except that we are taking the reciprocals of the observed values. A very popular curve. Advantages: --Has proven to be a good projection tool --Considered a bit more stable than the Gompertz curve Disadvantages: --Complex! --Hard to interpret the formula Example: Yc-1 = 0.0020 + 0.217(0.8015)X Another difficult to interpret equation. When X = 0, Y = 42.1. When X = 6, Y = 128.9. Note: Reciprocal of .002 is 500 (GL) Evaluation: Considered to be the best of the extrapolation curves. It reflects a well-known growth pattern. It is more stable than the Gompertz curve and it does not have a misleading growth limit.

Manatee County Logistic Curve


Actual
Year 1950 1960 1970 1980 1990 Index 0 1 2 3 4 Data 34,704 69,168 97,115 148,442 211,707

Recip of
Observd

Log of
Proj Projection 37,093 61,300 97,601 147,321 207,588

0.00002882 0.000026959 0.00001446 0.000016313 0.00001030 0.000010246 0.00000674 0.000006788 0.00000472 0.000004817

2000
2010 2020 2030 Total

5
6 7 8

264,002

0.00000379 0.000003694
0.000003054 0.000002689 0.000002481

270,700
327,434 371,848 403,002

450,000 400,000 350,000

Manatee County Pop Projections Best Fitting Logistic Curve


Actual Data Projection

Population

300,000 250,000 200,000 150,000 100,000 50,000 0 1950 1960 1970 1980 1990 Year 2000 2010 2020 2030

The Curve Fitting Procedure


1) Plot the data in a chart 2) Eyeball the data: Identify and eliminate erroneous data; Identify past population trends; Eliminate curves that dont fit the data 3) Process the data using the chosen curves, Plot your results in charts 4) Use quantitative procedures to identify best-fitting curves 5) Make your choice of forecast based upon a combination of quantitative and qualitative evaluations of the various projections Many issues affect how the fit of the various curves: --Choice of the Base Period, including the Base Year --Calibration of projections --Use of Growth Limits

Understanding Extrapolation
One basic principle when using the the extrapolation technique effectively is: The choice of the Base Period can have a significant impact upon the projection generated. In the Manatee County example, if we use a varying Base Period and the Lin Reg method, we get the following results:
Actual Data 1970 1980 1920-2000 1950-2000 1980-2000

97,115
148,442 211,707 264,002 253,817 284,749 315,680 300,066 346,507 392,948 323,610 381,390 439,170

1990
2000 2010 2020 2030

500,000 450,000 400,000

The Effect of Different Base Periods on Population Projections


Actual Data 1920-2000 1950-2000 1980-2000

Population

350,000 300,000 250,000 200,000 150,000 100,000 50,000 0 1920

1930

1940

1950

1960

1970

1980

1990

2000

2010

2020

2030

Year

Improving Extrapolation Projections through Calibration


The Linear Curve also helps to illustrate one improvement to the extrapolation technique: Oftentimes analysts calibrate their model to fit the projection to the observed data. Calibration is very simply an adjustment that makes the projected population consistent with the launch year population. Calibration is calculated by subtracting the estimated population from the observed population in the Launch Year (Observed Estimated). In our Manatee County example, the adjustment for BY1950 is: Observed Pop 2000: 264,002 Estimated Pop 2000: 253,625 Calibration: +10,377 This figure is then added to all subsequent projections using this mixture of curve type (Lin Reg) and base period (1950-2000) Calibration is typically used with the Lin Regression technique, but can be used in others as well.

Improving Extrapolation Projections through Upper Limits


The three asymptotic curves (Mod Exp, Gompertz, Logistic) have two derivations that offer an opportunity to fine tune our projections : 1) Under one approach the model itself calculates a limit to population growth. 2) Alternatively the analyst can set an upper limit for the population. This upper limit can be generated by a carrying capacity analysis (as in Monroe County (the Keys)) or from some other study that generates an upper population bound. The concept of growth limits has been found to be very useful in projections as populations cannot grow infinitely there is some limit to their growth. In incorporating this concept into the extrapolation technique there is evidence that better projections are generated.

County Population Projections Best Fitting Modified Exponential Curve


700,000 600,000 500,000

Actual Data Projection

Manatee County Example BP 1950-2000

Population

400,000 300,000 200,000 100,000 0 1950 1960 1970 1980 1990 2000 2010 2020 2030

Limit Calculated by Model


Year

County Mod Exp UL Pop Projections


400,000

Upper Limit Assumed To be 1.2 Million People


Population

350,000 300,000 250,000 200,000 150,000 100,000 50,000 Year

Actual Data Projection

1950

1960

1970

1980

1990

2000

2010

2020

Year

You might also like