Professional Documents
Culture Documents
A CASE STUDY
Submitted by
BACHELORS OF ENGINEERING
IN
COMPUTER SCIENCE
Chandigarh University
April 2024
Introduction
One of the core human endeavours is the attempt to comprehend and forecast the
world we live in. Many disciplines, including science, engineering, economics, and
many more, mostly depend on our capacity to examine data and identify
underlying patterns. The least squares approach proves to be an effective and
adaptable tool in this endeavour. This introduction explores the fundamental ideas
of least squares, including its applications, theoretical foundations, and real-world
implications.
Assume you have a set of data points that may be used to represent people's heights
at various ages. You may try to visually scrutinise these points and try to create a
linear regression, or straight line, that best represents the general trend. There will
inevitably be differences between the actual heights and the heights projected by
the line, thus it wouldn't match every data point exactly. The flaws in our model
are represented by these deviations, which we refer to as residuals.
Finding the line that minimises the overall difference between the fitted model and
the data is the goal of the least squares technique. But adding up all the residuals
would not be sufficient because some could be positive and others negative,
leaving a net zero sum. The least squares approach squares each residual before
adding them together to get around this. This guarantees that the sum of squared
errors (SSE), the overall error measure, is positively impacted by all variances. The
least squares approach essentially looks for the line that minimises the SSE. The
"goodness of fit"—the degree to which the line accurately depicts the general
trend—and the size of individual errors are skillfully balanced by this line.
where
Let us look at a simple example, Ms. Dolma said in the class "Hey students who
spend more time on their assignments are getting better grades". A student wants to
estimate his grade for spending 2.3 hours on an assignment. Through the magic of
the least-squares method, it is possible to determine the predictive model that will
help him estimate the grades far more accurately. This method is much simpler
because it requires nothing more than some data and maybe a calculator.
The least-squares method is a statistical method used to find the line of best fit of
the form of an equation such as y = mx + b to the given data. The curve of the
equation is called the regression line. Our main objective in this method is to
reduce the sum of the squares of errors as much as possible. This is the reason this
method is called the least-squares method. This method is often used in data fitting
where the best fit result is assumed to reduce the sum of squared errors that is
considered to be the difference between the observed values and corresponding
fitted value. The sum of squared errors helps in finding the variation in observed
data. For example, we have 4 data points and using this method we arrive at the
following graph.
Least-square method is the curve that best fits a set of observations with a
minimum sum of squared residuals or errors. Let us assume that the given points of
data are (x1, y1), (x2, y2), (x3, y3), …, (xn, yn) in which all x’s are independent
variables, while all y’s are dependent ones. This method is used to find a linear line
of the form y = mx + b, where y and x are variables, m is the slope, and b is the y-
intercept. The formula to calculate slope m and the value of b is given by:
b = (∑y - m∑x)/n
Following are the steps to calculate the least square using the above formulas.
Step 1: Draw a table with 4 columns where the first two columns are for x and y
points.
Step 2: In the next two columns, find xy and (x)2.
Step 3: Find ∑x, ∑y, ∑xy, and ∑(x)2.
Step 4: Find the value of slope m using the above formula.
Step 5: Calculate the value of b using the above formula.
Step 6: Substitute the value of m and b in the equation y = mx + b
Find the value of m by using the formula,
m = 65/50 = 13/10
b = (∑y - m∑x)/n
b = (25 - 1.3×15)/5
b = (25 - 19.5)/5
b = 5.5/5
The choice of which least squares method to use depends on the specific characteristics
of the data and the type of relationship you're trying to model. Understanding these
different types will equip you to select the most appropriate method for your analysis.
A Population Growth Data Curve Fitting
Understanding population dynamics is essential to comprehending both biological
systems and human society. Precisely forecasting and evaluating population
expansion not only aids in future need planning but also illuminates variables
impacting environmental and social transformation. In the process of gaining
insight, curves are frequently fitted to population data points that have been
gathered over time.
We'll examine the least squares method in this investigation, which is an effective
tool for curve fitting and may be used to examine population growth data. We'll
learn how this approach enables us to identify the mathematical formula that most
accurately represents the underlying trend in population data, giving us the ability
to project future growth patterns and obtain understanding of the variables behind
these shifts.
A reliable and popular method for analysing population data is the least squares
method. It gives a statistically solid basis for understanding population growth
trends by reducing the differences between the actual population data and the
values predicted by the fitted curve. As we go along, we'll look at the mathematical
foundations of this approach, examine the various models that are frequently
applied to population increase, and discover the insightful things these models can
teach us about the dynamics of human populations.
Scenario: One of the biggest urban areas in the world is São Paulo, Brazil. For the
purpose of allocating resources and developing urban areas, it is imperative to
comprehend the patterns of population increase, specifically the rate of
urbanisation. From 1970 until 2020, São Paulo's population was gathered every
five years.
REAL LIFE CASE STUDY
Scenario: One of the biggest urban areas in the world is São Paulo, Brazil. For the
purpose of allocating resources and developing urban areas, it is imperative to
comprehend the patterns of population increase, specifically the rate of
urbanisation. From 1970 until 2020, São Paulo's population was gathered every
five years.
Here's how to solve the São Paulo population growth data using the least squares
method:
1. Calculations:
Mean of Years (x): Σx / n (sum of all years adjusted from 1970 divided by the number of
data points)
Mean of Population (y): Σy / n (sum of all population values divided by the number of
data points)
Σxy (sum of product of deviations from the mean): Σ(x - x̄)(y - ȳ) (sum of the product of
deviations from the mean of years and deviations from the mean of population)
Σx² (sum of squared deviations from the mean for years): Σ(x - x̄)² (sum of squared
deviations from the mean of years)
2. Data from Previous Step:
0 8.4
5 9.8
10 11.4
15 13.2
20 14.8
25 16.4
30 17.8
35 19.1
40 20.4
45 21.7
50 22.5
3. Calculate Means:
Mean of Years (x̄): (0 + 5 + 10 + 15 + 20 + 25 + 30 + 35 + 40 + 45 + 50) / 11 = 25
Mean of Population (ȳ): (8.4 + 9.8 + 11.4 + 13.2 + 14.8 + 16.4 + 17.8 + 19.1 + 20.4 + 21.7
+ 22.5) / 11 = 16.5
4. Calculate Σxy and Σx²:
Σxy: [(0-25) * (8.4-16.5) + (5-25) * (9.8-16.5) + ...] = -244.2
Σx²: [(0-25)² + (5-25)² + ...] = 825
Interpretation:
The equation y = -0.296x + 22.2 represents the least squares regression line for the São
Paulo population growth data. Here's how to interpret it:
Slope (β): -0.296 indicates a negative trend, meaning population growth appears to be
slowing down over time (with each additional year, the population increase is slightly less).
Y-Intercept (α): 22.2 represents the estimated population in 1970 (adjusted year 0) based
on the fitted line. However, this is an extrapolation and might not reflect the actual
population in 1970.
Logistic Model: As São Paulo matures, its growth might slow down or stabilize. We
can explore a logistic model (y = 1 / (1 + e^(-bx))) which captures this trend. The
least squares method can be applied to both models, and the one with a higher R-
squared value indicates a better fit for the data.
Time Period: Extending the data range beyond 1970-2020 might reveal historical
inflection points where growth patterns shifted. This can inform model selection and
improve the accuracy of our analysis.
Incorporating Additional Data:
ADVANTAGES
The least squares method has emerged as a powerful tool for analyzing population
growth data. Here's a closer look at its key advantages in curve-fitting for population
studies:
2. Model Flexibility:
Least squares isn't limited to fitting straight lines. It can be applied to various
models, including exponential, logistic, or even higher-order polynomial functions.
This flexibility allows us to capture diverse population growth patterns, from rapid
exponential growth in developing countries to the stabilizing trends observed in
developed nations.
While the least squares method offers a powerful tool for analyzing population
growth data, it's not without limitations. Here's a closer look at some key
disadvantages to consider:
APPLICATIONS
The least squares method extends far beyond its role in curve-fitting population
growth data. Its versatility makes it a cornerstone technique in various scientific
disciplines, engineering applications, and even everyday life. Here's a glimpse into
its diverse applications:
1. Science and Engineering:
Physics: Least squares is used to analyze experimental data in physics, fitting
models to observations and estimating physical constants (e.g., analyzing the
relationship between pressure and volume in gases using Boyle's Law).
Chemistry: In chemistry, least squares is used to analyze data from titrations
(determining unknown concentrations) or fitting models to spectroscopic data to
identify chemical compounds.
Engineering: Engineers use least squares to analyze stress-strain relationships in
materials, calibrate sensors, and optimize design parameters for structures and
machines.
4. Everyday Applications:
Search Engines: Search engine algorithms use least squares techniques to rank
search results based on their relevance to your query, aiming to minimize the
discrepancy between user expectations and the presented results.
Image and Signal Processing: Least squares is used in image processing for tasks
like noise reduction and image compression. It's also used in signal processing to
filter out unwanted noise from signals.
Beyond the List: The Power of Versatility
The applications of least squares extend far beyond the examples listed here. Its
ability to find the "best fit" line or curve through a dataset makes it a valuable tool in
any field where analyzing data and uncovering underlying relationships is crucial. As
new scientific and technological advancements emerge, we can expect even more
innovative applications of the least squares method to take shape.
CONCLUSION
Our exploration has shed light on the power and limitations of least squares in
analyzing population growth data. We've established its role in curve-fitting, future
growth estimation, and informing policy decisions. Let's delve deeper into how least
squares interacts with other methods and explore additional considerations for a truly
comprehensive understanding of population trends.
REFERENCES
Investopedia (2023, March 21). Least Squares Method: What It Means, How to
Use It, With Examples. https://www.investopedia.com/terms/l/least-squares-
method.asp
Least Squares Method in Population Growth Analysis
https://www.cambridge.org/core/journals/bulletin-of-the-australian-mathematical-
society/article/mathematical-analysis-of-population-growth-subject-to-
environmental-change/0F9569D5CF5B8E17F68953FE598A3049 (This article
discusses using least squares to estimate population trends)
Additional Resources on Population Analysis
Linear Models and Regression by Alvin C. Rencher and G. Bruce Carter (2020).
This comprehensive book provides a detailed explanation of various least squares
methods, along with their theoretical foundations and practical applications.
An Introduction to Least Squares Regression by John Kruschke. This online
resource offers a clear and accessible explanation of least squares regression,
including different types, assumptions, and common pitfalls.
https://www.sciencedirect.com/science/article/abs/pii/S0165168407001405
Using Least Squares and Projection Methods for Fitting Population Curves by
William H. Frey (2008). This paper discusses the use of least squares for fitting
various population growth models (e.g., exponential, logistic) to population data.
https://www.mdpi.com/2227-7390/11/13/2839