Professional Documents
Culture Documents
Introduction to Microeconomics
Nature and Subject Matter of Economics
Demand, Supply and Market Equilibrium
The Concept of Demand, Suply & Elasticity
The Theory Of Consumer Choice
Short Run Cost and Output Decision Available Soon
Long Run Cost and Output Decision
Behavior of Profit Maximizing Firms and The Production Process
Monopoly and the Antitrust Policies of the Government
The Market for the Factors of Production
Introductory Macroeconomics
Introduction to Macroeconomics
National Income Accounting
Money Demand and Supply
Credit Creation and Monetary Policy
Inflation and its social costs
Intermediate Microeconomics-I
Budget Constraint
Preferences and Indifference Curves
Consumer Optimization
Decomposition of Price Effect Available Soon
Intertemporal Choices Available Soon
Revealed Preference Theory Available Soon
Production - Recap and basic concepts Available Soon
Costs Meeta Kumar Available Soon
THE PERFECT COMPETITION Available Soon
Intermediate Macroeconomics-I
Aggregate Demand and Aggregate Supply Available Soon
Short Run Open Economy Models Available Soon
Introduction to Microeconomics
Table of Contents
1.Learning Outcomes
2.Introduction
3. Evolution of the Subject
4. Methodology of Economics: Positive Economics and Normative Economics
5. Art or Science
6. Scope of Economics - Related Subjects
7. Models and Hypotheses
8. Market and Equilibrium
8.1 Demand and Supply
9. Concept of ceteris paribus – General Equilibrium Partial Equilibrium
10. Static and Dynamic Equilibrium
11. Short-Run and Long Run Equilibrium
12. Nobel Prize in Economics
13. Summary
14. Exercises
15. Glossary
16. References
17. Activity
18. Quiz
1.Learning Outcomes
After you have read this chapter you should be able to define Micro-
Economics. Macro-Economics, Market, Demand, Supply, Equilibrium, Partial
and General Equilibrium, Static and Dynamic Equilibrium understand the
central problems of an economy identify variables, constants and
parameters differentiate Micro-Economics from Macro-Economics appreciate
the scope of the subject of Economics apply the knowledge of basic
Economics
2. Introduction
different Y’s).Each such combination has its position on the X-y plane. Join them to
get the Production Possibility Curve (PPC).
Each point on the PPC represents a maximum of X ( at a certain Y) or a maximum of
Y ( at a certain X). All points below and including the PPC represents combinations of
X and Y that are Attainable by the society concerned but only points on the PPC
represent points of maximum X(given the Y’s) or maximum Y ( given the X’s). Points
below the PPC ( including the two axes and so, the origin) represent what the society
concerned can produce but without using its (scarce) resources fully.
When, for some reason or the other, the society becomes capable of producing more
of X ( at every given Y) or more of Y (at every given X), the PPC curve shifts
forward. This indicates Economic Growth. When the reverse happens, the PPC
shrinks back.
Economics studies these Central Problems of human society, and has by now come
to study much more.
How did economics evolve as a subject?
Etymologically, the word Economics derives from the Greek word oikos ( house) and
nomos ( management).
A wife doing a good job of running the household was often called thrifty or
economical. However from the second half of the seventeenth century, the word
Economics came to be applied to a much wider context, viz., the management of
the various resources of a whole country or nation.
Adam Smith is known as the `father’ of the subject of Economics. His book An
Inquiry into the Nature and Causes of the Wealth of Nations, first published in 1776,
is the first-ever treatise on Economics . Smith’s concern was about nations or
countries, that is, it was a macro-type concern , although the term macro was not in
use then.
Later T.R. Malthus, David Ricardo and even J.S.Mill wrote important treatises on the
subject, taking the same overall perspective. They made sweeping generalizations
taking long-run perspectives. They are known as Classical economists. Thus, by and
large, Classical economists did not take a micro-approach but rather a macro one.
Towards the end of the 19th century, economists began to study economic issues on
a more specific and individual level. A new or neo approach evolved that has come to
be called the Neo-Classical approach. It concentrated on how the price and quantity
of specific goods (and services) were determined in the market though a rational
balancing of their `marginal’ costs and benefits (`utilities and
productivities).Foremost among these Neo-Classical economists( also described as
`marginalists’) were Menger, Jevons and Alfred Marshall. It is their work that
constitutes the foundation of Micro-economics. Especially Marshall’s Principles of
Economics ( published first in 1890).There evolved the concept of the `economic
man’, that is, the individual producer or consumer, who made his choices or
decisions perfectly rationally. On the basis of his preferences or cost patterns, the
consumer’s objective was always to maximize his utility or satisfaction and the
producer’s, to maximize his profits. The individual consumer or producer was the unit
concerned, not the entire national entity.
Although the Classical economists had been concerned with the nation or the country
as a whole, and therefore are more macro than micro, Macro-economics as an area
of study began really after the Great Depression of the 1930s. Europe and America
both suffered from it and so did their colonies in other parts of the globe. Widespread
unemployment followed the closing down of production units. Both employers and
employees suffered. It was then that John Maynard Keynes came up with his
analysis of the phenomenon in terms of Aggregate Demand falling short of
Aggregate Supply and emphasized the role of the Government of a country in
stepping up its own expenditure in order to give Aggregate Demand a boost.
His analysis laid the foundation of Macro-Economics. Later John Hicks, Milton
Friedman, Lucas and others have contributed to the subject of Macro-Economics.
Paul. Samuelson has emphasized that there is no essential opposition between
Macro-Economics and Micro-Economics. Both are “vital” to the understanding of the
subject ( Economics, 7th edn, p 362). It is usual in most universities to offer a course
in Micro-Economics prior to that in Macro-Economics. But it is not a necessary
practice and is changing.
In the words of Paul A. Samuelson. “Macroeconomics deals with the big picture –
with the macro aggregates of income, employment, and price levels. But do not
think that microeconomics deals with unimportant details. After all, the big picture is
made up of its parts.” ( Economics, 7th edn, p 362)
5. Art or Science
A Social Science
Even if we use the term science to describe Economics, we must remember that it is
a Social Science. It does not study individuals in isolation, doing everything by
oneself. It studies individuals as members of a society or nation or Economy.
An economy is the same as country or society but considered only in its economic
aspects. Every society or country has numerous people engaged in activities of all
sorts. Some work in the fields, some work in factories, and yet others in offices.
Some perform agricultural activities, some industrial, and some do services. Those
who are in agriculture need to get industrial products and, say, banking services.
Those who are factory-workers, say, need to get hold of foodstuff, and use some
kind of transport services. The people engaged in the services sector need both food
and clothing . Thus all the three sectors with their separate kinds of activities need to
have relations. All the people of an economy need to act as well as inter-act. This
they do by exchanging the products of their various activities in various markets.
The epithet `Social’ covers this aspect of the subject of Economics.
However, for analytical purposes, Economics sometimes uses the concept of a
Robinson Crusoe Economy, or an economy consisting of a single person performing
all the economic activities by himself. Robinson Crusoe is the title of a book written
in 1719 by Daniel Defoe based on the life of Alexander Selkirk who was marooned on
an island and survived all by himself for 28 years. A Robinson Crusoe Economy is
thus a theoretical concept where the economy has a singleton member.
Economics has a wide scope and has connections with various subjects.
Mathematics and Statistics are necessary for the study of Economics. Mathematics
helps economists to analyze economic realities, to and derive conclusions from
them. Statistics aids this process by systematizing the economic realities as data and
inferring from them by accepted statistical tools. In fact, the application of Statistics
to Economics had led to the development of a relatively new subject: Econometrics.
It helps in empirical study and making projections both into the past and the future.
Without a sound mathematical base, it is next to impossible to cope with academic
Economics. However, to have an general awareness of the economic occurrences of
the world, basic intelligence will do. To quote Samuelson, “ Although every
introductory textbook must contain geometrical diagrams, knowledge of
mathematics itself is needed only for the higher reaches of economic theory. Logical
reasoning is the key to success in the mastery of basic economic principles, and
shrewd weighing of empirical evidence is the key to success in mastery of economic
applications.”( Economics, Ch 1. p 5)
Actually, the earlier term for Economics was Political Economy. Several universities
still have a common department for Politics and Economics. Political Science is an
useful subject to supplement a course in Economics. History is also a subject that
has a close connection with Economics. Economic History is a compulsory paper in
every course in Economics, undergraduate as well as post-graduate. Several
universities offer a post-graduate course in Economic Geography.
In recent times several subjects or courses have emerged from Economics, e.g.,
Commerce, Business Economics, Business Administration, Business Management.
While based on the fundamentals of Economics, they have their own distinctive
course contents. But both Papers on Micro-Economics and Macro-Economics figure in
all of them.
Economics has to deal with a complex mass of realities. So it sometimes puts them
into a simplified framework or Model. A Model is a theoretical construct that
represents economic realities by a set of inter-related variables. These relationships
can be logical or quantitative. But putting them in a Model helps economists to
analyze realities better and even made future predictions.
Economist often posit or propose explanations for economic phenomena. These are
known as Hypotheses. A hypothesis is not a theory. Only if a Hypothesis is verified
or found to be true, can we call it a Theory. To be verified or falsified, that is tested,
a hypothesis has to be framed in a certain way. Such a hypotheses is called a
Scientific hypothesis. Sometimes economists have no alternative but to take a
certain hypothesis to be true, and proceed on the basis of it. Such a hypothesis is
called a Working hypothesis. Statistics and Econometrics are the tools used in
verifying a hypothesis.
Laws of Economics
The Classical and Neoclassical economists often used the term `law’ to describe the
tendencies that they observed in functioning of the economy or society. The Law of
Demand, the Law of Diminishing Returns, Say’s Law , Okun’s law are just a few
examples. In no sense are these binding or enforceable or universal laws.
However, law in the usual sense of the term does have a close connection with
Economics. For the market to function well, there must be law and order in the
country. This is a basic idea of Neo-Classical Economics. Laws influence economic
occurrences. For example, the Permanent Settlement of 1793 had a far-reaching
influence on India’s agriculture. After Independence, the government had to pass
several Abolition of Intermediaries Acts in order to correct the agricultural situation.
The Monopolistic and Restrictive Trade Practices Act, the Consumer Protection Act ,
the various economic reforms, all testify to the close connection between Law and
Economics.
The word Market comes from Latin mercatus which meant trading, buying or selling
at an appointed time or place. A market is not necessarily a marketplace. It is a
context or background where buying and selling are taking place. The haat, bazaar
and mandi , the shop and the mall are markets. But on line or telephonic sale and
purchase , which is quite common these days, are also market transactions.
The distinguishing feature of the market is that market transactions are exchanges
, usually performed through the medium of money. The seller ( who is sometimes
though not always the producer) of certain commodities/ services brings them to the
market and offers certain quantities of quantities of them at a certain price . He
thus supplies them in the market. The (prospective) buyer comes to the market
wanting to get certain commodities/ services at a certain price. He thus demands
them in the market. If the demand of the buyer and the supply of the seller match at
a certain configuration of price and quantity, the transaction takes place. If not, it
does not.
The transaction is thus both a sale and a purchase. It is sale from the point of view
of the Seller(producer) , that is, from the Supply side. It is purchase from the point
of view of the Buyer, that is, the Demand side.
The transaction has two aspects or dimensions to it, viz., a quantity and a price. For
example, the seller is agreeable to selling 2 kegs of rice at the rate of Rest 50, and
the buyer finds this offer reasonable. “Two kgs of rice at Rs 50” is then the
description of the transaction. The total amount spent by the buyer/ consumer and
received by the seller/supplier is thus Rs 100 (50 x 2), and this is called the
Expenditure from the buyer’s point of view and the Revenue from the seller’s. The
transaction configuration and the total expenditure/revenue are thus distinct
concepts.
The transaction configuration is known as the Equilibrium configuration, or simply,
Equilibrium. It is called so because it represents a matching or balancing of two
aspects – the Buyer’s and the Seller’s, that is, the Demand side and the Supply side.
In Latin, aequus means equal and libra means scales or balances.( That is why in the
Zodiac, the sign Libra is shown by a pair of scales). When the two scales on the two
sides of a scales instrument hang at the same level, there is aequilibrium, or, in
English, Equilibrium. Neither of the scales go up or down any more, and unless there
is some external disturbance, the balance, or equilibrium, holds.
The word Demand is from Latin demandare which means to claim or commission.
Supply is from Latin supplere which means to fill up or complete.
In the context of Economics it was Adam Smith in 1776 who first used them as
corresponding concepts. Marshall has compared them to the two blades of a pair of
scissors. Just as the scissors cannot work without either of the two blades, Market
Equilibrium cannot be determined without reference to both Demand and Supply.
There exists at any one time a definite relationship between the market
price of a good and the quantity demanded of that good. This relationship
between price and quantity demanded/bought is called the Demand
schedule or Demand function or Demand curve.
One usual form that the Demand curve can take is downward-sloping from left to
right. Based on the Demand schedule below, this is depicted as follows:
Demand Schedule
A 5 9
B 4 10
C 3 12
D 2 15
E 1 20
Demand Curve
Prices are measures on the vertical axis and the quantities demanded on the
horizontal. Each pair of Q,P numbers from the Demand Schedule is plotted here as a
point on the Q-P plane, and a smooth curve passed through the points to yield the
There exists at any one time a definite relationship between the market
price of a good and the quantity the producers of that good are willing to
offer or supply. This relationship between price and quantity supplied is
called the Supply schedule, function and curve.
Based on the Supply schedule below, a supply curve can be depicted. Usually it
slopes upwards from left to right.
Supply Schedule
Price (P) Quantity Supplied(Qs)
Rs per kg Kg
A 5 18
B 4 16
C 3 12
D 2 7
E 1 0
Prices are measures on the vertical axis and the quantities supplied on the
horizontal. Each pair of Q,P numbers from the Supply Schedule is plotted here as a
point on the Q-P plane, and a smooth curve passed through the points to yield the
Supply `curve’. It slopes upwards from Left to Right, showing a Direct or Positive
relation between price and quantity.
To find the Equilibrum, the two schedules must be matched or, the two curves
superimposed on each other. At the price where the quantity demanded is the same
as the quantity offered, that is, at the point where the Demand curve and the Supply
curve intersect, there is a perfect matching or balancing, i.e., equilibrium.
Putting the two schedules together, we find that only at P=3 will both Qd and Qs be
the same, viz., 12. Putting the two curves together, we find that they intersect at
(only) the point (12, 3). At the (point 12,3) thus, there is equilibrium. This
equilibrium holds, until and unless there is some external reason tipping the scales
either way. At any price lower than Rs 3 per kg, suppliers would not come forth with
the quantity that the buyers are demanding( 12 kgs). At any price that is higher,
buyers will not be demanding the quantity that suppliers are willing to supply at
those(higher) prices. At any price higher or lower than Rs 3 per kg, there will be
Excess Demand or Excess Supply in the market.
The above Demand and Supply are individual in nature, belonging to an individual
person, household or firm. In Macro-Economics the corresponding concepts are
Aggregate Demand and Aggregate Supply. They represent the total demand and
supply of the economy as a whole.
Types of Markets
Markets can be of different types depending on the type of goods and services being
bought and sold in it.
The most common is the market for specific goods or commodities, i.e., of concrete,
physical things like items of food and clothing. Services can also be bought and sold,
e.g., travel and entertainment, the treatment of physicians and lawyers. The Share
Market is where shares of various companies are bought and sold. Domestic market
refers to markets within the boundaries of a country, whereas the Foreign or
International market refers to transactions taking place across two different
countries using two different currencies. All these markets come under the purview
of Economics. But there is a difference in the approach in which Micro-Economics and
Macro-economics looks at markets.
Micro-Economic looks at markets in the sense of individual buyers(consumers or
households) and individual sellers ( producers or firms) coming together to perform
their respective roles in the market transactions. It is concerned with whether there
are numerous buyers and sellers or just a few ( or even one), whether the product
(good, commodity, or service) is homogeneous or differentiated, whether there is
perfect information about the products(output) and factors of production (input),
whether the factors (inputs) can freely move between alternative uses, and such
conditions. Depending upon the configuration of such conditions, the market takes
different forms such as Perfect competition, monopolistic competition, Monopoly, and
so on. A large part of Micro-Economics is devoted to the study of these market
forms.
In Macro-Economics, the markets concerned are overall or aggregate in nature.
In addition to studying the exchange of actual goods and services, Micro Economics
also studies the attainment of Satisfaction or Welfare that comes from such
exchange, both at individual and social levels. In fact, this is one of the basic
questions that Adam Smith was preoccupied with. How is Social Welfare , as distinct
from the welfare of individuals, to be reached? Welfare Economics is the part of
Micro-economics which studies this, and there is no counterpart in Macro-economics.
Economics is a complex subject, rooted in the reality but often analyzed through
abstract thinking and mathematical methods.
As symbols of that reality, Economics makes use of the Mathematical concepts :
Variables, Constants and Parameters.
Variables are entities that take different values. They are usually symbolized by x, y ,
z. and take values positive and negative ranging from minus infinity to plus infinity.
Constants are entities that , for one particular analytical exercise, take one
particular value. They are usually symbolized by a, b, c .. or alpha, beta, gama. And
again, can take any value between plus-minus infinity but can take only one such
value during a particular analysis.
Parameters are entities that can be assigned different values for different variants of
an exercise but in any one particular variant, can take only one such value.
Variables can be dependent or independent. An in dependent variable takes on
values by itself. A Dependent variable takes on values according to or as per the
Independent variable. This relation of dependence between the Independent and the
Dependent variable(s) is known as a functional relationship, or simply, a Function. It
means that the Dependent variable functions according to the Independent variable.
It is a most powerful tool in the sturdy of Economics, both Micro and Macro.
In Economics, a Function may involve more than one variable. Usually, several
variables are interlinked. To examine whether any two have a causal ( cause-effect)
relationship, it may be necessary to rule out others that complicate the issue or get
in the way of analyzing it. Then what is done is to make an assumption known as the
ceteris paribus assumption.
In Latin Ceteris means `other things or the rest’ and Paribus means ` at par or
equal’. The phrase ceteris paribus thus means ‘other things being the same’. It
qualifies or conditions a causal relationship between an independent variable and the
dependent variable that depends on it or functions according to it.
Suppose we take up the following Functional Relationship
The Quantity (Qx) of a Commodity being demanded ( symbolized by the variable x)
depends on the Price (Px) of the Commodity, the Prices of other commodities (say, y
and z) that can complement or substitute it, the Income (Y) and Tastes(T) of the
person making the demand.
Symbolically this can be written as
Qx = f( Px, Py, Pz, Y, T)
where Qx is the dependent variable, Px, Py and Pz , Yand T the independent
variables, and f is the functional form.
Now if we want to focus on the causal relationship between the Price of the
commodity (Px) and the Quantity of it that is demanded (Qx), and for the time being
put aside the prices of commodities and the tastes of the consumer, this can be
written as
Qx = f(Px), ceteris paribus.
This simple yet powerful technique, used extensively by Alfred Marshall, is known as
Partial Equilibrium Analysis. However it lets only one market (at a time0 be in
equilibrium and may not capture the complexities of the real world.
General Equilibrium Analysis is a contrasting technique, first formalized by Leon
Walras. This does not use the ceteris paribus assumption. It lets the inter-
dependence of various variables play themselves out. Prices of Commodities are
determined simultaneously and mutually. All markets are simultaneously in
equilibrium.
A run is a length of time, not exactly specified. If all factors of production can be
varied during a length of time, it is called the Long Run. If some variables can be
varied but others cannot, i.e., are fixed, it is the Short Run. A Short Run Equilibrium
is one that holds
The highest recognition for economists is the “Sveriges Riksbank Prize in Economic
Sciences in Memory of Alfred Nobel” . Though created by Sweden’s Central Bank in
1968, nearly 75 years after Nobel prizes in physics, chemistry, literature, peace and
medicine/physiology were set up in 1895, this is regarded as the Nobel Prize in
Economics. The first two to receive this were Ragnar Frisch and Jan Tinbergen in
1969.Paul A. Samuelson received it in 1970. In 1998 , it went to Amartya Sen from
India.
13. Summary
Economics studies human choice among alternative uses of scarce
resources.
It is a Social Science has a wide scope. It aids the understanding of
the central problems of an economy.
Demand and Supply of goods and services determine their
Equilibrium Price and Quantity in the Market.
Markets can be of various forms.
Equilibrium can be Partial and General, Long-Run and Short-Run,
Dynamic and Static.
14. Exercises
Short Questions
Long Questions
15. Glossary
Variables
Constants
Hypothesis
Model
Demand supply
Market
Equilibrium
Static Equilibrium
Dynamic Equilibrium
Long Run
Short run
General equilibrium
Partial Equilibrium
16. References
17. Activity
Go to the nearby market for fruits and vegetables and observe the people going
about the daily business of buying and selling.
Go to a mall or supermarket and do the same.
Jot down any differences you may find.
18. Quiz
Discipline Courses-I
Semester-I
Paper I: Principales of Economics (POE)
Unit-I
Lesson: Nature and Subject Matter of Economics
Lesson Developer: Neha Goel
College/Department: Shyamlal College, University of
Delhi
1: Learning Outcomes
2: Introduction
3: The Basic Competitive Model
4: Incentives and Information
4.1: Property Rights
4.2: Prices, property rights and profits
5: Rationing
6: Opportunity Sets
7: Economic Systems and Gains from Trade
8: Comparative advantage and Trade
9: Summary
10: Exercises
11: References
12: MCQs
1. Learning Outcomes
After you have read this chapter, you should be able to:
- Understand the subject matter of economics
- Explain the basic competitive model
- Understand how prices, property rights and profits provide information and
incentives
- Define rationing and understand its types
- Explain opportunity sets and economic systems
- Have a better understanding of the concept of comparative advantage, trade and
gains from trade
2. Introduction
Have you ever imagined that when you go to a market and the mango seller sells the
mangoes for Rs. 50/kg, who decides the price, and if you want to buy 5kgs, why does
the seller sells the mangoes at Rs.40/kg? Have you ever thought why swimming is
allowed for free in the river Ganges and paid in a swimming pool in a hotel? Have you
ever imagined why Chinese items are cheap or why we export rice? Have you ever
thought how the goods and services get produced or sold or charged differently in
different markets or how are they traded among different countries? There may be a one
word answer to it i.e. competition. We may now discuss various terms and definitions
that would help us get the answers to the questions above.
There are two participants in the market i.e. Producers and Consumers. There are a
large number of buyers and sellers in a competitive market and thus they compete
among themselves. Producers compete with each other by providing the desired
products to the consumers at the lowest possible price and the consumers compete with
one another by paying the price for the products they are willing to buy, while others
may not be able to afford the product.This is known as the basic competitive model.
The basic competitive model is themodel which assumes that the firms are interested
in profit maximization, consumers are rational or self-interested and the
markets are perfectly competitive.
The consumers are assumed to be rational as they make choices in their own self-
interest i.e. they make a choice such that their satisfaction is maximized. For example,
Ram may prefer leisure over work and can exchange a lower income for longer holidays
and Rahul may be ambitious and hardworking and willing to work for longer hours to
fulfill his dream of buying a bunglow.
The firms are also assumed to be rational as they operate with the motive of profit
maximization.
Perfectly competitive markets are those where a single producer have no power to set
the price of a product as there are many sellers selling homogeneous products and the
market mechanism (interaction between demand and supply) determines the price and
quantity of the product.
Prices: generally, market mechanism i.e. interaction between demand and supply
determine prices. For example, price of air cooler may be less in winters due to low
demand but increases in summers with increase in demand. Sometimes, government
intervenes and determines the price either fully or partially. For example, GOI fixes up
price for critical goods like petrol and necessities like wheat to protect the interest of
both producers and consumers.
** Test Yourself: What are the assumptions of basic competitive model? How
are prices determined in a competitive market?
Incentives are the core of economics. Without incentives, have you ever thought why
would someone take the risk of inventing a new product or save for future contingencies
or work hard? In an economic system where government takes the decisions, it makes
central plan to decide what to produce, how to allocate the resources to produce those
goods and whom to sell the goods. This economic system has a drawback that since
nobody owns the resources that are used in the production of goods and services, the
resources may not be fully utilized. So property rights should be enforced on the
resources so that the private owner will have an incentive, to produce goods and
services, if the resources are fully utilized.
For example, WTO (World trade organization) contains the protection of Intellectual
Property Rights (IPR). Intellectual property is a tangible form of original creative work of
mind, eg. , literary works, industrial designs etc. Thus, it should be legally protected. IPR
is needed to control or manage the intellectual property so that the creator should be
benefitted in order to have incentive for innovation. IPR also encourages and promotes
creativity as no one can copy other person’s work without permission. Usage of internet
has helped in tremendous way to spread information among people saving time and
money. Information and incentives are provided by the market economies through
prices, profits and property rights.
earn income from it, to trade/exchange it with others and to enforce property rights.
Property rights are laws created by governments which keeps a check on how a resource
is used and who is the owner of that resource- government, individuals or collective
bodies. For example, if you lend your car to your sister, she won’t be having a legal
property right to the car, or suppose your car gets stolen, the thief won’t be having a
legal property right to the car but will just have economic property right to the car.
A good or a resource should have properly defined property rights, the possession of the
rights must be enforced so that the use of the good or resource can be controlled.
Property can be classified into four groups:
1. Open access property:– this kind of property is neither owned, nor controlled or
managed by any individual. It is non-excludable i.e. no individual can be excluded to use
this property. However this kind of property can be rival i.e. only used/consumed by one
consumer. Thus, if someone uses it, the quantity available for another individual gets
reduced. For example, if there is a fishery in a village, Fisherman A can catch any
amount of fishes from the fishery, the more the fishes he catches, the lesser fishes will
be available for other fisherman. Thus, the government should define proper property
rights so that the good or resource should be used ethically and available for all. The
government can divide different portions of the fishery among fishermen where they can
involve in fishing. Thus, the government can convert an open access property into
common, state or private property by enforcing property rights on it.
2. Common property:– This kind of property is jointly owned by a group of individuals.
Thus, it is commonly decided by the joint owners how to use, control or manage the
property and who should be excluded from using the property. Thus, the enforcement of
the property rights and the benefits from the property is shared by the joint owners and
thus it is easier to solve conflicts if any, unlike open access property. For example,
amusement park in a colony is a common property of the residents of that colony. The
residents manage and control the park and can decide who can use it.
3. Private property:– This kind of property is owned by an individual or a group of
individuals. For example, a building can be owned by an individual or a group of people.
Private properties are both excludable and rival i.e. the owner/owners decide how to
use, control or manage the property and who should be excluded from using it. The
owner may decide whether to rent out the building or reside himself.
4. Public property:– This kind of property is controlled and managed by the
government, although owned or used by all the individuals. For example, street lights,
monuments etc.
Limited supply of a good or resource implies higher price of that good, example, Gold,
Petrol etc. Whereas, goods available in bulk are cheaper, example, Paper, local market
clothes etc. Thus, prices provide information about the availability or scarcity of a good
or a resource. The consumers respond to this information by buying the goods if they
are willing to buy and able to pay for them and the producers respond to this information
with the motive of profit maximization. The producers can maximize their profits by
using lesser amount of scarce resources and producing what the consumers are willing
to buy. For example,if due to heavy rains, tomato yield is less, its price rises and the
rational consumers would reduce its consumption whereas, the rise in price of tomatoes
would signal the rational producers (farmers) to grow more tomatoes. Thus, prices are a
signal for firms and individuals to take rational decision.
This profit motive can only be effective if there are clearly defined property rights for a
good or resource. There must be properly defined private property for the firms and
individuals so that they have an incentive to invest in a new plant or technology or hire
trained candidates or produce goods and services utilizing the available resources
efficiently. For example, Mr. A has property rights on his building as he bought it, and he
may decide to rent out a part of it to get some return on his investment. If the tenant
doesn’t pay the rent on time, Mr. A suffers the consequence i.e. loss in income. Thus, if
he made a right decision, it was the incentive he got for renting out his building, if he
made a wrong decision, it would give him an incentive to check the financial stability of
the tenant next time.
** Test Yourself: Define property rights. How do prices, profits and property
rights provide information and incentives?
5. Rationing
Rationing is just another way to deal with the problem of scarcity in economics. It is a
way to control or manage the distribution of scarce goods or resources. Ration may be
defined as the allotment of resources to an individual. Rationing keeps in check the size
of the ration being distributed on a particular day/time. Now we may discuss the various
ways in which rationing is used:
2. Rationing by Queues:– It is a system in which goods are allocated to those who are
willing to wait in a queue. Thus the price of the good does not change or vary as the
goods are not provided to those willing to buy or able to pay for that good. Like the
lottery rationing, it is a fair system. For example, interviewers conduct the interviews of
the candidates in a queue or a doctor consults the patients according to the queue.
However, this system is also inefficient as waiting in a queue involves wastage of time
which is an important resource. For example, in case of medical care, if some people are
willing to and able to pay a higher amount to get treated, it would result in increase in
monetary resources of the hospital which can be utilized to employ more doctors, thus,
reducing the queue and improving the medical facilities too.
** Test Yourself: What do you mean by rationing/ Why do we need it? Which
is the best and fair way of rationing?
6. Opportunity Sets
It is a group of available options which emerges from the core idea of trade-offs and
scarcity in economics. We already know that budget and time constraints define the
availability of choices. Since resources are scarce, including time, people must make
choice in such a way that in order to get some good, they have to sacrifice the
consumption of another good. For example, you go to a movie hall and find two movies
released- superman and batman. Now you have following choices, either watch
superman or batman or back to back both the movies or none of them. This is your
opportunity set. Watching iron man is completely irrelevant as it is out of your
opportunity set. You may spend time yearning the movie iron man or any other movie
but it makes no sense. Thus, an opportunity set defines the limitation to the choices
made by an individual. We need to discuss the following concept to understand it clearly:
1. Budget constraint – opportunity sets in which money imposes constraints are known
as budget constraints. For example, Seema has Rs. 100 and she consumes two goods
i.e. burger and pizza priced at Rs. 10 and Rs. 20 respectively. The opportunity set and
PPC of Seema is as follows:
2. Time constraint – opportunity set in which time imposes constraints are known as
time constraints. For example, a farmer works for 8hrsand he produces two goods -
butter and rice. He produces 16kgs of rice in 8hrs and 4kgs of butter in 8hrs. If he
devotes half of his time to both, then he can produce 8kgs of rice and 2kgs of butter in
4hrs each (if he chooses not to trade i.e. point A). Following is the PPC and time
constraint of the farmer:
3. Cost and Opportunity Cost – Making choices out of the scarce resources (trade-off)
involves some cost (cost of sacrificing a good to choose another good) and some benefits
(an incentive, to consume the good that we choose, in terms of satisfaction). For
example, you may choose to attend a birthday party (benefit) at the cost of bunking
your tuition (cost). Making trade-offs may involve diminishing marginal utility i.e. the
utility of a good diminishes as more and more units are available for consumption. For
example, suppose electricity is a scarce resource. If one unit of electricity is available,
we may use it for lighting, if two units are available, we may use it for cooking, if three
units are available, we may use it for washing clothes and if more units are available we
may use it for less important purpose like playing video games. Thus, as more and more
units of electricity are available to us, the utility of it decreases.
If we look at an opportunity set, relative prices (price of one good in terms of another)
explains the trade-off. Let us get back to the example of Seema consuming pizza and
burger. In our example, burger costs Rs. 10 and pizza costs Rs. 20. Now,
Relative price of pizza in terms of burger = Rs. 20/Rs. 10 = 2. Thus, for every pizza she
sacrifices, she can get two burgers.
Now, from the concept of relative prices and trade-offs, we derive the concept of
opportunity cost. When resources are scarce, me make choices. When we make choices,
we sacrifice consumption of one good, to consume another good. Cost of sacrificing a
good which is the best alternative to the good that we choose to consume is known as
opportunity cost. For example, if Mr. X built a house which is bigger than his
requirement, he could keep a paying guest in one of the rooms and earn Rs. 2000 per
month. If he does not want to keep the paying guest, Rs. 2000 rent foregone is the
opportunity cost of not keeping a paying guest. Or suppose after graduation, you get a
job of Rs. 20000 per month. But you choose to continue your studies and go for post
graduation. The income of Rs. 20,000 is the part of the opportunity cost of your time
that you choose to study and not work. Now, this foregone income must be added to
your college fees to get the opportunity cost of attending college.
** Test Yourself: Apply the concept of opportunity cost to explain why some
students from lower income group cannot complete schooling.
An economic system is an organized way in which goods and services are produced in
the economy utilizing the resources in the best possible way, allocated by a state or
country.
We assume that the domestic market is perfectly competitive and thus the producers
and consumers rationally decide to utilize the resources efficiently. Thus, the problem of
scarcity and choice is solved by the market forces which determines what to produce, for
whom to produce and how much to produce. Suppose we take an economic system
where countries are engaged in free trade i.e. trading without policy restrictions. We
assume that a country produces and exports the good in which it has a relative
advantage (cost of producing that good in comparison to other good is less) and imports
the good in which it has greater opportunity cost. For example, an Indian farmer works
for 8hrs and he produces two goods - butter and rice. He produces 16kgs of rice in 8hrs
and 4kgs of butter in 8hrs. If he devotes half of his time to both, then he can produce
8kgs of rice and 2kgs of butter in 4hrs each (if there is no trade). His labor cost of
producing one unit of rice is 0.5hrs (8hrs/16) and labor cost of producing one unit of
butter is 2hrs (8hrs/4). Thus, opportunity cost of producing rice = 0.5/2 = 0.25 and
opportunity cost of producing butter = 2/0.5 = 4. Thus, he will produce and export rice
as he has less opportunity cost in rice and will import butter.
We take the basic competitive model for the domestic market.
When there is no trade, equilibrium price and quantity in the domestic economy are P*
and Q* respectively. Area of the triangle AP*E is the consumer surplus (as market price
is lower than the price the consumers are willing to pay) and area of the triangle BP*E is
the producer surplus (as market price is higher than the price at which they are willing to
sell. Total social welfare is equal to the area of triangle AEB (i.e. CS + PS).
When free trade is allowed, the world price is Pw which is lower than the domestic price.
This means the domestic producers will supply less of it and the consumers will be
better-off if they import the good. Thus quantity supplied reduces to Q1 and quantity
demanded increases to Q2. The gap between the demand and supply is filled by imports.
Thus Q1Q2 is the amounts of imports by the domestic consumers. It is obvious that the
domestic producers cannot charge a higher price than Pwas no one will be willing to buy
the good at a higher price if cheaper imports are freely accessible. Now, the consumer
surplus increases to the area APwD and the producer surplus reduces to the area BPwC.
However, the total social welfare has increased from area of the triangle ABE to area
ABCDE. Thus, net gain in welfare due to trade is represented by the area of the
triangle CDE. The lower the prices in the world, the higher would be the gain
from trade.
**Test Yourself: How does an economy in a free trade regime get benefit
from international trade?
We can see from the table above that India has a lower opportunity cost in producing
rice (0.25 as compared to America’s 0.67) and America has a lower opportunity cost in
producing butter (1.5 as compared to India’s 4). Therefore, India has a comparative
advantage in producing rice and America has a comparative advantage in producing
butter and both the countries should specialize and export to the other country that good
in which it has a comparative advantage. Thus, a country has acomparative
advantage in producing a good if the opportunity cost of producing in the home
country is less than that in the foreign country.
9. SUMMARY
10.Exercise
11.References
1. Joseph E. Stiglitz and Carl E. Walsh, Economics, W.W. Norton & Company, Inc., New
York, 4th edition, 2007.
2. N. Gregory Mankiw, Economics: Principles and Applications, South Western, Cengage
Learning Pvt. Ltd., 4th edition, 2007.
A. Rationing by queues
B. Rationing by lotteries
C. Health care rationing
D. All of the above
ANSWERS:
1. D
2. D
3. C
4. D
5. C
6. B
7. A
8. C
9. C
10. C
Discipline Courses-I
Semester-I
Paper I: Principales of Economics (POE)
Unit-II
Lesson: Demand, Supply and Market Equilibrium
Lesson Developer: Ankur Bhatnagar
College/Department: Satyawati College, University of Delhi
CONTENTS:
1. Learning Outcome
4. Determinants of demand
7. Determinants of supply
1. LEARNING OUTCOME
2. CONCEPT OF DEMAND
When we say that a consumer demands a good like a car it implies that she is
willing to pay a ‘certain’ price in return for a pre-determined amount of the good.
This ‘willingness ‘lies at the heart of the demand theory. In economics, this
willingness is expressed in terms of Desire, Ability and Willingness.
Consider a BMW sports car with a price tag of Rs. 25 lac . A 18 year girl student
would like to own this car. However, she would not constitute demand for this car
because she lacks to ability to pay the stated price of the car. She has the desire to
drive and the willingness to pay for it (she does not want it for free), but lacks the
ability to pay the stated price since she is a student with no income. Thus, demand
is not just willingness to pay for a good at a stated price but also the desire and
ability to pay for it. However, she may be willing to pay a lower price of Rs. 5 lacs.
If this price is acceptable to the makers of BMW then she constitutes demand for
the car.
Assuming that desire and ability exist we can say that demand for a good is
equivalent to willingness to pay for a good. This explains why the terms ‘demand
curve’ and ‘ willingness to pay’ curve are used interchangeably.
A consumer demand schedule gives the various combinations of price and demand
of a good for a consumer in a table form. For example, it tells us the willingness of
a consumer to pay for oranges at certain prices. The relationship between price and
quantity is shown using specific values in the table below. At a price of
Rs.10/dozen, the consumer is willing to consume/purchase 4dozen. At a price of Rs
30/dozen the demand falls to 2 dozen.
10 4
30 2
Qd = 5 – 0.1P
Notice that the sign for P is negative, which indicates that demand curve is
downward sloping. Another way of saying this is that slope of demand curve is
negative.
The market demand schedule provides the total demand for a good in the market.
It represents the sum of demand by all consumers. It is the horizontal summation
of all individual demand curves.
EXAMPLE:
Assume 3 consumers in the market, whose demand schedules are given below. Let
us graphically and numerically show the market demand; we assume the following
demand functions:
Q*=30-7P
1 2 5 4 2+5+4=11
2 1 3 3 1+3+3=7
3 0 1 2 0+1+2=3
P Total
demand=market
demand
1 2+5+4=11
2 1+3+3=7
3 0+1+2=3
4. DETERMINANTS OF DEMAND
Demand for a good is determined by monetary and nonmonetary factors. These can
be expressed using the demand function Qd where
M: income of the consumer and F: non-monetary factors like season, fashion, etc.
The last factor is subjective and can’t be defined in a mathematical expression.
We now examine the relation between demand for a good with each determinant
separately.
Demand and Px: The relation between demand and price of a good is based on the
law of demand. As price rises, the demand for a good will fall, ceteris paribus
(assuming all other factors – Py, M, F are unchanged) . This explains the negative
slope of a demand curve. In some cases this law may not be obeyed and there can
be a positive relation between price and demand. Such goods are exceptions to the
law of demand and called GIFFEN goods.
Demand and Py: there can be two types of relation between X and Y. The first is
that they are complements to each other. This means they are always consumed
together and it is not useful to consume them alone. A rise in price of Y will cause a
fall in demand for both X and Y. The common examples include a mobile phone and
a SIM card ( a mobile phone is useless without a SIM card) , shoes and socks( it is
not comfortable to wear shoes without socks). The other relation is that of
substitutes. As price of Y rises, the demand for X will increase as the demand for Y
declines; X substitutes for Y. Common examples include a laptop and a personal
computer, a WIFi connection and a data card for use on a mobile phone. (a phone
that needs Internet connectivity need to use only 1 of these- WiFi or a data card).
Demand and M: most goods are ‘normal’ as their demand rises with rise in income
levels. Therefore, the relation is positive. For some ‘inferior ‘goods the relation is
negative. Take the case of a non-branded shoe bought from the local market. As
income rises, a consumer may not opt for a similar shoe, and may want to buy a
branded shoe like Nike/ Adidas. Therefore, the non-branded shoe sees a decline in
demand even when income of the consumer rises. This non-branded local shoe is
an inferior good.
Note that when we examine the relation between demand and each determinant,
we assume that all other determinants are unchanged. So when income changes Px
and Py are unchanged. This is also referred to as ‘ceterius paribus’ condition. It can
be translated to mean that all other things remain constant.
A firm’s supply schedule gives the various combinations of price and output of a
good for a firm in a table form. For example, it tells us the ability and willingness of
a firm to produce a certain amount of output of a good at a certain price. The
relationship between price and quantity is shown using specific values in the table
below. At a price of Rs.10/dozen, the orange seller (firm) is willing to sell 4dozen.
At a price of Rs 30/dozen the supply rises to 8 dozen.
10 4
30 8
Qs = 2 + 0.2P
Notice that the sign for P is positive, which indicates that supply curve is upward
sloping. Another way of saying this is that slope of supply curve is positive.
The market supply schedule provides the total supply for a good in the market. It
represents the sum of supply by all firms for a good. It is the horizontal summation
of all individual supply curves.
EXAMPLE:
Assume 2 firms in the market, whose supply schedules are given below. Let us
graphically and numerically show the market supply; we assume the following
functions:
Q*=5+5P
1 5 3 5+3=7
2 8 5 8+5=13
3 11 7 11+7=18
P Total
demand=market
demand
1 5+3=7
2 8+5=13
3 11+7=18
7. DETERMINANTS OF SUPPLY
Supply of a good is determined by the costs involved in producing the good and non
cost factors as well. These can be expressed using the supply function Qs where
Supply and Px: The relation between supply and price of a good is based on the law
of supply. As price rises, the supply of a good will rise, ceteris paribus (assuming all
other determinants unchanged) . This explains the positive slope of a supply curve.
Supply and Pinputs: It is common sense that price at which a firm is willing to sell the
good will depend on the cost of producing it. This cost depends on the cost and
availability of inputs. Higher is the input higher will be the price of a good. A
common example is the local fruit seller who increases the prices of his fruits
whenever the price of petrol is increased. Petro/ diesel is used to transport fruits
from the grower to reach the final consumer through the fruit seller. The transport
costs are therefore part of producing the fruits until they reach the consumer,
which is you. Thus, higher price of inputs will decrease supply.
Note that when we examine the relation between supply and each determinant, we
assume that all other determinants are unchanged. So when input prices change
Px, F and T are unchanged. This is also referred to as ‘ceterius paribus’ condition. It
can be translated to mean that all other things remain constant.
The shifts in the demand curve are based on the determinants of demand. We can
distinguish between two types of shifts of the demand curve based on the cause of
the shift
A similar shift occurs when price of the good Y, which is a complement to X falls.
This fall causes an increase in the demand for X shown as a movement from D1 to
D2. Some other examples are listed in the table below:
Note that shift of the demand curve is caused by changes in non-price factors
( Py, M, F) alone.
Note that shift along the demand curve is caused by changes in price of the
good alone.
The shifts in the supply curve are based on the determinants of supply as was the
case for demand. We can again distinguish between two types of shifts of the
supply curve based on the cause of the shift.
shown as a move along a given supply curve. When price falls we move from A to
B, showing that quantity of X supplied has decreased. An increase in quantity
supplied is shown as a movement from C to D, when Px increases.
A similar shift occurs when price of an input declines. This fall causes a decline in
the cost of production of the good. The savings are used to produce more of X so
that we move to point B, without any change in Px. Some other examples are listed
in the table below:
Note that shift of the supply curve is caused by changes in non-price factors (
Pinputs,T, F) alone.
Note that shift along the supply curve is caused by changes in price of the
good (Px)alone.
Equilibrium is a position of ‘rest’ for all economic agents. At this point no agent will
like to change its position in terms of demand, supply or price. To determine
equilibrium we need the demand and supply curves. Equilibrium is determined
where demand equals supply. Ina diagram it is easy to show that P* and Q* are the
equilibrium values of price and quantity. We can easily show how P* is derived.
Consider price P1 where demand = Q2 and supply = Q1. Demand > supply so that
we have a position of excess demand which is called a SHORTAGE. Consumers are
willing to pay a price of P1 for Q2 while suppliers want to sell Q1 at this price.
When suppliers realize that consumers want more than Q1( which they had
produced), they increase production in next period, for which they ask for a higher
price. The red arrow shows this. As long as a shortage remains, producers will continue to
increase production, until demand equals supply. Now there is no reason to change the production
levels or the demand levels.
Consider price P2 where demand = Q4 and supply = Q3. Demand < supply so that
we have a position of excess supply which is called a SURPLUS. Consumers are
willing to pay a price of P2 for Q4 while suppliers want to sell Q3 at this price.
When suppliers realize that consumers want less than Q3( which they had
produced), they downsize production in next period, and are willing to offer this
lower output at a lower price. The blue arrow shows this. As long as a surplus remains,
producers continue to decrease production, until demand equals supply. Now there is no reason to
change the production levels or the demand levels.
Thus we conclude that a surplus causes prices to fall while a shortage causes prices to rise. At
equilibrium there is no shortage and no surplus, since demand = supply. We now investigate the
effects of changes in demand and supply on equilibrium price and quantity.
Case 1: Increase in demand. The demand curve shifts to the right ( D1 to D2), leading to higher price
and quantity.
Case 2: Decrease in demand. The demand curve shifts to the left (D1 to D3), leading to lower price
and quantity.
Case 3: Increase in supply. The supply curve shifts to the right, leading to higher quantity and lower
price.
Case 4: Decrease in demand. The supply curve shifts to the left, leading to higher price and lower
quantity.
Case 5: Increase in demand and supply. we have three possible cases shown in diagram below. Note
that quantity will always rise (as shown by the arrow) while the effect on price depends on
comparative increase in demand and supply.
Case 6: Decrease in demand and supply. we have three possible cases shown in diagram below. Note
that quantity will always fall while the effect on price depends on comparative increase in demand
and supply.
Case 7: Increase in demand and decrease in supply. we have three possible cases shown in diagram
below. Note that price will always rise while the effect on quantity depends on comparative increase
in demand and supply.
Case 8: Increase in supply and decrease in demand. we have three possible cases shown in diagram
below. Note that price will always fall while the effect on price depends on comparative increase in
demand and supply.
Subject: Economics
Delhi
1
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Table of Contents
Introduction
o 1.1 demand and Supply
o 1.1.Supply
o 1.1.3 Equilibrium
o 1.2.1 Elasticity of Demand
o Summary
o Exercise
o Glossary
2
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
1.1.1 Demand
3
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
The quantity of a good an individual is willing to buy over a specific time period is a
function of the price of the good, the individual’s money income,and the prices of other
goods. In simple mathematical language it can be expressed as:
where Qdx = the quantity of good X demanded by the individual, over the specific time
period,
follows :
(1.2)
where, the ‘bar’ on top of Iand Po means that they are kept constant. Equation (1.2) can
also be written as
Qdx = f (Px) cet. par. (1.3)
where, cet. par. = ‘ceteris paribus’ means everything else held constant.
Eqn(1.3) implies that the quantity of good X demanded by an individual over a speific
time period is a function of the price of that good, while holding constat everything else
that affects the individual’s demand for the good.
Qdx = 32 – 4Px cet. par. is a specific functional relationship indicating precisely how Qdx
depends on Px. That is, by substituting various prices of good X into this specific demand
function, we get the particular quantity of good X demanded by the individual per unit of
time at these various prices. Thus, we get the individual’s demand schedule.
In general, the individual’s demand schedule for a good is a table giving us the quantity
demanded of the good at various alternative prices of the good, keeping constant the
prices of other goods and money income and tastes of the consumer. The graphic
representation of the individual’s demand schedule gives us that person’s demand curve.
In the previous example where the demand function for an individual for good X is given
as Qdx = 32 – 4Px, if we substitute various prices of X into the demand function we will
get the individual’s demand schedule as given in Table 1.1.
Table 1.1
Px (in Rs.) 8 7 6 5 4 3 2 1 0
Qdx 0 4 8 12 16 20 24 28 32
Plotting each pair of values as a point on a graph and joining the resulting points, we get
the individual’s demand curve for good X. In Fig. 1.1 it is shown as dx
4
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
The individual buys the good X only when price falls below Rs. 8. At a price of Rs. 7 she
buys 4 units of X. As the price falls further, she purchases more of X because they are
becoming less expensive. At a price of Re1, she buys 28 units. However, even at a price
of Rs.0 she would not take more than 32 units because additional units of X may result
in a storage and disposal problem for the consumer. This is called the ‘saturation point’
for the individual. So the maximum quantity that the individual will ever demand of good
X per time period is 32 units.
In drawing the demand curve dx in fig. 1.1.1, we assume complete divisibility, so that
price and quantity demanded can both change by infinitely small steps. This enables us
to draw a demand curve by joining the points A, B, C, D... I by a continuous, smooth
line. Another point to be noted about the construction of the demand curve is that the
independent variable, price, is measured on the vertical axis, and the dependent
variable, quantity, on the horizontal axis which contradicts the mathematical principle of
drawing a curve. But this is a convention which economists follow so that they can draw
the demand curve of the consumers and the cost curves of the firms on the same set of
axes. The demand curve drawn this way is also called the inverse demand curve.
In the given example,the demand curve for the good X is a straight line and is of the
form of
Where ‘a’ (32) is the quantity intercept and ‘–b′ (–4) is the slope, i.e.,
When we plot the demand curve, we actually plot the inverse demand curve which is
given as:
Px = α – β Qdx, (1.5)
Where is the price intercept and is the slope of the inverse demand curve
and equals
In our example, α = (32/4) = 8, is the price intercept, and –β = -(1/4), is the slope of
the inverse demand curve.
5
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Though in the previous example the demand curve derived is linear, it is not always so.
Suppose Table 1.2 gives us the demand schedule for good Y for an individual.
Table 1.1.2
Py(in 90 80 70 60 50 40 30 20 10 0
Rs.)
Qdy 0 1 2 3 5 8 12 16 20 30
It is clear from the previous equation that the slope of the demand curve is negative and
it varies with the price. Hence, the demand curve that will be derived from equation-(i)
is a downward sloping, non-linear curve.
Let us suppose that a=100 and b=1. Then, the specific demand function will be,
6
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Px (in Rs.) 0 1 2 4 5
Qdx (in units) ∞ 100 50 25 20
The demand curve derived will be non-linear. In fact, it will be a rectangular hyperbola,
i.e., it will be asymptotic to both the axes and the areas of the rectangles formed under
the curve will be equal to each other.
In the given figure, dx is a demand curve which is a rectangular hyperbola. Area of the
rectangle OP1AQ1=area of OP2BQ2=area of OP3CQ3=area of OP4DQ4=100.
The individual’s demand curve for a good represents a maximum boundary of the
individual’s intentions. For the various alternative prices of a good, the demand curve
shows the maximum quantity of the good the individual intends to purchase per unit of
time. For various alternative quantities of a good, the demand curve shows the
maximum prices the individual is willing to pay. For example, in fig.1.2 point E on the
demand curve indicates two things. First, if the price is given as Rs.50, the individual will
buy maximum 5 units of good Y Second, the maximum price that the individual will be
willing to pay to buy 5 units of Y is Rs.50.
When there is a change in the price of one good, other things remaining constant, the
quantity demanded of that good changes and the consumer moves along the same
demand curve. The movement along the same demand curve for a good is known as the
change in the quantity demanded the good which occurs due to a change in the own
price, ceteris paribus.
For example, in Table 1.2, when price of Y falls from say Rs.50 to Rs.40, the quantity
demanded of Y rises or expands from 5 units to 8 units and the consumer moves from
point E to point F on the same demand curve ‘dy’.
However, when any of the ‘ceteris paribus’ conditions changes holding own price of the
good constant, the entire demand curve ‘shifts’ either to the right or to the left. A
rightward shift is called an increase in demand (rather than an increase in the quantity
demanded), and this shows that at any given price of the good, the consumer buys more
of the good. Similarly, with a leftward shift the consumer buys less of the good at any
given price. This is known as a decrease in demand.
7
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Increase in Demand
Case Studie
The Government in an effort to control the spread of Oral Cancer is contemplating two
policy options to bring about reduction in tobacco (Gutka) consumption. One option is to
tax the tobacco manufacturers thereby increasing the price and thus reducing/
contracting the demand for tobacco. Alternatively the Government can make use of
public service announcements, health warnings on tobacco products, restrictions on
advertisements of tobacco products etc. These measures would shift the demand curve
of tobacco products to the left implying a decrease in the demand for tobacco products.
Shifts in the demand curve occur due to changes in income of the consumer or in the
prices of other goods or in the tastes of the consumer. When consumer’s money income
increase, while everything else remains constant, the consumer’s demand for a good
usually increases so that the consumer demands more of the good at the same price of
the good. These goods are referred to as normal goods. For example, with an increase in
the consumer’s income, the consumer’s demand for ‘mango’ may increase even though
price of ‘mango’ has not changed. This will lead to a rightward shift of the consumer’s
demand curve for mango. Similarly, a decrease in income will lead to a leftward shift of
the consumer’s demand curve.
Sometimes, with a rise in individual’s income the demand for certain goods may fall.
These goods are known as inferior goods. For example, with a rise in income consumer
may demand less of potatoes and switch over to better quality vegetables or fruits.
8
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
The individual’s demand curve for a good shifts when prices of other goods in the
economy changes, own price of the good remaining constant. Change in the prices of
other goods will affect the demand for the good in question significantly when these
other goods are either close substitutes or complements of the given good.
A close substitute is a good that performs essentially the same function as the original,
so that a small increase in the price of the substitute will induce the consumer to buy
more of the original good even though it’s price has not changed. Thus, the demand
curve for the original good will shift to right. For example, let us suppose that for a
consumer ‘Tropicana’ fruit juice is a close substitute of ‘Real’ fruit juice. If price of ‘Real’
increases from Rs.70 to Rs.75, then, the demand of ‘Tropicana’ will increase from 2 litres
to 3 litres a month even though its price has remained unchanged at Rs.65 per litre.
A complement is a good that is used in conjunction with the particular good in question.
For example, pizzas and coke are complements of each other. When price of pizza rises,
price of coke remaining the same, the demand for pizzas as well as coke will fall and the
demand curve for coke will shift to left.
9
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
In fig. 1.4, d1 represents the demand curve for coke when price of one pizza was
Rs.100. At that time the consumer was consuming 5 bottles of coke at a price of
Rs.10/bottle. When price of pizza rises to Rs.150/unit, the demand curve for coke shifts
leftward to the position d2 and at the same price of coke (which is Rs.10/bottle), the
consumer reduces the demand to 3 bottles. This happens, because with an increased
price of pizza, consumption of both pizza as well as coke, falls. The opposite will happen
if price of pizza falls.
When a consumer buys a number of goods, it is possible for her to substitute other
goods for a particular good if its price rises. But the ability to substitute away from a
good increases with the narrowness of its definition. That is, the more narrowly a good is
defined; more substitutes are available for it, where as, the more broadly a good is
defined, less will be availability of its substitutes.
For example, food is a broader category than fruits and fruit is a broader category than
mango. As other goods in the individual’s consumption basket are very poor substitutes
of food, so with a rise in the price of food, the consumer will find it difficult to substitute
it with anything else. Whereas, if the good in question is fruit, then meat , milk,
10
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
vegetables etc. are substitutes for fruits. So a rise in the price of fruits may induce the
consumer to substitute fruits by meat or milk or vegetables. Mango is even more
narrowly defined than fruits. Because other fruits like orange, banana and apple are
more close substitutes of mango than is milk for fruits, so with a rise in the price of
mango, the consumer immediately will switch over to other fruits.
The market demand for a good gives the alternative quantities of the good demanded
per time period, at various alternative prices, by all the individuals in the market. The
market demand for a good, therefore, depends on all the factors that determine the
individual’s demand and also on the number of buyer of the good in the market.
In particular, if there are 100 identical buyers in the market for good X, having the same
demand function Qdx = 32 – 4 Px, the market demand function will be simply given by
100 Qdx, i.e.,
QDx = 100 Qdx = 3200 – 400 Px, (1.6) where QDx is the market demand function. The
market demand schedule can be derived by substituting various prices of X into this
demand function. Market demand curve will be a graphical presentation of the market
demand schedule. Table 1.3 gives us the market demand schedule and fig. 1.5 gives the
market demand curve.
Table 1.3
Px (in Rs.) 8 7 6 5 4 3 2 1 0
QDx 0 400 800 1,200 1,600 2,000 2,400 2,800 3,200
Plotting each pair of values as a point on a graph and joining the resulting points, we get
the market demand curve. In fig. 1.5 Dx gives us the market demand curve for good X.
In practice, individuals have different preferences and so they have different demand
functions for the same good X. In this case of people having different demand curves for
the same good, we can derive the market demand curve by horizontally adding up the
individual demand curves.
11
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
For example, suppose there are just two individual buyers in the market for good Y
whose individual demand schedules are given as follows in Table 1.4
Table 1.4
Price Qd1y Qd2y
90 0 1
80 1 2
70 2 5
60 3 8
50 5 12
40 8 15
30 12 17
20 16 20
10 20 25
0 30 35
In fig. 1.6 we draw the individual demand curves d1y & d2y and their horizontal
summation give us the market demand curve Dy. At each price the quantities demanded
by both the buyers are summed up to give the market demand curve.
When price is Rs.100 there is no demand for y by both the individuals. At Rs. 90,
individual 2 demands 1 unit of y but individual 1 still has zero demand,, thus the market
demand is 1 unit. At Rs.80, individual 1 demands 1 unit & 2 demands 2 units, so market
demand is 3 units. At Rs.40, 1’s demand is 8 units, 2’s demand is 15 units and so
market demand is 23 units and so on. In fig. 1.6, the market demand curve merges with
individual 2’s demand curve till point A2 and then to derive the market demand we
horizontally add up the points on the individual demand curves. For example, to derive
point B on the market demand curve we add up P8B1 and P8B2. So P8B = P8B1 + P8B2 so
that B2B is equal P8B1. Similarly, P4F = P4F1 + P4F2, such that F2F = P4F1 and so on.
12
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Solved Problem
Question:
Suppose that a good is demanded by just two consumers A and B. Their demand curves
are
qa = 80-8P
qb = 40-10P
ii) Plot the individual demand curves and the market demand curve on the same set of
axes.
Solution:
i) Individual Demand Schedule for A
(ii)
Plotting the individual demand schedule of A and B we get the demand curves dada’ and
dbdb’ respectively. The market demand curve, daCD, is a horizontal summation of the
two individual demand curves. Its price-intercept is at Rs.10 because if the price is Rs.10
or more there is no demand by both the consumers of the good and hence, the market
demand is zero. From the demand schedule of B it is clear that for any price greater than
or equal to Rs.4, B’s demand for the good is zero. Thus the market demand curve will
merge with A’s demand curve between the price Rs.10 and Rs.4. For any price below
Rs.4 we can obtain the market demand by adding the demand by A and B both. For
example, at price Rs. 2, A’s demand is 64 units and B’s demand is 20 units and the
market demand is 64+20=84 units. In the given figure Pea+Peb=PE. At zero price the
market demand is maximum 120 units.
13
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
1.1.Supply
Supply curves describe the seller’s desire to make the good available. The quantity of a
good that an individual firm is willing to supply over a specific time period is a function of
the price of the good and the cost of production. In order to derive the firm’s supply
curve of a good, we just vary the price of the good, factors influencing the cost of
production being held constant. The factors which influence cost of production are (i) the
prices of the factors of production which have helped in the production of the good, (ii)
technology and (iii) for agricultural goods, climate and weather conditions. A single firm’s
supply curve of a good shows the alternative quantities of the good that the firm is
willing to supply over a specific period of time at various alternative prices for the good,
while keeping the above constant.
(1.7)
Where Qsx = the quantity supplied of good X by the single producer, over the specific
time period,
g = a function of,
Tech = technology,
The bar on top of the last three factors indicate that they are kept constant.
Equation (1.7) or (1.7´) is a general functional relationship. In order to derive a single
firm’s supply schedule and supply curve, we must get that firm’s specific supply function.
Qsx = –50 + 25 P x.
If we substitute various prices of X into the above supply function we will get the
individual supply schedule as given in Table 1.5.
Px (inRs.) 10 9 8 7 6 5 4 3 2
Qdy 200 175 150 125 100 75 50 25 0
Px (inRs.) 10 9 8 7 6 5 4 3 2
Qdy 200 175 150 125 100 75 50 25 0
14
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
The supply schedule and the supply curve show that the producer will supply the good
only if the price is higher than Rs.2. If the price is Rs.2 or less the price is so low that it
does not even cover the cost of production so that the firm does not intend to produce
and sell the good.
In the above example the supply curve is an upward-sloping straight line. An upward
sloping supply curve implies that the higher is the price of the good, the more willing the
producer will be to supply the good. A producer’s positively sloped supply curve for a
good represents in one sense a maximum and in another sense a minimum boundary of
the producer’s intentions. At any given price, it would indicate the maximum quantity of
a good that the producer is willing to supply. To put it in a different way, if a given
quantity of a good is to be supplied, the supply curve would indicate the minimum price
at which the producer would be willing to supply that quantity. For example, let us take
the point D on the supply curve sx in fig. 1.7. That point indicates that if the price is
Rs.7, then the producer will be willing to supply a maximum of 125 units of the good. It
also indicates that if the producer has to supply 125 units of the good, then Rs.7 is the
minimum price at which he would supply that quantity.
Even though the supply curve is usually positively sloped, it could also have a zero,
infinite, or a negative slope, and no generalisation is possible. Also when the supply
curve is positively sloped it can be linear, as in the given example, or non-linear.
When factors other than own price of the good, affecting the supply of the good change,
the entire supply curve shifts. This is referred to as a change or shift in supply as
distinguished from a change in the quantity supplied.
Fig. 1.8 is an extension of fig.1.7. Given the supply curve sx when price rises from Rs.4
to Rs.7, the producer moves along the same supply curve sx from C to D and quantity
supplied increases from 50 to 125 units. When due to decrease in the cost of production
supply curve shifts from sx to s’x, the producer shifts from point C on sx to C’ on s’x and
increases the supply of the good from 50 to 80 units even at the same price of Rs.4.
15
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
The market or aggregate supply of a good gives the alternative amounts of the good
supplied per time period at various alternative prices by all the producers of this good in
the market. In addition to all the factors that influence individual producer’s supply, the
market supply depends also on the number of producers of the good in the market.
If all the producers face identical cost conditions such that they have the same supply
functions then the market supply function can be derived simply by multiplying the
individual supply function by the number of producers in the market. In the previous
example, if there are 100 identical producers in the market having the supply function
Qsx = –50 + 25 Px, then the market supply function will be given by
QSx = 100 × Qsx = –5,000 + 2,500 Px
The market supply curve is simply a graphical presentation of the market supply
schedule which can be drawn very much in the same way as fig. 1.7, only the scale on
the horizontal axis will have to change.
When individual producers face different cost conditions they will face different supply
functions and supply curves. In this case the market supply curve will be given by the
horizontal summation of the individual supply curves of all the firms in the market.
Let Table 1.7 give the supply schedules of the three producers of good X in the market.
16
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Table 1.7
Px Quantity supplied
(in Rs.) (per time period)
Firm 1 Firm 2 Firm 3
5 15 25 30
4 12 20 25
3 5 15 18
2 0 10 12
1 0 0 5
0 0 0 0
The individual supply curves of the three firms are drawn on the same set of axes in fig.
1.9 as sx1, sx2 and sx3. The market supply curve is given by Sx (OEDCBASx) which is a
horizontal summation of sx1, sx2 & sx3. Various points on the market supply curve are
obtained by adding up the quantities supplied by the individual producers at different
price levels. For example, at price Rs.5 (or P 5) the quantity supplied by firm 1 is P5A1
(15), by firm 2, P5A2 (25) and by firm 3 it is P5A3 (30). So the total quantity supplied in
the market at P5 price is P5A1 + P5A2 + P5A3 = P5A (70 units). The market supply curve
merges with Firm 3’s supply curve till price rises from Re.0 to Re1 and after that it
becomes a horizontal sum of s1x, s2x & s3x.
1.1.3 Equilibrium
Equilibrium is said to exist when opposing forces are in balance. In the market for a
particular good, demand and supply are like two opposing forces. The market is in
equilibrium at the price where the amount that is demanded equals the amount supplied.
This price is called the equilibrium price and the quantity demanded and supplied at this
price the equilibrium quantity. Market equilibium is shown graphically in Fig.1.10.
In fig.1.10 Dx is the market demand curve and Sx the market supply curve. They
intersect at point E. Only at price OP*, the quantity demanded is equal to the quantity
supplied which is equal to OQ*. At any price higher than OP* supply exceeds demand
and any price below OP*, demand exceeds supply and they are not in balance. So the
equilibrium price is OP* and the equilibrium quantity OQ*.
17
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Own Price elasticity of demand or,simply, the price elasticity of demand refers to the
relative responsiveness in the quantity demanded of a good with respect to a change in
its own pirce. The coefficient of price elasticity of demand is given by the percentage
(proportionate) change in the quantity demanded of a good divided by the percentage
(proportionate) change in its own price.
Since price and quantity demanded are inversely related, the coefficient of price
elasticity of demand is a negative number. In order to avoid dealing with negative
values, a minus sign is often introduced into the formula for the coefficient of price
elasticity. Thus, the formula for own price elasticity of demand for good X is given by the
following:
18
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
where the numerator gives the proportionate change in the quantity demanded of X and
the denominator gives the proportionate change in the price of X.
Equation (1.8)
can also be
written as
For infinitesimally small change in quantity and price the formula for price elasticity will
be
where is
the inverse of the slope of the demand curve at a point where price is Px and quantity
demanded of the good is Qx. Equation (1.10) can, therefore, be written as :
To measure elasticity between two points on the demand curve we may use the formula
given by equation (1.9). But while applying this formula to measure elasticity between
two points on a demand curve we would get different results depending on whether we
move from higher price to the lower price or from the lower price to the higher one. For
example, suppose we want to measure elasticity between points D & F on the market
demand curve Dx given in fig.1.5 which is reproduced in fig. 1.11. If we let the price fall
from Rs.5 to Rs3 and move from D to F on the demand curve Dx, then elasticity will be
19
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Whereas, if we let the price rise from Rs.3 to Rs.5 and move from point F to point D on
the same demand curve Dx, then elasticity will be
Thus, though we are measuring elasticity between the same pair of points on a demand
curve we are getting different results depending on whether we are moving from a
higher to a lower point or from a lower to higher point. This problem arises because the
elasticity of demand tends to vary from one point to another on the demand curve, and
for a large change in price and quantity we need an average value over the entire range.
Thus, when we deal with large changes in price and quantity, we should use the
following Arc Elasticity formula.
where P1 and P2 are the prices between which we want to find out the elasticity.
Following this formula, the elasticity between the points D and F on the demand curve
Dx in fig.1.11 will be
Graphically the price elasticity at a point on a linear demand curve is shown by the ratio
of the segments of the line to the right and to the left of the particular point. It can also
be described as the ratio of the lower segment to upper segment. Let us look at the
linear demand curve given in fig.1.11.
20
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Triangles AKE and ELJ are similar triangles and therefore, sides are proportionate.
It is clear from the figure that E is the mid point of the demand curve AJ. Therefore, EJ =
EA and hence
21
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Arc elasticity between two points on the demand curve is equivalent to finding elasticity
at the midway between the two points. In fig.1.11, where the demand curve is a straight
line, the point midway between D and F is the point E which corresponds to the price
In fig. 1.12, where the demand curve Dx is non-linear the point midway between D and
F is the point E′ which lies on the straight line joining the two points. So the arc elasticity
between the two points D and F on the non-linear demand curve Dx, is given by the
elasticity at point E′ which does not lie on the demand curve. In fig.1.12, the elasticity
22
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
History
Source: http://en.wikipedia.org/wiki/Price_elasticity_of_demand
The illustration that accompanied Marshall's original definition of PED, the ratio of PT to
Pt
Example : Given the market demand function QDx = 3200 – 400 Px,
(i) Derive the market demand schedule.
(ii) Find elasticity when price falls from Rs.5 to Rs.4.
(iii) Find elasticity at Px = Rs.3.
Px (in Rs.) 8 7 6 5 4 3 2 1 0
Qx (in Kgs) 0 400 800 1,200 1,600 2,000 2,400 2,800 3,200
(ii) Since the price has fallen by Re1, it is a finite change and so we use the concept of
Arc elasticity
23
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
(iii) Here we have to find elasticity at a point on the demand curve. So we use the point
method.
So the slope of the curve is only one of the factors that determine elasticity. The second
factor is the position of the point indicated by (P/Q), at which elasticity is evaluated.
Using this concept we can derive some important results on elasticity of demand.
(i) First, the elasticity of a down-ward-sloping straight-line demand curve varies from
infinity at the price axis to zero at the quantity axis. A straight line has a constant slope,
so its reciprocal is also constant at every point on the demand curve. So the value of
elasticity at any point will now depend on the ratio P/Q. At the price axis, Q = 0, and P/Q
is equal to infinity. Thus elasticity approaches infinity as quantity approaches zero.
In fig.1.13, the elasticity is equal to infinity at point D on the demand curve DE. As we
move down the line DE, price decreases and quantity in- creases steadily; thus P/Q is
falling steadily so that elasticity is also falling. At the quantity axis, that is, at point E on
the demand curve, price is zero, so the ratio P/Q equals zero and hence elasticity is
equal to zero.
This result can be interpreted in another way by using the definition of elasticity.
Elasticity refers to percentage change. Starting from point D on the demand curve, a
smallest reduction in price will increase the quantity demanded from zero to some
positive amount. Because the previ- ously demanded quantity was zero, the increase is
infinite in percentage term. So elasticity at point D is equal to infintiy. At point E, any
increase in price from zero to a positive number is an infinite percentage increase
because the price was previously zero. Therefore elasticity at point E is equal to zero. By
using the geometrical formula for point elasticity, we can derive that elasticity at the
24
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
An Example
Let us take the same demand function given in the previous example:
QDx = 3200-400Px.
At the price intercept of the demand curve, Px=8 and QDx=0, and ηxx = -(-400)*(8/0) =
∞.
At Px=4, QDx=1600 and ηxx = 400*(4/1600) = 1. It can be observed that it is the mid
point of the given straight line demand curve.
At Px=2, QDx=2400 and ηxx = 400*(2/2400)= 1/3<1.
We just saw that elasticity varies along a linear demand curve. There is another form of
demand curve (which is frequently used in empirical work) on which elasticity remains
the same at each point. The functional form of the demand curve is already given in
Value Addition 1.1:
Qx = a(1/Pxb ),
(100/1)*(1/100) = 1=b.
Thus the demand curve Qx=100/Px is an unit elastic demand curve. It is a rectangular
hyperbola. Such a demand curve is illustrated in Fig.1.v.1.
Suppose we assume a=100 and b=2, then Qx=100/Px 2.
If Px=2,Qx=25 and ηxx= -(dQx/dPx)*(Px/Qx) =
(200/Px3)*(2/25)=(200/8)*(2/25)=2=b.
Instead, if Px=5, Qx=4 and ηxx= (200/53)*(5/4)=(200/125)*(5/4)=2=b.
25
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Thus, when b=2, the elasticity of demand is equal to two at each point on this demand
curve
(ii) Second, comparing two straight line demand curves of the same slope, the one
farther from the origin is less elastic at each price than the one closer to the origin.
In fig. 1.14, D1E1 and D2E2 are two parallel straight line demand curves. Let us take the
price P, A and B are the cor- responding points on the de- mand curves D1E1 and D2E2
re- spectively. Since the two curves are parallel, is the same at points A and B. Price
is also the same. On the curve farther from the origin (D 2E2) quantity is larger (i.e., OQ2
> OQ1 ) and hence P/Q is smaller, thus elas- ticity is smaller.
Generally, elasticity is measured at a particular price and in that case, at each price,
elasticity on D2E2 will be less than the elasticity on D1E1. But if we measure elasticity at
a particular quantity, then we will get a different result. For example, elasticity at
quantity Q2, on the demand curve D1E1 is (∆Q/∆P).(CQ2/OQ2) and on the demand
curve D2E2 is (∆Q/∆P).(BQ2/OQ2). As BQ2 is more than CQ2, so, at the quantity Q2,
the demand curve D2E2 is more elastic than the demand curve D1E1 .
(iii) Third, of two intersecting straight line demand curves the steeper demand curve will
be less elastic than the flatter one at the point of intersection.
In fig. 1.15, D1 E1 and D2 E2 are two straight line demand curves intersecting at point A.
D1 E1 is steeper that D2 E2 . At the point of intersection A, P/Q is the same on the two
demand curves. On the steeper demand curve D1 E1, is larger than on the flatter
demand curve D2 E2; thus, the ratio is smaller on the steeper curve than on the
flatter curve, so that elasticity is lower.
26
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Thus, if we take two intersecting straight line demand curves, the flatter demand curve
will show greater elasticity than the steeper one at a given price. But it is not always
true that a flatter demand curve will show greater elasticity than a steeper one. In fact,
if two straight line demand curves having different slopes start from the same point on
the price axis, the elasticities on the two demand curves will be the same at a given
price.
In fig. 1.16 DE1 & DE2 are two straight line demand curves starting from the same point
D on the price axis, DE1 beting the steeper one. At price P, elasticity of the demand
curve DE1 will be and of DE2, will also be Hence elasticity is the same at price P
on the two demand curves. Thus, a flatter demand curve does not necessarily signify a
greater elasticity than a steeper one.
IV. Total Expenditure and price elasticity When price of a good increases, the consumer
spends more on each unit of the good bought. At the same time she buys less units of
the good. If the price effect outweighs the quantity effect, the total expenditure on the
good rises. If the quantity effect outweighs the price effect, then total expenditure falls.
If the elasticity of demand is less than one, then a 1 per cent increase in price leads to
less than a 1 per cent decrease in quantity demanded and the price effect outweighs the
quantity effect leading to rise in the expenditure on the good.
If the elasticity exceeds one, a small increase in price causes a more than proportionate
fall in the quantity demanded, so the quantity effect dominates and total expenditure
falls. If the elasticity is equal to one, a given percentage increase in price leads to an
equal percentage fall in the quantity bought and the total expenditure remains the same.
In general, if η < 1, then the change in price and the change in total expenditure move
in the same direction; if η > 1, then the change in price and the change in total
expenditure move in the opposite directions and if η = 1, with a change in price, the
total expenditure remains the same. It is clear that, the money spent by purchasers of a
27
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
good is received by the sellers. The total expenditure on the good by the consumers is
thus the total revenue for the sellers. Thus the previ- ous relationship also holds good
between elasticity and total revenue. This relationship can be formally proved as follows:
Total revenue = TR = P × Q
So when ηxx <1, then=""> 0; that is, total revenue and price move in the same
direction. When ηxx >1, then< 0; that is, total reveune and price move in opposite
directions. When ηxx =1, then= 0; that is, with a change in price there is no change in
total revenue.
Example : Using only the total expenditure criterion, determine if the demand
schedules given in the following table are elastic, inelastic, or unitary elastic.
Price 5 4 3 2 1
(in Rs.)
Qx 120 150 200 300 600
Qy 120 160 225 350 725
Qz 120 140 175 250 475
Ans.
28
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
(i) For good X, as the total expenditure on the good remains the same at Rs.600, so the
demand for X is unitary elastic.
(ii) For good Y, the total expenditure on the good is rising with a fall in the price. That is,
price and total expenditure are moving in the opposite direction, so the demand is
relatively elastic.
(iii) In case of good Z, with a fall in the price total expenditure also falls. So the demand
is relatively inelastic.
The link between elasticity and revenue may answer the following questions: Why would
Brazil, one of the world's largest producers of coffee, burn some of their coffee harvest
as a way of increasing the value of coffee exports? Why would the OPEC countries lower
production if their goal was greater income? Why does agricultural income fall in years of
a good harvest?
In the coffee example, what could we expect when Brazilian officials reduce the supply of
coffee? Coffee drinkers seem to need their coffee and they can be expected to pay
whatever they need to pay to get their coffee fix. In this situation the reduction in supply
will lead to a substantial increase in price as the demanders compete for the smaller
supply. The net effect on revenue will be positive with the increase in price ( P) more
than compensating for the decreased quantity ( Q).
The relationship between elasticity and total revenue can be explained in the following
way. Let's assume there is an increase in supply - the supply curve shifting to the right.
Total revenue is by definition equal to the price times the quantity sold (P*Q). In the
diagrams below the initial situation is described by the black supply curve (inner curve).
The revenue earned from selling the output is the areas A + B. After the increase in
supply shifts the supply curve to the right (red line), revenue equals the area B + C.
Revenue will increase as a result of the increase in supply if (area C) > (area A). In the
diagrams below we see that this happens when the demand curve is flat - when demand
is elastic. When demand is elastic, revenue will increase if we decrease the price or
increase supply. Revenue and output move in the same direction while revenue and
price move in opposite directions when demand is elastic. When demand is inelastic,
revenue will decrease if we decrease the price or increase supply. Revenue and output
move in opposite directions while revenue and price move in the same direction when
demand is inelastic.
Guidelines
We can now come up with some guidelines that tell us what to do with price or output if
our goal is to raise revenue. The general rules appear below.
Output and revenue are negatively related: to raise revenue you would lower
output
Price and revenue are positively related: to raise revenue you would raise price
29
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Output and revenue are positively related: to raise revenue you would raise
output
Price and revenue are negatively related: to raise revenue you would lower price
To understand the relationship between elasticity and revenue, let's look at the dilemma
faced by OPEC countries.The OPEC countries once controlled the supply of oil and they
were meeting to decide what to do about their levels of oil production. Some wanted to
raise output while others wanted to lower output. The strategy to lower output would be
most effective when:
Let's begin with the basics - Revenue = P*Q. The change in revenue will depend upon
the changes in price and quantity. The decision to restrict output (decrease in Q) as a
means to increase revenue works when we have reason to believe that revenue and
output tend to move in opposite directions (Revenue increases when Quantity falls). If
we cut production, the only way that this will increase revenue is if the price rises
substantially. This will happen if we are talking about a product where price does not
have much of an effect on demand - a product where demand is inelastic.
Now let's look at the previous graph. Because demand is inelastic, the curve is steep so
the appropriate diagram is the one on the right. The original equilibrium is where the
supply curve and demand curve intersect [price = P1 and the quantity = Q1]. Total
revenue is equal to the area A + B. If the supply is increased, the supply curve shifts
out, then the new equilibrium will generate revenue equal to the area B + C. If we
compare the revenues we see that the decision to expand output will lower revenue
when demand is inelastic. In this case, if OPEC thought that demand was inelastic, the
group should agree to restrict output which is exactly what they did.
With the help of the previous graph we can explain how good news for farming can be
bad news for farmers. Generally demand curve for agricultural products is fairly inelastic,
so the appropriate diagram is the one on the right panel. When an improvement in the
farm technology or a favourable weather condition shifts the supply curve of, say, wheat,
from S to S’, price falls steeply but demand increases only slightly and total revenue
falls.
Solved Problem
Question : Suppose the price of a good is Rs.10 and its demand elasticity at this price
is 0.5. Suppose that due to a rise in its price its demand fallls by 10 percent. What is the
new price? What happens to the total expenditure on the good after a rise in its price?
Calculate the percentage change in the total expenditure.
Answer: We know that ηxx = (percentage change in quantity/ percentage change in
price).
So, (0.5) = 10/ (∆P/P) .100 =10 P/ 100.∆P =10*10/ 100*∆ P =1/ ∆ P
Or, ∆ P = 2.
So the new price is Rs. 12.
As the elasticity of demand is less than one , so total expenditure on the good will rise
with a rise in the price.
30
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
The percentage change in total expenditure can be written as: {(P 2Q2 - P1Q1)/ P1Q1}
*100
={ (P2Q2/ P1Q1 ) – (P1Q1 /P1Q1) }* 100
= { (P2/P1).(Q2/Q1) – 1 }* 100. (I)
Now, we know that { (Q2 – Q1 )/ Q1 }* 100 = -10.
Therefore, { ( Q2 /Q1 ) - 1 } = -1/10,
Or, (Q2/Q1 ) = 9 / 10.
Putting this value in ( I ), we get the percentage change in total expenditure as:
{ (12 /10 ) . (9 / 10 ) – 1 } * 100
={ ( 108 – 100 ) / 100 } * 100
=8.
V. Factors affecting price elasticity The size of the price elasticity of demand depends on
the following factors.
(i) First, the price elasticity of demand for a good is larger the closer and the greater are
the number of substitutes available. For example, the demand for oranges is more
elastic than the demand for salt because oranges have closer and more number of
substituts (like banana, mango, etc.) than salt. Thus, if the price of both salt and orange
rise by the same percentage decrease in the demand for orange will be more than that
for salt.
We know that the more narrowly a good is defined, the larger are the number of
substitutes available and hence elasticity of demand also will be larger. For this reason
the demand for a particular brand of a product will be more elastic than the product in
general. For example, the detergent brand ‘Surf’has many substitutes like ‘Ariel’, ‘Tide’,
‘Nirma’etc. and hence an increase in the price of Surf will induce the consumers to buy
other brands and therefore, the demand for Surf will reduce to a great extent. Whereas,
if the price of detergent powder in general increases then the demand for it will not
reduce to a great extent because close substitutes are not available for it. Thus, demand
for Surf will be much more elastic than the ‘detergent powder’ in general.
In the extreme case, if a good is defined so that it has perfect substitutes, its elasticity
of demand is infinite. For example, if a particular petrol pump charges a higher price for
petrol than the market price, then it would lose all customers, as buyers will switch over
to other petrol pumps which are selling idential products at the market price.
(ii) Second, the elasticity of demand depends on the nature of the need that the good
satisfies. In general, luxury goods are price elastic, while necessaries are price inelastic.
For example, goods like cereals, cooking gas, sugar, salt, potatoes, electricity, transport
to and from the place of work are necessities and with a rise in their price quantity
demanded will not be recduced significantly. Whereas, goods like entertainments, eating
out, holidaying, etc. are luxuries and their demand will be price elastic.
(iii) The proportion of income spent on a good is another factor determining its elasticity.
Higher the proportion more is the price elasticity of demand. Examples are durable
goods like electrical appliances, cars etc. Whereas a consumer spends a very small
proportion of her income on the purchase of goods like salt, vegetables, milk etc., and
their demand will be price inelastic.
(iv) Another factor is the time period over which the consumers adjust to a price change.
The longer the adjustment period the more will be the elasticity of demand. For
example, immediately after a rise in the price of LPG, a household may not be able to
reduce its demand for it but in the longer run it will be able to replace LPG by either
piped natural gas or electricity and hence demand for LPG will decrease. So demand for
LPG will be more elastic in the long run than in the shortrun.
31
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
In fig.1.17, point A on the market demand curve DD′ shows that at the price of OP per
unit, the quantity demanded by the buyers or quantity sold by the sellers is OQ. Total
revenue is equal to the price per unit of the good times the quantity of the good sold.
From the stand point of sellers OP×OQ or the area of the rectangle OPAQ is the total
revenue obtainable when a price of OP per unit is charged.
Average revenue (AR) is the revenue per unit of output sold. That is,
Where Q is the quantity sold at the price P. Thus, AR is identically equal to price. In Fig.
1.17, when quantity sold is OQ, price as well as average revenue is equal to AQ which is
the height of the demand curve corresponding to OQ. So the market demand curve can
be considered as the AR curve from the point of view of the seller.
Marginal revenue (MR) is the change in total revenue attributable to a one-unit change
in output sold. In general, MR is calculated by dividing the change in TR by the change in
output. That is,
where, ΔTR is the change in TR and ΔQ is the change in output. In particular, when ΔQ =
1, MR = ΔTR. In equation (1.15) the changes are finite. For infinitesimally small change
in quantity and revenue,
or, Marginal revenue at any point on the TR curve is given by the slope of the total
revenue curve at that point.
32
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
The first two columns in Table 1.8 give the demand schedule of the good. Column 3 is
derived by multiplying columns (1) & (2), and it gives us total revenue. The change in
total revenue resulting from each additonal unit of the good sold gives the marginal
revenue which is shown by column 4. Because average revenue is identically equal to
the price of the good so column 1 also gives us the AR.
In fig.1.18. Panel (a) gives the total revenue curve which is drawn by plotting the points
given in columns (2) and (3), and then joining these points by straight line segments.
This is done because the data given in the table is discrete. The TR curve rises steadily
till 4 units of the good are sold, remains constant at Rs.20 between 4th and 5th unit and
then declines. Panel (b) gives the corresponding demand (AR) and marginal revenue
curves. Points on the TR and D curves are plotted at each level of output.
33
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
When demand curve is a downward sloping straight line, we can easily derive the
corresponding MR curve.
P = a – b Q, (1.17)
and MR =
From equation (1.19) we can derive two important relationships between MR and the
demand curves when the demand curve is a downward sloping straight line. The MR
curve has the same intercept as the demand curve and a slope which is twice as large in
absolute value as the slope of the demand curve. Equation (1.19) can be rewritten as
follows :
MR = a – bQ – bQ = P – bQ (1.20)
As b is a positive constant, so for any positive value of Q, MR will be less than the price.
When Q = 0, MR will be equal to the price.
In general, the marginal revenue is given by
where is the slope of the demand curve at the relevant point. When the demand
curves are negatively sloped is negative, and hence, MR is less than the price. When
the demand for a good is perfectly elastic and the demand curve is horizontal, = 0,
and hence, MR will be equal to the price.
Thus, MR curve lies below the demand curve when the latter has a negative slope. The
reason is that to sell more units the price must be lowered, not just on the last unit, but
on all previous (or intra-marginal) units as well 1. For example, in Table 1.18, to increase
34
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
quantity sold from, 2 to 3 units, price is reduced from Rs.7 to Rs.6 per unit. Therefore,
the MR on the 3rd unit of the good is given by the current price Rs.6 minus the Re1
reduction in price for the previous two units. So MR on the 3rd unit is given by Rs.6 –
Rs.2 = Rs.4, which is lower than the price of Rs.6.
For a given quantity, price measures the height of the demand curve. Since MR < P, so
MR curve is below the demand curve.
Solved Problem
Question: Given the demand function Qx = 100-10Px, derive the equations for TR,
AR and MR functions. On the basis of your answer derive the relationship between AR
and MR.
Px = 10-(1/10)*Qx. (i)
MR = 10-(1/5)*Qx. (iii)
The AR curve has a constant slope of –(1/10), implying that it is a downward sloping
straight line. Further, it has a vertical intercept equal to 10.
From (iii) it is clear that the MR curve also has the same vertical intercept as the AR
curve (10). Its slope is –(1/5), which is twice as much as the slope of the AR curve in
absolute term.
For any quantity, AR>MR. For example, at Qx=20, AR=Px=8 and MR=10-(1/5)*20 = 6;
at Qx=50, AR=Px=5 and MR= 10-(1/5)*50 =0; at Qx=80, AR=Px=2, and MR = 10-
(1/5)*80 = -6.
This illustrates that the MR curve lies below the demand curve (i.e., the AR curve) when
the demand curve is a downward sloping straight line.
We can use the relationship given in equation (1.21) to construct the marginal revenue
curve corresponding to a given demand curve. This is shown in fig.1.19 where in panel
(a) the demand curve is linear and in panel (b) it is non-linear. In panel (a), we can find
marginal revenue corresponding to point E on the demand curve Dx by dropping
perpendicular EA to the vertical axis and EC to the horizontal axis. We know from
35
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
drawn from the demand curve to the vertical axis. For example, AK = AE and OF′
ODx. We can prove this as follows. We know that the slope of the MR curve is twice as
much as the slope of the demand curve when the demand curve is linear.
In Fig. 1.19(a), slope of MR curve =(OD/OF’) and slope of the demand curve
=(OD/ODx).
Thus, (OD/OF’)=2(OD/ODx). So, OF’= (ODx/2).
This gives another way to derive the MR curve geometrically corresponding to a linear
demand curve.
To find the marginal revenue corresponding to any point on a non-linear demand curve,
we draw a tangent to the demand curve at that point and then proceed as described
above. For example, to find the MR corresponding to point E on the non-linear demand
curve D′x given in pannel (b) of fig.1.19, we draw the tangent AB and then drop
perpendicular EG to the vertical axis and EL to the horizontal axis. Following equation
(1.21), we can prove that the MR corresponding to point E will be OG – AG = Rs. 10 –
Rs.5 = Rs.5. This is shown as point E′. Similarly, corresponding to point F on the
demand curve Dx′, MR will be equal to zero which is shown as point F′. Joining points
like E′, F′ we get the marginal revenue curve MR′x corresponding to the nonlinear
demand curve D′x.
36
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
The marginal revenue is related to the price and the elasticity of demand by the
following formula
For a downward sloping straight line demand curve the relationship (1.23) is shown in
fig.1.20. On the demand curve DD’, M is the mid-point and hence η = 1 at that point.
Corre- sponding to point M on the demand curve, MR = O and the MR curve intersects
the quantity axis. For any point above M on the demand curve, η > 1 and hence MR > 0
For example, at point K on the demand curve, P = KC and MR = BC.
37
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
For any point below M on the demand curve, η < 1, and hence MR < 0, e.g., at point L,
η < 1 and MR curve goes below the quantity axis.
For an unitary elastic demand curve, elasticity is equal to one at every point on the de-
mand curve and hence MR = 0 for every level of output. In fact, an unitary elastic
demand curve has the shape of a rectangular hyperbola and its corresponding MR curve
will merge with the horizontal axis. A rectangular hyperbola is a down- ward sloping
curve which is asymptotic to both the axes and the areas of the rectangles formed under
the curve are equal.
In fig. 1.21 let DD’ be a unitary elastic demand curve. Then at every point on the
demand curve total revenue remains the same. Total revenue at point A on the demand
curve is given by the area of the rectangle OP AQ . Similarly, total revenue at points B
and C on the demand curve are given respectively by the arof OP 2 BQ2 and OP3 CQ3 .
Thus area of OP1 AQ1 = area of OP2 BQ2= area of OP3 CQ3. Hence, the demand curve DD’
is a rectangular hyperbola and since MR = 0, whenever η = 1, so MR curve merges with
the quantity axis.
ηxy = (1.24)
where ηxy = cross price elasticity of demand between good X and good Y,
Δ Q x = change in the quantity demanded of X,
When goods X and Y are substitutes of each other, a rise in the price of Y will lead to an
increase in the demand of X and hence, > 0 and, therefore, ηxy > 0.
On the other hand, if goods X and Y are complements of each other, then a rise in the
price of Y will lead to a reduction in the quantity demanded of Y and also a reduction in
38
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
If X and Y are not related to each other, so that a change in the price of Y does not
cause any change in the quantity demanded of X, then ηxy = 0.
It should be noted that the value of ηxy need not be equal to the value of ηyx because
the responsiveness of quantity demanded of X with respect to a change in the price of Y
need not equal the responsiveness of quantity bought of Y to a change in price of X.
If the goods X and Y are produced by two firms belonging to the same industry, then X
and Y will be substitutes of each other and their cross price elasticity will be a large
positive number. For example ‘Tropicana’ and ‘Real’ belong to the same packaged fruit
juice industry and a rise in the price of one will lead to a rise in the quantity demanded
of the other and so they will have a high positive cross price elasticity. Thus, high
positive cross elasticities among a group of commodities is frequently used to define the
boundaries of an industry. If the cross price elasticity among a group of goods equals
zero or is negligible, then the goods will belong to different industries rather than to the
same industry.
Example : Find the cross elasticity of demand between Coffee (X) and Tea (Y) and
between Coffee (X) and Milk (Z), for the data given in Table 1.9. Also interprete your
results.
Table 1.9
Good Before After
Pric Quantity Price Quantity
(Rs. unit) (units/month) (Rs./unit) (unit/month)
Tea (Y) Rs. 250/kg. 0.5 kg./month Rs.500/kg. 0.25 kg./month
Coffee (X) Rs. 500/kg. 0.2 kg./month Rs.500/kg. 0.3 kg./month
Milk (Z) Rs. 15/litre 60 litres/month Rs.18/kg. 45 litres/month
Coffee (X) Rs. 500/kg. 0.2 kg./month Rs.500/kg. 0.1 kg./month
Ans. ηxy = .
Since ηxy is positive so tea and coffee are substitutes of each other.
Now, ηxz =
Since ηxz is negative coffee and milk are complements of each other.
A Brain Teaser
Sppose that the cross price elasticity of demand for two goods is minus infinity. What
would you infer about the two googds?
39
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Source:http://tutor2u.net/economics/revision-notes/as-markets-
crossprice-elasticity-of-demand.html
How can businesses make use of the concept of cross price elasticity of demand?
Pricing strategies for substitutes: If a competitor cuts the price of a rival product, firms
use estimates of cross-price elasticity to predict the effect on the quantity demanded and
total revenue of their own product. For example, two or more airlines competing with
each other on a given route will have to consider how one airline might react to its
competitor’s price change. Will many consumers switch? Will they have the capacity to
meet an expected rise in demand? Will the other firm match a price rise? Will it follow a
price fall?
Consider for example the cross-price effect that has occurred with the rapid expansion of
low-cost airlines in the European airline industry. This has been a major challenge to the
existing and well-established national air carriers, many of whom have made
adjustments to their business model and pricing strategies to cope with the increased
competition.
Pricing strategies for complementary goods: For example, popcorn, soft drinks and
cinema tickets have a high negative value for cross elasticity– they are strong
complements. Popcorn has a high mark up i.e. pop corn costs pennies to make but sells
for more than a pound. If firms have a reliable estimate for cross price elasticity of
demand they can estimate the effect, say, of a two-for-one cinema ticket offer on the
demand for popcorn. The additional profit from extra popcorn sales may more than
compensate for the lower cost of entry into the cinema.
Advertising and marketing: In highly competitive markets where brand names carry
substantial value, many businesses spend huge amounts of money every year on
persuasive advertising and marketing. There are many aims behind this, including
attempting to shift out the demand curve for a product (or product range) and also build
consumer loyalty to a brand. When consumers become habitual purchasers of a product,
the cross price elasticity of demand against rival products will decrease. This reduces the
size of the substitution effect following a price change and makes demand less sensitive
to price. The result is that firms may be able to charge a higher price, increase their total
revenue and turn consumer surplus into higher profit.
These relationships can be extended to the case when the consumer consumes any
number of goods : If the own-price elasticity of demand for good X exceeds one, then in
some average sense, the other goods are substitutes for X. If the own-price elasticity is
less than one, then in that same sense, the other goods are complements. This
proposition can be formally derived as follows :
40
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Let the consumer spend her income, I, on the purchase of n goods. Then her budget
constraint is given as :
Multiplying and dividing the 2nd to nth terms on the R.H.S. of the previous equation by
(P1/Qj), j=2,3,...,n, we get,
where η11 is the own-price elasticity of demand for good 1; ηj1 is the cross-price
elasticity between the jth good and good 1; E 1 is the expenditure on good 1 and Ej is the
expenditure on the jth good, j = 2, ..., n.
The R.H.S. of equation (1.26) gives us the weighted sum of the cross-price elasticities
between good 1 and other goods. Equation (1.26) implies that if η 11 > 1, then this
weighted sum of cross-price elasticities will be positive indicating that on an average, the
other goods are substitutes for good 1. On the other hand, if η 11 < 1, then the weighted
sum of cross-price elasticities will be negative implying that in some average sense the
other goods are complements of good 1.
41
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Question: Suppose a consumer spends her entire income on the purchase of two
goods, X and Y. Suppose further that the consumer’s own price elasticity for X was more
than one. Then prove that X and Y are substitutes.
Solution: Let Px and Qx be the price and quantity bought of X respectively, and Py
and Qy be the price and quantity of good Y. Let I be the income of the consumer.
As the consumer is spending her entire income on the two given goods so,
I= Px. Qx + Py .Qy (i)
Let there be a change in the price of X, income of the consumer and price of Y remaining
constant. Thus, differentiating (i) with respect toPx we get
It is clear from equation (ii) that if ηxx >1, then ηs >0, implying that goods X and Y are
substitutes of each other.
Symbolically,
If X is a normal good for the consumer then with a change in her income the quantity
demanded of X will change in the same direction and so > 0 and hence ηI will be
positive. If X is an inferior good then ηI will be negative. A normal good can be further
classified as a necessity if ηI is less than one and as a luxury if ηI is greater than one.
Most of the broadly defined goods such as food, fuel, housing, education, clothings etc.
are normal goods, while narrowly defined inexpensive goods such as coarse rice, jawar,
bajra, vanaspati, synthetic clothes, 555 detergent powder etc. are usually considered as
42
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
inferior goods. Among normal goods, food, fuel, clothings etc. are necessities while
higher education and housing are luxuries.
It should be kept in mind that this classification of goods into normal and inferior, and
necessity and luxury is not strictly defined. In fact, the same good can be regarded as a
luxury by some individuals and as a necessity or even an inferior good by other
individuals. Even the same individual might consider a good as a luxury at a lower level
of income, as a necessity at intermediate level of income and as an inferior good at high
level of income.
Example : From the income quantity relationship given in Table 1.10, find the income
elasticity of demand between the various successive levels of income and determine over
what range of the consumer’s income the good is a luxury, a necessity, or an inferior
good for the consumer.
Table 1.10
Point A B C D E F
Income 2,000 4,000 6,000 8,000 10,000 12,000
(Rs./month)
Quantity 100 300 500 650 700 600
(Kg/month)
43
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Since ηI is negative so the good has now become an inferior good for the consumer.
It can be shown that if a consumer’s income elasticity of demand for a particular good is
greater than one, then with a rise in the consumer’s income, the proportion of income
spent on the good will increase. The opposite will take place if the ηI is less than one. If
ηI = 1, then with a rise in income, the proportion of income spent on the good will
remain the same.
Let us suppose that at income I, the individual consumes Qx units of the good at a price
of Px/unit and with a rise in income by ΔI, cetris paribus, she consumes ΔQx more units
of X.
44
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
From (1.30) it is clear that if ηI > 1, then > 1, i.e., the proportion of income spent
on good X will increase with a rise in income. Similarly, if η I < 1, the proportion of
income spent on X will decrease and if ηI = 1, then Pr2 = Pr1, implying that proportion of
income spent on X will remain the same with a rise in income.
A well known result involving the income elasticity of demand is that the weighted sum
of all income elasticities is equal to unity. It can be proved as follows.
Let the consumer spend all her income, I, on n goods whose prices are given as P 1, P2,
..., Pn. Let Q1, Q2,...,Qn be the quantity consumed of the n goods respectively. Then the
budget constraint can be written as:
I = P1Q1+P2Q2+...+PnQn (i)
45
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Let there be a change in the income of the consumer, prices of all the goods remaining
constant. The effect can be shown by partially differentiating (i) with respect to I.
Thus, (δI/δI) = P1 (δQ1/δI) + P2 (δQ2/δI) + ...+Pn (δQn/δI).
1=[(δQ1/δI)*(I/Q1)]*(P1Q1/I)+[(δQ2/δI)*(I/Q2)]*(P2Q2/I)+…+[(δQn/δI)*(I/Qn)]*(PnQn/I)
where, ηIj is the income elasticity of jth good and Ej is the total expenditure on the jth
good, j = 1,2,...n.
where, λj is the proportion of income spent on the jth good. It is clear that, λ 1+λ2+...+
λn = 1.
Hence, it is proved that the weighted sum of income elasticities is equal to one where
the proportion of income spent on the respective goods serve as the weights.
An interesting implication of this result is that in a world of n commodities that a
consumer consumes, there has to be at least one normal good. In other words, all goods
cannot be inferior goods
Elasticity of supply measures the responsiveness of the quantity supplied of a good with
respect to a change in its own price with every thing else held constant. Algebraically,
elasticity of supply (ηs) is the proportionate (percentage) change in the quantity supplied
of a good divided by the proportionate (percentage) change in the price of the good.
Thus,
For infinitesimally small change in price and quantity, the formula for elasticity is given
by (1.32)
where, is the inverse of the slope of the supply curve at the point where price is
given by Px and quantity supplied Qx.
46
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Thus, equation (1.32) gives us the elasticity at a point on the supply curve. Just like the
elastic- ity of demand, the arc elasticity of supply gives us the elasticity between two
points on the supply curve and can be given by slightly modifying the formula (1.31).
Thus,
Arc elasticity
where Px1 & Qx1 are original price and quantity and Px2 & Qx2 are new price and
quantity. If the supply curve of a good is upward sloping the coefficient of elasticity of
supply will have a positive sign. The supply curve is said to be elastic if η s > 1, inelastic if
ηs > < 1, and unitary elastic if ηs > = 1. It should be noted that for a positively sloped
supply curve, an increase in the price will always lead to an increase in the total revenue
of the seller and vice-versa.
The elasticity of supply depends on the period of time allowed for adjustment. As
adjustment in supply is easier in the long run than in the short run, so supply of a good
will be more elastic in the long run than in the short run.
1.2.2 (a) When the supply curve is a positively sloped straight line crossing the price
axis, then all along the line, ηs > 1.
In fig. 1.22, SSx is a linear supply curve. Point A on the supply curve corresponds to
price P1 and quantity Q1. Elasticity of supply at point A on supply curve is given by
47
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Similarly, we can prove that at any other point on the supply curve η s > 1.
1.2.2 (b) When the supply curve is a positively sloped straight line passing through the
origin, then all along the line, ηs = 1.
In fig. 1.23 OSx is the supply curve. Elasticity of sup- ply at any point A on the curve is
given by
1.2.2 (c) When the supply curve is a positively sloped straight line crossing the
quantity axis then ηs < 1.
In fig. 1.24, elasticity of supply at point A on the supply curve SSx is given by
1.2.2 (d) When the supply curve is curvilinear the elasticity of supply at any point on
the curve can be determined by drawing a tangent to the curve at that point and
proceeding in the manner as we had done for a linear supply curve. If the tangent
crosses the price axis, then ηs > 1, if it crosses the origin, then ηs = 1 and if it crosses
the quantity axis then ηs < 1. ηs at point A1 on the curve given in fig. 1.25 is greater
than one, at point A2, ηs is equal to one and at point A3 it is less than one.
48
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Summary
1. Demand-supply analysis is an economic model of price determination in a
market.
2. The demand schedule, depicted graphically as the demand curve, represents
the amount of some good that buyers are willing and able to purchase at various
prices, assuming all determinants of demand other than the price of the good in
question, such as income, personal tastes, the price of substitute goods, and the
price of complementary goods, remain the same. Following the law of demand,
the demand curve is almost always represented as downward-sloping, meaning
that as price decreases, consumers will buy more of the good.
3. Price of the good concerned remaining the same a change in the price of
substitutes and/complements and a change in the consumer’s income leads to a
shift in the demand curve.
4. The supply schedule, depicted graphically as the supply curve, represents
the amount of some good that producers are willing and able to produce and sell
at various prices, ceteris paribus, that is, assuming all determinants of supply
other than the price of the good in question, such as technology and the prices of
factors of production, remain the same.
5. When factors other than own price of the good, such as prices of the inputs
and/technology change, the supply curve shifts.
6. Equilibrium is arrived in a competitive market at that price which equates the
quantity demanded of a good to quantity supplied. It occurs at the intersection of
the demand and supply curves.
7. Elasticity of demand measures the responsiveness of the quantity demanded
of a good to a change in the factors which affect demand.
8. Own price elasticity of demand is given by the percentage change in the
quantity demanded of a good divided by the percentage change in its price. Arc
elasticity measures elasticity between two points on the demand curve and Point
elasticity measures elasticity at a specific point on the demand curve. Price
elasticity of demand for a good will be higher the larger the number of substitutes
available and the longer the time allowed for demand to adjust to a change in its
price.
9. Cross price elasticity is given by the percentage change in the demand of a
good divided by the percentage change in the price of some other good. In case
the two goods are substitutes cross price elasticity is greater than zero and if they
are complements it is less than zero.
10. Income elasticity is given by the percentage change in the quantity bought of a
good divided by the percentage change in the consumer’s income. For normal
goods the coefficient of income elasticity is positive and for inferior goods it is
negative.
11. Elasticity of supply is given by the percentage change in the quantity sold
of a good to a given percentage change in the price of the good.
49
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Exercise
1. 1. Use supply and demand curves to illustrate how each of the following events would
affect the price of butter and the quantity of butter bought and sold: (a) an increase in
the price of its substitute margarine; (b) an increase in the price of milk; (c) a decrease
in average income levels.
1. 2. Suppose the demand curve for a good is given by Qx = 10 – 2Px, where Px is the
price of good.. Determine the own price elasticity of demand for good X at Px = Re 1 and
Px = Rs. 2.
a. The individual’s demand curve for a good represents a maximum boundary of the
individual’s intentions
b. A producer’s positively sloped supply curve for a good represents in one sense a
maximum and in another sense a minimum boundary of the producer’s
intentions.
1. 6. What will be the shape of a unitary elastic demand curve and its corresponding
marginal revenue curve? Explain giving reasons.
1. 9. Neena consumes two goods X and Y with a fixed income; if her cross elasticity of
demand for X with respect to price of Y is greater than zero, then we can infer that her
demand for Y is less elastic. True or false, and why?
1. 10. If the inverse demand function is p = a – bq, where a and b are positive
constants,
what is the price elasticity at q = 0,at q =(a/b) and at q =(a/2b) ?
1. 11. The price elasticity of demand for a given commodity is alleged to be greater :
50
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
iv. At higher prices rather than lower prices. Explain the supporting argument in
each case and analyse its validity.
1. 12. How should a linear downward sloping demand curve shift if elasticity at each
price is to remain the same? Explain using diagram.
1. 13. A consumer spends all her income on two goods X and Y. If a 50% increase in the
price of good X does not change the amount consumed of Y, what is the price elasticity
of demand for good X?
1. 14. Suppose a consumer spends her entire income (I) on purchase of ‘n’ goods whose
quantities are denoted as q1 q2 ..., qn and price as p1, p2, ..., pn. Prove that if the own-
price elasticity of demand for a particular good exceeds one, then in some average
sense, the other goods are substitutes for the given good and if the own price elasticity
is less than one, then in that same sense, the other goods are complements of the given
good.
1. 15. Using calculus prove that the total amount spent on a good varies directly with
the change in price when elasticity is less than one, and inversely with the price when
elasticity is greater than one.
1. 16. Suppose that when Sachin’s income increases (prices of all goods unchanged), he
devotes the entire increment in income to increasing his purchase of food. Is Sachin’s
income elasticity of demand for food greater than, equal to, or less than one?
1. 17. What is the price elasticity of demand supposed to measure? State the point
elasticity and arc elasticity formulas for measuring elasticity of demand. When should
each be used?
1. 18. Compare the elasticity of two straight line intersecting demand curves at the print
of intersection.
1. 19. Prove that of two parallel straight line demand curves, the one farther to the right
has a smaller price elasticity at each price.
1. 20. Does a flatter demand curve necessarily signify a greater elasticity than a steeper
one?
1. 21. Prove that the proportion of income spent on a good rises with a rise in income if
the income elasticity of demand for the good is greater than one.
a. When the supply curve of a good is an upward sloping straight line passing
through the origin, then all along the supply curve, elasticity of supply is equal to
one.
b. When the supply curve is an upward sloping straight line crossing price axis, then
elasticity of supply is greater than one at every print on the curve.
c. When the supply curve is a positively sloped straight line intersecting the
horizontal axis, then elasticity of supply is less than one at every point on the
supply curve.
51
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Glossary
Complements - Two goods which are consumed together and for which the quantity
demanded of one is negatively related to the price of the other.
Demand - The quantity of a good or service which an individual or group desires at the
ruling price.
Demand function - A functional relation between quantity demanded and all of the
variables that influence it.
Demand schedule - A numerical tabulation showing the quantities that are demanded
at various alternative prices.
Substitutes - Two goods are substitutes of each other if they satisfy essentially the
same want of the consumer such that the quantity demanded of one is positively related
to the price of the other.
Supply - The relation between the quantity of some commodity that producers are
willing to produce and sell per period of time and the price of that commodity, ceteris
paribus.
52
Institute of Lifelong Learning,University of Delhi
The Concept of Demand, Supply & Elasticity
Supply function - A mathematical relation between the quantity supplied and all the
variables that influence it.
53
Institute of Lifelong Learning,University of Delhi
The Theory Of Consumer Choice
Table of Contents
Learning Outcomes
Introduction
The Budget Constraint
Slope of the budget constraint
Consumer Preferences and Indifference Curves
Indifference Curves: Properties
Types of Indifference Curves and their shapes
Optimization
Changes in Income and Consumer’s Choices
Changes in Prices and Consumer Choice
Income and Substitution effects
Equivalent and Compensating Variation
Demand Curve: Derivation
Application of The Theory of Consumer Choice
Slope of the demand curve: case of a giffen good
Wages and Labor Supply
Interest Rates and Household Savings
Conclusion
Summary
Exercises
Glossary
References
Web-links
Learning Outcomes
This chapter aims to give the reader, a deep insight into the Theory of Consumer Choice.
The lesson deals with questions like “How does a consumer decide what to buy?”, “What are
the trade-offs faced by him while making such decisions?”, “How do the decisions change
with change in factors like price, incomes, interest rates etc.?”. After reading the chapter,
the reader should be able to understand the concepts of affordability and budget constraint,
Indifference curves and how do they depict consumer preferences, the impact of changes in
income and price on the consumer’s choice, Income and Substitution Effects. The chapter
ends with derivation of demand curve and a few applications of the Theory of Consumer
Choice. The practice questions at the end of the lesson will help in developing a better
understanding of the concepts discussed in the lesson.
Introduction
The theory of demand has its foundations in the theory of consumer choice. Analysis of
consumer behavior is a prerequisite to deal with the theory of demand. The Theory of
Consumer Choice relies on the assumption that the consumer is rational, he is equipped
with the knowledge regarding his income, commodities available and their prices, to make a
decision as to what to buy. Trade-offs faced by the consumers while making a choice,
assume an important role in the theory of consumer choice. Amount to be spent on different
commodities, given the income and the price, amount of time to be devoted to leisure and
work, whether to consume more in the present or to save more for the future are a few
important questions that a consumer encounters in his day to day life. In the due course,
we will see how the theory of consumer choice caters to these questions.
A consumer would prefer having greater quantity or better quality of the goods he
consumes, however, his income acts as a limit on the amount of money he can spend on
consumption of those goods. It is important to understand this constraint. To take a simple
example, let’s study the case of a consumer who consumes only two commodities: Burger
and Milkshake. Suppose that the consumer earns a monthly income of Rs.1000, the price of
a burger is Rs.20 and that of a glass of milkshake is Rs.10. Table No. 1 lists several
combinations of milkshake and burger that the consumer can choose from given his income
and prices of the two goods.
Table 1: Combinations of Burger and Milkshake that the consumer can afford to consume
Glasses of Number of Spending on Spending on Total Spending
Milkshake Burgers Milkshake Burger
0 50 0 1000 1000
10 45 100 900 1000
20 40 200 800 1000
30 35 300 700 1000
40 30 400 600 1000
50 25 500 500 1000
60 20 600 400 1000
70 15 700 300 1000
80 10 800 200 1000
90 5 900 100 1000
100 0 1000 0 1000
The first row in table 1 shows that if all the income is spent on burgers, the consumer will
be able to consume 50 burgers but no milkshake, however if he spends the entire income
on milkshakes, he will be able to consume 100 glasses of milkshake but no burgers. Figure
1 depicts consumer’s budget constraint. The vertical axis plots glasses of milkshake while
the horizontal axis plots number of burgers. Point A corresponds to the case where the
consumer spends all his income on burgers while at point B he consumes 100 glasses of
milkshake but no burgers. At point C consumer spends equal amount of income on burger
and milkshake. The downward sloping curve BCA shows the trade-off, the consumer faces in
consuming burger and milkshake, given income and prices. Consuming more of burgers
leaves less money with the consumer to buy milkshakes. Hence, as the consumption of one
commodity rises, the consumption of the other commodity has to fall, if the income and
prices of the commodities are kept fixed.
Budget constraint’s slope measures the rate at which the consumer can trade one good for
the other. Slope between any two points is calculated as the ratio of change in the vertical
distance to the change in the horizontal distance. For instance if the points C and A in figure
1 are considered, the vertical distance is 50 glasses of milkshake and the horizontal distance
is 25 burgers, so the slope is 2 glasses of milkshake per burger. The slope of the budget
constraint is the same as the ratio of the prices of the two commodities. Since the price of a
burger is Rs.20 and the price of a glass of milkshake is Rs.10, the opportunity cost of a
burger is 2 glasses of milkshake. The budget constraint’s slope of 2 is the trade-off that
market offers the consumers. The consumer can trade 2 glasses of milkshake for a burger
in the market. Since the budget constraint is downward sloping, the slope is a negative
number.
1.) Higher indifference curves carry a greater level of satisfaction compared to the lower
ones: The preference of the consumers for greater quantities gets exhibited in the
indifference curve approach also. Higher indifference curves depict bundles with
larger quantities of goods relative to the lower ones and the consumer prefers higher
indifference curves to the lower ones.
2.) Indifference curves slope downwards: In the case where a consumer likes both the
goods, when the quantity of one good is raised, the quantity of the other good has to
fall for the consumer to stay at the same level of satisfaction. This is what makes the
indifference curves slope downwards.
3.) Indifference curves do not intersect: This property can be best illustrated through a
graph. Look at figure 3, suppose points A and B lie on the same indifference curve,
also point B and C lie on the same indifference curve. This implies that the consumer
is equally satisfied at points A and B and the same applies to points B and C as well.
This would imply that the consumer is indifferent between points A and C, which is
not possible because point C has greater amount of both the goods. We reach a
contradiction, indifference curves cannot cross.
4.) Indifference curves are bowed inwards: As we know the slope of an indifference
curve is equal to the marginal rate of substitution which depends on the amount of
the two goods that the consumer is consuming presently. People are willing to give
more of that commodity which they possess in greater quantity and are less willing
to give up on the one which is held in meagre amounts. If the consumer has a lot of
glasses of milkshake and small number of burgers, he will be willing to give up more
number of glasses of milkshakes for every single unit of increase in the number of
burgers. However as he continues to have more and more burgers, the number of
glasses of milkshakes that he gives up for every burger will reduce. This explains
why indifference curves are bowed inwards. As illustrated by figure 4, at point A, the
consumer has a lot of milkshake but less number of burgers, at this point it would be
required to give a lot of glasses of milkshake to the consumer to make him give up
one burger. At point B on the other hand, the consumer has a lot of burgers but less
milkshake, so the consumer will be willing to give up a burger for a few glasses of
milkshake. The marginal rate of substitution at point A is 5 glasses of milkshake for a
burger while the marginal rate of substitution at point B is 1 glass of milkshake for a
burger.
1.) Perfect Substitutes: perfect substitutes are shown by straight line indifference
curves, the slope along these straight lines stays constant which means that the rate
at which one good can be exchanged for the other is constant. For instance a pack of
20 black paperclips can be perfectly substituted for a pack of 20 green paper clips for
a person who does not have any color preference.
2.) Perfect Complements: When the two goods are perfect complements, the
indifference curves to represent such preferences are L-shaped or right angled. A
good example of perfect complements is pair of shoes. A bundle of 5 left shoes and 7
right shoes yields 5 pair of shoes.
3.) Good with zero utility: In case the consumer gets 0 satisfaction out of one good, he
will consume the other good which gives him positive utility and would not be willing
to sacrifice any amount of the other good for the one that offers no satisfaction. For
example egg cannot offer any satisfaction to a vegetarian.
4.) A Necessity: There are certain commodities that are absolute necessity, there might
be a minimum quantity of such goods which is necessary for living. The indifference
curve in such a case becomes steeper as the consumption of the absolute necessity
falls towards the minimum quantity for sustenance.
Figure 8: A Necessity
5.) Good that offers negative utility beyond a particular level of consumption: Beyond a
particular point of consumption, if a consumer consumes or is forced to consume
more of a particular good, he would start getting negative utilities out of further
consumption. In such a case the indifference curve becomes positively sloped
beyond that point of consumption. If the extra units can be disposed off without
incurring any costs the indifference curves will become horizontal.
Figure 9: Good that Offers Negative Utility Beyond a Particular Level of Consumption
6.) A good that is not consumed: When a consumer in an equilibrium condition does not
consume any amount of one good, it is called a corner solution. In this case the
indifference curve cuts the axis of the good which is not consumed. The slope of the
indifference curve is flatter than the budget line.
Optimization
Optimization involves two important components: first being the consumer’s budget
constraint and the second, consumer’s preferences. The consumer’s optimum can be
explained graphically. As shown in figure 11, the optimum is reached when the budget
constraint is tangential to the indifference curve i.e. point C. At point B, the consumer is at
a lower indifference curve, however given the budget constraint the consumer can afford to
move to a higher level of satisfaction. Point A is not affordable for the consumer. At the
optimum, the slope of the indifference curve is equal to the slope of the budget constraint
i.e. the marginal rate of substitution is the same as the relative prices. At this point the
market valuation of the goods is equal to the value that consumers place on two goods.
A change in income has important effects on the consumer’s choice. In case, the income of
the consumer changes, since there is no effect on the price of the two goods, the slope of
the budget constraint doesn’t change. However, due to a change in the income, the budget
constraint will shift outward or inward parallely, depending on whether there is a rise or a
fall in the income. On the new budget constraint, the consumer can afford to reach a higher
indifference curve with a better consumption bundle. Depending on the consumer
preferences, the consumer can consume at any point on the new budget constraint where
the indifference curve is tangential to it. If the consumption of a good rises with a rise in the
income, it is called a normal good. However, if the consumer decreases his consumption of
a good as the income rises, the good is said to be an inferior good. We can illustrate this
graphically. Graph A in figure 12 shows that as the income rises the consumer raises his
consumption of both milkshakes and burgers, so both these goods are normal. However in
graph B, the consumption of burgers rises while that of milkshakes falls, depicting that
milkshake is an inferior good.
Now we consider the impact of a change in price. Suppose the price of a burger falls from
Rs.20 to Rs.10. The price of a glass of milkshake and income of the consumer stays the
same. So the slope of the budget constraint goes down from 2 milkshake for a burger to 1
milkshake for a burger, suggesting that the budget constraint pivots and becomes relatively
flatter. If the consumer spends all his money on burgers he will be able to consume 100
burgers. The new budget constraint is AD now. The new point of consumption again
depends on the consumer preferences. As the figure 13 shows, at the new optimum, the
consumer is having more of burgers and less of milkshake.
The price effect can be segregated into income and substitution effect. If after the price
change an adjustment is made, such that the consumer is left with the level of income that
leaves him with the same level of satisfaction (original indifference curve) as before the
price change but the consumer faces new relative prices, then the consumer’s response in
terms of quantity demanded is termed as substitution effect. However, if the money income
is restored and the consumer moves on to a higher or a lower indifference curve, the
response of the consumer is then called income effect. In the figure 14, graph A we can see
that as the price of burger falls the consumer moves from point A to point C. This change
can be broken down to two important steps. In the graph that illustrates the case of a fall
in the price of a burger, when the consumer moves from point A to point B which is on the
same indifference curve as point A, he faces new set of relative prices, this bit is termed as
substitution effect. Once the consumer shifts to the new indifference curve at point C, he
still faces the new set of relative prices (as at point B), this bit is called the income effect.
The substitution effect therefore is shown by rotating the budget constraint around the
original fixed indifference curve while the income effect is shown by a parallel shift in the
budget constraint. Movement from point A to point B is only about a change in the relative
prices, there is no change in the level of satisfaction. On the other hand, the movement
from point B to point C involves a change in the level of satisfaction and no change in the
relative prices. The substitution effect always works in the same direction, which means that
if the relative price of a commodity falls more of that commodity is consumed. However,
income effect can work in any direction: more or less of a good can be consumed when it’s
relative price falls depending on whether the good is normal or inferior. In case the good is
normal, any increase in the real income due to a fall in price will lead to an increase in the
consumption of that good. The income effect and substitution effect both work in the same
direction. In such a case the demand curve is negatively sloping. If on the other hand the
good is of inferior nature: less of a good is consumed when the real income rises due to the
price rise. The substitution effect works in the same direction suggesting that the quantity
consumed of the commodity should rise when its relative price falls. However, the end
result completely depends on the intensity of these two effects. If the substitution effect
outweighs the negative income effect such that the quantity demanded increases when the
relative price of the good falls, it can be defined as a case of inferior good even though the
demand curve has a negative slope. However if the negative income effect outweighs the
substitution effect, one reaches a positively sloped demand curve. This is the case of a
giffen good. The giffen goods are inferior goods and the negative income effect in their case
is strong enough to out power the substitution effect.
Two very important concepts attached with the income effect of a price change are
equivalent variation and compensating variation.
Equivalent Variation: it is equivalent to giving some money income to the consumer instead
of a price change, such that he becomes as satisfied as he is after the price change. This
can be shown graphically by shifting the original budget constraint in a parallel manner such
that it touches the new indifference curve after the price change.
Compensating Variation: it is the amount of income that needs to be taken away from a
consumer when the price of a good falls to make him return to a level of satisfaction at
which he was before the price change i.e. the original indifference curve. In the graph this
magnitude can be shown by the vertical distance OL.
The law of demand says that as the price of the good rises the quantity demanded of it falls,
this is shown by a regular downward sloping demand curve. However, there are cases
where the law of demand gets violated. In the case of giffen goods, the demand curve is an
upward sloping one. Let’s take the example of a consumer in china who consumes rice and
chicken. As the graph shows the consumer was consuming at point C initially. Now, there is
a rise in the price of rice, the relevant budget constraint is DA. The consumer consumes at
the new point E, where he has more of rice and less of chicken. This happens because rice is
a giffen good for a consumer in china. As the price of rice rises, the consumer gets poor in a
relative sense. The income effect says that the consumer should buy more of rice and less
of chicken, on the other hand the substitution effect would direct him to buy less of rice and
more of chicken. Since, the income effect outweighs the substitution effect the, consumer
ends up buying more of rice and less of chicken.
The notion of giffen goods was first introduced by Alfred Marshall in his book
Principles of Economics in the year 1895. The idea of giffen goods can be attributed
to Robert Giffen, who pointed out how a rise in the price of bread draws down the
income of poor families. The marginal utility of money rises for these families in
such a manner that rather than buying more of other foods they end up consuming
more of bread, which is still relatively the cheapest compared to the other foods.
Though in reality, giffen goods are rare to find, Jensen and Miller found evidence of
giffen behavior. In their study, they present data from field experiment, wherein
they have tried to gauge the response of the poor households in China to changes
in the prices of staple food items. Evidence has been found for giffen behavior in
the case of rice and wheat when there prices were subsidized.
Source: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2964162/
SS
The theory of consumer choice can also be used to determine the labor supply decisions i.e.
how to decide how much time should be allocated to work and leisure. Let’s take the
example of Subhash who works at the ice cream parlor. Subhash is awake for 120 hours in
a week. He can spend this time in leisure or he can work and earn a salary of Rs.100 per
hour of work. For every hour of work Subhash can have get consumption worth Rs.100. One
hour of leisure means Subhash loses out on this consumption. The opportunity cost of one
hour of leisure is Rs.100 worth of consumption. If he works for 120 hours in a week, he
earns Rs.12000 but enjoys no leisure and if he doesn’t work at all, he earns nothing and
doesn’t consume anything but gets 120 hours of leisure. As shown in the graph the
consumer can make an optimal choice consisting of work and leisure hours.
Now, suppose Subhash’s salary rises from Rs.100 per hour to Rs.150 per hour, there can be
two possible outcomes. With the rise in the wage rate the budget constraint rotates from BA
to BC and becomes relatively steeper. With a higher wage rate, for every hour of leisure
foregone, the consumer enjoys higher consumption. The optimal choice depends a lot on
Subhash’s preferences. With a rise in wage rate the consumption will rise definitely but what
happens to leisure, depends on Subhash’s response. Subhash can respond to the rise in the
wage rate by enjoying either more leisure or less of it.
When the wage rate rises, the substitution effect says that since leisure is relatively
expensive now compared to consumption, Subhash should work more and hence consume
more. Income effect on the other hand says as the wage rate rises the consumer becomes
better off. The consumer gets a higher wage for all the hours that he works. Assuming that
the both leisure and consumption are normal goods, the income effect encourages Subhash
to work less and enjoy more of leisure. If the substitution effect is stronger than the income
effect, labor supply curve will be upward sloping as shown in part a of figure 19 and if the
income effect is stronger than the substitution effect the labor supply slopes backward, as
depicted in part b of the figure. In both the cases, consumption rises. Hence the labor
supply curve can slope upward or downward.
Savings of a household depend on the interest rate. A consumer’s lifetime can be divided
into two periods, the first is the young age where he works and earns and the second is the
old age when he retires. Suppose Samir, the consumer, earns Rs.100000 in his young age
which he can use for present consumption and saving. In the old age Samir will live on the
savings that he makes in the young age. If the interest rate is 10 percent, for every rupee
saved in the young age, Samir gets to enjoy consumption worth Rs.1.10 in old age. Samir
has to find an optimal combination of consumption in the young age and consumption in the
old age, which has been shown in the figure. If he consumes all of his income today, he will
be able to enjoy Rs.100000 worth of consumption but he will starve in his old age. On the
other hand, if he consumes nothing in the present he will be able to enjoy consumption
worth Rs.110000 in his old age.
Now let’s see what happens when the interest rate rises to 3O percent. Again the
substitution effect and income effect come into the picture. The consumption in the old age
will certainly rise, however the consumption in the young age will depend on income and
substitution effects. When the interest rises to 30 percent, the budget constraint rotates
outwards to become BC from BA, it becomes steeper. For every rupee saved in the young
age, Samir gets Rs.1.3 worth of consumption in his old age. The substitution effect says
that as the interest rate rises consumption in the young age becomes costly relative to
consumption in the old age. So, it would make Samir save more in the young age. The
income effect says that the rise in the interest rate makes Samir better off compared to his
original position and if consumption in both the periods is seen as normal goods, Samir
would want to consume more in both the periods thereby saving less in the young age.
Figure 21 shows both the cases, part a where consumption in the young age falls as
substitution effect overpowers the income effect and part b where the consumption in the
young age rises because the income effect is stronger than the substitution effect. Hence
the effect of changes in interest rate on the savings is ambiguous which has implications for
the tax policy.
Conclusion
The Theory of Consumer Choice helps us understand the important factors that contribute
to decision making at the level of a consumer. Decision to consume different quantities of
different goods, allocation of time to work and leisure, inter-temporal choices, types of
preferences are explained with the help of consumer choice theory. Indifference curves,
budget constraint and tools of optimization together form the core of consumer theory. Not
that in reality every consumer goes about carrying out these optimization exercises, but
every consumer knows that his choice is limited by his budget. Given the constraint of
income the consumer has to reach the best possible combination of goods for which he has
a preference.
The Theory of Consumer Choice provides a brilliant framework for analyzing the real world
choices and it has several application, few of which we have already gone through.
Summary
In this lesson we have learnt that:
Exercise
Review Questions
Q.1 Draw the budget constraint for Ravi, who has a weekly income of Rs.2000. He
consumes only two goods: Sandwiches and Pepsi. The price of a sandwich is Rs.20 and that
of a glass of Pepsi is Rs.10. Determine the slope of the budget line.
Q.3 How are the concepts of equivalent and compensating variation different?
Q.4 Draw the demand curve for a giffen good, taking hypothetical values.
Q.6 Draw the set of indifference curves for left-hand and right-hand gloves.
Q.7 Can an indifference curve be upward sloping? What are the cases when it can slope
upwards?
[Hint: Think of a good (bad) that gives negative utility like pollution or a good that starts
giving out negative utility beyond a point of consumption]
Q.8 A consumer consuming apples and oranges gets a salary raise. Illustrate, how the
consumption choice changes on the rise in income when both apple and oranges are normal
goods.
Q.9 In extension to question number 8, what would happen if apple is an inferior good?
a. Are affordable, given the income and the prices of the goods.
b. Are unaffordable, given the income and the prices of the goods.
c. Give equal level of satisfaction.
d. Indicate the bundles of goods that exhaust the total given income.
a. Several combinations of goods that give the consumer an equal level of satisfaction
b. Various combinations of goods that the consumer can afford.
c. The level of income that the consumer can use to buy goods.
d. All of the above.
Q.3 If the prices of the two goods stay constant and the income increases:
Q.5 A normal good is the one, the demand for which rises:
a. The goods that can perfectly replace each other or can be used perfectly in place of
each other.
b. The goods that are used in conjunction with each other.
c. Both a. and b.
d. None of the above.
a. The slope of the budget constraint is equal to the slope of the indifference curve.
b. The slope of the budget constraint is greater than the slope of the indifference curve.
c. The slope of the budget constraint is less than the slope of the indifference curve.
d. None of the above.
Q.8 Suppose the price of burgers falls (Burger is a normal good). Resultantly, Sam’s real
income rises, which he uses to buy greater number of burgers each week. This effect is
called:
a. Substitution effect.
b. Income effect.
c. Price effect.
d. None of the above.
Q.9 Samir consumes pizza and pepsi (both the goods are assumed to be normal), the
income and the price of pepsi stay the same, while the price of pizza rises. The example of
substitution effect in this case will be:
a. Samir buys more of pepsi and less of pizza as the price of pizza rises.
b. Samir buys more of pizza when his income rises.
c. Samir buys more of pizza as its price has risen.
d. None of the above.
Answer 2. An indifference curve is the locus of the bundle of goods that give the same level
of satisfaction to the consumer.
Answer 3. A rise in income does not impact the slope of the budget constraint. A rise in
income, with the prices fixed can help the consumer, consume more of both the goods, so
there is a parallel shift outwards in the budget constraint.
Answer 4. When the prices of both the goods rise by the same percentage, the relative
prices of the two goods stay constant. With the rise in prices, less of both the goods can be
afforded given the same level of income, hence the budget constraint shows a parallel shift
inwards.
Answer 5. Normal goods are the goods, the demand for which rises as the income rises.
Answer 6. Perfect substitutes are the goods that can replace each other perfectly to satisfy
the needs of the consumer.
Answer 7. The point of optimum occurs where the indifference curve is tangential to the
budget constraint. This is the point where the marginal rate of substitution is equal to the
relative prices.
Answer 8. The fall in price of burgers, other things constant, raise the real income of the
consumer. There is a rise in the purchasing power of the consumer which he can use to buy
more of burgers (burgers being normal goods).
Answer 9. When the price of pizza rises relative to that of pepsi, it becomes more expensive
to consume pizza relative to the consumption of pepsi, so the substitution effect will induce
the consumer to consume more of pepsi and less of pizza.
Answer 10. Indifference curves cannot intersect each other as it violates the principle of
transitivity.
Answer 2. Option b is incorrect, it is the budget line that shows which all bundles of
consumption are affordable. Option c is wrong because the indifference curve shows the
preference of the consumers, the budget constraint depicts the income of the consumer.
Hence option d (All of the above) is also incorrect.
Answer 3. Options a and b are incorrect because there is no change in the prices of the
goods, since the relative price of the two goods is the same, the slope will not change.
Hence the budget constraint cannot become flatter or steeper. Option d is also incorrect
since a rise in income expands the consumption possibilities for both the goods, the budget
constraint will shift inwards in a parallel manner when the income decreases.
Answer 4. When the price of both the goods rise by the same percentage, there will be no
change in the slope of the budget constraint. The budget constraint cannot get flatter or
steeper so the options a and b are not possible. Rise in prices of both the goods means that
given the income the consumer will be able to consume less of both the commodities, hence
the budget cannot shift outwards in a parallel manner. So option c is also wrong.
Answer 5. A normal good is usually defined with respect to income. It is the good, the
demand for which rises when a consumer’s income rises. So option a is incorrect, since the
demand for the normal good will fall as the income falls. Option c is wrong, the demand for
normal goods is related to income. Option d is ruled out since answer is option b.
Answer 6. Option b is incorrect, perfect complements are the goods that are used in
conjunction with each other. Options c and d are also incorrect.
Answer 7. At the optimum level the slope of the indifference curve and the budget
constraint is equal. If these slopes are not equal there is a scope that the consumer can still
do better and reach the position of optimum. At the optimum, consumer is highly satisfied
given his income, since the rate at which he is willing to trade one good for the other is
equal to the trade-off for the two goods set by the market i.e. the ratio of prices of the two
goods. So, the options b, c and d are incorrect.
Answer 8. Option a is incorrect since the substitution effect will make him consume more of
burgers since after a fall in price, the burgers become relatively cheap, not because his
purchasing power has risen. Option c is incorrect since the price effect is the overall effect of
the price change which includes income and substitution effect, the result of the price effect
will depend on the substitution an income effects. Option d is also ruled out. The correct
answer is income effect.
Answer 9. Option b is incorrect since the rise in income does not involve any substitution
effect. Option c is incorrect pizza is a normal good, when the price of pizza rises both the
substitution and income effects will make the consumer consume less of it and not more of
it. Option d is ruled out since option a is correct.
Answer 10. The options a, c and d are incorrect because they are the properties of
indifference curves. Option b is correct because indifference curves do not intersect.
Glossary
Trade-off: To give up one thing for another, which might be of more or less equal value to
the decision maker.
Rational: A behavior based on logical reasoning, taking into account all the information
available, without any inconsistencies.
Slope: It is a measure that gives out the rate at which one variable changes for a unit
change in the other.
Utility: Utility is the satisfaction one gets out of consuming a good or service.
Substitutes: Goods that can replace each other to satisfy the needs of the consumer.
Complements: Goods which are usually used in conjunction with each other to satisfy
various uses.
Necessity: A good, the consumption of which is necessary for survival. The proportion of
expenditure on such goods falls with a rise in income.
Optimization: Making a choice which is cost effective or which delivers the best result, given
the constraints.
Inter-temporal: This term refers to the decisions made in the present and the future.
Decisions regarding consumption and savings made in the present have an impact on the
alternatives available in the future.
Normal good: These are the goods, the demand for which rises as the income rises.
Inferior good: The goods for which the demand falls as the income rises.
Giffen good: These are rare type of goods, for which the demand rises as the price rises.
They violate the law of demand.
References
Mankiw, N.G. (2007), “Principles of Microeconomics”, Ch.21, Cengage Learning
Lipsey, Richard & Chrystal, Alec (2011), “Economics”, Ch.5, (PP.91, 97-99), Oxford
University Press
Web Link
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2964162/
Semester-I
Principle of Economics
Table of Contents
Learning Outcomes
Short Run Conditions and Long Run Directions
Maximizing Profits
Minimizing Losses
The Short Run Industry Supply Curve
Long Run Directions: A Review
Long Run Costs: Economies and Diseconomies of Scale
Long Run Adjustments to Short Run Conditions
Output Markets: A Final Word
Conclusion
Summary
Exercises
References
Appendix
Learning Outcome
In this chapter we are going to learn about how the firm in the industry takes decision
related to Cost and Output from the short run to the long run. We will learn whether the
equilibrium condition in the long run is same as in the short run for all the firms i.e. those
firms who are earning profit, those who are trying to reduce loses and those who decide to
shut down in short run If not, then what is the condition of equilibrium in the competitive
market in the long run? Can new firms enter or exit or can the existing firm expand in the
long run? After reading this chapter one would be able to answer questions about
economies and diseconomies of scale and how they arise. What is the shape of the short run
Industry supply curve in the long run? What do you understand by Return to Scale?
Maximizing Profits
Assuming the industry is competitive. We can say that the profitable firms will try to
maximize their profits in the short run.
Profits = Total Revenue- Total cost
where
Total Revenue = Price x Quantity
Total Cost = Total Fixed Cost + Total Variable Cost
Let us construct a hypothetical situation in which a firm is supposed to purchase a land for
production costing INR 10,00,000. Suppose investor expects to earn a minimum return of
say 10% p.a.. That means a firm have to pay INR 1,00,000 as nominal rate of return and
thus a part of fixed cost. It sells a good at INR 50. Variable cost to a firm includes wages
and raw material amounting INR 16,000. Some other fixed cost equal to INR 10,000. As
the firm are competitive in nature, they are ready to sell all it wants at one single price INR
50. Assuming firm is supplying say 8000 unit of goods.
So, its Total Revenue is INR 400,000.
Total Cost = 160,000 + 110,000= INR 270,000.
So, Total Profit= TR-TC= INR 130,000.
In figure 1, a) shows the competitive industry and b) shows a representative firm. Market is
clearing at price INR 50 and we assume that a firm can sell anything at that price but is
constrained by its capacity in the short run. Assume that the representative firm produces
3000 unit of output. We know that in a competitive market, a profit maximizing firm
produces up to the point where price equals Marginal cost. In the short run, MC curve
moves upward because fixed factor constrains the capacity. Total revenue is equal to the
area (TP0q). Total cost is equal to the area (AC0q). Profit is simply the difference between
TR and TC. That is equal to the area (ABPC). This firm is earning positive profits.
Minimizing Losses
A firm is suffering loss if it is not earning positive profits or breaking even. These firm falls
in the following two categories:
a) those thinking of shutting down their operation immediately and bear loses equal to
the fixed cost as that would help them to minimize their loses.
b) Those that keep doing their business in the short run to minimize their losses.
Fixed cost (FC) need to be paid whether you are in the business or shutting down, that’s
why firms cannot completely exit the market in the short run. To keep continuing their
operation every firm have to look whether it’s advantageous or not. As fixed cost needs to
be paid, their decision depends on variable cost and the revenue earned. So, in the short
run a firm will keep operating till revenue earned are more than variable cost.
Operating profit (or loss) is the difference of total revenue (TR) and total variable cost
(TVC).
If, TR>TVC then firm will keep operating as the operating profit helps offsetting fixed cost
and thus reducing losses. If TR<TVC, then firm will be better off shutting down rather than
increasing losses. As now, total losses will be more than fixed cost.
Suppose that the above mentioned firm now selling at INR 30 due to competitive forces.
This means that the total revenue (TR) now will be equal to INR 240,000 (30 X 8000).
Variable cost remains at INR 160,000 and total fixed cost (FC) at INR 110,000. So, total
cost equals to INR 270,000. Thus firm is bearing losses equals to INR 30,000. Now in the
short run, the firm has to decide whether to remain in business or shut down. If they plan
to shut down, it has to bear no variable cost but just bear loss equal to its fixed cost of INR
110,000. In case they decide to remain in business, then they will make operating profit of
INR 80,000 which can be used to offset its total fixed cost and thus reduces losses from
INR 110,000 to INR 30,000. Thus the firm total loss will be only INR 30,000.
In figure 2 we will show a) the industry and b) a representative firm suffering losses but
showing an operating profit in the short run. Assume that the market price determined by
the supply and demand is INR 35. Again as the firm is operating in the competitive world,
they will operate where price (MR) is equal to the Marginal cost (MC) and produce
output (q*) equals to 5000 unit. Total revenue is the product of price and quantity thus
equals INR 175,000. Total cost is the product of Average total cost (ATC) and the quantity
produced. Average total cost and average variable cost at q* is INR 41 and INR 30
respectively. So, total cost is INR 205,000. Thus firm is suffering losses equal to INR
30,000. But firm is earning operating profit equal to INR 25,000.
We know that average fixed cost (AFC) is the difference between Average total cost (ATC)
and Average variable cost (AVC). So, AFC is INR 11. Thus, TFC= 11 x 5000= INR 55,000.
Now, if the firm shuts down it will bear losses equal to INR 55,000. By operating, they will
reduce this loss to INR 30,000 as they will be earning operating profits.
NOTE: As long as P> AVC, the firm will be better operating than shutting down in the short
run.
The short run industry supply curve is the sum of the individual firm MC curve (above AVC)
of all the firms in an industry. Industry supply curve is the horizontal summation of the
quantity supplied by the individual firms in the industry at each price level.
In Figure 4, shows the industry supply curve in the short run as a horizontal sum of the MC
curves (above AVC) of all the firms in the industry.
The short run Industry Supply curve can shit because of two reasons. One if the price of
other input changes which shifts the individual firms MC curves simultaneously.
With the change in the number of firms in the industry in the long run, the industry supply
curve also shifts with the increase or decrease in the number of supply curves of individual
firms. If the number of individual firm increases the industry supply curve will shift to right
and if the number of individual firm decreases the industry supply curve will shift to left.
An Individual firm long run decision depends on what their cost are likely to be different at
different scale of production. To arrive at long-run cost, firms must also compare their costs
at different scales of plant. With the increases in the scale of production there may be
Economies of scale which will help reduce production costs and help firm expands or
perhaps as the complexities arises as the firms become large in size.
Technological change is one of the most important sources leading to economies of scale.
When a firm adopt capital intensive technique say in producing electronic gadgets, the cost
per unit falls when the electronic gadget are produced using machines than by using labour
only. Other source of economies of scale other than technology is sheer size. For example
the bulk order of inputs reduces the cost of inputs for large companies.
In figure 5, we will see a firm exhibiting Economies of scale. It shows short run and long run
AC curve for a firm at different level of output. Long run average cost curve (LRAC) shows
the different scales on which firm choose to operate in the long run. At any time, the scale
of operation determines the short run cost curves. LRAC is also known as envelope of the
short run average cost curves because its wraps around short run cost curves. Each point
on LRAC curve represent the least cost associated with that level of production.
In the short run, the firm chooses a particular scale of production which fixed them into one
cost curve where as in the long run, the scale of production can vary and with it the cost
curves also vary.
All short run AC curves are U shaped, because we assume a fixed scale of plant that
constrain production and MC rises when diminishing return sets in. In the long run, there is
no fixed factor and the scale of plants can be changed.
Note: the same firm can face diminishing return in the short run and still exhibits
economies of scale in the long run.
Long run average cost curve (LRAC) can be of different shape. Its shape depends on how
cost changes with the change in the scale of production. Every firm try to take advantage of
economies of scale and avoids diseconomies of scale and thus try to operate at the optimal
scale of plant i.e. try to minimize the AC.
Both the entrance of new firms and expansion of old firms will lead to the shift of the short
run industry supply curve from S to S’. As the industry supply curve is nothing but the sum
of all the MC curves of all the firms, it will shift because of the two reasons. Firstly, new
firms are being added, so their MC curves will also be added. Secondly, existing firms are
expanding; their individual MC curves will also shift to the right. Finally, each firm in the
competitive industry will choose to operate at optimal scale of operation in the long run.
In the long run, profits are driven to zero. And Equilibrium condition will be where
P*= SRMC=SRAC=LRAC
When prices are above P*, there will still exist profits and new firm will enter and when the
prices are below P*, that means operating losses, existing firm will exit. Only ay P=P* there
will be equilibrium in the industry.
The final long run competitive equilibrium condition will remain the same:
P*=SRAC=SRMC=LRAC
In the long run profits will be zero and at this point firms will be operating at the most
optimal scale.
allocation of the society’s resources. In the long run the profits will lure more investment
and resources in that goods market. So, just a change in the demand for a product will lead
to reallocation of resources.
Conclusion
In this chapter, we learned about the three short run condition in which any firm will find
themselves. How the short run changes affects the shape of long run cost curves. We
learned about when and how the economies of scale and diseconomies of scale arise with
the change in the scale of production? We also learned about what long run adjustment
need to be taken when the short run conditions changes the equilibrium. Investment moves
towards the profit opportunities.
Summary
1. In the short run, any firm will be earning positive profits, suffering losses or just
breaking even. In case of breaking even the firm is just earning a normal rate of
return.
2. A firm that is earning profit in the short run and expects to continue doing so has an
incentive to expand in the long run. Profits also provide an incentive for new firms to
enter the industry.
3. A firm incurring losses in the short run will either shut down and bear cost equal to
the fixed cost or keep operating when the revenue are enough to cover the average
variable cost and these operating profit can be used to reduce the losses.
4. Anytime the price is below the minimum point on the AVC curve, there will be
operating losses and firm will shut down. The minimum point on the AVC curve is
called the shut- down point. At all prices above the shut-down point, the MC curve
shows the profit-maximizing level of output.
5. The short run supply curve of a firm in a perfectly competitive industry is the portion
of its MC curve that lies above its AVC.
6. Industry supply curve shifts either in the short run when something causes the MC to
change across the industry or in the long run entry or exit of firms.
7. When an increase in the firm’s scale of production leads to fall in the AC, the firm is
exhibiting increasing return to scale or economies of scale. When AC do not change
with the scale of production. When AC rises with the increase with the increase in the
scale of production.
8. A firm LRAC shows the cost associated with different scales on which it can choose to
operate in the long run.
9. When there are short run profit, firms will enter and existing firm will expand in the
industry which will shift the supply curve to the right which in turn leads to fall in
prices till profits eliminate. When there are losses, some firms will exit and some
reduce scale which will shift the supply curve to the left which increases the price till
losses eliminates.
10. Long run competitive equilibrium is reached when profits are zero and
P=SRMC=SRAC=LRAC.
2. Is it possible for a firm to exhibit both diminishing return in the short run and
increasing return to scale in the long run? Explain.
5. Discuss the actual long-run adjustments that are likely to take place in response to
short run profits and losses.
a) price exceeds average variable cost but is less than average total cost.
b) price exceeds marginal cost.
c) revenues are smaller than variable costs of production.
d) revenues are greater than variable costs of production but less than total costs.
2. When a firm expands its scale of operations, and such expansion leads to lower cost
per unit, we say that the firm exhibits:
a) Diminishing returns.
b) Constant returns to scale.
c) Increasing returns to scale.
3. A firm will choose to operate rather than shut down in the competitive industry as
long as
5. The short run supply curve of a firm in a perfectly competitive industry is the
Answer 2. Option c) When AC falls with the rise in the level of output, that firm will exhibit
IRS.
Answer 3. Option d) A firm will choose to operate rather than shut down as long as revenue
is more than the AVC. In other words, till the firm is enjoying operating profits it will choose
to operate rather than shut down.
Answer 4. Option d) A firm attains long run competitive equilibrium when they are
producing at the optimal level and there will be zero profits and P=LRMC=SRAC=SRMC
Answer 5. Option b) The short run supply curve of a firm in a perfectly competitive industry
is the portion of its MC curve that lies above its AVC.
Answer 2. Option a) is incorrect because diminishing return is a short run concept and in the
short run scale of production doesn’t changes. Option b) is not correct because in CRS, AC
doesn’t change. Option d) is wrong as fixed factor of production is only related to the short
run.
Answer 3. Option a), b) and c) are incorrect because at all this point there is no situation of
operating losses.
Answer 4. All of the options are correct. As a firm attains long run competitive equilibrium
when they are producing at the most efficient level which means they are producing at the
minimum point of LRAC curve and P=LRMC=SRAC=SRMC.
Answer 5. Option a) is incorrect because in the competitive firm P = MC and any price below
AVC represent shut down which means no production. Option c) and d) are incorrect
because even if firms are not able to gain economic profits they will continue producing till
they are generating operating profits.
Appendix to Chapter 8
Long run competitive equilibrium is achieved when the profits are driven to zero and
P=LRAC=SRAC=SRMC. Here all firms are producing at the optimal scale. It takes time to
achieve this long run equilibrium. In a dynamic economy, long run equilibrium point will
keep changing. When the changes like growth in population and stocks of input factor, and
as preferences and technology changes some sectors will contract and some will expand. To
adjust to these long term changes, the industry has to consider both internal and external
factors.
The shape of long run average cost (LRAC) curve is determined by the extent of internal
economies (or diseconomies). When a firm scale of operation changes and it either expands
or contracts, its AC will increase, decrease or remain constant along the LRAC curve. In case
a firm having internal economies expands its scale will find its AC decreasing and in case a
firm facing diseconomies expands its scale will find its AC increasing.
The external economies and diseconomies arise from an industry wide expansion. When
industry faces external diseconomies, LRAC curve shifts upward i.e. cost increases
regardless of how much a firm produces. When industry enjoys external economies, LRAC
curve shifts downward i.e. cost reduces at all the potential level of output. This is because
external economies (or diseconomies) reduces (or increases) the costs.
We will see example of both an expanding industry facing external economies and external
diseconomies in figure 9 and 10 respectively.
In figure 9, we will see that the industry and a representative firm both are at long run
equilibrium. P* is the equilibrium price determined by the intersection of demand, DD 1 and
supply curves, SS1. All firms at this points have zero economic profits and the price P*
intersects the LRAC curve at its minimum point, i.e. at optimal point . When demand
increases, it will shift the demand curve from DD 1 to DD2, price rises along with the demand
from P* to P**. Rising prices increases the profit opportunities and new firms will enter and
old firms will expand. This will shifts the supply curve from SS 1 to SS2, driving price down.
If the LRAC falls as a result of expansion to LRAC 2 from LRAC1, the final price will be below
the original price, P*. So, the long run industry supply curve (LRIS) slopes downwards an
industry is enjoying external economies. Such an industry is also known as decreasing cost
industry.
Similarly, in figure 10 we will see that the industry and a representative firm both are at
long run equilibrium. P* is the equilibrium price determined by the intersection of demand,
DD1 and supply curves, SS1. When demand increases, it will shift the demand curve from
DD1 to DD2, price rises along with the demand from P* to P**. With the rise in prices new
firms will enter and existing firm expands shifting the supply curve from SS 1 to SS2, driving
price down. If the long run industry supply curves (LRIS) slopes upwards an industry is
facing external diseconomies. Such an industry is known as Increasing cost industry.
Reference
1) Case E Kase and Fair C Ray, Principles of Economics, 9th Edition.
Introductory Microeconomics
Table of Contents
Learning Outcomes
Introduction
Production
Production Process
Choice of Technology
Conclusion
Summary
Exercises
Glossary
References
Web-links
Appendix
Learning Outcomes
The objective of this lesson is to acquaint the reader with the behavior of the profit
maximizing firms in a perfectly competitive market structure, the production process and
how are the decisions related to the production process taken. After having gone through
the chapter, the reader should be able to understand the concept of perfect competition.
The decisions regarding the amount of the output to be produced, which production
technology should be used to produce the output and the quantity of inputs to be used in
the production are crucial for any firm, the lesson, looks into these aspects as well. The
reader will attain a deeper understanding of the concepts like profits, total revenue, total
cost and the normal rate of return. The chapter analyses in detail, how do the decisions and
response of the firms differ in the short and the long-run. It also discusses the factors that
influence the important decisions that the firms have to take, the production process which
involves a discussion on the types of technology, concepts of total, marginal and average
product and production function. The last section of the chapter discusses the question of
the choice of technology and its connection with the input markets. The practice questions
at the end of the chapter will help the reader, get a clear picture of the topics discussed in
the lesson. The appendix to this chapter introduces the idea of isocosts, isoquants and the
cost-minimizing optimum combination of inputs.
Introduction
An idea parallel to the concept of household decisions and consumer choice is that of the
production decisions taken by the firms. The households decide what and how much to
consume of different goods, given the prices and the income, they make decisions about the
number of hours to work. Firms in the market are also involved in a similar exercise,
wherein they decide about the inputs and their quantity that they should use in the
production process such that least cost is incurred given the input prices and the level of
output which will be profitable for them to produce. All the firms involved in production aim
at maximizing their profits and minimizing their costs, therefore optimization is equally vital
for the firms as well. The firms are involved in the production process, in which the inputs
are combine to produce output. The decisions related to production have a definite
implication for the profits and the viability of the firm. Figure 1 illustrates the circular flow
diagram, which shows the demand and supply of the inputs as well as the output. It shows
the demand and supply decision of firms and households. It is very important to understand
the questions that are faced by the firms and how are they answered. The chapter aims at
solving all these puzzles.
Production
Production can be defined as a process whereby, inputs are combined, processed and
converted into output. Production is a vital function of a firm be it of any size and internal
structure. There are a set of assumptions on which the analysis in this chapter is based. The
assumptions are listed below:
Production is not confined only to firms: The function of production is not confined only
to the firms in the market. Households can also process and convert inputs like land, labor,
capital etc. into output. A household that has a kitchen garden, combines land, labor,
manure, fertilizers, seeds and other tools to grow vegetables. Government utilizes various
factors of production to provide various services of public utility.
Firms are different from households and government in the sense that they produce goods
or services to meet the demand for those goods or services to make profits.
Firms differ from each other on the basis of their size, type of organization and the market
structure that they function in. We analyze the case of perfect competition here.
Resultantly, each firm in the industry takes the market price, which is determined by the
supply and demand, as given. These firms can be described as “price-takers”. At the given
price, the firms can decide how much output to supply, quantity of inputs to purchase and
how to produce the output.
Since the products produced by the firms are homogeneous, no firm can charge a price
above the market price as the consumers, in this case will easily shift to the other sellers in
the market and the firm who fixed a price above the market price will incur losses. Also, it is
very clear, that no firm would want to charge a price below the market price, since it can
sell any quantity of output at the given market price. The demand for output produced by
such a firm is also perfectly elastic. For instance, let’s consider the case of Ram who sells
pens in a perfectly competitive market. Part a in figure 2 depicts the supply and demand
conditions in the market. Say the price set by the market is Rs.5 per pen. Part b in figure 2
represents the demand curve being faced by a perfectly competitive firm for its output. It
would not be beneficial for Ram to raise the price of pen above Rs.5, since the consumers
will shift their demand to the other sellers and he will not be able to sell any pen. On the
other hand, he would not want to fix a price below Rs.5, because he can sell as many
number of pens at this price as he wants.
Figure 2 : A perfectly competitive market structure and the demand faced by a single firm
In the perfectly competitive industry it is also assumed that the entry is easy, it is very easy
for the firms to enter and exit the industry. If the existing firms in the industry are earning
high profits, new firms would also enter. For example even if there are several stationery
shops, there are no barriers for a new stationery shop to spring up. Similarly the exit is
also easy, if a firm is incurring losses, it can easily shut down the business. Firms might face
losses when there is changing technology, changes in tastes and preferences, rise in costs
of production or when there is a fall in prices due to intense competition. Though it is hard
to find a perfectly competitive set up in the real world, certain markets are quite close to
perfect competition, when it comes to their structure and the way of functioning. Few
examples can be pointed out here, for instance, markets for agricultural goods, vendors
who sell food and other articles on the street etc.
We have already discussed, that any firm functions in the market, primarily to make profits.
Since, profits are so vital for any firm, it is important to understand what profit is. Profit can
be defined as the difference between the total revenue and the total cost of the firm.
Total revenue is the amount of money that a firm receives out of selling its output; it’s the
quantity of output sold (q) multiplied by the per unit price of that good (p). Total cost or the
total economic cost includes three elements. First being the explicit/accounting or out of
pocket costs, which is the cost of raw materials and other inputs used in the production. The
normal rate of return on capital and the opportunity cost of each factor of production are
the other two elements of the total economic cost. The normal rate of return on capital and
the opportunity cost of each factor of production can be categorized as implicit costs.
Opportunity costs are implicit and need to be included in the total cost incurred by the firm.
For example, a person who owns a business also contributes his labor services to it, but
does not get any wage in return, instead of running his own business, he could have worked
as an employee and would have got a wage for his labor. The wage that this person loses
out is the opportunity cost of his labor which needs to be added to the total economic cost.
The opportunity cost of capital, in a similar fashion is equally important. The opportunity
cost of capital can be accounted for by including the normal rate of return to capital in the
total economic costs.
Capital is required to establish a firm or to start a business. Money is required purchase and
set up machinery, equipment, furniture etc. This implies that this capital will stay tied up
with the business for a long period. Fresh investments also need to be made, even when the
firm or the business has been in place for a long time. There is an opportunity attached with
this invested capital. The investor or the proprietor, instead of investing his funds in the
business, could have invested them in some financial security, which would have given him
returns. This rate of return is the opportunity cost of using or investing one’s capital in the
business.
The concept of rate of return needs to be understood. A person who has invested his funds
in a business, will get a stream of returns. The rate of return can be described as the annual
flow of net returns on investment, expressed as a proportion of the total investment. A
normal rate of return, on the other hand, can be defined as the rate of return that keeps the
investors and owners satisfied. If the rate of return falls below the normal rate of return, the
owners will get a lower return if they invest in the business, they could earn a higher
returns by putting the funds in the financial securities, bonds or anywhere else. Under
normal conditions, i.e. when there is a consistent stream of revenue, there is no uncertainty
about the future, the firm earns a steady stream of revenues, the normal rate of return will
be quite close to the rate of return on risk-free government securities.
Let’s define economic profit now. Economic profit is the difference between the total
revenue and the total economic costs.
With this definition, it is easy to see that when the firm earns a rate of return equal to the
normal rate of return, firms don’t earn any profits. On the other hand, if the firm is earning
a positive sum of profit, it implies that the rate of return is above the normal rate of return
to capital. A positive level of profit will keep the investors happy and motivate new firms to
enter the industry. A negative profit, means that the rate of return is below the normal rate
of return to capital. In such a case the firms might shut down the business and move out of
the industry, a few might contract and the fresh investments will be hard to come by.
A Numerical Example
Suppose Ravi is planning to start a small-scale business. He plans to sell radios. To start the business, he
needs a shop. The money required to purchase this shop is Rs.1000000. Ravi has decided to sell 30000
radios annually at a price of Rs.100 per radio. He purchases the radio from a supplier and each radio
costs him Rs.50. He will need a person to stay on the sales counter, who will work for a yearly wage of
Rs.400000. The rate of interest on government securities is 10%, so Ravi wants to earn at least 10% on
his investment. Let’s work out the profit for this venture.
Ravi earns a revenue of Rs.3000000, out of his venture. The total economic costs have been
calculated by including the opportunity cost of capital or the normal return to the capital.
The economic profit generated by this venture is Rs.1000000.
Short-run can be defined with the help of two features. In the short-run, some factor of
production for the firms existing in the industry, is given to be fixed i.e. the quantity or
scale of that factor cannot be altered. Also, in the short-run the entry and the exit of firms
from the industry is difficult. It can be said that there are bottlenecks in the entry and the
exit of the firms. A firm winding up its business, in order to move out of the industry, might
still find some locked up fixed costs which are yet to be recovered. The factor of production
which is fixed, differs from one industry to the other. For a firm the plant or the machinery
can be a limit, for a professional his time can be a constraint, for a bakery the place of work
i.e. the shop might be a constraint. Land as an input is also fixed in the short run.
Long-run is the time period where no factor of production is fixed, nor are there any
restrictions or difficulties in the entry and the exit of the firms from the industry. Firms are
free to alter the scale at which they operate.
The firms are always on a look out for an optimal method of production. The optimal
method of production is the one, through which the firm incurs the least cost in production.
Once the cost minimizing method of production has been chosen and the market price of
the good and inputs is also known, firm can decide about the quantity of output to be sold
and the quantity of inputs to purchase. This has been illustrated in figure 4.
Production Process
Production is any process through which the inputs are processed and converted into
output. Production technology is a functional relationship between the inputs and the
output. For example producing a cotton shirt requires cotton, threads, buttons, dyes,
machinery, electricity, laborers and other inputs. It is possible that a good can be produced
through a number of different production techniques. The technology can be labor intensive
or capital intensive. A production technique that uses more of labor relative to capital, is
called a labor intensive production technology. On the other hand, a production technique
that uses more of capital relative to labor, is a capital-intensive technology. For example, to
make a swimming pool in a resort 50 laborers can be employed, with necessary tools and
equipment. This is a labor intensive technique. On the other hand, the swimming pool can
also be made with the help of 15 laborers, a crane and other machinery. This is a capital-
intensive technique. Since, the firm tries to choose the method of production which
minimizes the cost, a firm in an economy with abundant supply of cheap labor will use
labor-intensive techniques of production. However, in the economy where, the labor is short
in supply and the wages are high, the firms will have a tendency to use more of capital
relative to labor in the production process.
A production function can be describe as the mathematical relationship between the inputs
and the output. The total product function shows the total number of units of output that
will result on using different units of inputs.
For example, in a bakery one worker, working alone can produce 12 cookies in an hour. If
another worker is added, both the workers produce a total of 27 cookies in an hour, which
means that the second worker can produce 15 cookies in an hour. With the third worker the
total number of cookies produced rises to 37, i.e. the third worker adds only 10 cookies.
This could be because with three workers, the kitchen gets crowded and workers come in
each other’s way. Also the number of ovens is fixed, so three workers get to work on with a
fixed number of ovens, so there is a capital constraint. Note that we assume that all the
workers are equally efficient, it is the constraint of space and capital which leads to fall in
the number of cookies added to the total production by the third worker. With the addition
of the fourth and the fifth worker, these constraints are felt more strongly and the addition
made to the total production of cookies by each worker falls. With the fourth worker, the
total production of cookies rises to 40 and with the fifth worker it rises to 41 cookies. With
the sixth worker there is no further rise in the total production.
Marginal product can be defined as the additional units of output that can be produced by
employing an additional unit of a particular input, holding the quantity of other inputs fixed.
Table 2 above shows the marginal product of labor. The first unit of labor in the bakery
produces 12 cookies, the second unit of labor adds 15 cookies to the total production, the
marginal product of the third worker is 10 cookies, fourth worker adds 3 cookies, fifth
worker produces 1 cookie, while the marginal product of the sixth unit of labor is 0. Part b
of figure 5 shows the curve for marginal product of labor.
According to the law of diminishing returns or the law of variable proportions, beyond a
particular point, if additional units of a variable input are employed along with fixed inputs,
the marginal product of the variable input falls.
In the Essay on the Influence of a Low Price of Corn on the Profits of Stock (1815), the
British economist David Ricardo introduced the law of diminishing marginal returns. Ricardo
derived the law mostly out of his observations of agriculture and land, labor and capital
involved in it.
It is the short run where the firm or a factory or a farmer faces the constraint of fixed
inputs. Hence law of diminish returns always applies in the short run.
Average product is the amount of output produced on an average by each unit of the
variable input employed. Table 2 also shows the average product of labor. The average
product of labor is calculated by dividing the total output the total number of units of labor
used. For instance the average product of the first two units of labor is 13.5 (27/2), while
the average product of 6 units of labor is 6.83 (41/6).
The average product and marginal product are related to each other, however the average
product is not very quick to change, as compared to the marginal product. If the marginal
product exceeds the average product, the average product increases. For instance, Sam
participates in a competition that has five rounds and he has already completed two rounds.
Suppose he gets points for each round and his average for the first two rounds is 10, if he
scores 8 in the third round, his average for three rounds will fall but not all the way to 8, the
average will be 9.33. If he gets 12 points his average will rise but not all the way up to 12,
the average will be 10.66. Table 2 shows that the marginal product has been falling after
employing the third worker. Though the average product also falls with the marginal
product, it has been falling slowly, when compared with the marginal product. Figure 6
shows the graph of the Total product and the graph of marginal and average product. The
marginal product curve is nothing but a depiction of the slope of the total product function.
As figure 6 shows, the marginal and average product curves start out together. While the
marginal product is rising and is above the average product curve, the average product
rises with it but at a slower pace. The marginal product curve reaches its maximum at point
A with number of workers, before the average product reaches its maximum at point B
with number of workers. At point A, the marginal product curve begins to fall since at
this point the additional u nits of output that an extra worker generates, begins to fall due
to fixed inputs or capacity constraints. At point B, the average product and marginal product
of labor are equal. The average product of labor continues to rise till point B, while the
marginal product has already begun to fall at point A. Average product is equal to the
marginal product of labor, when it reaches its highest point B. Beyond point B and till the
point C, the marginal product continues to decline and it is less than the average product of
labor. The average product also follows this decline in the marginal product. At the point
where units of labor are employed, the marginal product falls to 0, i.e. an additional unit
of labor cannot add to the output. This is point C and this is where the firm reaches its
capacity and the total product is at its maximum.
Inputs are usually used in conjunction with each other. Labor and capital are two inputs
which can be seen as complementary in nature. Using more capital in the production
process can raise the productivity of labor. So, if the demand for cookies is on a rise, while
the bakery has hit its capacity of production, where all the workers are working with fixed
inputs for example a single oven, the owner of the bakery can think of expanding the
production capability of the bakery. He can infuse more capital in terms of another oven for
the bakery. The additional oven can raise the productivity of the labor as it will raise the
average output that a single worker can produce in an hour.
Choice of Technology
As we have discussed, inputs are used in conjunction with each other. The factors of
production are complementary in nature. Labor and capital are used together in production
and reach others productivity. However, different factors of production also act as
substitutes for each other. If capital is expensive relative to labor in an economy, the firms
will be motivated to shift to labor-intensive techniques of production. Similarly if labor is
relatively expensive compared to capital, the firms would want to shift to capital intensive
techniques. The type of production technique which will be chosen by the firm depends on
the prices of inputs determined by the input markets. Suppose, Rahul wants to manufacture
150 toys in a week. Table 3 shows several options of production technology that can be
used to produce these 150 toys.
Analyzing, different production technologies, it is easy to see that out of all the options,
technology A is the most labor intensive while technology E is the most capital intensive.
Since, the firm chooses the production technique which minimizes the cost, its ultimate
decision will depend on the market prices of the inputs. Let’s assume that the wage rate
(W) is Rs.1 and the cost of capital per hour (R) is Rs.5. The total cost corresponding to each
production technique can be calculate given the input prices.
Given the input prices, technology A is the one that will produce 150 toys at the least cost,
which is Rs.25, as shown in table 4. All the other technologies cost more than this amount.
Hence, the firm will choose technology which is the most labor-intensive technique. Now, if
the wage rate rises to Rs.7 and the cost of using capital per hour stays fixed at Rs.5, the
cost minimizing production technique after the rise in wage rate is option E. The cost of
production with technology E is Rs.42 after the rise in wages, as shown in table 4. So, the
firm will choose option E which is the most capital intensive technique out of all the options.
Hence, the cost of production depends on the available production techniques and the input
prices decided by the input markets.
Conclusion
The lesson throws light on important elements that go into the decision making process of a
firm. The ultimate goal of any firm is to generate profits for itself. Decisions taken by the
firm effect its profit. These decisions are regarding the quantity of output to be produced,
choice of production technique and the quantity of inputs to purchase. Hence, it is important
to understand the market structure in which a firm operates, the types of production
techniques that are available for production and what does the cost of production depend
on. The decisions made are such that, the profits should be maximized while the cost should
be minimized.
Summary
The chapter focuses on how the production decisions are taken at the firm level. Case of the
firm functioning in a perfectly competitive set up has been discussed. Following points
summarize the chapter.
Firms differ in size and structure. For instance a firm functioning in a perfectly
competitive industry is a price-taker.
Perfect competition is a market structure where there are several firms that are
small in size relative to the industry, each firm produces identical goods and there is
no restriction on entry and exit of the firms.
The demand curve facing a firm in a perfectly competitive industry is perfectly
elastic, i.e. at this price the firm can sell any amount of output, but it will not be able
to sell anything if it fixes a price above this price. Also the firm will not want to
reduce the price it charges below the market price.
Profit maximizing firms have to take three basic decisions. The first being the
quantity of output to produce, second, the choice of production technique and the
third, the quantity of inputs to purchase.
The ultimate aim of the firm is to make profit. Profit is the difference between total
revenue and total cost of the firm.
The total economic costs include out of pocket costs that are explicit in nature, the
opportunity cost of each input and the normal rate of return to capital which are
implicit in nature.
The normal rate of return to capital is the rate of return which is sufficient to keep
the investors and owners satisfied. In normal conditions, it is quite close to the rate
of interest on risk-free government securities.
If the firm makes positive profit, it implies that the rate of return that it earns is
greater than the normal rate of return to capital.
Decisions made by the firm also take into consideration the time period. Short run
differs from the long-run since it involves fixed inputs and the entry and exit of the
firms from the industry is constrained.
Decisions to be taken by the firm depend on market price of the good it produces,
the production technologies available and the input prices.
A production function entails how the inputs are related to output. It is a
mathematical relationship between inputs and output.
Marginal product is the additional units of output produced by employing an
additional unit of variable input. The law of diminishing returns states that beyond a
particular point, if additional units of a variable input are employed along with fixed
inputs, the marginal product of the variable input falls.
Average product is the average amount of output produce by each unit of variable
input employed. It is related to the marginal product. It rises when the marginal is
above the average product, it is equal to the marginal product at its highest level
and falls when the marginal product falls below it.
Capital and labor are inputs, complementary in nature, but they can also act as
substitutes.
A profit maximizing firm uses the technology that minimizes the cost of production,
given the prices of inputs and various production techniques.
Exercise
Review Questions
Q.1 Discuss the features of a perfectly competitive market structure. Why are the firms in a
perfectly competitive industry called “Price-Takers”?
Q.2 Why is normal rate of return to capital added while calculating total economic costs?
Q.3 What is the law of diminishing returns?. In the table given below determine whether
there is a case of diminishing returns.
Q.4 Draw curves for total product, marginal product and average product. Illustrate the
relation between average product and marginal product.
Q.5 what does the choice of the cost-minimizing production technique depend on.
Q.2 Which one out of the following is included to calculate the total economic costs:
a. Revenue.
b. Profit.
c. Price of the output.
d. Normal rate of return to capital.
a. 4 months.
b. 6 months to a year.
c. The time period where all the inputs are variable.
d. The time period where one or more of the inputs are fixed.
a. When additional units of a variable input are used with fixed inputs, the marginal
product of that variable input declines.
b. When additional units of a variable input are used with fixed inputs, the marginal
product of that variable input rises.
c. When additional units of a variable input are used with fixed inputs, the marginal
product of that variable input becomes constant.
d. None of the above.
Q.5 While choosing the production technology, the profit-maximizing firm should keep in
mind:
a. Input-prices.
b. Available production techniques.
c. Market price of output.
d. All of the above.
Answer 2. The opportunity cost of capital are accounted for by including the normal rate of
return to capital in the total economic costs.
Answer 3. Short-run is the time period where one or more of the inputs are fixed.
Answer 4. The law of diminishing returns state that when additional units of a variable input
are used with fixed inputs, the marginal product of that variable input declines.
Answer 2. Option a is incorrect, revenue is not included in the total economic costs. Option
b is also not correct, profit is calculated by deducting costs from revenue. Option c is
incorrect since price of the output is used in calculating the revenue earned by a firm.
Answer 3. Option a and b are incorrect, short-run is not earmarked by months or years.
Option c defines the long-run.
Answer 4. Option b is incorrect, the law states that as more and more units of variable input
are used with fixed inputs, the marginal product of the variable input declines. Option c is
incorrect for the same reason. Option d is ruled out.
Glossary
Average Product: average product is the ratio of total product to the total units of the
variable input. It is the average product produced by each unit of variable input.
Capital-Intensive Technology: The production technology that uses greater number of units
of capital relative to the units of labor.
Labor-Intensive Technology: The production technology that uses greater number of units
of labor relative to the units of capital.
Marginal Product: The additional units of output produced by an additional unit of variable
input employed.
Homogeneous Products: The goods that are identical to each other in terms of quality and
characteristics.
References
Case, Karl E. and Fair, Ray C. (2007), “Principles of Economics”, Ch.7, 8 th edition, Pearson
Education Inc.
Web Link
http://www.econlib.org/library/Enc/bios/Ricardo.html
Appendix
Introduction to Isocosts and Isoquants
Table A.1 Various Combinations of Capital (K) and Labor (L) which can be used to produce
output units 75, 150 and 225.
Output = 75 Output=150 Output=225
K L K L K L
A 1 9 2 10 3 11
B 2 6 3 7 4 8
C 3 4 4 5 5 6
D 6 3 8 3 9 4
Table A.1 shows various combinations of capital and labor that can be used to produce three
different levels of output of good x. These levels of output are 75 units, 150 units and 225
units. A curve that shows different combinations of inputs, capital and labor, to produce a
given level of output is called an isoquant. Figure A.1 shows isoquants for three different
levels of output, using the data shown in the table A.1. Each isoquant represents infinite
combinations of inputs that can be used to produce the corresponding level of output. There
can be several isoquants corresponding to several levels of output. The higher the isoquant
greater is the level of output attached to it.
Figure A.1 : Isoquants showing combinations of labor and capital to produce levels of output
Q1 = 75, Q2 = 150 and Q3 = 225
Figure A.2 shows the slope of the isoquant, where the isoquant has been drawn for the level
of output 75 units. Points F and G represent two points on the isoquant. When one moves
from point F and G, the capital employed falls and the units of labor rise. The output lost
due to fall in the number of units of capital employed is given by multiplied by . The
marginal product of capital is the number of additional units of output produced by
employing another unit of capital. To keep the level of output constant along the isoquant,
this loos in output must be made up by the addition to the output by employing more units
of labor. This addition to the output is similarly calculated as multiplied by .
The ratio of marginal product of labor to the marginal product of capital is called the
marginal rate of technical substitution. It measures the rate at which a firm can substitute
capital in place of labor, keeping the level of output fixed.
Isocosts
A curve that shows several combinations of capital and labor to produce output at a given
cost, is called an isocost line. Like isoquants, isocost lines are infinite in number. Figure A.3
shows lines. If the price of labor is and the price of capital is , the isocost line is given
by the equation:
The lowest isocost line represents the combinations of capital and labor corresponding to
the lowest cost of production. Suppose the price of labor is Rs.1 and the price of capital is
Rs.1, figure A.3 shows three isocost lines corresponding to total cost of Rs.4, Rs.5 and Rs.6.
In the figure A.4 slope of isocost line has been shown for total cost Rs.16, = Rs.1 and
= Rs.2. The isocost line shows several combinations of capital and labor that can be
purchase for a total cost of Rs.16. to draw the isocost line the endpoints can be marked.
Point A of the isocost line is given by i.e. 8 units of capital. Similarly point B is
given by i.e. 16 units of labor. The slope of the isocost line is given by:
This formula gives out the slope for the above isocost line, which is -1/2.
Let’s draw another diagram with three isoquants showing different levels of output, i.e. 100,
150 and 200 units of good X. Figure A.6 shows the cost minimizing production technology
for these levels of output. We maintain that = Rs.1 and = Rs.1. The minimum cost of
producing 100 units of good X is represented by the isocost line with total cost of Rs.4, for
150 units of good X is shown by the isocost line with total cost of Rs.5 and for 200 units of
good X is shown by the isocost line with total cost of Rs.6.
Figure A.6 : Cost minimizing production technology for Q1 = 100, Q2 = 150 & Q3 = 200
This is the firm’s cost-minimizing condition. Left side is the output produced by the last
rupee spent on labor and the right side is the output produced by the last rupee spent on
capital. If these two measure are not equal, the firm can lower the cost by substituting
more labor for capital or vice versa. Figure A.7 shows the total cost curve that represents
the minimum cost to produce different levels of output.
Summary of Appendix
A few important points discussed in the appendix that need to be reviewed are:
An isoquant represents infinite combinations of inputs that can be used to produce
the corresponding level of output.
The slope of the isoquant is given by: . The ratio of marginal product
of labor to the marginal product of capital is called the marginal rate of technical
substitution. It measures the rate at which a firm can substitute capital in place of
labor, keeping the level of output fixed.
A curve that shows several combinations of capital and labor to produce output at a
given cost, is called an isocost line. The slope of the isocost line is given by:
The point of equilibrium where the slope of isoquant is equal to the slope of isocost
line shows the cost-minimizing technology of production for a given level of output.
The equilibrium condition is: or . The same condition can
be written as:
Q.2 Give the equilibrium condition for the cost-minimizing production technique.
a. The different combinations of inputs that can be used to produce output at a given
total cost.
b. The budget of the consumer.
c. The different combinations of inputs to produce a given level of output.
d. The combinations of two goods that leave a consumer equally satisfied.
Answer 2. Higher the isoquant, higher is the level of output that it represents.
Answer 3. The cost-minimizing production techniques is given by the point where the slopes
of the isocost line and the isoquant curve for the given level of output are the same. So, it is
the point where they are tangent.
Isoquant: A curve that shows several combinations of inputs to produce a given level of
output.
Marginal rate of technical substitution: The rate at which capital can be substituted in place
of labor by the firm, holding the level of output fixed.
Table of Contents
Learning Outcomes
Concept of Monopoly
Why monopoly arises?
Pricing and the Output Decision of the Monopoly
Welfare Cost of the Monopoly
Public Policy towards Monopolies
Price Discrimination
Comparing Monopoly and Competition
Conclusion
Summary
Exercises
References
Learning Outcome
We are going to analyse monopoly market here. After reading this chapter one would be
able to answer questions like what is monopoly, how it arises, what makes monopoly
different from competition, what government does to control the problem of monopoly? One
would also be able to answer about price discrimination and the inefficiency caused by
monopoly. In this chapter we try to explain monopoly industry with the use of both
hypothetical data and diagrams, which would make the concepts more clear.
Monopoly
An industry with a single firm that is the sole seller of the product for which there are no
close substitutes and having the barriers to entry. There are key characteristics of
monopoly: - single seller, no close substitute, barriers to entry… It’s a type of imperfect
competition. They are the price makers.
that drug and sell that drug for that period of time in the market. It prevents any other
company to produce similar drug. Similarly, copyright rights provide an author the exclusive
rights to sell his book alone. They act as a stimulus for innovations and discoveries or any
new original work. It’s a reward for an extensive research through which some new
knowledge or new products are developed.
Natural Monopoly: When an entire market demand for the good or services can
be met by one single firm at lower cost than could be when more than one firm. It arises
when there are economies of scale associated with the output level. It usually arises in that
industry where production requires a very high fixed cost and a negligible marginal cost
relatively. For instance oil and gas pipelines, as there construction requires a huge fixed
cost while cost of supplying an extra unit of oil is negligible. So, it is better that a single firm
should produce output at least cost, as with more firms output produced per firm is less and
cost will be more. Natural monopoly also depends on the size of the market. It is possible
that as market size increases the monopoly give way to the competitive market. In figure 1,
we can see Monopoly arises because of economies of scale. When a firm’s ATC continually
decline, the firm has what is called natural monopoly. In such cases it’s better that only one
firm produce entire output at the least cost.
Quantity Price TR AR MR
0 18 0 - -
1 16 16 16 16
2 14 28 14 12
3 12 36 12 8
4 10 40 10 4
5 8 40 8 0
6 6 36 6 -4
In figure 2, demand curve shows how the quantity and the price are related. Market
demand curve of a monopoly is downward sloping because more can only be sold at lower
prices. It can be seen MR curve lies below demand curve (AR curve) because to increase
quantity price must fall on all units. MR become negative, when in order to sell an extra
quantity led price to fall by enough such that TR starts declining. Both demand curve and
MR curve start at the same point indicating that MR and price of the good are same for the
first unit sold.
In case of a monopoly firm, if MR>MC, they will produce more and in case MC>MR, they will
reduce their output level to increase their level of profits.
In figure 3, quantity produce is at horizontal axis whereas price and cost at the vertical axis.
Profit maximizing level of price and output is (P*, Q*) where MC curve intersect MR curve
from below. Monopoly will sell Q* unit of output at price P*
In perfect competitive firm if positive profits appear, those will be drowned by the entry of
new firms whereas in monopoly as there are barriers to entry which protect their profits
from falling.
Important Note: Monopoly firm does not have supply curve. In perfect competition,
supply curve is the upward sloping part of MC curve that lies above the AVC curve. How
much to produce in perfectly competitive firms depend on the adjusting MC as price
changes. However the amount of good produced by a monopoly depends on both its MC
curve and the demand curve it faces as they set both prices and quantity.
pay. Producer’s surplus on the other hand, is the difference between what producers get
and cost of producing that good.
In perfectly competitive firm, output is produced where P=MC, so market leads to the best
allocation of resources. Thus, total surplus (TS) is as large as possible.
Deadweight loss
A benevolent social planner always tries to maximize total surplus. He always chooses that
level of output where demand curve (AR curve) intersects MC curve. Demand curve
represent value to the consumer and marginal cost represent cost at the margin. So, for
social planner most efficient level of output would be where P=MC. So, we can say
monopoly doesn’t produce efficient level of output as they produce where P> MC=MR. So, in
monopoly price doesn’t reflect the true cost of production. Consumer wouldn’t be able to
buy efficiently.
Deadweight loss measure the loss in the efficiency caused when monopoly produces output
less than the efficient level of output. Monopoly as compared to perfect competition
produces less and charges high prices. It is the area of triangle between the demand curve
and the MC curve. The loss is basically due to the fact that they charge prices which are
more than MC. So, in this case those consumers, who are ready to purchase output at more
than the MC but less than the actual price level, couldn’t purchase output. Thus monopoly
pricing power leads to the Deadweight loss. They actually by charging high reduces the size
of the total surplus, by keeping some potential consumer out of the market.
In figure 4, quantity produced and supplied is on the x axis and price and cost on the y axis.
Pm, Pc and Qm and Qc represent the prices charged and the quantity produced by monopoly
and competitive firm respectively. We can observe that Pm>Pc and Qm<Qc, which shows that a
monopoly firm charges more and supply less quantity to the market. Downward sloping MR
curve is the marginal revenue curve of the monopoly. MC is the marginal cost curve and DD
is the demand curve. So, where MC curve intersect demand curve (AR=P) we get
competitive market equilibrium price and quantity. Now, where MC intersect MR curve from
below that determines the equilibrium level of price and quantity produced by monopoly.
The triangle ABC represents the deadweight loss of the monopoly firm.
Now to get rid of these problem policymakers can do two things to solve this problem.
Either they subsidize the monopolist or they can let monopolist to charge more than MC say
AC pricing. Both of the condition will lead to Deadweight loss (DWL). In both the pricing
mechanism inefficiencies will arise. So, in such a condition it would be better to let
monopoly keep some positive benefit from lowering cost which acts as an incentive but for
this prices must be charged more than MC.
So, each of the policies have their own drawbacks so policymakers faces the trade-off
between the monopoly’s problem and its solutions.
Price Discrimination
In perfect competition there are many firm selling similar products, so when any firm try to
charge price more than the market price, they will lose almost all the customers. In a
monopoly market, there is only one firm selling a given product. When they raises it price it
does not loses only some but not all its customers.
The monopolist usually tries to sell different units of output at different prices of a same
commodity and this practice of monopolist is what is known as price discrimination. Price
discrimination is not possible in competitive market because if any firm charge any price
other than market price either they lose its entire customer to others or not be producing at
efficient level. There are three types of price discrimination:-
First degree price discrimination: This happens when monopolist sell different
unit of output to different people at different prices. Basically, a monopolist here charges
each and every consumer price equal to their maximum willingness to pay or at their
reservation prices. Thus in this type of price discrimination, consumer surplus is zero and
producer surplus is equal to the total surplus. This is also known as perfect price
discrimination. All the gains from trade have been exhausted and there is no deadweight
loss to such a monopoly. The best example could be of a doctor charging different patients
different fees depending on the financial condition of the patients who are assumed to be
living in the neighbourhood and he is familiar with their financial condition. In figure 6, we
are comparing monopoly with monopolist with perfectly price discrimination.
Third degree price discrimination: Here consumers get different unit of output
at different prices but all those purchasing same amount of output have to pay the same
price. Customers with different elasticity of demand are charged differently. Customer with
low elasticity of demand is charged less whereas customer with high elasticity of demand is
charged more. In this way a monopoly will maximize their profits. The best example of this
might be discount to students on laptops.
Price Discrimination It’s not possible in perfect It’s one of the behaviour of
competition monopoly. So Possible.
Conclusion
In this chapter we discuss monopoly market. We learn that monopoly arises because of the
barriers to entry. How market behaves differently in case of perfectly competitive market
and monopoly market? We get the answer that profit maximization condition for the
monopoly is where MR=MC and MC should cut MR from below. In Monopoly,
P> MR= MC. Why this is so? We also learn that monopoly produces an inefficient level of
output and thus causes the deadweight loss. How policymakers can alleviate the problem of
monopoly using antitrust laws, regulating and by public ownership? Through price
discrimination monopolist can themselves eliminate DWL or at least reduce it.
The fact is that the real world exists between the two extreme, perfect competitions and the
monopoly. Different degree of price discrimination is actually practiced in the real world.
Summary
1) Monopoly is a firm that is the only player in the market. It arises because of barriers
to entry. Barriers to entry can be because of any of the following:
a) Monopolies Created by Government
b) Monopoly Resources
C) Patents and copyright laws
d) Natural Monopolies
4) Public Policies towards monopoly: Antitrust laws, Regulation and Public Ownership.
5) Monopoly Behaviour describing Price Discrimination: The art of selling different goods
at different prices to the consumer is called price discrimination. There are three
types of price discrimination. Examples of it are Movie tickets, Discount Coupons,
Airfare pricing etc…
3) Why monopoly don’t have supply curve? How does monopoly determine the output
produced?
4) What are the different public policies to handle monopoly behaviour? Are they
efficient?
5) Why do firm price discriminate? What are the types of price discrimination and give
their examples?
a) Upward sloping
b) Horizontal to the x-axis
c) Vertical to the y-axis
d) Downward sloping
a) Increase the price of the last unit sold but maintain the price of all previous
units sold constant.
b) Decrease the price of the last unit sold but maintains the price of all other units
sold constant.
c) Decrease the price on all units sold.
d) Increase the price of all units sold.
Answer 2. A monopoly has a downward sloping demand curve. As to increase the supply of
a good they have to fall the price.
Answer 3. The pure monopoly’s demand curve and the market demand curve are one and
the same. As they are the only seller in the market, so there demand curve is same as
market demand curve.
Answer 4. For a monopoly firm, price need to be more than marginal revenue. As selling an
extra unit of quantity leads to raise in TR but by less than how much it got increased by
selling the first unit. Monopolists have to reduce the prices for selling an extra unit that to
not only on extra unit sold but also on all previously selling units which will result in the fall
of TR.
Answer 5. Perfectly price discriminating monopoly charges every consumer price equal to
their reservation price or their maximum willingness to pay which make their consumer’s
surplus equal to zero.
Answer 6. Monopolists have to reduce the prices for selling an extra unit that to not only on
extra unit sold but also on all previously selling units which will result in the fall of TR.
Answer 2. Option a), b) and c) are incorrect, as for monopoly to sell more they have to
reduce price so its demand curve need to be downward sloping only. Thus cannot be
upward sloping, horizontal or vertical.
Answer 3. Option a) is incorrect because the price a monopoly charges is constrained by the
willingness of consumers to pay for the good it produces. Option b) is wrong as there is no
entry or exit in purely monopolistic markets. Option d) is also incorrect because pure
monopolies do not worry about potential entrants because entry is impossible.
Answer 4. There exist a definite relation between price and marginal revenue. So, option d)
is incorrect. Price if, equal to marginal revenue, that means that more quantity can be sold
without reducing the price which is not the case with monopoly. So option a) and option c)
is also wrong.
Answer 6. Option a) is incorrect as the law of demand states that quantity demanded
decreases as price increases. Option b) is incorrect as the price of all units of output sold must
decrease. Option d) is also incorrect as the law of demand states that quantity demanded
decreases as price increases.
Reference
1) Mankiw N.G. “Principle of Economics” 4 th Ed, pg 240–267.
2) Varian Hal.R. “Intermediate Microeconomics” 7th Ed, pg 445–454.
Table of Contents
Learning Outcomes
Concept of Factors of Production
The Demand for Labour
The Production Function and the Marginal Productivity of Labour
The Value of Marginal Product and The demand for labour
What causes the Labour Demand curve to shift?
The Supply of labour
What causes the Labour Supply curve to shift?
Equilibrium in the labour market
Shifts in labour demand and labour supply
The other Factors of Production
Conclusion
Summary
Exercises
References
Learning Outcome
In this chapter we will learn about how equilibrium is determined in the factors market. We
will basically try to look economics from supply side. What are the factors of Production?
How a competitive firm decides how much of the factor to buy? What causes there demand
and supply to change, especially in the labour market? Why equilibrium wages equal to the
marginal product of labour? How factors of production are paid?
There are two assumptions that we keep in our mind about our firm. Firstly, that they are
competitive both in output and input markets. Secondly, that the firm is profit maximizing.
Figure 1 : The supply & demand in the goods & input market
0 0 - - - -
1 10 10 100 50 50
2 18 8 80 50 30
3 24 6 60 50 10
4 28 4 40 50 -10
5 30 2 20 50 -30
Figure 2 graphs the production function. The labour employed on the x-axis and the
quantity produced on the y-axis. The production function is the way of transforming inputs
into outputs using some techniques. As labour input increases, the MPL decreases. That is,
as more and more labour increases, each additional worker contributes less to the
production of output. For this reason, production function becomes flatter as number of
labour increases.
In order to decide how much labour to hire, firm has to decide about how much he is going
to contribute to the firm’s revenue. Firm is only concerned about the profit. Profit is
basically total revenue minus total wages offered. Now there is a need to convert labour
contribution to some value. Value of the MPL of any input is the product of the market price
of the output and the MPL. In table 1 we assume that the price of one unit of output is $ 10
in a competitive market. This is also known as Marginal revenue product. In order to know
how many labour will firm demand at some particular wage. Let’s say Wage is equal to $ 50
in table 1. So, for the firm it makes sense to demand labour to the extent where the value
created by them must be equal to the wage rate. In our example as third worker produces
$60 which is more than and fourth labour produces $40 only which is less than the offered
wage rate. Thus, a competitive, profit maximizing firm will demand only 3 labours i.e. up to
the point where VMPL= P x MPL.
We graph the VMPL in the figure 3. It is a downward sloping curve because the MPL
diminishes with the rise in the labour force. As we are taking market wage as given, so
there will be a horizontal line at that wage rate. So, wherever these two lines intersect, the
firm will decide to demand that much labour. VMPL= W.
a) the output price: When output price changes it leads to change the value of
marginal product and thus shifts the labour demand curve. As if price rises, the firm will
demand more labour or vice- versa.
c) The supply of other Factors: The change in the supply of other factors
affects the MP of the other factor. For instance, if on a given piece of land, we keep on
increasing labour to work their efficiency will decline.
The supply of labour is the decision of the households. It is also a function of wage rate
offered in the competitive market. There is always a trade - off between leisure and work.
Supply of labour shows the decision of the households supplying labour about their labour-
leisure decision with respect to the wage rate. If we spend more hours working, less time
will be available for leisure activities. This is the reason behind the labour supply curve. In
order to enjoy one hour of leisure, we need to forgo one hour of wage. That’s why wage is
also the opportunity cost of leisure.
Labour supply curve is upward sloping because higher the wage rate, more a household is
ready to supply his labour services. As the wage rate rises with it opportunity cost of leisure
also rises. When the wages increases, substitution effect encouraged the worker to work
more and earn higher wages and income effect makes the worker to work less and enjoy
more leisure by using the goods and services. Although, at a very high wage rate labour
supply curve is backward bending reflecting that at a very high wage rate, household would
decide to enjoy leisure more than working. At this point his income effect is more than
substitution effect. As both of the work and leisure are a normal good. So, income effect
makes him enjoy leisure along with the work whereas substitution effect makes him work
more.
much labour that is profitable for them to do at equilibrium wage, i.e. hire until VMPL=W.
Equilibrium wage and employment changes with the change in the supply and demand of
labour.
Now consider that there is a change in the technology. Labour augmenting technology is
being introduced. This will lead to rise in the labour demand. So when the labour demand
increases, the equilibrium wage also rises along with the equilibrium employment. Here
value of marginal product of labour rises because of the labour augmenting technology
which in turn raises the MPL of the labour. So, as the VMPL rises firms wish to hire more by
offering high wage rate as it is profitable now to hire more labours.
In figure 6, as the labour demand rises due to the change in technology, demand curve
shifts from DD0 to DD1. At a given wage rate W0, demand for labour is more than the supply
of labour. Now in order to induce labour firms have to offer them the higher wage rate as
they will be willing to join only at the higher wage rate W1. This rise in wage reflects rise in
the MPL of the labourers which will raise the VMPL.
Labour demand and labour supply together determines the equilibrium wage and
equilibrium employment. Any shift in labour demand and/or labour supply cause the change
in the equilibrium level of employment and wage. At the same time, profit maximization by
the firms that demand labour ensures that the equilibrium wage always equal the VMPL.
production. Capital means the accumulated goods produced in the past that are being used
in the production of new goods and services.
As all the factors of production are paid equal to the value of marginal product of that
factor. MP in turns depends upon the quantity of that factor available corresponding to other
factors. As when any factor keep on increasing with other factor remain constant or not
raising that much, that very factor faces diminishing MP. A factor in abundance supply has a
lower MP and thus a lower price, and a scarce factor has a high MP and thus a high prices.
So, whenever some factor supply falls, its price rises.
Production of any good usually depends on some combinations of the factors of production.
Factors of production are used together in the production process. So, in case there is a
change in any factor of production supply, it affects other factor marginal products (MP).
Thus a change in the supply of any factor leads to change in the earning of all the other
factors.
For example think of an industry where demand of labour depends on number of capital
units available. In case, there is an increase in the capital in the industry which means
relative abundance of capital and at the same time relative scarcity of the labour that will
lead to fall in the MP of the capital and raise the MP of the labours. Thus there will be fall in
the rental price of capital and rise in the wage rate.
Conclusion
Here in this chapter we learn about how factors of production are being paid in the
production process. Each and every factor quantity employed is determined by the forces of
demand and supply. The demand, in turn depends on the marginal productivity. At
equilibrium, each factor of production earns the value of its marginal productivity. Change in
demand and/or supply of any factor of production will change the equilibrium level of factor
payment of its own as well as of payments of other factors.
Summary
1) Factors of production are the inputs that are used in the production of goods and
services.
2) Labour market is determined by the interaction of demand and supply. The demand
for labour is determined by their marginal productivity. Labour are paid wages in
return of their service. As the labour supply increases, MPL diminishes.
Social cost of Monopoly. In a competitive market, profit maximizing firm will demand
labours up to the point where VMPL= P x MPL.
3) The Labour demand curve to shifts due to the change in any of the following reasons:
a) the output price
b) Technological change
c) The supply of other Factors
4) Supply of labour decision is taken by the household. Household have to take make a
trade - off between leisure and work. Labour supply curve is although upward sloping
up to some wage level but after certain wage level it is backward bending.
5) The Labour supply curve to shifts due to the change in any of the following reasons:
a) Changes in Tastes
b) Changes in Alternative Opportunities
c) Immigration
7) Other factors market equilibrium is also attained at the point where the value of their
marginal product is equal to its rental price. There are factors of production which are
also used along with labour in the production of goods and services.
3) What are the factors that lead to shift in demand and supply curve of labour?
5) How the changes in the one factor supply affect the return to the other factor’s
return?
a) Falls
b) Rises
c) Remain constant
d) may rise or fall
a. VMPL > W.
b. VMPL < W.
c. MPL = W.
d. VMPL = W.
5) When the supply of one factor increases, what impact does it have on the rental
price of other factors in the production process?
Answer 2. As labour input rises, the MPL falls. That is, as more and more labour increases,
each additional worker contributes less and less to the production of output.
Answer 3. At an equilibrium point, workers receive the value of their contribution to the
production of goods and services and firm will employ that much labour that is profitable for
them to do at equilibrium wage, i.e. hire until VMPL=W.
Answer 4. at a very high wage rate labour supply curve is backward bending reflecting that
at a very high wage rate, household would decide to enjoy leisure more than working. At
this point his income effect is more than substitution effect.
Answer 5. Production of any good usually depends on some combinations of the factors of
production. So, in case there is an increase in any factor of production supply, it increases
the return to other factors.
Answer 2. Option b), c) and d) are incorrect, as more and more labour increases, each
additional worker contributes less and less to the production of output. So MP falls.
Answer 3. A firm would like to employ more labour till the point where the marginal
contribution at market price is equal to the wages. That implies equilibrium profit
maximizing point will be at a point where VMPL is exactly equal to W. So, a), b) and c) are
wrong.
Answer 4. There exist a definite relation between price and marginal revenue. So, option d)
is incorrect. Price if, equal to marginal revenue, that means that more quantity can be sold
without reducing the price which is not the case with monopoly. So option a) and option c)
is also wrong.
Answer 5. Option b), c), d) is wrong because the rise in the supply of one factor will lead
the marginal product of that factor to fall. And as there is a direct linked between the rental
price and the productivity. So, the rental price of the other factor will increase because of its
scarcity in relative terms.
Glossary
Derived demand: demand arises for factors because the demand for the goods and
services for which they are being used has risen.
Income Effect for wages: With the rise in wage rate, the worker spends more
time in leisure and less time at work. Leisure becomes more attractive at higher wage rate.
Substitution Effect for wages: With the rise in the wage rate, the opportunity
cost of leisure rises, which makes worker to spend more hour at working and less time to
enjoy leisure.
Reference
1) Mankiw N.G. “Principle of Economics” 4 th Ed, pg 334-348.
Table of Contents
1.Learning Outcomes
2.Introduction
3. Evolution of the Subject
4. Positive Economics and Normative Economics - Methodology
5. Art or Science
6. Scope of Economics - Related Subjects
7. Models and Hypotheses
8. Macroeconomic Variables
9. Laws of Economics
10. Market, Equilibrium, Demand, Supply
11. Markets in Macro-economics
12. Concept of Aggregate Demand and Supply
13. Closed Economy and Open Economy
14. Partial and General Equilibrium Analysis
15. Static and Dynamic Equilibrium
16. Short-Run and Long-Run Equilibrium
17. Nobel Prize in Economics
18. Summary
19. Exercises
20. Glossary
21. References
22. Activity
1.Learning Outcomes
After you have read this chapter you should be able to define Micro-
Economics, Macro-Economics, Market, Demand, Supply, Equilibrium, Partial
and General Equilibrium, Static and Dynamic Equilibrium, Long Run and
Short Run, understand the central problems of an economy, identify
variables, constants and parameters, real and nominal variables,
differentiate Micro-Economics from Macro-Economics, the scope of the
subject of Economics, apply the knowledge of basic Economics
Value Addition:
Focus of the Section
Topic Economics
This section is to make you aware of what Economics is.
The purpose of this section is to make you familiar with the various
Definitions of Economics, the Evolution of the subject, its Scope,
Methodology, Tools and Basic Concepts.
2.Introduction
The word Macro and Micro come from the Greek words macros ( long or huge) and
micros (small).
As for Economics, there are two basic definitions.
Economics
Etymologically, the word Economics derives from the Greek word oikos ( house) and
nomos ( management).
But since the second half of the 17th century, the word Economics has come to be
used in the wider context of a whole country or nation rather than the household.
Adam Smith is known as the `father’ of the subject of Economics. His book An
Inquiry into the Nature and Causes of the Wealth of Nations, first published in 1776,
is the first-ever treatise on Economics . Smith’s concern was about nations or
countries, that is, it was a Macro-type concern, although the term Macro was not in
use then.
Later T.R. Malthus, David Ricardo , and J.S.Mill wrote important treatises on the
subject, taking the same overall perspective and sweeping generalizations taking
long-run perspectives. They are known as Classical economists and have also been
described as Magnificent Economists because they dealt with big issues on a broad
background. One of the tenets of Classical economists was that in the long run there
is no unemployment in the economy. Jean-Baptiste Say (1757—1832), a French
economist, stated that “products are paid with products” which came to be popularly
interpreted as ‘Supply creates its own Demand.” Given sufficient time, imbalances in
the economy will be smoothed out and people, or governments, need not be worried
about them. This was the basic standpoint of the Classical economists and is known
as the “Say’s Law”.
While the Classical approach prevailed throughout the 19 th century, a Neo-
Classical approach came to be formed towards the end of the 19th century.
economists began to study economic issues on a more specific and individual level. It
concentrated on how the price and quantity of specific goods (and services) were
determined in the market though a rational balancing of their `marginal’ costs and
benefits (`utilities and productivities).Foremost among these Neo-Classical
economists( also described as `marginalists’) were Menger, Jevons and Alfred
Marshall. It is their work that constitutes the foundation of Micro-economics, where
the individual consumer or producer was the unit concerned, not the entire national
entity.
Although the Classical economists had been concerned with the nation or the country
as a whole, and therefore are more Macro than Micro, in approach, Macro-economics
as a subject developed only after the Great Depression. On 23 October 1929, the
New York Stock Exchange ( at Wall Street) crashed. Many rich and successful people
lost their all and took their own lives in desperation. Widespread unemployment
followed the closing down of production units. Both employers and employees felt
the impact. Not just America or Europe but their colonies too suffered. It was a
global crisis.
It was then that John Maynard Keynes came up with his analysis of the
phenomenon in terms of Aggregate Demand falling short of Aggregate Supply and
emphasized the role of the Government of a country in stepping up its own
expenditure in order to correct that shortfall or gap.
His analysis laid the foundation of Macro-Economics. Later John Hicks, Milton
Friedman, James Tobin, A.W.Phillips, Edmund Phelps, Robert Lucas, T.J. Sargent,
Robert Barros and others have contributed to the subject of Macro-Economics,
bringing in the roles of Money and Expectations. Lucas, Sargent and Barros are often
called the New Keynesian economists.
To sum up in the words of Paul A. Samuelson, “Macroeconomics deals with the big
picture – with the macro aggregates of income, employment, and price levels. But do
not think that microeconomics deals with unimportant details. After all, the big
picture is made up of its parts.” ( Economics, 7th edn, p 362). So he concludes that
there is no essential opposition between the two.
Traditionally and in most universities, a course in Micro-Economics is taught prior to
one in Macro-Economics.
5. Art or Science ?
A Social Science
Even if we use the term science to describe Economics, we must remember that it is
a Social Science. It does not study individuals in isolation, doing everything by
oneself. It studies individuals as members of a society or nation or Economy.
An economy is the same as country or society but considered only in its economic
aspects. Every society or country has numerous people engaged in activities of all
sorts. Some work in the fields, some work in factories, and yet others in offices.
Some perform agricultural activities, some industrial, and some do services. Those
who are in agriculture need to get industrial products and, say, banking services.
Those who are factory-workers, say, need to get hold of foodstuff, and use some
kind of transport services. The people engaged in the services sector need both food
and clothing . Thus all the three sectors with their separate kinds of activities need to
have relations. All the people of an economy need to act as well as inter-act. This
they do by exchanging the products of their various activities in various markets.
The epithet `Social’ covers this aspect of the subject of Economics.
However, for analytical purposes, Economics sometimes uses the concept of a
Robinson Crusoe Economy, or an economy consisting of a single person performing
all the economic activities by himself. Robinson Crusoe is the title of a book written
in 1719 by Daniel Defoe based on the life of Alexander Selkirk who was marooned on
an island and survived all by himself for 28 years. A Robinson Crusoe Economy is
thus a theoretical concept where the economy has a singleton member.
Economics has a wide scope and has connections with various subjects.
Mathematics and Statistics are necessary for the study of Economics. Mathematics
helps economists to analyze economic realities, to and derive conclusions from
them. Statistics aids this process by systematizing the economic realities as data and
inferring from them by accepted statistical tools. In fact, the application of Statistics
to Economics had led to the development of a relatively new subject: Econometrics.
It helps in empirical study and making projections both into the past and the future.
Without a sound mathematical base, it is next to impossible to cope with academic
Economics. However, to have an general awareness of the economic occurrences of
the world, basic intelligence will do. To quote Samuelson, “ Although every
introductory textbook must contain geometrical diagrams, knowledge of
mathematics itself is needed only for the higher reaches of economic theory. Logical
reasoning is the key to success in the mastery of basic economic principles, and
shrewd weighing of empirical evidence is the key to success in mastery of economic
applications.”( Economics, Ch 1. p 5)
Actually, the earlier term for Economics was Political Economy. Several universities
still have a common department for Politics and Economics. Political Science is an
useful subject to supplement a course in Economics. History is also a subject that
has a close connection with Economics. Economic History is a compulsory paper in
every course in Economics, undergraduate as well as post-graduate. Several
universities offer a post-graduate course in Economic Geography.
In recent times several subjects or courses have emerged from Economics, e.g.,
Commerce, Business Economics, Business Administration, Business Management.
While based on the fundamentals of Economics, they have their own distinctive
course contents. But both Papers on Micro-Economics and Macro-Economics figure in
all of them.
Economics has to deal with a complex mass of realities. So it sometimes puts them
into a simplified framework or Model. A Model is a theoretical construct that
represents economic realities by a set of inter-related variables. These relationships
can be logical or quantitative. But putting them in a Model helps economists to
analyze realities better and even made future predictions.
Economist often posit or propose explanations for economic phenomena. These are
known as Hypotheses. A hypothesis is not a theory. Only if a Hypothesis is verified
or found to be true, can we call it a Theory. To be verified or falsified, that is tested,
a hypothesis has to be framed in a certain way. Such a hypotheses is called a
Scientific hypothesis. Sometimes economists have no alternative but to take a
certain hypothesis to be true, and proceed on the basis of it. Such a hypothesis is
called a Working hypothesis. Statistics and Econometrics are the tools used in
verifying a hypothesis.
Economics is a complex subject, rooted in the reality but often analyzed through
abstract thinking and mathematical methods.
As symbols of that reality, Economics makes use of the Mathematical concepts :
Variables, Constants and Parameters.
Variables are entities that take different values. They are usually symbolized by x, y ,
z. and take values positive and negative ranging from minus infinity to plus infinity.
Constants are entities that , for one particular analytical exercise, take one
particular value. They are usually symbolized by a, b, c .. or alpha, beta, gamma.
And again, can take any value between plus-minus infinity but can take only one
such value during a particular analysis.
Parameters are entities that can be assigned different values for different variants of
an exercise but in any one particular variant, can take only one such value.
Variables can be dependent or independent.
An Independent variable takes on values by itself.
A Dependent variable takes on values according to or as per the Independent
variable. This relation of dependence between the Independent and the Dependent
variable(s) is known as a functional relationship, or simply, a Function. It means that
C = f(Y) is the Consumption Function which says that consumption C depends upon
National Income Y.
In Economics, a Function may involve more than one variable. Usually, several
variables are interlinked. To examine whether any two have a causal ( cause-effect)
relationship, it may be necessary to rule out others that complicate the issue or get
in the way of analyzing it. This is done under an assumption known as the ceteris
paribus assumption.
8. Macroeconomic Variables
Important variables in Macro economics are National Income, Disposable Income,
Consumption, Saving etc. Sometimes the ratio of two variables may be regarded as
a variable in itself, e.g., Consumption/National Income is a separate variable, viz.,
the Average Propensity to Consume. Variables may be Real or Nominal.
Nominal variables are those expressed in terms of money, usually in terms of the
current prices. Real variables are those expressed in real terms, or constant prices,
which means that they are `deflated’ or corrected for possible fluctuations in the
price level. The Deflator used is usually the General Price Level.
A Stock variable measures the quantity of the variable at a particular point of time.
E.g., the capital a businessman has got on such-and-such date. A Flow Variable
measures the quantity of a variable over a period of time. E.g., the investment he
has made in his business in that year or the profit he had made in course of it.
National Income ( usually symbolized by Y)is the sum total of the money measures
of goods and services produced in a country during a year. It is also the Net
National Product at Factor cost, i.e., the sum total of income generated by an
economy during a year. It is a flow variable.
Gross Domestic Product (GDP) is the sum total of the money measures of the
goods and services produced during a year within the national boundaries of that
country.
Personal Income is that part of the National income which is actually received by
the persons or households of the economy. Corporate Income Taxes, Undistributed
Corporate Profits, Savings of Non-Departmental Enterprises, Income from Property
and Entrepreneurship of Government Administrative Departments and Social
Security contributions do not go to persons or households. They have to be deducted
from the National Income to get at its personal component.
On the other hand, Transfer Payments (pensions, unemployment doles), though they
are not earned income, add to the amount that the individuals and household have
to spend. They have to be added to the National Income so as to get at its personal
component.
Thus,
Personal Income = National Income - Corporate Income Taxes -Undistributed
Corporate Profits - Savings of Non-Departmental Enterprises - Income from Property
The interest rate is the rate at which loan able funds are leant out in the economy.
In any actual economy there are several rates of interest prevailing simultaneously.
But for theoretical purposes, we take it that uniform interest rate (r or i) prevails.
The Nominal Interest Rate is deflated by he Price level to get the Real interest Rate.
Government Expenditure(G) refers to all purchases made by all governmental bodies
in an economy, e.g., on provision of infrastructure, public transport, administration,
defence, space research. G is generally taken to be autonomous.
Net Exports (NX) refer to the value of goods produced in an country and exported
abroad (X) after the deduction of the value of goods and services produced abroad
but imported (M) by the country. That is, NX= X-M.
Y= C+I+G+NX.
Or,
Nominal GDP
GDP Deflator = ---------------------------
Real GDP
The rate at which the General Price Level increases is known as the Inflation rate.
Thus
Pt – Pt-1
Inflation rate = ------------
Pt-1
where t refers to the present time-period and (t-1) the previous one.
9. Laws of Economics
The Classical and Neoclassical economists often used the term `law’ to describe the
tendencies that they observed in functioning of the economy or society. The Law of
Demand and the law of Diminishing Returns in Micro-economics and Say’s Law , and
Okun’s law in Macro-economics are just a few examples. In no sense are these
binding or enforceable or universal laws.
However, law in the usual sense of the term does have a close connection with
Economics. It is a basic idea of neo-Classical Economics that , for the smoothing
functioning of the market, there must be law and order in the country. The law of the
land influences its economic performance.
The word Market comes from Latin mercatus which meant trading, buying or selling
at an appointed time or place. A market is not necessarily a marketplace. It is a
conjunction or coming-together of buyers and sellers. The haat, bazaar and mandi
, the shop and the mall are markets. But on line or telephonic sale and purchase ,
which is quite common these days, are also market transactions.
The distinguishing feature of the market is that market transactions are exchanges
, usually performed through the medium of money. The seller ( who is sometimes
though not always the producer) of certain commodities/ services brings them to the
market and offers certain quantities of quantities of them at a certain price . He
thus supplies them in the market. The (prospective) buyer comes to the market
wanting to get certain commodities/ services at a certain price. He thus demands
them in the market. If the demand of the buyer and the supply of the seller match at
a certain configuration of price and quantity, the transaction takes place. If not, it
does not.
The transaction is thus both a sale and a purchase. It is sale from the point of view
of the Seller(producer) , that is, from the Supply side. It is purchase from the point
of view of the Buyer, that is, the Demand side.
The transaction configuration is known as the Equilibrium.
In Latin, aequus means equal and libra means scales or balances.( That is why in the
Zodiac, the sign Libra is shown by a pair of scales). When the two scales on the two
sides of a scales instrument hang steady at the same level, there is aequilibrium, or,
in English, Equilibrium.
The word Demand is from Latin demandare which means to claim or commission.
Supply is from Latin supplere, to fill up or complete.
In the context of Economics it was Adam Smith in 1776 who first used them as
corresponding concepts. Marshall has compared them to the two blades of a pair of
scissors. Just as the scissors cannot work without either of the two blades, Market
Equilibrium cannot be determined without reference to both Demand and Supply.
Aggregate Demand and Aggregate Supply are two important concepts of Macro-
economics.
Aggregate Demand refers to the overall or national demand for goods and
services. It comes not from individual persons, households or even groups, but from
all the citizens taken together.
Similarly, Aggregate Supply refers to the overall or national supply of goods and
services , or the national income it generates.
When Aggregate Demand equals Aggregate Supply , there is equilibrium in the
Macro-economic sense, in the overall or national market for goods and services, or
simply, the Goods Market. When Aggregate Demand falls short of Aggregate Supply,
there is Depression. When Aggregate Supply falls short of Aggregate Demand, there
is Inflation. Inflation, Depression and Unemployment are fundamental concerns of
Macro-economics.
Macro-economic Theory distinguishes between the Closed Economy and the Open
Economy.
A Closed Economy has no ( or negligible) interactions with the rest of the world. All
production, consumption and market exchange is internal and in the same domestic
currency. There are no Exports and Imports and Net Exports are zero. There is no
Foreign Investment in other countries or by foreign countries, and Net Foreign
Investment is zero as well. It is in this Closed Economy framework or model that the
Goods market and Money Market are analyzed to yield an overall equilibrium.
An Open Economy has transactions with the rest of the world. It exports and
imports, borrows and lends. It invests abroad and other countries invest in it. All this
involves the use of at least two, if not more, currencies.Net Exports, Net Foreign
Investment, and the Exchange Rate are thus important elements in Macro-
Economics.
In Economics, a Function may involve more than one variable. Usually, several
variables are interlinked. To examine whether any two have a causal ( cause-effect)
relationship, it may be necessary to rule out others that complicate the issue or get
in the way of analyzing it. Then what is done is to make an assumption known as the
ceteris paribus which means ‘other things being the same’. It qualifies or conditions
a causal relationship between an independent variable and the dependent variable
that depends on it or functions according to it. In Latin Ceteris means `other things
or the rest’ and Paribus means ` at par or equal’.
Partial Equilibrium Analysis is a study of economic occurrences where a causal
relationship is studied between two variables, keeping other related variables
constant or fixed under the assumption of ceteris paribus‘other things being the
same’ However it lets only one market (at a time) be in equilibrium and may not
capture the complexities of the real world. General Equilibrium Analysis lets the
inter-dependence of various variables play themselves out. Prices of Commodities
are determined simultaneously and mutually. All markets are simultaneously in
equilibrium. Macro-economics , as of now, uses the Partial Equilibrium analysis
rather then the General Equilibrium.
A run is a length of time, not exactly specified. If all factors of production can be
varied during a length of time, it is called the Long Run. If some variables can be
varied but others cannot, i.e., are fixed, it is the Short Run. A Short Run
Equilibrium, one that holds in the Short Run, is achieved in Macro-economics if
Aggregate Demand is equal to Aggregate Supply. But if there is a gap, there is dis-
equilibrium leading to unemployment, depression or inflation . The Classical
economists held that in the Long Run the dis-equilibrium situation will correct itself.
Wages and prices will adjust and this variable or flexible character will ensure
equilibrium in the Long Run.
It was precisely against this attitude that Keynes wrote: “The long run is a
misleading guide to current affairs. In the long run we are all dead. Economists set
themselves too easy, too useless a task if in tempestuous seasons they can only tell
us that when the storm I past the ocean is flat again” ( A Tract on Monetary Reform,
1923, Ch 3).
Wages are not so flexible in the Short Run, and this `sticky’ character of the wages
may stand in the way of restoring equilibrium in the Short Run. Keynesian analysis is
Short Run analysis.
However modern Macro-economics also includes
Study of Inflation and output in the long Run, using Dynamic Aggregate Demand
Curve and Dynamic Aggregate Supply Curve.
The highest recognition for economists is the “Sveriges Riksbank Prize in Economic
Sciences in Memory of Alfred Nobel” , first awarded in 1969. Among the important
Macro-economists who have received it are:
Robert Lucas in 1995, James Tobin in 1981, E.S.Phelps 2006, Friedman 1976 and
Robert A Mundell in 1999
18. Summary
Economics studies human choice among alternative uses of scarce
resources.
It is a Social Science has a wide scope. It aids the understanding of
the central problems of an economy.
Demand and Supply of goods and services determine their
Equilibrium Price and Quantity in the Market.
Aggregate Demand and Aggregate Supply determine macro-economic
Equilibrium.
Markets can be of various types and forms.
Equilibrium can be Partial and General, Long-Run and Short-Run,
Dynamic and Static.
19. Exercises
Short Questions
Long Questions
20. Glossary
Variables
Constants
Hypothesis
Model
Demand supply
Market
Equilibrium
Static Equilibrium
Dynamic Equilibrium
Long Run
Short run
General equilibrium
Partial Equilibrium
Consumption
National Income
Gross Domestic Product
Disposable income
Money
Consumption
Savings
Net Exports
Interest rate
21. References
22. Activity
From newspaper and official statistics, find out the National Income, Inflation Rate
and the Unemployment Rate for the previous two years.
Talk to some householders as well as factory workers about what they feel about
Inflation and Unemployment.
Quiz
Discipline Courses-I
Semester-I
Paper I: Principales of Economics(POE)
Unit-III
Lesson: National Income Accounting
Lesson Developer: Rakhi Arora and Vaishali Kapoor
College/Department: Rajdhani College, University of Delhi
1
Institute of Lifelong Learning, University of Delhi
National Income Accounting
Table of Contents:
1. Learning outcomes
2. Introduction
3. What is macroeconomics?
4. Measurement of GDP
6. Price Indexes
7. Summary
8. Exercises
9. Glossary
10. References
2
Institute of Lifelong Learning, University of Delhi
National Income Accounting
Learning outcomes:
After you have read this chapter, you should be able to:-
INTRODUCTION
Newspapers, these days, are full of headlines symptomatic of the worsened conditions of
the global economy; which suggest that policy makers and economists have been worried
about what form the ongoing global financial crisis will take, how all economies would be
affected and whether all economies would emerge as gainers and take the lead? One needs
to know, how economists predict these crises & their repercussions;how economists study
the symptoms of any disturbance in the economy and provide the cure.
Economists & researchers keep studying every economy with the help of various economic
variables & economic tools at hand & economic data that is widely released in various
newspapers journals & articles –mostly produced by government. These data /statistics are
used to study the economy & policy makers use them to monitor the ongoing development
processes in the economy &to formulate policies.
This chapter broadly covers the macroeconomic issues and the macroeconomic
variablesin Section one. It discusses in Section two the Gross Domestic Product, GDP, –
an indicator of the health of an economy and in Section three, it explains the meaning of
Consumer Price Index,CPI, which represents the overall prices. This chapter largely
focuses on the accounting of National Income.
3
Institute of Lifelong Learning, University of Delhi
National Income Accounting
WHAT IS MACROCONOMICS?
Macro Economics is the study of the structure and performance of national economies and
of the policies that government‟s use to try to affect economic performance.
8000000
7000000
6000000
5000000
4000000
3000000 National income of
2000000 India Rs. Cr
1000000
0
1950-51
2011-12
1960-61
1970-71
1980-81
1990-91
2000-01
2000-02
2005-06
2006-07
2007-08
2008-09
2009-10
2010-11
4
Institute of Lifelong Learning, University of Delhi
National Income Accounting
(iii) Inflation
Many efforts have been devoted by the economists to identify the costs &
consequences of even the moderate inflation. The key questions that need to be
addressed are: who are Gainers & losers from inflation? What costs does inflation
impose on society and their severity? What are the causes of inflation? What are the
best ways to curb it? Figure 2 shows the behavior of consumer goods prices over
time in India. They have been rising since 1950 and have doubled in the last decade
owing to droughts, US crisis and recession.
5
Institute of Lifelong Learning, University of Delhi
National Income Accounting
200
150
0
1970-71
2011-12
1950-51
1960-61
1980-81
1990-91
2000-01
2000-02
2005-06
2006-07
2007-08
2008-09
2009-10
2010-11
This issue focuses on the economic links among nations – international trade &
borrowings etc. that affect the performance of individual economies & world
economy as a whole. The recent crisis affected the entire globe even when it
originated in US as Sub Prime crisis. This shows that countries are linked by
international trade and more so by financial flows.
6
Institute of Lifelong Learning, University of Delhi
National Income Accounting
MEASUREMENT OF GDP
The national income accounts are an accounting framework used in measuring current
economic activity. National Income Accounts are set up in a way that mirrors the structure
of the economy. Working through these accounts is a first important step towards
understanding how the macro economy works.
The economic activity that occurs during a period of time can be measured in the following
3 ways:
All three approaches portray the identical picture of the economy. The money value
computed from either of the above ways is technically known as National income of the
economy.
VARIANTS OF GDP
GDP is the market value of all final goods and services produced by normal residents
as well as non-residents in the domestic territory of a country in a year. It includes
the market value of only final goods and ignores intermediate goods to avoid the
problem of double counting (i.e.) to count all goods and services produced in any
given year only once.
GDP= C+I+G+NX
Where,
7
Institute of Lifelong Learning, University of Delhi
National Income Accounting
C= Value of final consumer goods and services produced in a year and consumed by
households.
NX=Exports-Imports
It is defined as the total market value of all final goods and services produced in a
year by normal residents of a country. These residents may be national or non-
national companies having their set up plants in India.
GNP=GDP+NFIA
Where,
NFIA is the difference between factor income received from abroad by normal
residents of India for rendering factor services in other countries and the factor
incomes paid to the foreign residents for factor services rendered by them in the
domestic territory of India.
The capital goods wear out or fall in value as a result of its consumption or use in
the production process. This consumption of fixed capital or fall in the value of fixed
capital due to wear and tear is called depreciation. So this depreciation is to be
deducted from GDP to get NDP. Therefore,NDP is the net market value i.e. after
providing for depreciation, all final goods and services produced by normal residents
as well as non-residents in the domestic territory of a country in a year. Therefore,
8
Institute of Lifelong Learning, University of Delhi
National Income Accounting
NDP= GDP-Depreciation
It refers to the market value of goods and services produced by normal residents of
a country in a year after providing for depreciation.
It is also known as National income at market price.
NNP= GNP-Depreciation
Or
NNP= GDP-Depreciation+ NFIA
9
Institute of Lifelong Learning, University of Delhi
National Income Accounting
A useful way to study the economic interactions among the four sectors in the economy is
through a circular flow diagram, which shows the income received and payments made by
each sector. The phenomenon of three methods of measurement of national income giving
identical results can be shown diagrammatically through the circular flow of money in the
economy.
10
Institute of Lifelong Learning, University of Delhi
National Income Accounting
Let‟s analyze the circular flow step by step. Households provide their services to the firms
and government and in return they get wages. The circular flow diagram above shows the
flow of wages in the household sector as a compensation for their services. Interest on
corporate and government bonds and dividends from firms is another receipt of the
households. Social security benefits, veterans‟ benefits, and welfare payments are also
received by some of the households from the government. These kinds of payments from
the government for which the recipient does not supply any good/service/labor are called as
Transfer Payments. All these receipts constitute the total income received by the
households.
Households pay out by purchasing goods/services from the firms and by giving taxes to the
government. These components constitute the total payments by the households. The
11
Institute of Lifelong Learning, University of Delhi
National Income Accounting
gap between the total receipts and total payments of the households is whatthey save/dis-
save. Savings are categorized as a „leakage‟ from the circular flow as they withdraw the
current income/purchasing power from the system.
Goods/services are sold to the households and the government by the firms. Revenues are
generated by these sales which are shown as a flow into the firm sector in the diagram
above. Wages, interest and dividends are paid by the firms to the households and taxes are
paid by the firms to the government. These expenses are shown as flows out of the firm
sector.
Taxes are collected by the government from the households and the firms. Government
makes payments also by purchasing goods /services from the firms, paying wages and
interest to the households, and by making transfer payments to the households. Households
expend part of their income on imports and rest on domestically produced goods/services.
GDP can be obtained by adding up the four major categories of expenditures of national
income accounts- Consumption, Investment, Government purchases of goods/services, net
exports of goods/services. According to the expenditure approach, GDP is measured as the
total spending on the final goods/services produced in the nation during a specified period
of time of the national income.
Symbolically,
= total income
=total expenditure;
C= consumption;
I= investment;
12
Institute of Lifelong Learning, University of Delhi
National Income Accounting
Y=C+1+G+NX.
1. Consumption.
2. Investment
Investment includes both spending for new capital goods, called fixed investment,
and increases in firms, inventory holding, called inventory investment. Investment,
in India, accounted for 37% of GDP in 2010. Fixed investment in turn has two major
components.
13
Institute of Lifelong Learning, University of Delhi
National Income Accounting
Much like the distinction between private-sector consumption and investment some
part of government purchases accounts for current needs (such as employee
salaries) as some is devoted to acquiring capital goods (such as office buildings).
14
Institute of Lifelong Learning, University of Delhi
National Income Accounting
4. Net Exports
Net exports are exports minus imports. Exports are the goods and services produced
within a country that are purchased by foreigners. . It is about 22% of GDP in 2010
in India.
Imports are the goods and service produced abroad that are purchased by a
country‟s residents, which was about 26% of GDP in 2010 in India. Net exports are
positive if exports are greater than imports and negative if imports exceed exports.
Exports are added to total spending because they represent spending (by foreigners)
on final goods and services produced in a country. Imports are subtracted from total
spending because consumption, investment, and government purchases are defined
to include imported goods and service. Subtracting imports ensures that total
spending C+I+G+NX, reflects spending only on output produced in the country.
According to the Income Approach, National Income is the summation of eight types of
income. It totals the income received by the producers inclusive of the profits and the taxes
payment to the government. The eight components are as follows:
15
Institute of Lifelong Learning, University of Delhi
National Income Accounting
6. Taxes on production and imports-It includes indirect business taxes such as sales
tax and excise taxes that are paid by businesses central to state, and local
governments, as well as customs duties and taxes on residential real estate and
motor vehicle licenses paid by households. These taxes have averaged about 7% of
GDP for the past 25 years.
16
Institute of Lifelong Learning, University of Delhi
National Income Accounting
In addition to the eight components of national income just described, three other items
need to be accounted for to obtain GDP:
Statistical discrepancy;
Depreciation; and
Net factor payments.
GDP, measured in rupee terms, is sum of value of the output produced in the economy, i.e.
sum of product of prices of different commodities produced and their respective quantities.
Nominal GDP = ∑piqi
Where,
pi= price of the ithcommodity
qi= quantity of the ithcommodity
The value of goods/services measured at current prices is usually called as nominal GDP by
the economists. Nominal GDP is not capable of reflecting accurately as to how well an
economy is able to satisfy the demands of households, firms and the government. If only all
the prices double and the quantities remain unchanged, then accordingly GDP would double.
But it would be misleading to state that the ability of the economy to satisfy the demands
has doubled as the quantities of every good-produced remains unchanged.
A better and more reliable measure to monitor the economy‟s well-being would be one that
would not be influenced by the changes in prices. Henceforth, real GDP is used by the
economists. Real GDP is the measurement of the value of goods/services using a constant
set of prices, i.e., it would tell us the affect on expenditure on output when only quantities
change and prices don‟t.
A real variable is an economic variable that is measured by the base year prices. The
physical quantity of the economic activity is measured by the real economic variables. Real
GDP measures the physical volume of an economy‟s final production, using the base year
prices. Nominal GDP measures the value of an economy‟s final output, using the current
market prices.
17
Institute of Lifelong Learning, University of Delhi
National Income Accounting
PRICE INDEXES
A measure of the average level of prices for some specified set of goods/services relative to
the prices in a specified base year is called as the Price Index. For instance, a GDP deflator
is a price index, which measures the overall level of goods/services included in GDP. It is
defined as follows:
The GDP deflator (divided by 100) is the amount by which nominal GDP must be divided,
or” deflated” to obtain real GDP. In our example, we have already computed nominal GDP
and real GDP, so we can now calculate the GDP deflator by rewriting the preceding formula
as:
GDP deflator deals with the average level of prices of goods/services that are included in
GDP. The CPI, Consumer Price Index, is available monthly. The Bureau of labor Statistics
constructs the CPI by sending people out each month to find the current prices of a fixed
list, or “basket” of consumer goods and services, including many specific items of food,
clothing, housing, and fuel. The CPI for the month is then computed as:
100*(Current cost of a basket of consumer items)/ (cost of the same basket of items in
reference base period).
SUMMARY
18
Institute of Lifelong Learning, University of Delhi
National Income Accounting
National income of the economy can be computed by either adding the expenditure
incurred by the residents in a year or by adding everybody‟s income. Either of the
two sums will yield same result as income of one person is expenditure of the other
and vice a versa.
Computations could be made easy by remembering following equations:
1. Net + CFC = Gross
2. EC+ NIT = MKT Price
3. Domestic + NFIA = National
In expenditure method, national income is computed by adding consumption (C),
government expenditure (G), investment (I) and net exports (NX) in a given
accounting year.
While calculating national income by income method, following components are
added:Compensation of employees, Proprietors‟ income, Rental income of persons,
corporate profits, Net interest, Taxes on production and imports, Current surplus of
government enterprises, and Business current transfer payments (net).
Real economic variables deal with the physical quantity of the economic activity
using base year prices. For example, real GDP is at constant prices i.e. it measures
physical production of this year at base year prices. In contrast, nominal GDP is
current rupee- GDP i.e. rupee value of an economy‟s final output, measured at
current market prices.
A price indexmeasures theaverage level of prices of a basket of goods/services
relative to the prices of the same basket in a specified base year.
EXERCISES
Q1. Which of the following items will be included while calculating GDP of India? Why and
why not?
19
Institute of Lifelong Learning, University of Delhi
National Income Accounting
Q1. What are major macroeconomic issues that each economy has to deal with? Explain it
with reference to Indian scenario.
Q2. What are the approaches to measuring economic activity? Why do they give same
answer?
Q3. List all the components of total spending. Why imports are subtracted when GDP is
computed in the expenditure method?
Q4. “For assessing growth performance of an economy real GDP is a better measure”.
Comment.
NUMERICALS
Q1. Calculate national income from expenditure method and gross domestic product at
factor cost by income method:
d. Change in stock 50
e. Exports 40
20
Institute of Lifelong Learning, University of Delhi
National Income Accounting
g. Subsidies 20
i. Imports 50
Q2. Calculate Gross National Product at market prices from the following data?
e. Net exports 10
21
Institute of Lifelong Learning, University of Delhi
National Income Accounting
f. Interest 4,000
g. Rent 4,176
i. Subsidies 1,348
Q4. From the following data, calculate national income and gross domestic product:
d. Profit 100
e. Interest 80
f. Rent 40
g. Royalty 20
Q5. Consider a three good economy and for this economy then, calculate nominal GDP for
year 1 and year 2 and real GDP for year 2 from the following given information:
22
Institute of Lifelong Learning, University of Delhi
National Income Accounting
GLOSSARY
Refernces
3. http://data.worldbank.org/indicator/NE.CON.PETC.ZS
23
Institute of Lifelong Learning, University of Delhi
National Income Accounting
24
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
Discipline Courses-I
Semester-I
Paper I: Principles of Economics(POE)
Unit-IV
Lesson: Money: Demand and Supply
Lesson Developer: Rakhi Arora and Vaishali Kappor
College/Department: Rajdhani College, University of Delhi
1
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
Table of Contents:
1. Learning outcomes
2. Introduction
3. What is money
a. Origins of money
b. Functions of money
4. Money demand
5. Money supply
6. Summary
7. Exercises
8. Glossary
9. References
2
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
Learning outcomes:
After you have read this chapter, you should be able to:-
a) Define money
b) Explain different types of money ever since its evolution
c) Notify the various functions of money in economy
d) Define money demand.
e) List various factors of demand for money
f) Define money supply
g) State various measures of money supply
INTRODUCTION
All currency notes in circulation across the world are promissory notes. For instance, a
hundred rupee note states that, “ I promise to pay the bearer the sum of 100 rupees” along
with the signature of RBI governor, which makes it a legal tender and therefore is widely
accepted for making transactions.
To facilitate the transactions in India, RBI prints currency notes of different denominations.
But how does RBI know how many currency notes would be sufficient to make all the
transactions in the economy? RBI anticipates the requirements for printing notes on the
basis of the money demanded by the people.
This chapter broadly covers the evolution of money, its functions and quantity theory of
money in section one. It discusses in section two, the reasons for demanding money and
factors determining money demand and in section three, it explains the different
measuresof money supply.
3
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
WHAT IS MONEY
The origins of money go far back in antiquity. Many primitive tribes seem to have made
some uses of it.
Unusual form of money developed in Nazi prisoner of war (POW) camps during
World War II. These prisoners were supplied with various goods like food,
clothing, cigarettes etc, but no attention to personal preferences was provided.
Then the endowments with the prisoners invoked them to trade with to trade with
each other to have a better bundle of goods.
1. Metallic money
Different commodities have been used as money at some or other time but gold and silver
proved to have great advantages over stones or other metals. The metals were carried in
bulk before coins would have been invented. When a purchase was made, the requisite
quantity of the metal was carefully weighed on a scale. The invention of coinage eliminated
the need to weigh the metal at each transaction, but it created an important role for an
authority, usually a monarch, who made the coins by mixing gold or silver with base metals
to create convenient size and durability, and the authority affixed the seal that acted as
guarantee for the amount of precious metal contained in coin. It was convenient as long as
4
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
everyone knew that the coin would be accepted at „face value‟.The face value was nothing
more than a statement that a certain weight goes gold or silver was contained therein.
However, coins often could not be taken at their face value. A form of counterfeiting i.e.
clipping a thin slice off the edge of the coin and keeping the valuable metal be came
common. This of course served to undermine the acceptability of coins even if they were
stamped. To get around this problem, the idea arose of minting the coins with a rough
edge; the absence of the rough edge would immediately indicate that the coin had been
clipped.
Some rulers were quick to seize the chance of getting a position to work a really profitable
fraud by ordering their subjects to bring their coins into the mint to be melted down and
coined afresh with a new stamp. Between the melting down and the recoining, however, the
rulers had only to toss some further inexpensive base metal in with the molten coins. This
debasing of the coinage allowed the ruler to earn a handsome profit by minting more new
coins than the number of old ones collected, and putting the extras in the royal vault.
Consider a fifty-fifty ratio of gold and cheap metal to be alloyed with it. Iffor instance,
subjects bought 50 coins to be minted; ruler will mix cheap metal in it and would have 100
minted coins: 50 to be returned and ruler will retain 50 such coins.
The result of this debasement was inflation. The subjects had the same number of coins as
before (50), and hence could demand the same quantity of goods. When rulers paid
Gresham’s law
In the mid of 16th century, when Queen Elizabeth I came to the throne of
England, the coinage was severely debased. To help trade, Elizabeth minted
new coins that contained full face value in gold. As soon as these new wins
were into circulation, they disappeared. Why?
Consider you possess one new & one old coin, each with same face value. If 5
Institute of Lifelong Learning, University of Delhi
you had to pay a bill you would use debased coin as you part with less gold this
way and if you wish to obtain certain amount of gold bullion by melting gold
Money: Demand and Supply
their bills, however, the recipients of the extra coins could be expected to spend them. This
causes a net increase in demand, which in turn bid up prices.
It was the experience of such inflation that led early economists to stress the link between
the quantities of money and the price level. This relationship is popularly known as the
Quantity Theory of Money (QTM)itwill be discussed later in this chapter.
To this day the revenue generated from the power to create currency is known as seignior
age. Seignior age was not normally revenue generated by debasement; originally it was an
explicit duty, or tax levied on the mint. In the modern context the possibility of debasement
does not enter, so the Seignior ageis applied to the revenue that accrues to government
from the power to print banknotes (since bank notes have very low production costs relative
to their face value) and from another source that is commercial banks are forced to place
non interest bearing deposits at the central banks.
2. Paper money
The next milestone in evolution of money was when paper currency evolved. The source of
evolution of paper currency was goldsmiths. Initially, public began to deposit their gold with
goldsmiths since they had secure safes. Goldsmiths used to give their depositors a
promiseto hand over gold whenever demanded. Whenever depositor was required to make
any large purchase, depositor would go to goldsmith, reclaim some of the gold deposited
and hand it over to the seller of the goods. If the seller had no immediate need for the gold,
he would carry it back to the goldsmith for safekeeping on his behalf.
6
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
But this seems illogical as initial depositor reclaimed gold and handed over to the seller, who
again deposited gold with the goldsmith. So why involve into risky business of physically
transferring the gold? As long as goldsmith is considered reliable and people had confidence
in goldsmith, the buyer only needed to transfer goldsmith‟s receipt to the seller. Then again
if this seller wishes to transact with the third party and he also finds goldsmith to be
reliable, this transaction could also be effected by passing goldsmith‟s receipt. This receipt
in this case was as good as transfer of gold itself.
When it came into being in this way, paper money represented a promise to pay so much
gold on demand. In this case the promise was made first by goldsmiths and later by banks.
Such paper money, which became bank notes, was backed by precious metal and was
convertible on demand into this metal.
Early on many gold smiths and banks discovered that it was not necessary to keep a full
ounce of gold in the vaults for every claim to ounce circulating as paper money. At any one
time some of the bank‟s customers would be withdrawing gold, other would be deposing it
and most would be trading in bank‟s paper notes without indicating any need or desire to
convert them into gold.
As a result the bank was able to issue more money (initially notes, but later deposits)
redeemable in gold it the amount of gold that it held in its vaults. This was good business,
because the money could be invested profit in interest-earning loans (often called advances)
to in visuals and firms. The demand for loans arose, as it does today, because some
customers wanted credit to help the over hard times or to buy equipment for their business.
To this day banks have many more claims outstanding against them than they actually have
in reserves available to pay those claims. We say that the currency issued in a situation is
fractionally backed by the reserves.
The major problem with a fractionally backed convert the currency was maintaining its
convertibility into the precious metal by which it was backed. It would be imprudent to issue
too much paper money, which is unable to redeem its currency in gold when the demand for
gold isslightly higher than proportionate. It would then have to suspend payments, and all
holders of its notes would suddenly find that the notes were worthless. However the prudent
bank that kept a reasonable relationship between its note issue and its gold reserve would
find that could meet a normal range of demand for gold without any trouble.
7
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
3. Fiat money
As time went on, note issue by private banks became less common and central banks,
which are (usually) state-owned institution, took control of the currency. Over time central
banks have assumed a monopoly in the provision of money (cash) to the economy. As a
result, they have the responsibility of controlling monetary conditions in the economy and
ultimately they determine the value of a nation‟s (or Group of nation‟s) currency.
Originally the central banks issued paper currency that was fully convertible into gold. In
those day‟s gold would be brought to the central bank, which would issue currency in the
form of gold certificate‟s the asserted that the gold as available on demand. The gold supply
thus set some upper limit on the amount of currency. However, central banks like private
banks before them, could issue more currency than they had in gold, because in normal
times only a small fraction of the currency was presented for payment at any one time.
Thus even though the need to maintain convertibility under a gold standard put an upper
limit on note issue, central banks had substantial discretionary control over the quantity of
currency outstanding.
In primitive societies, stone money of Yap (ontiny Micronesiar island of Yap) and
seashells (in America & New Guinea) played the medium of exchange function of
money. Prominent Economists like kenyes, Friedman and Mankiwgive these as
an example of fiat money. The reason for the same is that stone money of Yap
and seashells are considered not useful and not convertible and those do not have
any other legal status. But DrorGoldberger puts a question mark on whether these
ever existed as fiat money in their respective societies. Dror Goldberger explains
that stone money of Yap & sea shells would not be considered as example of fiat
money as these were intrinsically valuable to their primitive users and considering
them as fiat money would be equivalent to ignoring the fact that money then
circulated had esthetic value and it had a religions use too.
8
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
Almost all the countries abandoned the gold standard during the period from 1914-1928.
Then that the currencies were not convertible into gold, money derived its value from its
acceptability in exchange. Fiat money is widely acceptable because it is declared by
Government order or fiat to be legal tender. Legal tender is anything that by law must be
accepted than offered either for the purchase of goods or services or to discharge a debt.
Today almost all currency is fiat money.Fiat money is valuable because it is accepted by
convention and in law in payment for the purchase of goods or service and for the discharge
of debts.
Many people are disturbed to learn that present-day paper money is neither backed by nor
convertible into anything more valuable-that it consists of nothing but pieces of paper
whose value derives from common acceptance. Many people believe that their money
should be more substantial than this. Yet money is, in fact, nothing more than pieces of
paper.
FUNCTIONS OF MONEY
Money acts as a medium of exchange and can also serve as a store of value and a unit of
account.
1. Medium of Exchange
Goods would have to be exchanged by barter (one good being swapped directly for
another) in the absence of money. The major difficulty with barter is that each transaction
requires a double coincidence of wants; i.e. a great deal of time is required to search an
eligible person for a viable transaction.Thus a thirsty economics lecturer would have to find
a brewer who wanted to learn economics before he could swap a lesson in economics for a
pint of beer.
The severity of this problem could be reduced by using money as a medium of exchange.
Output could be sold for money and could be used subsequently to purchase the commodity
of requirement from others. So a monetary economy typically involvers exchanges of goods
and services for money and of money for goods, but not of goods for goods.
9
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
The double coincidence of wants, which is required for barter, is unnecessary, when a
medium of exchange is used.
By facilitating transactions, money makes possible the benefits of specialization and the
division of labor, which in turn contributes to the efficiency of the economic system.
2. Unit of Account
As a unit of account, money is the basic unit for measuring economic value. In India, for
example virtually all prices, wages, asset values, and debts are expressed in rupees. Having
a single uniform measure of value is convenient. For example, pricing all goods in India in
rupees -instead of some goods being priced in yen, some in gold and some in Microsoft
shares-simplify comparison among different goods.
3. Store of Value
To store the purchasing power, money is the most convenient way. The money taken in
exchange for the goods sold today may be stored until it is required.However, money must
have a relatively stable value to be a satisfactory store of value. A rise in the price level
leads to decrease in the purchasing power of money because more money is required to buy
a typical basket of goods. When the price level is stable, the purchasing power of a given
sum of money is also stable, when the price level is highly variable, this is not so, and the
usefulness of money as a store of value is undermined.
Money is usually held to buy goods and services. People hold more money when they need
more money for making transactions. Thus, the number of rupees exchanged in the
transactions is related to the quantity of money in the economy.
The link between transactions and money is expressed in the Quantity equation in the
following manner:
10
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
where,
T = the number of times goods/ services are exchanged for money in a year.
V = the velocity of money (i.e) the number of times money changes hands in a given year
Therefore, the right side of the equation tells about the transactions and the left side of the
equation tells about the money used to make transactions.
For example, suppose that 60 slices of cheese are sold in a given year at Rs.5 per slice.
Then,
Then,
V=(P*T)/M
=(Rs.300/year)/(Rs.20)
(i.e.) for Rs.300 of transactions per year to take place with Rs. 20 of money, each rupee
must change hands 15 times per year.
11
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
The quantity equation basically shows that a change in one variable must lead to a change
in one or more of other variables so as to maintain the equality i.e. the quantity equation is
an identity.
If
Value of output = PY
PY = nominal GDP.
Therefore,
M x V = P xY ………………………………..(2)
As Y is also total income, therefore, V in this version of the quantity equation is called the
Income velocity of money.
The income velocity of money tells us the number of times a rupee billenters someone‟s
income in a given period of time. We most commonly use this version of the equation.
When we express the quantity of money in terms of the quantity of goods and services, it
helps us analyze the affect of money on the economy. The amount M/P is called as the real
money balances, which measures the purchasing power of the stock of money.
12
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
For example, an economy produces only cheese. If M=Rs.20, P=Rs.5 per slice, then M/P=4
slices of cheese. That is, at current prices, money stock in the economy is able to buy 4
slices.
A money demand function shows what determines the quantity of real money balances
people wish to hold. A simple money demand function is:
(M/P)d = kY
Where,
k = A constant that tells us how much money people want to hold for every rupee of
income.
This equation states that the quantity of real money balances demanded is proportional to
real income.
The money demand function and the demand function of a particular good are alike. The
convenience of holding real money balances is our good under consideration. Owning an
automobile makes it easier for a person to travel. Similarly, holding money makes it easier
for a person to make transactions. Hence, it can be said that as higher income leads to a
greater demand for automobiles, similarly, higher income also leads to a greater demand
for the real money balances.
We get,
M/P =kY
M (1/k)=PY
MV=PY,
Where,
V = 1/k.
It shows the link between the demand for money and the velocity of money.
13
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
MV=PT………………………………………..(i)
But, Cambridge economists linked money to income via quantity theory of money.
Md= kPY…………………………………..(ii)
Money demand is a function of the nominal income ie PY. A fraction of this nominal
income is demanded by the public to be held as cash.
On comparing the two we find, that Y in equation (ii) is the physical quantity of
output ( real income) and so is equal to transactions is the equation (i) this yields that
V = 1/k or k= 1/V i.e. one is the reciprocal of the other.
When k is large i.e. people wish to hold a lot of money for each rupee of income then V is
For example, stock of money that people is wish to hold equals one – fourth of value
small i.e. money changes hands infrequently. On the contrary, when k is small i.e. people
of total income (transactions) thus k is 0.25 and V the reciprocal of k, is 4 . If money
wish to hold only little money then V is large i.e. money changes hands frequently.
supply iswe
Therefore, to can
be one – fourth
deuce of value
that money of transactions,
demand parameter keach
and rupee must
velocity be usedV on
of money are
negatively related
average four to each other.
times.
THE ASSUMPTION OF CONSTANT VELOCITY
On making the assumption of constant velocity of money, the quantity equation becomes
the quantity theory of money.
14
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
If the money demand function changes, the velocity does change in reality. For instance,
the average money holdings of the people were reduced when Automatic teller machines
were introduced. It means a good approximation is provided by the assumption of constant
velocity in various situations.
The assumption of constant velocity makes the quantity equation a theory of determination
of nominal GDP. The Quantity equation says:
𝑀𝑉 = 𝑃𝑌
Where
Therefore, a change in the quantity of money (M) must cause a proportionate change in
nominal GDP (PY) i.e., if V is fixed, the quantity of money determines the rupee value of the
economy‟s output.
The three building blocks that help us study the determination of the overall level of prices
are as follows:
1. The level of output, Y, is determined by the factors of production and the production
function.
2. The nominal value of output, PY, is determined by the money supply. This conclusion
is deduced from the Quantity equation and the fixed velocity of money.
3. The ratio of nominal value of output, PY, to the output level, Y, gives the price level.
What happens when the money supply is changed by the Central bank is explained by this
theory. Any change in the money supply causes proportionate change in nominal GDP as
the velocity is fixed. The change in nominal GDP gets represented in the change in the price
level as the factors of production and the production function have already determined the
real GDP. Therefore, the quantity theory of money states that the price level is proportional
to the money supply.
As the percentage change in the price level is nothing but the inflation rate, therefore, this
theory of price level is also a theory of the inflation rate. Consequently, the quantity
equation in percentage form is represented as follows:
15
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
This analysis states that the rate of inflation is determined by the growth in the money
supply except for a constant that itself is determined by the exogenous growth in the
output.
The amount of wealth what everyone in the economy wishes to hold in the form of money
balances is called the demand for money. Because people are choosing how to divide their
given stock of wealth between moneyand bonds, it follows what if we know the demand for
money we also know the demand for bonds. With a given level of wealth, a rise in the
demand for money necessarily implies a fall in the demand for bonds; if people wish to hold
1 billion of bonds. It also follows that if households are in equilibrium with respect to their
money holdings, they are in equilibrium with respect to their bond holdings.
Money is required for making most of the transactions. Consumers pass the money to the
firms to make the payment for the goods and services produced by them and firms pass the
money to the employees for the labor services supplied by them to the firms. Money
balances that are usually held for this reason are called as the Transactions balances.
An imaginary world in which the receipts and disbursements of consumers and firms were
perfectly synchronized, it would be unnecessary to hold transactions balances. If every
time a consumer spent l0 she received as part payment of her wages, no transactions
balances would be needed. In the real, world, however, receipts and payments are not
perfectly synchronized.
16
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
Consider the balances that are held because of wage. Suppose for purposes of illustration,
that firms wages every Friday and that employees spend all their on goods and services,
with the expenditure spread evenly over the week. Thus on Friday morning firms hold
balances equal to the weekly wage bill; on Friday the employees will hold these balances.
Over the week workers balances will be drawn down as of purchasing good and services.
Over the same the balances held by firms will build up as a result goods and services until,
on the following Friday firms will again have amassed balances equal to bill that must be
met on that day.
That determines the size of the transactions balances to hold? It is clear that in our example
total transactions very with the value of the wage bill. If the wage bill for any reason, the
transactions balances held and households for this purpose will also double, As it is with
wages, so it is with all other transactions the size of the balances held is positively related
to of the transactions.
The average value of money balances that people to hold over a particular period that is
relevant for economics, but we need to knows how money de and related to GDP rather
than to total transactions. In the value of all transactions exceeds the value of the final
output. When the miller buys wheat from the farmer and when the baker buys flour from
the miller, both are transactions against which money balances must be held, although only
the value added at each stage is part of GDP.
Generally there will be a stable, positive relationship between transactions and GDP. A rise
in GDP also leads to a rise in the total value of all transactions and hence to an associated
rise in the demand for transactions balances. This allows us to relate transactions balances
to GDP.
2. Precautionary Motive
Sometimes, unpredictably your vehicle breaks down or you are required to make an
impromptu visit to sick relative. At times like these, certain expenditures crop up out of the
blue. As a precaution against cash crises, when receipts are abnormally low or
disbursements are abnormally high, firms and individuals carry money balances.
Precautionary balances usually grant a cushion against the ambiguity about the timing of
17
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
the cash flows. The greater is the quantum of such balances, the larger would be the shield
against running out of cash balances due to provisional fluctuations in cash flows.
The seriousness of the risk of a cash crisis depends on the penalties that are inflicted for
beingcaught without sufficient money balances. A firm is unlikely to be pushed into
insolvency, but it may incur considerable costs if it is forced to borrow money at high
interest rates in order to meet a temporary cash crisis.
The precautionary motive arises because individuals and firms are uncertain about the
degree to which payments and receipts will be synchronized.
The precautionary motive, like the transactions motive, causes the demand for money to
vary positively with the money value of GDP.
For most purposes the transactions and precautionary motives can be merged, as they both
show that desired money holdings are positively related to GDP. Indeed, they both show
money being held in relation to transactions, either planned or potential.
3. Speculative Motive
People usually hold money because of its characteristics of an asset. Some money is usually
held by the individuals and the firms to be able to evade the inbuilt uncertainty in
variableprices of other financial assets. Money held for this reason is called as the
speculative balance. This motive was first analyzed by Keynes, and the classic modern
analysis was developed by Professor James Tobin, the 1981 Nobel Laureate in economics.
Any holder of money balances forgoes the extra interest income that could be earned if
bonds are held instead. However, market interest rates fluctuate, and so do the market
prices of existing bonds (their present values depend on the interest rate). Because their
prices fluctuate, bonds are a risky asset. Many individuals and firms do not like risk; they
are said to be risk –averse.
18
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
Wealth holders require balancing the extra interest income that could be earned by holding
bonds against the risk carried by bondsat the time of choosing between holding money and
bonds.At one extreme, if individuals hold all their wealth in the form of bonds, they earn
extra interest on their entire wealth, but they also expose their entire wealth to the risk of
changes in the price of bonds. At the other extreme if people hold all their wealth in the
form of money, they earn less interest income, but they do not face the risk of unexpected
changes in the price of bonds. Wealth holders usually do not take either extreme position.
They hold part of their wealth as money and part of it as bonds; (i.e.), they diversify their
holdings. The fact that some proportion of wealth is held in money and some in bonds
suggests that, as wealth rises, so will desire for money holdings.
Although one individual‟s wealth may rise or fall rapidly, the total wealth of a society
changes only slowly. For the analysis of short-term fluctuations in GDP, the effects of
changes in wealth are fairly small, and we will ignore them for the present. Specific
individuals may undergo large wealth changes in response to bond price changes, but with
inside wealth the total effect is negligible when leaders gain, borrowers lose; and when
lenders lose borrowers gain. Over the long term, however, variations in aggregate wealth
can have a major effect on the demand of money.
Wealth that is held in cash or deposits earns less interest than could be earned by holding
bonds; hence the reduction in risk involved in holding money carries an opportunity cost in
terms of forgone interest earnings. The speculative motive leads individuals and firms to
add to their money holding until the reduction in risk obtained by the last pound added is
just balanced ( in each wealth-holders view) by the cost in terms of the interest forgone on
that pound. A fall in the rate of return on bonds for the same level of risk will encourage
people to return on bonds for the same level of risk will encourage people to hold more of
their wealth as money and less in bonds. A rise in their rate of return for a given level of
risk will cause people to hold more bonds and less money.
The speculative motive implies that the demand for money will be negatively related to the
rate of interest.
19
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
We express the effects of the price level, real income, and interest rates on money demand
as
where
L= a function relating money demand to real income and the nominal interest rate.
Equation (5)states that nominal money demand, Md is proportional to the price level, P.
Hence, if the price level P doubles (given the real income and rate of interest) then, the
nominal money demand Md will become double, reinforcing the fact that real money
required to conduct the same real transactions will be twice.
Equation (5) also indicates that, for any given P, Md depends (through the function L) on
real income,Y and the nominal interest rate on non-monetary assets,i.An increase in real
20
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
income, Y, raises the demand for liquidity and thus increases money demand. An increase in
the nominal interest rate, i, makes non-monetary assets more attractive which reduces
money demand.
We could have included the nominal interest rate on money im in the above equation
because an increase in the interest rate on money makes people more willing to hold money
and thus increases money demand. Historically, however, the nominal interest rate on
money has varied much less than the nominal interest rate on nonmonetary assets (for
example, currency and a portion of checking accounts always have paid zero interest ) and
thus has been ignored by many statistical studies of equation thus for simplicity we do not
explicitly include im in the equation.
Md=PXL(Y, r+ e
) ………...…………………….(6)
e
Equation (6) shows that for any expected rate of inflation , increase in the real interest
rate increases the nominal interest rate and reduces the demand for money. Similarly, for
any real interest rate, an increase in the expected rate of inflation increases the nominal
interest rate and reduces the demand for money.
Nominal money demand, Md measures the demand for money in terms of rupees. If we
divide both sides of Eq. (6) by the price level, P, we get,
Md /P=L(Y, r+ e
). ………………………………….(7)
The expression Md /P is called real money demand or the demand for real balances. Real
money demand, Md /P depends on real income (or output), Y, and on the nominal interest
e
rate, which is the sum of the real interest rate, r and expected inflation, .The
function,L,that relates real money demand to output and interest rates in Eq. (3) is
calledthe money demand function.
1. Wealth:When wealth increases, part of the extra wealth may be held as money,
increasing, and total money demand. However with income and the level of transactions
held constant, a holder of wealth has little incentive to keep extra wealth in money rather
21
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
than in higher-return alternative assets. Thus the effect of an increase in wealth on money
demand is likely to be small.
2. Risk:Holding money isn‟t usually risky as it pays a fixed nominal interest rate (ZERO in
case of cash). However, the demand for safer assets including money may increase if the
risk of alternative assets such as stocks and real estate increases greatly.Therefore, money
demand in the economy increases with increased riskiness.
However, money doesn‟t always carry a low risk. In a period of erratic inflation, even if the
nominal return on money is fixed, the real return on money (the nominal return minus
inflation) may become quite uncertain, making money risky. Money demand then will fall as
people switch to inflation hedges (assets whose real returns are less likely to be affected by
erratic inflation) such as gold, consumer durable goods, and real estate.
22
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
Over the past three decades, economists have performed hundreds of statistical studies of
the money demand function. The results of these studies often are expressed in terms of
elasticities(income & interest elasticities), which measure the change in money demand
resulting from changes in factors affecting the demand for money.
(A) Income elasticity of money demandis the percentage change in money demand
resulting from a 1% increase in real income. Thus, for example, if the income elasticity of
money demand is 2/3, a 3% increase in real income will increase money demand by 2%
(2/3X3%=2%).
(B) The interest elasticity of money demandisthe percentage change in money demand
resulting from a 1% increase in the interest rate. If interest rate increases from 5% to 6%,
it is not 1% increase in interest rate rather it is 20% increase in the return. So this has to
be kept in mind while dealing with interest elasticity of money demand.
Measures of Money
If we take the value of all currency(including coins) held outside of bank vaults and add to it
the value of all demand deposits, traveler‟s cheques, and other checkable deposits, it is
defined as the, M1, or transactions money (i.e.) this is the money that can be directly used
for transactions to buy things. It is also known as narrow money.
A checkable deposit is any deposit account with a bank or other financial institution on
which a check can be written. Checkable deposits include demand deposits: negotiable
order of withdrawal (now)accounts, which automatically transfer funds from savings to
checking (or vice versa) when the balance on one of those accounts reaches a
predetermined level.
23
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
A. Monetary Aggregates
Weekly compilation
Fortnightly Compilation
M1= currency with the public + Demand Deposits with the banking system +
other’s deposits with the RBI currency with the public + current deposits with the
banking system+ demand liabilities portion of saving deposits with the banking
system + other deposits with the RBI
M2=M1+ time liabilities portion of saving deposits with the banking system +
certificates of deposit issued by banks + term deposits (excluding FCNR
(B)deposits) with a contractual maturity of up to and including one year with the
banking system = currency with the public + current deposits with the banking
system + saving deposits with the banking system + certificates includingFCNR
(B)deposits) with a contractual maturity up to and including one year with the
banking system + other deposits with the RBI
B. Liquidity Aggregates
Monthly compilation
L1=M3+ all deposits with the post officer saving banks (excluding national
savings certificates)
L2=L1+Term deposits with term lending institutions and refinancing institutions 24
(FIS) + Term borrowing byofFIs+
Institute certificates
Lifelong of depositofissued
Learning, University Delhi by FIIs.
L2=L1+ public deposits of non-banking financial companies.
Money: Demand and Supply
2. M2:
If we add near monies, close substitutes for transactions money, to m1 we get m2, called as
the broad money because it includes not–quite-money monies such as saving accounts,
money market accounts, and other money‟s.
On June 26, 2000, M2 was $4,778.2 billion, considerably larger than the total M1 of $
1,103.3 billion. The main advantage of looking at M2 instead of M1 is that M2 is sometimes
more stable. When banks introduced new forms of interest-bearing checking accounts in
the early 1980s, M1 shot up as people switched their funds from savings accounts to
checking accounts. However, M2 remained fairly constant because, the fall in saving
account deposits and the rise in checking account balances were both part of M2, canceling
each other out.
One of the very broad definitions of money includes the amount of available credit on credit
cards (your charge limit minus what you have charged but not paid) as part of the money
supply.
SUMMARY
Money is anything that serves as medium of exchange. Today almost all currency is
fiat money.Fiat money is neither convertible into anything nor has any face value but
it is yet valuable because it is accepted by convention and in law in payment for the
purchase of goods or service and for the discharge of debts.
Money acts as a medium of exchange and can also serve as a store of value and a
unit of account.
There is direct and one to one relationship between quantity of money and prices in
the economy. This is known as Quantity Theory of Money.
Public hold cash or demand money for three reasons: for making day-to-day
transactions, saving it for the bad days and as an alternative to holding bonds.
Money demand is function of real income (nominal income and prices) and interest
rate. Other factors that determine money demand are wealth, risk, liquidity of
25
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
EXERCISES
Q1. Which of the functions of money are satisfied(medium of exchange, unit of account and
store of value) by the following items?
a. Credit Card
b. Subway token
Q2. State whether you agree or disagree with the following statement and explain why:
a. If money supply increases by 10%, overall prices changes by less than 10%.
b. If an economy is experiencing inflation, then it is better to hold money balances.
c. Higher real income means a greater demand for money.
Q2. What is fiat money? How is it different from other types of money in early age?
Q4. Define velocity. Discuss the role of velocity in the quantity theory of money.
NUMERICALS
a. Suppose that P = 100, Y = 1000 and i = 0.10.Find real money demand, nominal
money demand and velocity.
b. If suppose, price level doubles from p = 100 to p = 200 find real money demand,
nominal money demand and velocity.
26
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
GLOSSARY
27
Institute of Lifelong Learning, University of Delhi
Money: Demand and Supply
REFERENCES
4. www.rbi.org
5. Dror Goldberg, Famous Myths of "Fiat Money",Journal of Money, Credit and Banking, Vol.
37, No. 5 (Oct., 2005), pp. 957-967
28
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
Discipline Courses-I
Semester-I
Paper I: Principales of Economics (POE)
Unit-III
Lesson: Credit Creation and Monetary Policy
Lesson Developer: Rakhi Arora and Vaishali Kapoor
College/Department: RajdhaniCollege, University of Delhi
1
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
Table of Contents
1. Learning outcomes
2. Introduction
3. Credit Creation
b. Reserves
e. Money multiplier
4. Monetary policy
b. Reserve requirement
d. Bank rate
5. Summary
6. Exercises
7. Glossary
8. References
Learning outcomes
After you have read this chapter, you should be able to:-
2
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
INTRODUCTION
Banks are the financial intermediaries in the economy whose primary task is the acceptance
of deposits and provisioning of loans. The questions that usually come to one’s mind is-how
all banks operate; who controls all the banks;whatquantum of accepted deposits is loaned
out; who decides all that; how does it impact economy?
Reserve Bank of India, RBI is an apex bank controlling all the operations of all the
commercial banks in the economy. RBI controls money supply & credit availability in the
economy. After the recession of 2008, RBI has been consistently loweringCRR& repo rate.
Why does RBI takesuch steps? The answer to this is, RBI injects money in the system but
one likes to know how does it materialize that?
This chapter is an attempt to answer all the above stated questions. It focuses on credit
creation by RBI in the economy in Section 1and the usage of different macroeconomic
tools to inject or eject money from the economy in Section 2.
CREDIT CREATION
Commercial banks are different from other financial institutions as they have the ability to
create credit in the economy. They accept deposits from public- a part of which is loaned
out and the remaining is conserved as deposits.Banks are in reality capable of providing
more loans than the amount of cash held by them. The questions that needtobe answered
are- what proportion of the total deposits of the bank is to be given as loans and what ratio
is to be preserved as cash by the bank; how can banks expand loans by more than the
quantity of cash they have; what mechanism is at work?
We would try to study the mechanism of credit creation in an economy in this section.
3
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
Reserve Bank of India, the central bank, controls money supply in India in two ways.
Firstly, RBI prints money and directly controls money supply in the economy and
Secondly, RBI uses monetary policy as a tool to control money supply indirectly.
Along with the Central Bank, it also depends on the Depository Institutions (i.e.)
Commercial banks and public thatholds money either as cash at hand or deposits in bank.
2006 peace Nobel Prize winner M.Yunus, in his effort to create economic & social development
from below, proclaims that credit is directly instrumental to economic development, poverty
reduction and improved welfare of all citizens, and hence credit should be a human right. Yunus
considers right to credit to be moral one, based on the fact that without access to opportunities
that credit can provide there is little chancethat the poor will be able to improve their position.
4
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
Reserve Bank of India comprises oftwodepartments viz. Issue Department & Banking
Department. Issue Department relates to the sole function of currency management.
Banking Department deals with rest of the banks in the country and provides an impact of
all functions of the Reserve Bank.
(Rs. Thousands)
Liabilities Assets
11022,063,648
circulation
11034,734,496 GOI Rupee securities 10,464,300
Total Notes
Issued
5
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
On the right hand side of the balance sheet arethe Reserve Bank’s assets- what it owns.
Its assets comprise of gold coins & bullion, foreign securities, rupee coins, government
securities & commercial paper. On the left hand side is RBI’s liabilities- what it owes to
others. Currency issued by RBI – either held by public or in the Banking Department is a
debt obligation of the RBI.
Likewise, in Banking Department’s balance sheet, assets are securities purchased &
investments made and notes held by it (Rs. 89,169 as shown in balance sheet of Issue
department). Liabilities comprise of reserve deposits. These reserve deposits are liabilities
of Reserve Bank and assets of commercial banks as these are deposit accounts at Reserve
Bank held by commercial banks.
For simplification, from hereon we will assume no difference between Banking and Issue
department and combined balance sheet will be considered for the two departments. Let us
set the following example that would be applied to almost any currency and Central bank.
Reserve deposits
200 Gold 100
6
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
*The above balance sheet contains selected items, which would be required for further
analysis.
The sum of reservedeposits and currency (including both currency held by public and vault
cash held by banks) is called as the monetary base or also known as high-powered money
denoted by H.
H = C+R……………………….…………….(1)
Where,
H is high-powered money
C is currency
R is Reserve deposits
Next, consider the balance sheets of all commercial banks in the private sector. Supposeall
banks are combined together and their consolidated balance sheet looks like the following:-
Loan 2700
7
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
Banksassets consist of vault cash & reserve deposits both of which appeared as liabilities on
central banks balance sheets plus loans that banks have extended to the public. Banks
liabilities consist of deposit accepted from the public. The money you deposit in your bank
account is your asset while a liability for the particular bank.
RESERVES
Out of total deposits of Rs.3000, banks kept Rs.200 as reserve deposits &Rs.100 as vault
cash to meet the demands for withdrawals by depositors. This is known as bank reserves.
It is 10 % of the total deposits. How one fixes this reserved deposit ratio of 0.10
(=300/3000)?
Why don’t banks keep entire deposits as reserves?Depositors can write cheques of the
amount equivalent to their deposit money or withdraw the entire deposit money. If banks
reserve the entire deposit money, banks are said to be following 100% reserve banking.
But banks anticipate the withdrawal demand by all sorts of depositors and then what
amount would be held as reserve deposits is decided. For example, there are three
depositors viz.A,B& C and each havethesame amount of deposits in their respective
accounts. A withdraws his entire salary every month, B withdraws half of his salary &C
withdraws none. In this case, Banks would decide 0.50as reserve deposit ratio so as to
meet the requirement of their depositors. In this case, a generalization for all customers is
made & then rest of the money is lent out by banks.
Bank Runs
If suppose there is a spread of rumor that a bank would not be able to honor cash
requirement of their depositors; then all depositors would rush to the bank so that they do
not lose on their money. They do not want to lose on their money. Since this is known as
8
run on banks follow fractional reserve banking system; they would not be able to actually
Institute of Lifelong Learning, University of Delhi
honor all withdrawal requirements. Final outcome would be panic in the economy. Bank
runs were evident at the time of great depression 1929 & Great Recession 2008.
Credit Creation and Monetary Policy
As in our example, reserve deposit ratio is 0.1, which is less than 1; this is known as
fractional reserve banking system.Every bank follows fractional reserve banking deposit
because keeping 100 % of their deposits would mean they perform a function of safe vault
and would earn no profit or a very low profit of central bank given them some interest rate
on such reserve deposits. And in an economy such reserve deposit ratios are set by central
bank of an economy.
Suppose there are only private banks is an economy that follows fractional reserve banking
system with reserve deposit ratio of 0.10. Suppose one of the banks, i.e.bank A accepts
deposit equivalent to Rs. 100 & keeping reserves of 10 %; bank loans out the rest.
Therefore, Bank A’sBalance Sheet would look like the following:
9
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
90
Now this Rs.90, which is loaned out to anybody in the non-bank public is deposited in
borrower’s bank, say, bank B.
Bank B’s balance sheet after accepting deposit and lending out money to public after
keeping reserves would appear like the one below:
Loan 81
This process of credit expansion will continue, as now this Rs.81 would be deposited in next
borrower’s bank & so on. Let’stry to figure out what will be the amount of deposits & loans
in the end?
10
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
: : : :
Total 1000 100 900
+0.4(0.4x0.9x100)+…………)
= Rs.1000
= (90+0.9x90) +0.9(0.9x90)
= 900
=Rs. 100
11
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
One bank in a multi-bank system cannot produce a large multiple expansion of deposits
based on an original accretions of each when other banks do not also expand their deposits.
In the banking system in this example, a multiple increase in deposit money is created
when all banks with excess reserves (i.e. money left after keeping a required reserve ratio
of 0.1) expand their deposits in step with each other.
In the above setup, the implicit assumption was that depositor does not wish to hold cash
out of the deposits. But in reality, public wishes to hold a proportion of cash, say, equal to
10 percent of the size of its bank deposits. How does this impact the process of credit
creation?
As we already know, high powered money, is sum of currency & reserves (R)
H = C+R ……………………………………………………….(1)
which means, that total cash is either held by the banks or the public. Let required reserve
deposit ratio be r. Then,
R=rD…..…………………………………………………….(2)
Where,
C = bD ………………………………………………………….(3)
H = bD+rD
12
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
D= H/b+r ………..…………….…………………………………(4)
In Equation (4) deposit multiplier becomes1/b+r (in case of cash drain), which is to be
compared to previous deposit multiplier, 1/r where cash deposit ratio was assumed zero. In
equation (4) if cash deposit ratio is assumed to be ZERO, deposit multiplier again becomes
1/r. A positive value of b lowers the increase in deposits, as it is cash drained out of
expansion process.
MONEY MULTIPLIER
M = C+D …………………………………………………………….(5)
M=bD+D{from (3)}
M=D(1+b)
M= (1+b) / (b+rH)
1+𝑏
Where𝑏+𝑟 is the money multiplier.
𝑀3 Rs .71,986.8 billion
So Money Multiplier, m= = = 5.0693(approx.) 13
𝑀0 Rs .14,200.5 billion
Institute of Lifelong Learning, University of Delhi
The following graph shows the money multiplier for the period April 2008 to April
Credit Creation and Monetary Policy
The size of the money multiplier is greater, the smaller is the banks reserve deposit ratio,r
and the smaller is the cash deposit ratio, b. Both b & r are the drains in the deposit or credit
expansion process.
MONETARY POLICY
14
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
Monetary Policy is the policy of the Central bank of an economy that deals with the quantity
of money to be supplied in the economy.Monetary Policy is an important tool to affect macro
economy. Money supply has a direct one to one relationship with prices in the economy
(result of quantity theory of money), which has an implication that, if Central Bank wishes
to contain inflation rate in the economy, it can be achieved with the help of changing the
monetary base of the economy.One of the primary objectives of monetary policy is to
contain inflation rate.
Considering money supply and money demand as a function of interest rates, money
demand slopes downward to the right while money supply is vertical. Money demand is
negatively related to the interest rate as was observed is last chapter. Money supply is
determined by central banks decision of high – powered money so it is fixed at some given
level, irrespective of interest rates. For this consider the following figure.
15
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
If Central Bank decides to increase the money supply in the economy; then m shifts to the
right from m0 to m1 and equilibrium interest falls from ro to r1 as shown in the above
figure.
This fall in interest rate induces investment in the economy. From our knowledge from
chapter on National Income Accounting; investment is a part of National Income is known.
So as money supply in an economy expands, interest rate falls which induces investment in
the economy and henceforth national income increases. So this could be the second
objective of Monetary Policy.
As discussed in the last section of money multiplier, money supply is determined by three
factors: H (High powered Money), r (reserve deposit-ratio) and b (cash deposit ratio).
Central bank can change the monetary base of the economy or could change the
requirement for reserve deposit.
16
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
Central banks control money supply in the economy through the following policy
instruments:-
If RBI purchases securities from private investors, then theygetcurrency or deposit with
them as a result of this transaction, which means that it increases the monetary base and
thus the money supply. This purchase of assets is known as open market purchase.
Thesale of assets to the public by the Central bank is called as the open market sale. It
reduces the monetary base and the money supply. Open market purchases and sales
collectively are called as open –market operations.
For example, if RBI purchases assets worth Rs.100cr. then monetary base increases by
Rs.100 cr. Assuming a money multiplier of 10, total money supply increases by Rs.1000cr.
in the economy due to RBI’s open market purchases.
17
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
Panel 1:
Liabilities Rs. Assets Rs. Liabilities Rs. Assets Rs. Liabilities Rs. Assets Rs.
Total 100 Total 100 Total 100 Total 100 Total 5 Total 5
Panel 2 :
Liabilities Rs. Assets Rs. Liabilities Rs. Assets Rs. Liabilities RS. Assets Rs.
Panel 3 :
Liabilities Rs. Assets Rs. Liabilities Rs. Assets Rs. Liabilities Rs. Assets Rs.
Let us look at Table1 to understand how open market operations affect money supply in the
economy. In panel 1,Central bank hasRs.100 of government securities. Its liabilities
consist of Rs.20 of deposits and Rs.80 of currency. With required reserve ratio of 0.2, Rs.20
of reserves can support Rs. 100 (=20/0.2) of deposits in commercial banks. Panel 1 also
shows Shyam’s financial position.
Now imagine that central bank decides to make open market sale of securities worth Rs. 5
to private investor Shyam. Shyamwrites a cheque to the Central bank to complete this
transaction. Central bank’s reserves are reduced byRs. 5. (& reserves of commercial banks
too). Such changes are shown in Panel 2.
18
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
The story doesnot end here. Since reserves are reduced to Rs.15 which now could support
deposits of Rs.75 (=15/0.2), the final equilibrium of loans have been reduced to
Rs.60.Banks don’t call in loans but rather loans and deposits would be reduced by slowing
down the rate of new lending as old loans come due and are paid off. Deposits have
changed by Rs.25(from Rs.100 to Rs.75). In this example, change in money (Rs25) is
equal to Money multiplier (5) times the change the reserves (Rs.5). Money supply defined
by sum of deposits and currency decreased from Rs.100 to Rs.155.
Changes in the reserves (discussed in the last section) bring changes in the money supply.
When any Central bank changes the required reserve ratio in the economy, money
multiplier changes and henceforth money supply changes.
Suppose central bank announces that required reserve ratio is reduced from 20 percent to
12.5 percent. The changes in the money supplies are shown in table2.
Initially, when required reserve ratio is 20%, the balance sheets of central bank and
commercial banks are shown in Panel 1 in Table 2. When required reserve ratio is lowered
to 12.5%, then out of Rs.500 of deposits only Rs.62.5 might be kept as reserves and extra
Rs.37.5 must be lent out which again creates deposits of Rs.37.5 times the money
multiplier (8) i.e. deposits of Rs. 300 more are created.
19
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
So new deposits that could be supported with 12.5% required reserve ratio becomes Rs.800
and reserves equal 12.5% of deposits (Rs.800) i.e.Rs.100. Money supply has increased
from Rs.600 (Rs.100 currency R.500 deposits) to Rs.900 (Rs.100 currency and Rs.800 of
deposits).
Cash Reserve Ratio, CRR is the amount of funds that the banks have to keep with the
RBI(Central bank of India). Statutory liquidity ratio, SLR refers to the amount that
commercial bank requires to maintain in the form of gold or govt. securities before
providing credit to customers.
Bank Rate
20
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
Bank rate, also referred as Discount rate, is the rate of interest, which a central bank
charges on the loans that it advances to the commercial bank. When banks borrow, money
supply increases. Central banks’ lending of money to banks is called discount window
lending. The higher the discount rate, the higher the cost of borrowing and the lesser the
borrowings that the banks would want to do. If central bank wants to curtail the growth of
money supply, it can raise the discount rate and discourage banks from borrowing from it,
restricting the growth of reserves (and ultimately deposits).
CRR : 4%
SLR : 23%
Repo is a repurchase agreement, is the sale of securities to central bank together with an
agreement for the commercial banks to buy back the securities at a later date. The
repurchase price should be greater than the original sale price, the difference effectively
representing interest is called repo rate. Reverse repo is the sale of securities by commercial
banks together with an agreement for the central bank to buy back securities at a later
21
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
date. An increase in reverse repo rate can prompt banks to park more funds with the central
bank to earn higher return on idle cash. It is also a tool, which can be used by the central
bank to drain excess money out of banking system.
SUMMARY
Banks create money by making loans. When a bank makes a loan to a customer, it
creates a deposit in that customer’s account. This deposit becomes part of money
supply. Banks can create money only when they have excess reserves and credit
creation process is successful only when all banks loan out their excess reserves.
Money supply in the economy is determined by monetary base times the money
multiplier. Money multiplier is equal to 1/ required reserve ratio.
Central bank pursues monetary policy and controls money supply in the economy.
Central banks can either monetary base or the multiplier by its policies.
Central banks have following tools to control the money supply: (1) through Open
Market Operations (the buying and selling of already existing government
securities); (2) by changing the required reserve ratio( reducing this ratio increases
multiplier); (3) by changing discount rate (raising discount rate decreases money
supply) and (4) by changing repo and reverse repo rate.
EXERCISES
Q2. Decide on whether RBI has taken correct step as per the requirement or not? What
would be the outcome?
22
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
b. During period of rapid real growth, RBI should inject money in the economy.
Q2. What are the ways in which a central bank can influence the money supply?
Q3. What would happen to money supply if general public chose to hold (a) no cash, (b)
no bank deposits?
Q4. What is money multiplier? What all factors determine its value?
Numericals
Reserves 200
Total 350 Total 350
Calculate:
23
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
GLOSSARY
Bank Reserves: Liquid assets held by banks to the demands for withdraws by
depositors are called Bank reserves.
Reserves deposits ratio: Fraction of banks outstanding deposits that is kept as
reserves is known as reserve Deposits ratio.
Fractional Reserve Banking: If reserve deposit ratio is less than 1 ie reserves are a
fraction of deposits then such a banking system is known as fractional reserve
banking.
Money Multiplier:Money multiplier is the multiple by which the total supply of money
can increase for every unit increase in reserves. The money multiplier is equal to 1/
required reserve ratio.
Open market Operations: If central bank purchases from or sells to, private investors
in the economy; money supply increase or decreases, respectively. These open
market purchases and sale collectively is known as open market operations.
Cash Reserve Ratio: Cash Reserve Ratio is the amount of funds that the banks have
to keep with central bank
Statuary Liquidity Ratio: Statutory liquidity ratio refers to amount that commercial
bank requires to maintain in form of gold or govt. securities before providing credit
to customers.
Repo Rate: Repo (Repurchase) rate is the rate at which the RBI lends shot-term
money to the banks.
Reserve Repo Rate:Reverse Repo rate is the rate at which banks park their short-
term excess liquidity with the RBI.
Bank Rate:Bank rate is the rate of interest which a central bank charges on the loans
and advances to a commercial bank.
REFERENCES
24
Institute of Lifelong Learning, University of Delhi
Credit Creation and Monetary Policy
3. www.epw.in
4. www.rbi.org
5. MarekHudon, Should Access to Credit Be a Right?, Journal of Business Ethics, Vol. 84, No.
1 (Jan., 2009), pp. 17-28
25
Institute of Lifelong Learning, University of Delhi
Inflation and its social costs
Discipline Courses-I
Semester-I
Paper I: Principales of Economics (POE)
Unit-IV
Lesson: Inflation and its Social Costs
Lesson Developer: Rakhi Arora and Vaishali Kapoor
College/Department: RajdhaniCollege, University of Delhi
Table of Contents
1. Learning outcomes
2. Introduction
3. Inflation
b. Measuring inflation
c. Indicators of inflation
a. Expected inflation
b. Unexpected inflation
5. Hyperinflation
b. Costs of hyperinflation
6. Summary
7. Exercises
8. Glossary
9. References
Learning outcomes
After you have read this chapter, you should be able to:-
Introduction
The financial year 2010-11 started with a headline of 11% inflation in April 2010. And
inflation has been a disturbing issue in the Indian economic scene in the last few years.
Why all economists and policymakers are concerned about curbing inflation in the economy?
Even a layman understands that inflation i.e. rise in prices, makes him poorer. Inflation can
be seen as a devaluing of the worth of money or fall in the purchase power of the
consumer.
Inflation is universal because money supply needs to be raised with time, cost of inputs and
wages increases. Is it normal? When does inflation become a cause of worry? We have
discussed in last chapter that central bank has macroeconomic tools to contain inflation.
But when would it be required?
Inflation
Inflation is a persistent increase in the general level of prices. Milton Friedman, the Nobel
Prize winning economist said: “inflation is always and everywhere a monetary
phenomenon”. By saying this he meant that inflation always moves up as and when money
supply is more than the growth of economy for a period of time. Inflation has also been
defined as “too much money chasing too few goods”. This is what monetarists think. But
inflation rate may be as a result of demand-pull and / or supply shock. Let’s study the
various other causes of inflation.
Causes of inflation
Prices may rise due to excessive demand. An economy can experience excessive
demand due to increase in the population, high rate if investment- which is demand
of capital goods, increase in government expenditure and also due to increasing role
of black money.
Price rise could occur due to increase in particular prices or wage rates being passed
round the economy. These include rise in wages, profit margins & increasing costs of
inputs. Also, fresh taxation also raises the price level.
Crude oil price changes are reason for concern as it affects the prices of other
commodities in many ways. Prices of following goods changes due to crude oil
price change.
c. Structural rigidities
d. Nature of Economy
Measuring Inflation
Inflation plays a vital role in economic policy making, as well as individual decision making.
Consumer Price Index, (CPI) is preferably used as a tool to measure of inflation by
economists, policy makers and consumers. The following exercise will help us to understand
how CPI is constructed and it will make a distinction between the level of a variablei.e. CPI
and the rate of change in the variable i.e. inflation rate. It also helps us to understand what
bias in a variable CPI would be considered?
Consider, seven periods viz. base period and then period 1 to 6. Suppose ina base
period,you haveRs. 30 to be spent on three goods viz.Good 1, Good 2 and Good 3 each
costing Rs.5. Further suppose that aggregation and averaging gives the constituents of a
typical basket of a consumer, which is assumed as follows: 3 units ofGood1, 2 units of Good
2 and 1 unit of Good 3.whereas , good 4 is introduced in period 5.
The following table provides the data on price change of three goods in the economy:-
Cost of basket will now be calculated with varying prices in each period. Now Price Index
can be constructed by dividing the cost of the goods and services in the representative
basket in the current period by the cost of the same representative basket in the base
period and then multiplying it by 100. This gives the following results:-
Price Index & inflation rates for basket of 3 goods computed for six periods
In the construction ofPrice Index (in year 5 & 6), we ignored good 4, which was introduced
in period 5 in market. Similarly Consumer Price Index (CPI) also ignores the introduction of
new products while constructing price index.
Also, a fixed basket was assumed throughout which might not be the case in real life.
When price of good 2 increased in period 1, consumer might substitute Good 2 for Good 1
or Good 3. Does CPI’s limitation of substitution bias lead to overestimation of inflation.
Another important thing to learn from this example is that as prices sometimes decline
between periods like from period 1 to period2: so one gets a negative inflation rate. Fall in
prices is referred to as “Deflation”.
Indicators of inflation
WPI (Whole sale price Index) represents the rate of increase in the wholesale prices of
products. However, what matters to the common man is the consumer price. Though
prices in the wholesale market grow at slower pace about 2-3 percent,the consumer prices
measured in terms of CPI grow at a much faster pace ( about 8-9) percent.
The way the two indices are calculated differ, both in weightage assigned to products as
well as the kinds of items included in the basket of products.
At present, there are five different price indices namely, wholesale Price Index (WPI),
Consumer Price Index for Industrial work (CPI-IW), Consumer Price Index for Urban Non-
Manual Employees (CPI-UNME), the Consumer Price Index for Agricultural laborers (CPI-AL)
and CPI for Rural Laborers (CPI-RL). CPI-IW is the most well known of the consumer price
indicators as it is used for wage indexation.Wholesale Price Index has always continued to
be the most prominent of the headline inflation in the Indian economy because of its weekly
availability. It is an economy wide index, which covers close to 676 commodities.
If one looks at the table below, the value for two indices- WPI and CPI-IW& that of GDP
deflator has been on a consistent rise since 1999- 2000. The figures in parentheses are all
positive indicating that Indian economy is experiencing inflation in the last few years.
In India, inflation is due to both cost-push & demand pull factors. Due to drought of 2002
or bad weather conditions or like in petroleum prices, India experienced inflation in last
decade. According to Economic Survey of 2007-08, inflation in India is a structural as well
as monetary phenomenon.
WPI has been on a rise since the first half of 2006. The futures trading system especially in
products like cereals, pulses, milk, sugar & edible oils; is being blamed for the same.CPI–IW
remained high even in 2009-10 ( in double digits from July 2009 to July 2010) . The major
contributors to high CPI –IW inflation were food & housing.
A new series of the wholesale price index with 2004-05 was released on September 14, 2010. A
comparison of the weighting diagram and number of commodities between old and new series for
the groups are drawn in the table below:
Core Inflation
Core inflation is a measure of inflation, which excludes those items that have volatile price
movement, especially food and energy. Therefore, it is a preferred instrument for designing
long-term policy. Core inflation, which was 0.55 percent in November 2009, reached its
peak in April 2010 at 8.07 per cent.
Cost of Inflation
An economy which is experiencing inflation has to bear many costs and policymakers,
economistsand especially politicians are concerned to make arrangements and take steps to
curb inflation because of public pressure.Inflation is keenly watched and widely debated by
all the stakeholders in the economy as it is considered to be a serious economic problem.
Let’s study what are the costs that an economy has to face in advent of inflation.
If suppose, every week prices rising by half percent. What would be the cost of such
predictable inflation?
When there is inflation it seems at first that now you would be able to command
lesser number of goods. But is it really true? If you pay higher prices for the goods
and services then the seller gets higher income and so do you when you charge
higher price. So now it seems that if nominal incomes keep pace with inflation rate
then fall in purchasing power is just a fallacy. Therefore, inflation itself does not
lower real purchasing power of the consumer.
When an economy faces inflation, value of money is eroded. To save on that, public
chooses to keep money in the banks. But how is it ensured that money is not losing
its value.The solution to this isthe interest rate offered on the deposits one’s make.
The nominal interest rate is at which people pay/receive interest payments to/from
the commercial banks. The real interest rate is adjusted nominal interest rate for the
effect of inflation in order to tell usat what pacethe purchasing power of our
deposited money is growing or at least is not eroding.
Inflation creates cost on the public with regard to distortion in the amount of money
they should hold. A higher inflation leads to higher interest rate via fisher effect and
also lower real money balances. Then people will hold lower money balance on an
average and this would mean they would make frequent trips to the bank to
withdraw money. They might withdraw Rs.1000 instead of Rs.2000 once a week.
This cost of wearing out of one’s shoes (while making frequent trips to banks) is
metaphorically called the shoe leather cost of inflation.
3. Menu costs
Inflation also arises because high inflation causes firms to bring changes in their
prices printed in menu cards more often. This procedure iscostly as it requires print
and distribution of a new catalog. These costs arose due to high inflation are called
10
menu costs, because the firms often revise the price list in their menu cards
whenever the rate of inflation is high.
Another factor which add to the inflation because some provisions in the tax code do
not consider the effects of inflation. One of the classic examples of this is when tax
laws fail to deal with inflation in case of tax on capital gains. Suppose you buy a
stock today for say Rs.100 and sell it a year from now for Rs.115. It seems
reasonable for the government to tax your capital gain of Rs.15(Rs.115-100).
Suppose again that your economy has inflation rate of 15% over the same period.
Then, in that case, you have not earned any real income from this investment. But
tax code fails to take into account the effect of inflation and government levies a tax
on nominal rather than real income earned. This is how; inflation distorts tax
imposition and individual’s liability.
Inflation arises due to the fact that since firms face menu costs; they change prices
frequently, which brings variation in relative prices. For example, McDonalds revises
its menu prices in the month of January every year. If the rate of inflation is zero,
then the firm’s prices relative to the overall price level are constant over the year.
But if inflation is 0.5 percent per month, then at the end of the year firm’s relative
prices fall by 6 percent. Firm’s prices would be relatively high early in the year and
sales tend to be low. Prices would be relatively low later in the year and sales tend
to be high. Hence, inflation not only brings variability in relative prices but it also
allocates the resources inefficiently.
6. Inconvenience
Another cost of inflation is the inconvenience of living in a world where prices are
changing and brings changes in the value of rupee. Money is used as a yardstick for
measuring economic transactions and therefore, when an economy experiences
11
Effect of unexpected inflation in terms of costs is more destructive than anticipated and
regular inflation. Unexpected inflation leads to arbitrary redistribution of wealth in an
economy. It can be understood better by seeing how it works by examining long-term
loans. Largely most loan agreements have a fixed nominal interest rate, which is sum of
real interest rate, and an expected rate of inflation for the same term period. If inflation
turns out to be different from what was expectedby both the parties then the ex post real
return that the debtor pays to the creditor is different from what both parties expected. The
debtor gains and the creditor looses if inflation is more than expected and inversely if
inflation is lower than expected, the creditor gains and the debtor looses. Suppose loan
agreement states that a sum of Rs.100 is provided at the rate of 10% (rate of expected
inflation) for a year. Suppose actual inflation turns out to be 15%, debtor gains as he/she
repays the loan with less real amount. On the other hand, if inflation turns out to be 5%,
creditor gains because the repayment is worth more than expected in real terms.
The free silver movement, the election of 1896, and the Wizard of Oz
The redistributions of wealth caused by unexpected changes in the price level are often
a source of political turmoil, as evidenced by the Free silver movement in nineteenth
century. From 1880 to 1896 the price level in the United States fell 23 percent. This
deflation was good for creditors, primarily the bankers of the Northeast, but it was bad
for debtors, primarily the farmers of the south and west. One proposed solution to this
problem was to replace the gold standard with the bimetallic standard, under which
12
both gold and silver can be minted into coins. The move to bimetallic standard would
increase the money supply
Instituteand stop theLearning,
of Lifelong deflation.
University of Delhi
The silver movement dominated the presidential election of 1986. William McKinley,
the Republican nominee, campaigned on a platform of preserving the gold standard.
Inflation and its social costs
Individuals with fixed pensions are also hurt by unexpected inflation. Since, workers and
firms decide on a fixed nominal amount of pension to be given when the worker retires. As
explained in our previous example, worker loses when inflation is high because he fixed
pension that has a lower worth when he retired. Like any debtor, firm will be looser if
inflation is less than anticipated.
Given the impact of inflation on the position of a debtor and creditor; it is confusing that
contracts in nominal terms are still widespread. One might expect some sort of indexation
to the changing price level. In the economies where inflation is high and volatile, indexation
is prevalent. Hence, loans are made available at floating interest rate than at a fixed
interest rate.
Hyperinflation
When inflation surpasses the benchmark of50 % per month i.e. approximately a little above
1% per day, it is termed as Hyperinflation. This high rate of inflation when amalgamated
13
over several months becomes a source of significant increases in the level of prices.
Therefore, it can be said that a 50% inflation rate per month would imply above 100 times
increase in the level of prices over a year and further 2 million times increase over 3 years.
Excessive growth in the money supply causes hyperinflation. The price level immediately
rises when money is printed by the Central Bank. And hyperinflation results when it prints
money speedily. A condensing of the rate of money growth by the Central Bank can stop
hyperinflation.
Whenever the government faces budget deficit, it seeks to borrow but fails to do so as the
lenders consider the government as a bad credit risk. It then resorts to deficit financing to
cover up the budget deficit which consequences into speedy money growth and
hyperinflation.
Fiscal problems get rigorous with the advent of hyperinflation. Real tax revenue falls due to
delays in tax collection and consequently inflation rises. Thus, the government’s reliance on
seignior age is self- reinforcing. Fast money creation causes hyperinflation, which results
into higher budget deficits, and consequently more speedily money is created.
The government assembles the political will for the reduction in the spending of the
government and tax increase when the scale of the trouble becomes evident. These
suggested fiscal reforms spot to the reduction in the requirement for seignior age, which in
turn permits the reduction in money growth. It can be, therefore, said that if inflation is
forever a monetary phenomenon then the conclusion of hyperinflation is time and again a
fiscal phenomenon.
It is a unanimously accepted fact that hyperinflation takes a high toll on the society. The
costs of extreme inflation are similar to that of hyperinflation. It’s just that due to the
severity of the costs of hyperinflation, they are more noticeable.
A great amount of time and energy is devoted by the business executives towards cash
management. They are forced to divert this time and energy from more socially important
14
activities such as production and investments decision when cash loses its value quickly i.e.
the economy runs less efficiently during hyperinflation. In the nutshell, we can say that the
shoe leather costs associated with reduced money holdings are very severe under
hyperinflation.
Menu costs become significant at the times of hyperinflation, as firms are required to
change the prices frequently. Regular business practices of printing and distribution of
catalogs with fixed prices become unfeasible. For instance, once in 1920’s in Germany, a
waiter in a restaurant had to call out new prices every half hour on every table at the time
of hyperinflation.
In a similar fashion, at the times of hyperinflation, relative prices also don’t reveal the
exactshortage. It gets very complex for the customers also to shop for the best price as
prices fluctuate significantly and recurrently. Consumer’s behavior also gets distorted in a
variety of ways due to extremely volatile and fast expanding prices.
Finally, one should learn to live with the hassle of life with hyperinflation. The existing
monetary system is not executing its best to facilitate exchange as it is equally troublesome
to carry money to the grocery store as it is to carrying the groceries back home. The ready
solution accomplished by the government is to add more and more ZEROS to the paper
currency but it has failed to keep pace with the out bursting price level.
In due course, the costs of hyperinflation become unendurable as the functions of money as
a store of value, medium of exchange and unit of account get defeated. Barter replaces
money as a common medium of exchange and more stable unofficial monies cigarettes
replace the official money.
Summary
The costs of expected inflation include shoe leather costs, menu costs, cost of tax
distortions, relative price variability and inconvenience of making inflation
corrections. In addition, unexpected inflation causes arbitrary redistributions of
wealth between debtor and creditor.
Hyperinflations usually initiate when government resorts to deficit financing to cover
up its budget deficits. The severity of most of the costs of inflation enhances during
hyperinflation.
Glossary
Exercises
Q1. In a country experiencing a low rate of inflation it is quoted from a newspaper: “low
inflation has a downside: 45 million recipients of social security and other benefits will see
their checks go up by just 2.8 percent next year.”
16
a. Why does inflation lead to increase in social security and other benefits?
b. Is this effect cost of inflation? Why or why not?
Q2. Sate whether following statements are true or false. Why or why not?
Q3. If inflation rises from 6 to 8 percent what happens to real and nominal interest rates
according to the fisher effect?
Q1. What is inflation and what are its causes? How is it measured?
Q2. List all the costs of inflation and rank them according to how important you think they
are.
Q3. How does inflation affect the ability of money to serve its functions- medium of
exchange, unit of account and store of value?
Numericals
Q1. If CPI in a country is 113 in year 2010-11 and its value changes to 133 in year 2011-
12. What can we say about inflation rate in the economy? If over the same period WPI’s
value decreased from 109 to 101. How would you explain such changes in the economy?
Q2. In a country, the velocity of money is constant. Real GDP grows by 5 percent per year,
the money stock grows by 14 percent per year and the nominal interest rate is 11 percent.
What is the real interest rate?
References
17
3.http://www.caribank.org/uploads/publicationsreports/staffpapers/Inflation%20starts%20i
n%20LACV2e%20manuscript.pdf
5. Denise Hazlett and Cynthia D. HillSource, Calculating the Candy Price Index: A Classroom
Inflation Experiment, The Journal of Economic Education, Vol. 34, No. 3 (Summer, 2003),
pp. 214-223
18
Discipline Courses-I
Semester-I
Paper II: Mathematical Methods for Economics: Preliminaries-I
Unit-I
Lesson: Preliminaries-II
Lesson Developer: Sanjeev Kumar
College/Department: Dyal Singh College, University of Delhi
CONTENTS:
References
Introduction
According to R. Dedekind, "in science, what can be proved should not believed
without proof". Theorems are the most important outcome of the every branch of
mathematics. Proof of these theorems is the heart of mathematics and it distinguishes
mathematics from the other disciplines. In simple way, we can say, "a proof is a chain of
reasoning that establishes the truth of particular statement or a proposition. For example
Pythagoras theorem is an important proven result in this direction.
Example: Prove that the sum and product of any three consecutive even numbers is
always a multiply of 6 and a multiply of 8 respectively.
Proof: These three consecutive numbers must be multiply of 2, so, we can write
these number; 2N, 2N + 2 and 2N + 4. Where N is an whole number.
Let we take first case; the sum of three consecutive numbers are;
= 6N (N + 1)
= 6 (N + 1)
= 8N(N +1) (N + 2)
Direct Proof:
A direct proof is a mathematical argument that uses rules of inference to derive the
conclusion from the promises.
Proof: Le x and y are two even numbers and there exist m and n are integers, such
that
x = 2m and y = 2n
= 2(m + n)
Indirect Proof:
4x >3 x2 + 3
(iii) Contradiction Proof: suppose the statement is not true then there exist an
x such that;-x2 + 4x -3>0 and x>0
P(n + 1) is true, whenever P(n) is true i.e. P(n) is true implies that P (n + 1) is also
true, so P(n) is true for all natural numbers
Theorem: Prove that the sum of the first n odd natural numbers is n 2
i.e. 1+3+5+7+9+-----------------+(2n-1)= n2
The next odd number to be added both sides in above equation, we get,
Then, we conclude;
So, we can say it is also true for k+1. Then, it holds for all natural numbers n.
The above three methods of proof are the outline of the deductive reasoning.
Basically, deductive reasoning is based on consistent rules of logic and proof is the
important part of it. The second type reasoning is called inductive reasoning which is used in
many branches of science and social science. In this reasoning; the process to draw
Institute of Lifelong Learning, University of Delhi
5
Mathematical Methods for Economics: Preliminaries-II
conclusion is based on few observations. For example, If the price level has increased from
the last 20 years then price will also increase in next coming year. The above example
demonstrates inductive reasoning. In fact, it is no guarantee that price level will increase in
the coming year. So, inductive reasoning is not recognized as a form of proof in
mathematics.
x y
Then, xy
2
( x+ y) >4xy
2
Conclusion is true if
Problem Set
x2 + 3x -2 > 0 x>0
3. Use mathematical induction to prove that n<2n for all natural numbers n.
4. Prove that, the sum of square of three consecutive numbers and then subtracts two
is always a multiple of 3.
1+2+3+4+………………………………………………+n=1/2{n (n+1)}
Introduction
S = {a, b, c, d}
Types of Sets
Finite Set: A set having finite number of elements is called finite set.
Infinite Set: A set having infinite number of element is called Infinite Set.
Null (empty) Set: If there is no element in set then it is called Null Set. It is
denoted by (phi)
For Example: S = { }
The Universal Set: The set of all objects is called universal set and it is denoted U.
Equal Set: If the both set have same element then they called equal set.
For example: A = {1, 2, 3} and B = {3, 2, 1}, Then, A = B, A & B are equal set.
Let S = {1, 2, 3, 4, 5}
Proper Set: The proper set is the set of all subsets. It is denoted by P(S)
Example 2: S= {a, b, c}
=8
Venn Diagrams
It is the diagrammatical representation of set theory. It is easy way to
understand set theory.
AB =
Set Operations
AB (A Union B): AB is the set of those elements, which are, belongs to in set A or in
set B or in both sets;
AB = { x x A or xB}
AB (A intersection B): A common element between set A and set B are called AB.
AB = { x x A and x B}
AB = {2, 3}
Associative Law
Distribution Law
De Morgan’s Law
(AB)C = ACBC
Or, (AB)C = AC BC
(i) A ’ B = A B’ (F)
(ii) A B A B =B (T)
(iii) XY = XZ Y=Z (F)
(iv) A (BC) = (AB) (AC) (T)
X(YZ) = {2, 3, 4, 5}
Example 3: In a survey of reading habits of 100 students. It was found that 50 students
used the university library, 40 students had their own library, 30 students borrowed from
friends. It was also found 20 students used both the library i.e. university & own library, 50
students used their own library as well as borrowed from friends; while 10 students used
the university library and also borrowed from friends. How many students used all the
three sources of books?
Solution: Let A, B, and C the sources of books, i.e., library, own library and borrowed
from friends
Institute of Lifelong Learning, University of Delhi
10
Mathematical Methods for Economics: Preliminaries-II
Given; n(ABC) =100, n(A) = 50, n(B)= 40, n(C)= 30, n (AB) = 20,
n(BC) = 50 and n(CC) = 10
= 100 – 50-40-30+20+50+10
= 60 students
Where x1 and x2 are quantities of goods and their respective prices are p1and p2
>0 and M>0 is income of the consumer. Illustrate in a diagram the sets;
Now, BC = C
A = {(x, y): x>0 for all y and y>0 for all x.}
A = {(x, y): x>0 for all y and y>0 for all x.}
PROBLEM SET
Answer:
(3) If x and y are the finite sets, then prove the followings:
(I) n(xy) = n (x) + n (y) – n (xy)
(II) n (x/y) = n (x) – n (xy)
Answer: 10 students
Answer: A B = {(1, ), (1, ), (1, ), (2, ),(2, ),(2, ),(3, ),(3, ),(3, )}
(6) Suppose A= {a, b, c} B= {a, b, c, d} and C = {a, b, c, d, e}, then prove that A B
and BC implies AC
(7) Asked, if you will vote for ‘x’ party the following responses are recorded
Male: 20 40 10
Female: 40 15 15
Youth (Just as 18 years) 20 10 10
Find; (i) n(A) (ii) n(AS) (iii) n (YN)’ (iv) n[A (YN)]
(8) (i) Given, A= {(x, y): x-y0}, B= {(x, y): |x| y0}, C= {(x, y): x y1}
(a)Set A= {(x, y): y IxI} (b) Set B = {(x, y): y 1/IxI} (c) Set AB
If the both set are equal i.e., A=B then R is called a binary relations on the set A
Notation:
Example:
Inverse Relation: Let R be a relation from A to B. Then inverse relation R-1 can
be defined as:
Solution: R = A× B
= {(a,1),(a,2),(a,3),(b,1),(b,2),(b,3),(c,1),(c,2),(c,3)}
(i) If y<x
(ii) If, y = x
R = {x: yA, y = x}
R {(1, 1), (2, 2), (3, 3)}
(ii) If x= 2y
R = {x, yA, x = 2y}
R= {2, 1}
Here domain and range are {2} and {1} respectively.
Example 3: Let A is set of real numbers and the relation R defined as:
Solution:
Example 4: If R1 and R2 are the transitive relation on a set A then R1R2 is transitive?
Solve by taking an example.
Here, we note that R1 and R2 are both transitive but R1 R2 is not transitive.
Examples 5: Suppose A= {, β, }, B= {1,2,3,4} and C = {a, b, c, d}, find R2oR1
PROBLEM SET
1. Let xQy is a relation based on the set of integers, given that 2x-y = 1, then prove
that the relation is not reflexive.
2. Let xRy is the relation of set of real numbers such that x/y=2. Then describe the
relation R2. Is the relation reflexive?
3. If the relation R from A to B is given by
R = {(x, y): x, y A×B, x = 2y+1}, graph the relation.
4. If x and y are the set of all real numbers then explain why the statement
y= |x| - 1 and y = x2 -1 give the same relation?
{Hint: x is the absolute value and it taken always positive}
Functions: A function from a set A to a set B is rule that assigns a unique element in B to
each element A.
xR y or yRx or y = f(x)
Solution:
Types of function
1 1
f ( x) & g ( x) 2
x x
Range ® = [0,)
Let y = f (x) = x3
Or f (-x) = -f (x)
y = f(x) = logx
Example: In the rule that assign to each of the 50 students in a class his marks out of a
maximum of 100 marks a function? If yes, is the function one to one?
Solution: Yes,
This function is also one to one function because every student gets unique
marks out of 100 marks.
Note: The detailed discussion about the functions is given in next chapter.
PROBLEM SET
ANSWERS
REFRENCES
Allen, R.G.D, Mathematical Analysis for Economists, London: Macmillan and Co. Ltd
Chiang, Alpha C., Fundamental Methods of Mathematical Economics, New York: McGraw Hill
Carl P. Simon and Lawrence Blume, Mathematics for Economists, London: W .W. Norton & Co.
Michael Hoy, John Livernois, Chris Mckenna, Ray Rees, Thantsis Stengos, Mathematics for
Economists, Addison-Wesley Publishers Ltd.
Discipline Courses-I
Semester-I
Paper II: Mathematical Methods for Economics: Preliminaries-I
Unit-II
Lesson: Functions, Sequence and Series
Lesson Developer: S. K. Taneja
College/Department: Ramlal Anand College (Eve.) , University of
Delhi
1
Functions, Sequence and Series
Contents:
1. Learning Outcome
2. Graphs and Functions
2.1 Linear functions
2.2 Pointpoint formula
2.3 Quadratic Function
2.4 Polynomial Functions
2.5 Rational Functionas
2.6 Graphing Rational functions
3. Sequence
3.1 Bounded sequence
3.2 Finite sequence and Infinite sequence
3.3 Limit of a sequence
3.4 Convergent sequence
3.5 Divergent Sequence
3.6 Oscillatory Sequence
4. Series
4.1 Convergence and Divergence of Series
4.2 Arithmetic Series
4.3 Geometric Series
5. Exercises
6. References
1. Learning Outcome
After reading this lesson you will be able to know the
various types of functions i.e. linear, quadratic, polynomial,
rational functions and their graphs. Besides sequence and
series will also are covered in this lesson. Various types of
sequences i.e. Bounded sequence, Finite sequence and
Infinite sequence, Limit of a sequence, Convergent
2
Functions, Sequence and Series
Figure 1
The sign of the coordinate in each quadrant are shown in the figure.
Quadrants are numbered anticlockwise.
There are several functions which are utilized in economics and some
of them are:
3
Functions, Sequence and Series
Polynomial function :
Rational functions :
g x
f x
h x
Power function :
Given any equation in x and y, we can depict the set of points in the
coordinate system; which satisfy this equation. This set of points is called
the graph of the equation. The graph of a linear equation is a straight line.
Ax + By + C = 0
A C
y x
B B
The equation has been written in the slope intercept form. We can write it as
y = mx + c
4
Functions, Sequence and Series
Figure 2
The slope of a straight line conveys the steepness and direction of the
line.
In the figure (a) the negative sign convey that the straight line is
negatively sloping. The magnitude of the slope (m) conveys the steepness of
line.
5
Functions, Sequence and Series
If two points on a straight line are given. Then we can find the slope
and equation of straight line given two points (x,y) and (x2, y2)
y1 y2
slope m
x1 x2
y1 y2 m x1 x2
y1 y2
m
x1 x2
y y1 m x x1
For any point (x, y) to be on a straight line passing through the point
(x1 , y1) and has a slope m, it must be true that
y y1
m
x x1
Rearranging this
y y1 m x x1
6
Functions, Sequence and Series
y m x x1 y1
mx y1 mx1
or mx c
Line parallel to y = mx + c
Solution: Parallel lines have equal slope so the slope of the line m
equation is given by;
y y1 m x x1
Suppose m = 3 then
y 5 3 x 2
y 3x 6 5 3 x s 1
So m = -1/2
Equation is y 3 1 x 8 y 1 x 7
2 2
We take an example
7
Functions, Sequence and Series
y = x2
T0 graph this function, simple pick some representative values of x; solve for
f(x) which is usually referred t0 as y in graphing. Plot the resulting ordered
paris [x, f(x)] and connect them with a smooth line. The procedure is shown
below for
y = x2
3 9 (3, 9)
2 4 (2, 4)
1 1 (1, 1)
2 4 (2, 4)
3 9 (3, 9)
Figure 4
b b2 4ac
x
2a
8
Functions, Sequence and Series
x2 + bx = 0 where a = 1 and c= 0
can be converted into a perfect square by taking one half of the coefficient of
2 2
2
x b , squaring it b and adding to the original expression to obtain
b2
2
x 2 bx xb
4 2
Example:
x2 + 12x + 35 = 0
x2 + 12x = 35
square it 62 = 36
x2 + 12x + 62 = 35 + 62
(x + 6)2 = 1
Take the square root of both sides and then solve for x
x 6 1 1
x 7 and 5
ax2 + bx + c
Write
b c
a x2 x 0
a a
2
b b b
Now take half of b which is . Take the square of . Add and
a 2a 2a 2a
subtract it in the bracket expression
9
Functions, Sequence and Series
b b
2
c b
2
a x2 x
a 2a a 2a
4ac b 0
2
a x b
2
2a 4a
b 4ac 0
2
or a x b
2
2a 4a
b e b 2
,
2a 4a
From the above exercise; a quadratic function can be expressed in this form:
y = a(x h)2 + k
where the axis is (xh) = 0, x = h and the vertex is (h, k). The expression h
shifts the function by h units from the origin. The function will shift to the
right or left will depend on the sign of h for example if
For Example if :
y = (x3)2 + 16
The term K shifts the function up or lower it depending upon the sign of k. In
our example.
y = (x3)2 + 16
The graph has been shifted 3 units to the left of origin and 16 units above
the xaxis.
If a > 0, the parabola opens up and the vertex is the lowest point of the
function.
If a < 0, the parabola opens down and the vertex is the highest point.
If |a| > 1 the parabola is narrower than if |a| = 1 If 0 < |a| < 1 it is wider
then if |a| = 1
10
Functions, Sequence and Series
Figure 5
The function
f x an xn a n1 x n1 a1 x a0 0
is call the polynomial of degree n. Linear quadratic and cubic function are
also examples of polynomial.
The polynomial equation has at the most n real solution or roots, but it need
not have any.
If r is a root of equation
f x 0 i.e. if f r 0
or f(r) = 0
11
Functions, Sequence and Series
Example :
3x3 + 5x2 3x 2 = 0
value of b are limited to factors of 2. Which are 1, 2 and vector of c are
limited to factions of 3 which are 1, 3. Hence the only possible real roots
are 1, 2 and 1 , 2 .
3 3
It follows that if an equation f(x)= 0 has integral coefficient and the lead
coefficient is 1 (i.e. an = 1)
Example :
3x3 + 5x2 3x 2 = 0
Whether there possible roots are actually the root of the equation or not can
be found out by putting these values in the equation in place of x and then
finding out whether f(r) = 0 or not. If the equation is satisfied then this a
root of the equation.
Example :
x3 + 2x2 23x 60 = 0
12
Functions, Sequence and Series
1.) Write the terms of dividend in descending power of the variable and fill
in missing terms using zero for the coefficient (In our example there is no
missing term)
x3 + 2x2 23x 60 ÷ (x 5)
Write the constant terms a from the divisor on the left of a and write the
coefficient from the divided to the write of the symbol.
5 1 + 2 23 60
Bring down the first term in the divisor to the third row for now
5 1 2 23 60
Multiply the term in the quotial row (third row) by the divisor and write the
product between the second row below the second term in the first row, add
the numbers in the column formed and write the sum as the second term in
the quotient row
5 1 2 23 60
5
1 7
Multiply the last term in the quotient row by the divisor under the term in
the top row, add the sum and write the sum in the quotient row. Continue
this process until all of the terms in the top row have a number under them.
5 1 2 23 60
5 35 60
1 7 12 0
The third row is the quotient row with the last terms being the remainder.
The degree of the quotient polynomial is one less than the degree of the
dividend because we have divided by a linear factor. The term in the
quotient row are the coefficients of the quotient polynomials. The degree of
the polynomial is 2
13
Functions, Sequence and Series
x3 + 2x2 23x 60 ÷ (x 5)
0
= x2 + 7x + 12 +
x5
or x2 + 7x + 12
The existence of zero remainder proves that 5 is the root of the equation.
g x
f x h x 0
h x
Example :
2x 3
f x
x2 4
x2 4 = 0 x = 2 and 2
i) If the degree of f(x) is less than the degree of h(x) then the rational
function has a horizontal asymptote of y = 0
14
Functions, Sequence and Series
ii) If the degree of g(x) is equal to the degree of h(x) than f(x) has a
a
horizontal asymptote of y n where an is the coefficient of the
bn
highest degree term of g(x) (Numerator) and bn is the coefficient of
the highest degree term of the h(x) (the denominator)
iii) If the degree of g(x) is greater than the degree of h(x), then f(x) does
not have a horizontal asymptote.
The graph of f(x) may cross the horizontal asymptote in the interior of
its domain. This is due to the fact that we are concerned with how f(x)
behave as x or x in determining the asymptote.
g x
i) f x we first determine the holes: Values of x for which
h x
both g(x) and h(x) are zero. After any holes are located, we reduce
f(x) to lowest terms.
ii) Once f(x) is in lowest terms we find the asymptote, symmetry, zeros
and y intercept if they exist.
iv) Plot the zeros and y intercept and plot other points to determine how
the graph approaches the asymptotes.
Example :
x3 2 x 2 3x
y
x
0
y .
0
15
Functions, Sequence and Series
x3 2 x 2 3x
x2 2x 3 .
x
when x 0
There is no y intercept but there are zero at (3, 0) and (1, 0). We plot
the zero and place an open circle around the point (0, 3) to indicate the
hole in the graph. Now select the corresponding points and plot them
Figure 6
Example :
5
x2
x2=0
ii) Since the degree of g(x) less then degree of h(x) the horizontal
asymptote is
y = 0.
16
Functions, Sequence and Series
iii) when x = 0 y = 5
2
There are no holes nor there are zero (the graph does not cross the xaxis)
Figure 7
3. Sequence:
A sequence is a function whose domain is the subset (or set of)
Natural numbers N.
For example
1
f x
n
17
Functions, Sequence and Series
1 1 1 1
1, , , ................
2 3 4 k
The numbers in the list are called the terms of the sequence.
1 1
a1 1 a2 a3
2 3
1
an
n
We write the sequence by placing braces around the formula for nth term.
1
f x n N
n
1
an
n
Example :
1 1 1 1
(1) an 1, , , , ..............
n 1
2 4 8
an 1
n 1
(2) 1, 1,1, 1, ................
1 1 1 1
(3) 1, , , an
3 5 7 2n 1
2 4 6 8
Sequence is , , , ,.......
4 5 6 7
2n
an
n3
Given the nth term of sequence one can find out different terms of
sequence.
18
Functions, Sequence and Series
a1 > a2
a2 > a3
an > an+1
a1 < a2
a2 < a3
an < an+1
an M n N
an m n N for all n N
The values a1, a2, and so on are the terms of the sequence. The terminal
value is an. So it is a finite sequence.
19
Functions, Sequence and Series
1
ak where K N
k
1 1 1
1, , ,......., ,..........
2 3 n
an L
lim an L
x
In simple words it means nth terms gets closer and closer to L if n tends to
infinity (Note L is a finite number.)
If Lim an
x
20
Functions, Sequence and Series
1
(i)
2n1
1 1 1
(ii) 1, , ,
2 3 4
1
1 1
n 1
(iii) an
2
n
(iv) an
n 1
1 1 1 1
(v) 1, , , , ,.....
4 9 16 25
(i) an
5n 2
4n 1
n
(ii) an 3
1 3
n
(iii) an n !
2n
(i) n2
(ii) 2n
(iii) 2n
(iv) 1
n
Oscillates or not
21
Functions, Sequence and Series
(i) 1
n
(ii) 1
n
1 1
(iii) 1, 2, , 3, ......
2 3
4. Series:
A series is a special type of sequence If ai , i = 1, 2, 3, 4………. is a
sequence them
Sn = a1 + a2 + a3 + ……..an
Sn = a1 + a2 + a3 + ……..an
The series Sn is finite series since it is a sum of finite sequence. We can use
symbol (sigma) for the summation
sn an
i 1
Series is a special type of sequence. Any result derived for sequence also
applies to series.
n
If Sn ai is series associated with a sequence ai and
i 1
22
Functions, Sequence and Series
an 1
lim L
x an
Where n denotes number of terms a + (n-1) d is the nth term or last term of
the arithmetic series.
2
Sn n 2a n 1 d
a 1 r n
Sn
1 r
a
Sum of an infinite geometric series where r 1 is Sn
1 r
23
Functions, Sequence and Series
Example :
nn
Test the convergence of
n 1 n !
an 1
Using the ratio test lim
n an
n 22 33
a1 a2 a3
1 2! 3!
n 1
n 1
nn
an an 1
n! n 1!
n 1
n 1
an 1
n 1!
n
an n
n!
n 1
n 1
n!
n n
n 1 !
n 1 n 1
n 1 n 1
nn . n n n 1
n
1
1
n
1
lim 1 e 1 e 2.71828 series is diversant
n
n
5. Exercises:
(1) How many terms of arithmetic sequence 24, 22, 20 ..... are receded to
give a sum of 150?
(2) How long it take to pay off n debt of Rs. 880 if Rs. 25 is paid in the
first month, Rs. 27 in the second and Rs. 29 in the third month.
24
Functions, Sequence and Series
(3) The second term of a geometric sequence is 3 and the fifth term in
81/8. Find the eight th term.
(4) The first term of a geometric series is 375 and the forth term in 192.
Find the common ratio and the sum of first for terms.
(5) A man agrees to work at the rate of Rs. 1 for the first day, Rs. 2 for
the second day, Rs. 4 the third day, Rs. 8 for the forth th day etc.
How much would he receive at the end of 15 days.
(6) The population of a certain tour will increase 3% each year for four
years. What is the percentage increase in population after for years?
6. References
K. Sydsaeter and P. Hammond, Mathematics for Economic Analysis,
Pearson Educational Asia, Delhi, 2002.
25
Limit and Continuity
Discipline Courses-I
Semester-I
Paper II: Mathematical Methods for Economics: Preliminaries-I
Unit-II
Lesson: Limit and Contiinuity
Lesson Developer: S. K. Taneja
College/Department: Ramlal Anand College (Eve.) , University of
Delhi
Content:
1. Learning Outcome
2. Limit
4. Asymptote
5. Continuity
7. Reference
1. Learning Outcome:
After reading this chapter you will be able to know the concept
of limit. Limits of a rational function, asymptote. In addition to
limit the concept of continuity and intermediate value theorem
is explain in detail.
2. Limit:
x3 1
f x
x 1
The function is not defined for x=1, since the result is % which makes
no sense. However we try to see what happens to f(x) when x is slightly
below or above 1. Take a calculator and try to find out the values f(x), when
x taking values which are slightly move than 1 and slightly less then 1.
Some of the values are given below in table 1.
As x approaches 1, f(x) takes values which are closer and closer to 3. So, we
can say that f(x) tends to 3 as x tends to 1. This is written as;
x3 1
lim 3
x 1 x 1
Given the above example the idea of limit should be closed intuitively.
What we are looking at is what happens to the value of the function when
the independent variable x approaches a particular value.
Suppose y = f(x)
lim f x L
x v
Now x can approach v either from the right hand side (i.e. x takes
values which are greater than g) or from the left hand side (i.e. x tends to a
taking values which are less than v) when x approaches v from the left hand
side we say L is the left hand limit of f(x)
lim f x L
x v
Similarly
lim f x L
x V
x
Look at the function f x
x
x f(x)
1 1
5 1
2.5 1
25 1
x
lim f x lim does not exist
x 0 x 0 x
1
f x
x
In this case when x tends to 0 from the right hand side the value of
the function increases and when x is very close to zero the value of the
function approaches . On the other hand when x approaches 0 from the left
hand side the value of the function gets closer and closes to .
lim f x L
x a
f x L
Whenever 0 xa
Let f x x2
Now that
lim x 2 4
x 2
the two neighborhoods define a rectangle (see the diagram). With two of its
converse lying on the curve. It can be seen that for every value of x lying in
the neighborhood of 2, the corresponding value of the f(x) lies in the
neighborhood of 4. Thus 4 fulfills the definition of limit.
Example:
f x x2 x2
lim f x 4
x2
We can make x closes and closes to 2 from both sides (left hand side
and write hand side)
As we put this values in the function the value of function gets closes
used closes to 4.
This is happen even thought the function may not be defined for when x= 2.
In order to prove that the limit of the function is 4. We use the formal
definition of limit.
We must show that given any >0 we can find 0 such that
x2 4 when 0 x2
Choose 1 so that 0 x 2 1
1 x 2 1
1 x 3 x2
So x 2 5
Theorems of limit.
f x h
(iv) lim 1 , provided L2 is not 0 (L2 0)
xa
g
x h2
Them lim f x a
x B
Theorem:
f x a0 a1 x a2 x 2 ...... a x n
lim f x a0 a1 a a2 a 2 ......an a n
x 0
f a
This means that limit of a polynomial f(x) at x=a is the same as the
value of the polynomial at x= a. In the case of polynomial, to find out the
limit at x = a we just are required to evaluate the polynomial at x = a.
Example
x4 4
f x
x3
x4 4 lim x 4 4
lim f x lim x 2
x 2 x 2 x3 lim x 3
x 2
20
20
1
We can say if
h x
f x
g x
There is a useful principle for polynomial which in simple words states that.
‘The end behavior of a polynomial matches the end behavior of its highest
degree term'.
lim
x
a
0 a1 x a2 x 2 ....... an x n lim an x n
x
f x a0 a1 x a2 x 2 .......an x n
a a a
f x x n 0n n11 n22 ...... an
x x x
4. Asymptote
(a) (b)
1 1
Lim Lim
x a xa x a xa
In the case of the above two functions the vertical asymptote is the
line x =a
10
3x 1
Look at the function y
x
f x 3
1
x
lim f x 3 lim f x 3
x x
We can define:
lim f x L or lim f x L
x x
Very often when the limit of the function does not exist we may be
interest in finding how the function f(x) behavior when x tends to () and x
tends to 0 (zero) or x tends to a value N.
Example:
4 x 5
3x 2
Vertical Asymptote
2
3x 2 0 x
3
2 2
This only vertical asymptote is x . As x approaches from the left or
3 3
2
right f(x) approach the vertical line x ,
3
4x 5
But, the Horizontal asymptote lim
x 3x 2
45
lim x 4
x
3 2 3
x
Example:
2 x 3
x2 2x 3
Vertical Asymptote
x2 2 x 3 0
x 3 x 1 0
Horizontal Asymptote
12
2x 3 2 3
lim lim x
x
x 2x 3
2 x
1 2 3
x x
2
= 2
1
2x 3
lim
x
x2 2 x 3
23x
lim
x
x 2
2x 3 x2
23x 20
lim 2
x 1 2 x 3 x2 1 0 0
5. Continuity
13
In the first figure when x takes a value slightly greater them a the
value of the f(x) jumps up from y1 to y2. The function is not continuous at
x= a if the function is continuous at a point x, there will be small changes in
the value of f(x) for small change in the value of x.
Definition :
(iii) lim f x f a
x a
The function drawn in Fig (1a) is discontinuous at x=a. The limit does not
exist.
14
of the function at x= a exists but f(x) is not equal to the value of the
function at x = a.
Actually the third condition implies the first two conditions, since it
means, lim f x f a . This actually means limit exists and the function is
xa
Example:
x2 4
at x 2
x2
x2 4
x 2
f x x 2
3 x 2
Since lim f x f 2 3
x 2
x2 4
x2
f x x 2
4 x2
Notes :
15
lim f x f a
x a
(i) f + g is continuous at c
(i) f g is continuous at c
(ii) fg is continuous at c
x2 9
Example :
x2 5x 6
16
lim f g x f h .
xh
That is
lim f g x f lim g x
x c
x c
lim 5 x 2 5 9 4
x 3
f (3) 4
lim f x f 3
x 3
lim 5 x 2 lim 5 x 2 4 4
x 3 x 3
Example: f x 4 x2
17
The natural domain is the closed interval [-2, 2] we will have to find
out the f continuity on open interval (-2, 2) and at two end points -2 and 2.
Take an arbitrary point C
lim f x lim 4 x 2 4 c2 f c
x c x c
lim f x lim 4 x 2 4 4 0 f 2
x 2 x 2
lim lim 4 x 2 0 f 2
x 2 x 2
F(x) = k.
18
If f is continuous on [a,b] and if f(a) and f(b) are non-zero and have
opposite signs, then there is at least one solution of the equation in the
interval (a, b)
f(x) = 0
Example:
f(x) = x3 x 1
We can make the approximation better by reducing the size of interval [1,2]
Example :
19
lim f x exist.
xc
x2 1
(i) f x
x 1
1 x 1
(ii) g x 0 x 1
1, x 1
Solution:
x2 1
for x 1
f x x 1
2 if x 1
So if g(x) = 1 when x = 1
20
Note: In case the limit does not exist at C then the discontinuity is
irremovable.
7. References:
21
Discipline Courses-I
Semester-I
Paper II: Mathematical Methods for Economics: Preliminaries-I
Unit-III
Lesson: Single Variable Optimization
Lesson Developer: Himanshu Singh
College/Department: Satyawati College, University of Delhi
Content:
1. Learning Outcome
2. Introduction of Geometric Properties of Functions
2.1 Linearity
2.2 Differentiability
3. Optimization
8. References
1. Learning Outcome
After reading this lesson you should be able to learn about the geometric properties of
functions (linearity and differentiability); optimization of function (maxima and
minima); geometric interpretation of Rolle’s theorem and mean value theorem;
existence, global and uniqueness of solution.
If an economic variable lives in one set, and charges in that variable help to explain
changes in another economic variable, two variables are related. There is a
correspondence between the two sets of variables. If the first set is denoted by S1 and
the second set by S2, the correspondence (which is defined by f) is written f : S 1
S2 to biggest that f associates elements in S1 with elements in S2. Here f "sends" or
"Transforms" or "maps" x in S1 into y and z in S2 (as shown in the following
diagram).
If each element in S1 gets sent to exactly one element of S2 (see fig. 3.1.b), then f is
called a function. Notice that more than one element in S1 may go tvo a single
element of S2.
The main problem is to find ways of combining functions and describing their
properties. Thus a diagram like that in fig. 3.2.a does not represent a continuous
function. Formally, continuity requires that for any distance > 0 around f(x), there is
a distance > 1 around x, so that points within distance of x get sent to points
within distance of f(x) as in fig. 3.2.b.
Simply stated, a function transforms a set of real numbers into another set of real
numbers. But a set of real numbers may have a certain coherence or structure. This
set, for example, may be open, or compact, or convex, etc. Because that set is
transformed by a rule or function representing and explanation or theory, the structure
of set S1 ought to have an analogue in the structure of the set S2.
Thus the important question to ask is whether or not a function preserves the relevant
structure of the set being transformed. Functions are appropriately classified by the
kinds of structures they preserve under the transformations they represent. For
example, continuous functions transform "nearby" points into "nearby" points. A
function is continuous at a point in the domain, called x, if points close to x get sent to
points close to the image or transformation of x, called f(x).
2.1 Linearity
( a) f (a) First condition con notes that linear function preserves the additive
structure of the real numbers while second condition implies that a stretching of an
arrow from 0 to a by a factor is preserved under the action of a linear function. All
linear functions from R to R are continuous. In fact, linear functions were created to
preserve arithmetic properties, while continuous functions were created to preserves
topological (or geometric closedness) properties of the real numbers.
2.2 Differentiability: The differential calculus, when all is said and done, is the
study of linear approximation to nonlinear functions. If any nonlinear function has an
associated linear function that approximates it closely, then analysis of non-linear
functions is rather easy. However, it is not possible to associate a linear function with
an arbitrary function from R to R. It is sometimes possible, though, to carve up the
domain of the function to perform a local approximation. Gaining the simplicity of
linearity requires forsaking the global picture. If it is feasible to do a local
approximation in all the chunks of the carved up domain, it may be possible to patch
together the local approximations into some coherent picture. Differential calculus is
after all a local analysis.
Now, we shall examine special class of functions that have significant economic
applications.
Now, we will examine concavity and convexity of a function more formally. Suppose
f : SCR R, So that f is defined on open set S1 and suppose that S is a convex set; f
is a concave function if, given any x and x̂ is S, and for all
0 1, (i) f (x (1 – ) xˆ ) f ( x) (1 – ) f ( xˆ )
Because x (1 – ) xˆ = zs, , inequality (i) means that the value of or the value
of f at some print between x and x̂ , is greater than or equal to the value of f
represented by a point on the line connecting f (x) and f ( x̂ ).
f ( x (1 – ) xˆ ) f ( x) (1 – ) f ( xˆ ) )
Convex function:
A function f: R R, defined on an open convex set SCR, is convex if, for any x
and
[2] f ( x (1 – ) xˆ ) f ( x) (1 – ) f ( xˆ ).
A convex function has the property that a line drawn between two points on the graph
lies on or above the graph between those two points. Thus concave functions look like
parabolas opening downward, and convex functions look like parabolas opening
upward. Linear functions are certainly both concave and convex, but neither strictly
concave nor strictly convex.
Test for concavity and convexity of a function : f" test. Intuitively, if the graph of f
lies above all of its tangents or an interval I, then it is called concave upward or
convex function on I. If the graph of f lies below all of its tangents or I, it is called
concave downward or simply concave function.
Let us now see how the second derivative of the given function helps determine the
intervals of concavity. For a concave function or some interval I, slope of the tangent
line falls continuously over I and for a convex function slope of the tangent line
drawn rises.
This mean that for a concave function, the derivative f1 is decreasing, and therefore f"
is negative. For a convex function, the derivative f1 is increasing throughout the
interval I, and hence f11 is positive. This reasoning can be reversed and suggests that
the following theorem is true [Proof can be provided with the help of the Mean Value
Theorem].
Test :
Definition:
In view of the Concavity Test, there is a point of inflection at any point where the
second derivative changes sign.
(ii) f1 (x) > 0 for 0 < x < 4, f1 (x) < 0 for x < 0 and for x > 4
(iii) f11 (x) > 0 for x < 2, f11 (x) < 0 for x > 2.
Condition (i) tells us that the graph has horizontal tangents at the points (0,0) and (4,
6). Condition (ii) says that f is increasing (as f1 > o) on the interval (0,4) and
decreasing on the intervals (– , 0) and (4, ). Condition (iii) Says that the graph is
convex (concave upward) on the interval (– , 2) and concave on (2, ). Because the
curve changes from convex to concave when x = 2, the point (2, 3) is an inflection
point. We use this information to sketch the graph in fig.3.5. Notice that we made the
curve bend upward when x < 2 and bend downward x > 2.
Another related application of second derivative is the following test for maximum
and minimum values. It is a consequences of the concavity test. The Second
Derivative Test : Suppose f11 is continuous near C.
(a) If f1 (c) = 0 and f11(c) > o), then f has a local minimum at c.
For instance, part (a) is true because f11(x) > 0 near c and so f is convex near c.
This means that the graph1 of lies above its horizontal tangent at c and so f has a local
minimum at c. This may be seen from the diagram below:
Examples: 2
Discuss the curve Discuss the curve y = x4 – 4x3 with respect to concavity, points of
inflection and local maxima and mininima. Use this information to sketch the curve.
Solution:
To find the critical numbers we set f1(x) = 0 and obtain x = 0 and x = 3. To use the
second derivative test we evaluate f11 at these critical points :
Since f1(3) = 0 and f11(3) > 0, f(3) = — 27 is a local minimum. Since f11(0) > 0, the
second derivative test give no information about the critical number 0. But since f1(x)
<0 for x < 0 and also for 0<x<3, the first derivative test tells us that f does not have a
local maximum or minimum at 0.
Since f11(x) = 0 when x = 0 or x = 2, we divide the real line into intervals with these
numbers as and points end complete the following schedule.
(– , 0) + convex
(0, 2) – concave
(2, ) + convex
The point (0,0) is an inflection point since the curve changes from convex to concave
there. Also, (2, – 16) is an inflection point since the curve changes from concave to
convex there.
Using the local minimum, the intervals of concavity and convexity and the points of
inflection, we plot the curve in the diagram given below:—
Note : The Second derivative test is inconclusive when f11(c) = 0. At such a point
there might be a maximum, or minimum or neither. This test fails when f11(c) does not
exist. In all such cases the first derivative test must be used. In fact, even when both
tests apply, the first derivative test is often the easier one to use.
Example :
Discuss the curve with respect to maxima, concavity and points of inflection. Also
sketch the graph of the function given below:
4–x
Solution : f1(x) =
x 1/ 3
(6 – x ) 2 / 3
–8
f11(x) =
x 4/3
(6 – x ) 5 / 3
Since, f1(x) = 0 when x = 4 and f1(x) does not exist when x = 0 or x = 6, the critical
numbers are 0, 4, and 6.
x<0 + – + – decreasing
on (–,0)
4<x<6 – + + – decreasing
on (4, 6)
x>6 – + + – decreasing
on (6. )
We now apply the first derivation test to find local extreme values (maxima or minima).
Since f1 change sign from positive to negative at x = 4, f(4) = 25/3 is a local maximum. The sign
of f1 does not change at 6. So there is no maximum or minimum there
Looking at the expression for f11(x) and noting that x 4 / 3 0 for all x, we have f11 (x)
< 0 for x < o and for 0 < x < 6, and f11(x) > 0 for x > 6. so f is concave on (– – , o)
and (0, 6) and convex on (6, 00) and the only point of inflection is (6, 0). The graph
is sketched below. Note that the curve has vertical tangents at (0,0) and (6,0) because
f 1 ( x) as x 0 and as x 6.
(1) Identify the domain of f and any symmetries the curve may have.
(2) Find f1 and f11.
(3) Find the critical points of f, and identify the function's behaviour at each
one.
(4) Find intervals where the curve is increasing and when it is decreasing.
(5) Find the points of inflection, if any occur, and determine the concavity of
the curve.
(6) Identify any asymptotes.
(7) Plot the key points, such as the intercepts and the points found in steps
3,4,5 and sketch the curve.
( x 1) 2
Sketch the graph of f (x) =
1 x2
1. The domain of f is (– , ) and there are no symmetries about either axis or the
origin.
2. Find f1 and f11 x – intercept at x = – 1
y – intercept at y = 1 at x = o
(1 x 2 ) 2( x 1) – ( x 1) 2 2 x 2 (1 – x 2 )
f1
(1 x 2 ) 2 (1 x 2 ) 2
Critical points: x = – 1, x = 1
(1 x 2 ) 4
4 x ( x 2 – 3)
=
(1 x 2 ) 3
3. Behavior at critical points: The critical points occur only at x = 1 where f` (x)
= 0 since f` exists everywhere over the domain of f. x – 1, f11 (–1) = 1 > 0
yielding a local minimum by the second derivative test.
At x = 1, f11 (1) = –1 < 0 yielding a local maximum by the second derivative test.
4. Increasing and Decreasing: We see that on the internal (– , – 1) the derivative f1(x) < 0,
and the curve is decreasing on the interval (–1, 1), f1(x) > 0 and the curve is increasing; it is
decreasing on ( 1, ) where f1(x) < 0 again.
5. Inflection Points: It may noted that denominator of the second derivative (step 2) is
negative on (0, 3) and positive again on ( 3, ). Thus, each point is a point of
inflection. The curve is concave on the internal – , – 3 . Convex on
3, 0 .,
6. Asymptotes: Expanding the numerator of f(x) and then dividing both numerator
and denominator by x2 yields:
( x 1) 2 x 2 2 x 1
f ( x) (Expanding Numerator)
1 x2 1 x2
2 1
1
= x x2 (Dividing by x2)
1
1
x2
Since f decreases on (– , – 1) and then increases on (–1, 1), we know that f(–1) = 0
is a local minimum. Although f decreases on (1, ) , it never crosses the horizontal
asymptote y = 1 on that interval (it approaches the asymptote from above). So the
graph never becomes negative, and f(–1) = 0 is an absolute minimum as well.
Likewise, f(1) = 2is an absolute maximum because the graph never crosses the
asymptote y =1 on the interval (–, – 1) , approaching it from below. Therefore, there
are no vertical asymptotes (the range of f is 0 y 2).
3. OPTIMIZATION
The concept of rationality, inter alia, means that a decision maker (consumer) firm,
government etc.) tries to find the best alternative out of those available to him. That
is, he tries to optimize.
It is the values of the choice variables which is in the feasible set (Set of alternatives
available) and which yields a maximum or minimum value of the objective function
over the feasible set.
Suppose y = f(x) is the objective function and the problem is to maximize f. The
feasible set is S. Then a solution to the problem is the choice variable x* having the
property that:
f ( x * ) f ( x) for all x s .
In also frequency happens that we are concerned with the greatest or least value over
a certain neighborhood in the domain of the function rather than with the absolutely
greatest or least value over the entire domain. The next definition makes this idea
more precise.
A relative minimum is defined in the similar manner. The greatest value (if there is
one) or the global maximum on its entire domain is sometimes called absolute
maximum. The maxima and minima of a function are called the extremes of a
function. Note that the existence of a relative maximum (or minimum) at xo implies
that the function is defined in some neighborhood of xo, N(0o). If xo is an end point of
the domain of f, then the neighborhood is a left or right neighborhood, and the
extreme is sometimes called an end point (boundary point) extreme. The following
figure illustrates some of the ways in which extremes can occur.
The graph is that of a function with relative maximum at each of the points x = x1, x3
and x4, and with a relative (local) minimum for each of x = x2 and x5. The extreme at
The figure also suggests that if f(xo) is an extreme and f1(xo) exists, then f1(x0) = o.
This seems to be so at x = x3 and x = x5. At x1, x2 and x5, the derivative apparently does
not exist (only one sided derivatives can exist at such points).
The converse of the theorem is not true. For example the function defined by f(x) = x3
has a zero derivative for x = 0 since f1(n) =3 x2. Yet f has no extreme at x = o as f1 (x)
> 0 for x 0, so that f is an increasing function over its entire domain, the set of all
real numbers. One can easily verify that the curve y = x3 has a point of inflection with
a horizontal tangent at (0,0).
f ( x) 1 – x 2 |, –2 x2
1 – x2 for | x | 1
since f ( x)
– (1 – x x – 1 for
2 2
1 | x | 2
– 2x for | x | 1
f 1 ( x)
2x for 1| x | 2
1
1/3
Let the function f (x) = (x–2) , . We first find the derivatives : f (x) 1 = ( x – 2) – 2 / 3`
3
and
f11-(x) =
5
–
–2
3
( x – 2)
9
In this case, there is no value of x such that f11(x)=0; however f11(x) fails to exist for x
= 2 also f ` (2) does not exist. The point (2,0) is on the curve (because when x = 2, f(x)
= 0, and it is wident from the expression for f11(x) that the second derivatives function
is continuous except at
x = 2. Furthermore, f11(x) >0 for x<2 and
Accordingly, the curve is convex for x < 2 and concave for x > 2, so that point (2,0)
is a point of inflection. We also see that f11(x) as x 2, s that the
inflectional tangent is vertical (see figure below).
The main point to note here is that f1(2)does not exist, although (2,0) does not
correspond to extreme point.
Example: Find the extremes of the function defined by f(x) =2x3 + 3x2 – 12x
We see that f1(–2) = 0, and f1(x) changes sign from plus to minus as x increases
through the value –x. The point (–2, 20) is a maximum point on the graph of f. Also,
f1(1) = 0 and f1(x) changes sign from minus to plus as x increases through value 1.
The point (1– 7) is a minimum point on the graph.
–1
2
Solution: 1
f (x) = x 3
,x0
3
The domain of f is all xR, yet here is no value of n for which f ` (x) = 0.
The only possible critical value of x is x = 0, for which f1(x) does not exist. Since
f1(x) < 0 for x < 0 and f1(x) > 0 for x> 0, f(o) = 0 is a relative (local) minimum of the
function.
Fig.
Clearly, if the curve of the function in panel a were turned so that the line AB became
not parallel to the x–axis, the geometric content of Rolle's Theorem would still be
true; the tangent line through point P (in panel b) would still be parallel to the recent
line AB. This apparently evident result is analytically formalised in the Mean value
Theorem.
This equation is often written in the form obtained by solving for f(b). f(b) = f (a) + f1
(c) (b–a), c (a,b).
Example : Find the value(s) of c which the mean value Theorem predicts if f(x) = x3 –
x, a = 0, b = 3.
f (b) – f (a)
f1(c) = Mean value theorem (henceforth MVT)
b–a
24 – 0
3c2–1 = 8, or , c 2 3 or c 3 or – 3
3–0
Since C (0,3) or 0 < c < 3, the positive root 3 is the desired value of c.
Solution: Let (f(x)... = x and use the MVT in the form f(x + h) = f(x) + h f1(x + oh), 0<
1 1
0 1. Since
2 x and f (x + h)
1 = 1
f (x) 2 x h , we have
h
xh x
2 x h
5
With x = 100 and h = 10, this formula gives 110 10
100 100
5
110 10 10.5
100
5 110
Similarly, 110 10 10
110 22
21
or, 110 10, so that
22
220
110 10.476, Finally, we have
21
Solution :
Hence the curve crosses the axes at (0,0) and (3a, 0).
2. Since y appears y appears to even powers only, the curve must be symmetric
to the x–axis. This is the only simple symmetry the curve has.
3a – x
3. We solve for y to obtain y x which show that there is a vertical
x–a
asymptote at x = a provided that x = – a within or on the boundary of the curve. As
we shall notice in the extent of the curve is –a < x 3a,the curve lies to the right of
and is asymptotic to the line x = – a.
4.To determine the extent of the curve (the set of values of x for which y is real), we
3a – x
solve the inequality 0 . The required solution is – a x 3a .
xa
Note : Solving for x in terms of y involves solving a general cubic equation. Since a
cubic equation. with real coefficients always has one or three distinct real roots, it is
clear that for each real value of y, there is always at least one corresponding real
value of x and there may be three such values. Consequently, there is no restriction
on the extent of the curve in the y-direction.
5.Maximum and minimum points on the curve : Let us consider the two separate
equations :
3a – x 3a – x
y1 x , y2 – x
xa x a
4. These two formulas represent two branches of the curve. The two equations
can now be properly associated with functions rather than relations. Because of the
symmetry of the curve, one branch is symmetric to other with respect to the x-axis. so
that we may restrict the discussion to y1 Since,
3a 2 – x 2
y1` , we see that x = 3a is a critical value of x.
( x a) 3 (3a – x)
y1` 0 for 3 a x 3 a, so that the curve is rising and falling over the
12a 3
We see that y –
"
. It is clear here that the curve is concave
(3a – x) 3 / 2 ( x a) 5 / 2
1
In the above figure, concave portion of the curve corresponds to y1 and the convex
portion, which is obtained by symmetry, correspond to y2,
Note: The curve is known as a tri-sectrix because of the property that the angle is
one-third of angle if x > 0, y> 0 for p(x, y).
Solution: f 1 ( x) – 1 2 x for x (– , – 1)
= – 1 – 2x for x (–1, 0)
= 1 – 2x for x (0, 1)
= 1 2x for x (1, )
1
To investigation the existence of extreme values of f at –1, 0, 1 for 0 , we
4
note that f(–1) = 1, f(–1 – ) = 1 + (1+ )2–1 = 1+ 3 + 2
f (–1 ) = 1 – 1 – (1 – ) 2 1 – 2
f(1+ ) =1 + + (1 + )2 – 1 = 1 + 3 + 2
Example: Show that the curve y = ax3 + bx2 + cx can have only one point of inflexion.
If a is positive, show the curvature changes from concave to convex from below as we
pass through the point of inflection from left to right. Deduce that the point of
inflection is also a stationary point if b2 = 3ac.
f1 = 3ax2 + 2bx +c
f11 = 6a x + 2b
–b
6ax + 2b = 0 x
3a
–b
Therefore, we have a point of inflection at x =
3a
The point of inflection is also a stationary point if f 1 is also zero at that point.
2
–b –b –b
f 1
3a 26 . c 0
3a 3a 3a
b 2 2b 2
Or, – c0
3a 3a
– b2
Or, c0
3a
Or, b2 = 3ac.
–b
Hence, The point of inflection is also a stationary point if b2 = 3ac.
3a
5.3 Uniqueness of Solution
Given an optimization problem in which the feasible is convex and the objective
function is non-constant and quasi-concave, a solution is unique if :
(c) both
H 2 – AB
which is independent of fixed amount of b used for the production
B
1 a(2 Hb – 2 Aa
f1 = – 2 Hab – Aa – Bb 0
2 2
a2 2 2 Hab – Aa – Bb
2 2
1 – 2 Hab 2 Bb 2
Or, 0
a 2 2 2 Hab – Aa 2 – Bb 2
B aH
Or, –2 Hab +2Bb2 = 0 or a = b or b
H B
B
It is easy to check that at a b, f 11 0.
H
6
B
Thus APa will be maximum at a
H
aH
To get APmax, we substitute b = , and simplify.
B
2 Hab Bb 2
APmax = – A– 2
a2 a
2 Ha aH aH B a2 H 2
= – A–
a2 B a2B2
2 H 2 – AB – H 2
=
B
H 2 – AB
APmax = Hence Proved.
B
Example :
k
Determine the constant K so that the function f(x) = x2 + may have a (i) minimum
x
at x = 2, (ii) a minimum at x = –3. Show that that the function cannot have a local
maximum for any value of K.
K k
Solution: f1(x) = 2x – 2
0 x3
x 2
2k
f11(x) = 2 + >0 (for minimum)
x3
k
(by substitutions x3 = )
2
1/ 2
k
Therefore, x= is a point of minimum.
2
x = 2 k = 16
k
(ii) x=–3 – 27 k – 54
2
1 k
f (x) = 2 x 3
11
13
k
2 2(1 2) 6 0. Which is always positive.
11
f
Therefore, the function cannot have a local maximum for any value k.
Example: A wire of length L is cut into two piece, one being bent to form a square
and the other to form an equilateral triangle. How should the wires be cut if the sum
of the two areas is minimum?
Solution: Let the wire be cut at length and is used to form the triangle. Side =
/ 3.
3 3
Altitude = Side .
2 2 3 2 3
3
Area of the triangle (equilateral) = . ( Side) 2
4
2
3
= .
4 3
2
=
12 3
2
L –
A2 = Area of the square =
4
2 ( L – ) 2
A = Total Area = A1 + A2 =
12 3 16
dA 2l 2( L – l )
– 0
dl 12 3 16
L–
Or, – 0
6 3 8
4 – 3 3 L 3 3
Or, 0
24 3
Or,
4 3 3 – 3 3L
0
24 3
Or, (4 3 3) 3 3 L
3 3
Or, L
4 3 3
d2A 4 3 3
0
dr 2 24 3
3 3
Hence, Using fraction of the total length
43 3
L and rest for the square gives the minimum total area.
(2) If f1(a) = f11(a) = ... = f(n–1) (a) = 0, fn) (a) 0, then f(x)has a stationary value
at
x = a which is an inflectional value if n is odd, a maximum value if n is even
and f(n) (a) <0 and a minimum value if n is even and f(n) (a) > 0.
This criterion is complete and so both necessary and sufficient, subject to the
condition that the derivatives involved are finite and continuous.
There is no case of failure; unless the function is a constant (and hence without
maxima and minima) there must always be some derivative which is not zero.
Illustration : Let y = (x –1)5. Now we investigate this function with respect to maxima,
minima or point of inflection.
f(x) = y = (x –1)5
f1 = 5(x –1)4 = 0 x = 1
f11 = 20 (x–1)3 = 0 at x = 1
f111 = 60 (x –1)2 = 0 at x = 1
f(4) = 120(x–1) = 0 at x = 1
f(5) = 120 0 at x = 1
Here, the first non-zero derivative at x = 1 is the 5th – order derivative (Odd – ordered
derivative). Hence at x = 1, the function has a point of inflection.
at x= 1, y = (1–1)5 = 0
Example : (Curious Case) Find maxima and Minima of the function f(x) = y =
2–x
x x –2
2
x ( x – 4)
f1
( x x – 2) 2
2
–1
There are stationary values y = –1 at x = 0 and y = at x = 4.
9
To check sign of the derivative near these stationary values x = 0, 4, we get the
following :
h(h – 4)
f1 = at x = 0+h
(h h – 2) 2
2
h (h 4)
f1 at x = 4 + h
(h 9h 18) 2
2
As may be easy to see, the first expression changes from positive to negative as h is
given small values changing from negative to positive. The second expression
changes in the opposite sense as h is varied similarly. The function thus has a
1
maximum value –1 at x = 0 and a minimum value – at x = 4.
9
The curious feature of this case is that the maximum value of the function is smaller
than the minimum value. This apparently paradoxical result is due to the fact that the
function has infinite values at x = 1 and at x = – 2. (At each of these values the
denominator of y is zero). The graph of the function illustrates how the presence of
infinities influences the maximum and minimum values.
Example : If a monopolist has a total cot function c ax2 bx c and if the demand
law is p – x 2 , then show that the output for maximum net revenue is
a 2 3 x ( – b) – a
x=
3x
Solution:
Net Revenue = p. x – c( x)
= ( – x 2 ) x – (ax2 bx c)
= x – x 3 – ax2 – bx – c
d
Necessary condition for maxima 0
dx
– 2a 4a 2 – 12 (b – )
Or, x
6
– 2a 4[a 2 ( – b) 3
Or, x
b
– a a 2 3 ( – b) a 2 3 ( – b) – a
Hence : x
3 3
d2
Sufficient condition for maxima: 0
dx2
d2
0.
dx2
Hence the output for maximum net revenue,
a 2 3 ( – b) – a
x
3
1. Without appealing to graphical ideas, find the location and nature of the
extreme of the following two functions and determine if they are differentiable
at these points :
1 3
(a) f ( x) x 2 x 2 3x 1
2
(b) f ( x) (2 x – 5) x 2 / 3
1
2. Show that the curve of y 2 x – 3 is convex from below for all positive
x
c
values of x. Is same true for y ax b ?
x
a
3. Show that the demand curve p – c is downward sloping and
( x b)
convex from below. Do the same properties hold for marginal revenue curve?
1
6. Slow that y x has one maximum and one minimum value and the latter
x
is larger than the former.
1 3
c Q – 7Q 2 111 Q 50
3
Q = 100 – p.
when about 30 sets are produced per week. What is the monopoly price and
net revenue at this level of output?
2x 2 4 x (1 x 2 ) (1 x) (1 – x)
9. Show that f ( x) 4 f ( x)
1
x 1 ( x 4 1) 2
Also find the maximum value of f on (0, ) . Show that f(–x) = f(x), for all x.
What are the maximum points for f on (– , ) ?
10. Find two positive numbers whose sum 16 and whose product is as large as
possible.
11. Let C(Q) = a Qb + c, for a>0, b > 1, and c o, be cost function. Prove that the
average cost function has a minimum on (0, ), and find it.
6x3
12. Classify the stationary points of f(x) = , with respect to maxima,
x4 x2 2
maxima and point(s) of inflection.
14. Find possible inflection points for f(x) = x2ex. Draw its graph.
15. Find the intervals where the following Cubic cost function is convex and
where it is concave, and find the unique inflection point:
(16) Are the following functions concave or convex (assuming x > 0 in parts (b)
and (c) ?
1 x 1 –x
(a) e e (b) 2x – 3 + 4 lnx
2 2
1
8. References
K. Sydsaeter and P. Hammond, Mathematics for Economic Analysis,
Pearson Educational Asia, Delhi, 2002.
Discipline Courses-I
Semester-I
Paper II: Mathematical Methods for Economics: Preliminaries-I
Unit-IV
Lesson: Integration of Functions
Lesson Developer: Sanjeev Kumar
College/Department: Dyal Singh College, University of Delhi
CONTENTS:
References
Introduction
There are two limiting processes of Calculus. First one is differentiation in which we
study about the tangent to the curve or rate of change in one variable due to change in
other variables. On the other hand, second one is integration, in which we study about the
area under curve integration can be defined as:
“Integration is the process of finding the function from it’s derivative and this function is
called the integral of the function”.
Basically, we use integration to find out area under a curve. We can also find the
area under curve by geometrically. However, concept of integration and differentiation do
not depend on geometry as analytically. A geometrical interpretation is used only to
understand intuitively.
Let y f (x) be a
continuous and positive function
on the closed interval [a, b] in the
figure (1). We have to find the
area of given function on the
closed interval [a, b]. Now the
question is how do we compute
area (A) under the given graph.
A (a) = 0
Because, there is no area from ‘a’ to ‘a’ and the total area can be defined as,
A= A(b)
Now, we suppose that ‘x’ increases by x amount. Then, A( x x) is the area under
curve y f (x) over the closed interval [a, x x] , Hence, the required area is given by;
A( x x) A( x)
It is the area {A} under the curve y f (x) over the closed interval [ x, x x]
. Let,
A be very small i.e. magnified and this area can not be exceed the area of rectangle with
edges x and f ( x x) and cannot be lesser than area of the rectangle with edges x and
f(x). Hence, x 0 , then;
A( x x) A( x)
OR, f ( x) f ( x x)
x
If we take x 0 in the above equation then the interval [ x, x x] shrinks to the
single point ‘x’ and the value f ( x x) approaches f (x) . So, the function A ( x ) is
differentiable and it measures the area under the curve y f (x) over the closed interval
[a, x] .Then, the derivative of the function is given by;
This proves that the derivatives of the area function A ( x ) is a curve height function
{i.e. y = f (x)}
Now, suppose F (x) is another continuous function with the function y f (x) as its
derivative;
d
Because, A( x) F ( x) A '( x) F '( x) 0
dx
A( x) F ( x) C {C is some constant}
If A(a) = 0, then
A (a) = F(a) +C = 0
Institute of Lifelong Learning, University of Delhi
4
Mathematical Methods for Economics: Integration of Functions
In short, the method for finding the area under the curve y f (x) and its domain
(a,b) or above the x –axis from xa to x b has following steps;
Find an arbitrary function F(x),that is continuous over the interval (a, b) such that
What happens, if the function y f ( x ) has negative value in [a, b]. At this
condition, the required area is A( x ) [ f ( b ) F ( a )] . Further, we know that, the area
of a region is always positive. So A( x ) is also positive.
Example 1:
Find the area under the straight line y f(x) x over the interval [0,1]
Solution:
Then,
x2
F ( x)
2
d 1
( ax ) anx x ,here ,n 2 & a
n n 1
dx 2
x
F ( x) 2 x
2
A F (1) F (0)
1 1
= 0 , This answer is reasonable.
2 2
Example 2:
Compute the area under the parabola; y f ( x ) x2 over the interval [a, b]
Solution:
Let,
1
F ( x) x 3
3
1
Then, F '( x) f ( x) 3x 2 x 2
3
1 1
A F (b) F (a) b3 a 3
3 3
1
A b3 a 3
3
Example 3:
1 d
F ( x) ax2 bx (ax n ) anx n1 ax b
2 dx
1
Then, F' ( x ) .2ax b.1
2
= ax b
So, the required area A is given by
A F (b) F (a)
A F ( ) F ( )
1 2 1
= a b a 2 b
2 2
1
= a( 2 2 ) b( )
2
= a( ) 2b
2
Example 4: Find the shaded area ‘A’ of the function y f ( x) e x /3 3 over the closed
interval [0, 3 ln 3]
d x x
F ( x ) 3e x / 3 3 x ( e ) e
dx
Then, F' ( x ) f ( x ) ex / 3 3
A = - [F(b) – F(a)]
= (3eln3 3 3ln 3 3e
A = 3.89 units
Problem Set
1. Find the area under the graph of polynomial y f ( x ) x3 over the interval [0,1]
1 x x
2. Find the bounded area of the graph of function y f ( x) (e e ) over the
2
interval (-1,1)
3. Find the area under straight line, y f ( x) cx d over the interval [o,1]
4. Compute the area under the parabola y 4 x2 over the interval [o,1]
1 1
1. Area (A) = 2. Area (A) = e
4 e
1 4
3. Area (A)= (ab) 4. Area (A)=
2 3
Introduction
The previous section of the present chapter discusses the problem of finding an anti-
derivative of the function f(x) i.e. a function F( x ) whose derivative is f ( x ).
F' ( x ) f ( x )
By symbolically, if
d
F ( x ) f ( x )
dx
Then, f ( x )dx F ( x ) C
Here ‘C’ is the constant term. We know that differentiation of constant term is zero.
If integral constant 'C' can take any value then the integral is called indefinite integral.
1
x dx n 1 x C n 1 }
n n 1
{
x2
Example: x dx 2 C
Exponential Rule: It is defined as;
e dx e C
x x
ax
And, a dx C {a 0&a 1 }
x
loge a
e dx e x C
x
Examples:
1 ax
e ax
dx e C a 0
a
2x
C
x
2 dx
loge 2
1
x ln x C
Example:
1
t dt ln t C
Integral of sum
a f ( x) a
1 1 f ( x) ...... an fn( x) dx a1 f1 ( x)dx a2 f 2 ( x)dx ...... an fn( x)dx
2 2
Integral of Difference
[ a f ( x) a
1 2 f ( x) ...... an fn( x)]dx a1 f1 ( x)dx a2 f 2 ( x)dx .......... an fn( x)dx
Integral of Multiplication
d
f ( x) g ( x)dx f ( x) g ( x)dx [ dx
f ( x) g(x)dx]dx
x 2
1
a 2
dx log x x 2 a 2 C
x 2
1
a 2
dx log x x 2 a 2 C
1 1 ( a x)
a 2
x 2
dx
2a
log C
( a x)
Example1:
( 5x 3 x 2 2 x 1 )dx
4
Find the integral
( 5x 3 x 2 2 x 1 )dx
4
Solution:
5x dx 3x dx 2xdx dx
4 2
=
= 5 x4 dx 3 x 2 dx 2 xdx dx
x5 3.x3 x2
= 5 C1 C2 2 C3 x C4
5 3 2
= x5 x3 x 2 x C1 C2 C3 C4
=x
5
x3 x 2 x C C C
1
C2 C3 C4
( e 1 )dx
x
Example 2: Evaluate
x3
1
(e 1)dx
x
Solution:
x3
e dx x dx 1dx
x 3
=
1
= e x x 2 x c
2
1
= ex xc
2 x2
( x 1 )2 2 x 1 / 2
Example 3: Find the integral x
dx
( x 1)2 2 x 1/2
Solution: x
dx
x 2 2 x 1 2 x 1/2
= dx
x1/2
1
= (x 2 x1/2 x 1/2 2 )dx
3/2
2 5/2 4 3/2
= x x 2 x1/2 2ln x c
5 3
x2
Example 4: Compute x 1 dx
x2
Solution: Let, x 1 dx
x 2 1 1
= dx
x 1
( x 1 )( x 1 ) 1
= ( x 1)
dx
1
= ( x 1 )dx ( x 1 ) dx
x2
= x log x 1 c
2
dx
Example5: Evaluate xc xd
dx
Solution: Let, xc xd
xc xd )
= dx
( x c x d )( x c x d
xc xd )
= ( x c ) ( x d ) dx
1 1
=
(c d )
( x c )1 / 2 dx
(c d )
( x d )1 / 2 dx
1 2 1 2
( x c)3/2 ( x d )3/2 c
(c d ) 3 cd 3
2 1
( x c)3/2 ( x d )3/2 c
3 (c d )
( 6 x 9 ) dx
8
Example 6: Find the integration
Let y 6x 9
1
Then, dy 6 dx or dx dy
6
1
( 6 x 9 ) dx 6 y dy
8 8
So, we get,
1 y9
= c
6 9
1
(6 x 9) dx 54 (6 x 9) C
8 9
x2
Example 7: Evaluate 4 x 2 dx
x2 x2 4 4
Solution: Let, x 2 4 dx = 4 x2 dx
1
= 1dx 4 4 x2
dx
1 2 x
= x 4. log c (by the formulae)
2 2 2 x
2 x
= x log c
2 x
x e
2 2x
Example 8: Evaluate
d
I x 2 e2 x dx x 2 e2 x dx dx
dx
e2 x 2 x.e2 x
x2 dx
2 2
1 2 2x
= x e xe 2 x dx
2
1 2 2x d
= x e x e2 x dx .x e2 x dx dx
2 dx
1 2 2 x xe 2 x e2 x
= x e
2 1. 2 dx
2
1 2 2x 1 x 1 2x
= x e xe e c
2 2 4
1 2 2x 1 2x 1 2x
= x e xe e c
2 2 4
1 2x 2 1
= e x x c
2 2
1
Example 9: Find 4x 2
9
dx
1 1
Solution: 4x 2
9
dx = ( 2 x 3 )( 2 x 3 ) dx
1 A B
Now, let; ................(i )
(2 x 3)(2 x 3) (2 x 3) (2 x 3)
2 Ax 3 A 2 Bx 3 B
=
4 x2 9
2 x( A B ) 3( A B )
=
4 x2 9
Now compare both sides of the equation; 2 x( A B ) 3( a B ) 1
Hence 2( A B ) 0 or A=-B and 3(A-B)=1 or B=-1/6 and A=1/6, now by equation (i)
1 1 dx 1 dx
dx
4x 2 9
6 2x 3 6 2x 3
1 1
ln 2 x 3 ln 2 x 3 C
12 12
Institute of Lifelong Learning, University of Delhi
15
Mathematical Methods for Economics: Integration of Functions
18 4 / 3
6 L dL L c
1/ 3
Solution: Q(L) =
4
Given L=0, then Q(0)= 0+C or C=0
Problem Set
A
( 4 x 9 x 2 2 x 2 )dx r
3
(i) (ii) 5/2
dr
(3t 2t et )dt x
2
(iii) (iv) x dx
1
( ax b ) dx a( 1 ) ( ax b ) c
1
2. Prove that,
1 x
3. Find the integration (i) x 2
dx (ii) 2x 2
3
dx
1
2(e e x )dx
x
6. Evaluate
7. Given, f " ( t ) 1 / t 2 t 3 2 t 0 and f(1) 0, f' (1) 1/4 then find f(t).
8. Prove that,
2
t at b .dt = 2
( 3at 2b )( at b )3 / 2 c
15a
log x dx x e dx
5 x
9. Find the integration (i) (ii)
10. Find the general form of the function f(x), whose third derivative is x and also given
f"(0 ) f '(0 ) f (0 ) 0
Institute of Lifelong Learning, University of Delhi
16
Mathematical Methods for Economics: Integration of Functions
1 2x 1
11. Evaluate, (i) x 2
a2
dx (ii) ( x 1 )( x 2 )( x 3 )
1. (i) x4 3 x3 x 2 2 x c
2A
(ii) c
3r 3/2
(iii) t 3 t 2 et c
2 5/2
(iv) x c
5
3. (i) 2[ x 2ln( x 2] C
1
(ii) ln(2 x 2 3] C
4
1
4. (i) ln( x 2 1)3/2 C
3
3
(iv) ( x 2 )4 / 3 ( 2 x 3 ) c
14
5. C( x ) x 2 4 x 40
1 x x 1 5
6. (e e ) 7. t t log t
2 20
9. (i) x log x x c
1 4
10. x
24
Institute of Lifelong Learning, University of Delhi
17
Mathematical Methods for Economics: Integration of Functions
1 xa 1 3 7
11. (i) log c (ii)
2a xa 12( x a ) 5( x 2 ) 4( x 3 )
Introduction
Let F( x ) be a continuous function over the interval [a, b] and it has a derivative
f(x) i.e.F' ( x ) f ( x )x ( a ,b ). Then the difference, F(b)-F(a), is called the definite
integral of function f ( x ) over the interval [a, b]. In the first section of the present
chapter, this difference, F(b)-F(a), does not depend on indefinite integrals. On the other
hand, definite integral of f(x) depends only on the function f(x) and its interval [a, b].
Definite integral can be written as;
b
f ( x)dx F ( x) ba F ( x) a F (b) F (a)
b
a
where, F' ( x ) f ( x )x ( a ,b ) and the number ‘b’ and ‘a’ are the upper and
lower limits respectively.
I a f ( x )dx
b
Let
a f ( x )dx F ( x ) ba F ( x ) ba F ( b ) F ( a )
b
b
Example 1: Find, x dx
a
I a x dx
b
Solution: Let
b
x2
I c
2 a
b2 a2
=
2 c
2 c
b2 a 2 1 2 2
=
2 2 2 (b a )
Some Basic Properties of Definite Integral
F( a ) F( a ) 0
a
a
f ( x)dx 0
b b b
a
f ( x)dx f ( y)dy f ( z )dz
a a
d a( t )
f ( x )dx f ' ( t ) f { b( t )}.b' ( t ) f { a( t )}.a' ( t )
dt b( t )
Every continuous function is integrable, if this function has an anti-derivative i.e.
F' ( x ) f ( x ), x ( a ,b )
1
( 2 x x )dx
1
Example 2: Find
0
1
I 1 ( 2 x )dx
2
Solution: Let
x
2
2 x2
= log x
2 1
2
= x 2 log x 1
= [4 + log2]-[1+0]
= 3 + log 2
Example 3: Find the area of the parabola x2 4 by between x axis and its
ordinate at x3
Institute of Lifelong Learning, University of Delhi
19
Mathematical Methods for Economics: Integration of Functions
3
Solution: The required area = ydx
0
x2 x2
y
3
= dx
4b
0
4b
3
1 x3
=
4b 3 0
1 27 9
0
4b 3 4b
4
Example 4: find
1
x 2 dx
Solution: Let
x 2 If x 2
x2
( x 2 ) If x 2
4 2 4
Then 1
x-2 dx ( x 2)dx ( x 2)dx {By property of Integration)
1 2
2 4
x2 x2
= 2x 2x
2 1 2 2
4 1 16 4
= 2 4 2 2 2 8 2 4
3 5
= 2 0 2.
2 2
Example 5: Find the area between the regions of parabola y x2 and straight line y x
over the interval [-1,1]or ( x, y )x 2
y x
= 2 *Area OAB
2 xdx x 2 dx
1 1
=
0 0
x2 1 x3 1
= 2
2 0 3 0
1 1
= 2 0 0 = 2/3 square units
2 3
K Qt
T
Example 6: Evaluate e dt , where T> 0 and K and Q are positive constants.
0
T
K
T e
T Qt
Solution: Let W(T) = dt
O
K T
=
T O
eQt dt
T
K e Qt
T Q O
=
K
= (eQT ) (e )
TQ
W(T) =
K
1 eQT
TQ
Example 7: Find the area included between the two parabola i.e. y2 4 x and x2 4 y
x2
4x
4
Or, x( x
3
64 ) 0
So, x 0&4
4 x2
4 x dx y 2
4 x & y x 2 / 4
O
4
4
x3 / 2 x3
= 2
3 / 2 12 0
= 5.3 square unit.
d x 4 2
e du
2
Example 8: Find
dx x
Solution: By the direct property of integration, we get;
d b( x )
f ( x )dx
dx a( x )
= f ( b( x )b' ( x ) f a( x )a' ( x )
d x 4 2
e du
2
Then,
dx x
e ( x 2 x e x .1
2
)2 2
=
=
2
e x 2 xe x 1
2
Problem Set
t t 2 dt
3y
1 2 3
3
(i)
1
e x dx (ii) (iii)
1
dy
0
10
d x 2 d u v2 d u 1
2. Find, (i) t dt
dx 0
(vi)
du u
e dv (iii)
du u
x4 1
dx
4. Find the area intercepted between the line 3 x 2 y 12 and the parable
3 2
y x
4
5. Find the area between the parabolas; y 2 4ax and x 2 4ay, a 0
6. Prove that
f ( x)dx 2 f ( x)dx, If f ( 2a x ) f ( x )
2a a
0 0
=0 If f ( 2a x ) f ( x )
7. Evaluate
1 3000
I f ( t )dt
1
(i) (t
0
t 4 t )dt (ii)
2000 1000
3000000
Given F ( t ) 4000 t
t
1 b
8. Prove that F ( t* ) f ( t )dt
ba a
If f (t ) is continuous function over the interval [a,b] and t* ( a ,b )
H int : Put F( t ) f ( x )dx
t
e2 1 4 39
1. (i) (ii) (iii)
e 3 10
1
2. (i) x2 (ii) 2e
u 2
(iii)
2 u4 1
3. 32 sq. units
4. 27 sq. units
16 3
5. a sq. units
3
13
7. (i) (ii) I 352
12
1.4 ECONOMIC APPLICATION OF INTEGRATION
Introduction
Integration has an important role in economics. The present section shows the role
of integration in economics by illustrating some important examples.
If f ( r ) is the function of individuals income over the interval [a, b], then the no. of
individuals with incomes in [a, b]
b
=n f ( r )dr
a
r earning
b
Total income of individuals = n rf ( r )dr
a
r f ( r )dr
b
a
m=
f ( r )dr
b
Example 1: If the income distribution of population over interval [a, b] is given by,
f ( r ) Ar 5 / 2 {A is a positive constant}, then determine mean income in the
given group.
b
b b 2 2 3/2
f (r )dr Ar 5/2
Solution: Let dr A r 3/2 A a 3/2 b
a a
3 a 3
rf ( r )dr a Ar.r 5 / 2 dr
b b
And
a
r 3 / 2 dr 2 Aa 1 / 2 b1 / 2
b
=A
a
2 A( a 1 / 2 b 1 / 2 ) ( a 1 / 2 b 1 / 2 )
m= = 3 3 / 2
2 / 3 A( a 3 / 2 b 3 / 2 ) (a b 3 / 2 )
Institute of Lifelong Learning, University of Delhi
24
Mathematical Methods for Economics: Integration of Functions
Now, suppose b is very large then b-1/2 and b-3/2 close to zero, then m3a
Consumer surplus (CS) and producer surplus (PS): These can be also
calculated by using definite integral. Consumer surplus is given by;
x
CS f ( x)dx p x
o
T
PDV =
o
f ( t )ert dt
T
DV =
t S
f ( t )er ( t s )dt
Example 2: Find total cost function from the given marginal cost function;
MC f '(q) 2 3q1/2 5 / q 1/2 , Given; f(1) = 11
q3/2 q1/2
Solution: F (q) f '(q)dq (2 3q1/2 5q 1/2 )dq 2q 3. 5. c
3/ 2 1/ 2
TC 2q 2q3 / 2 c
Institute of Lifelong Learning, University of Delhi
25
Mathematical Methods for Economics: Integration of Functions
Example 3: If the marginal revenue function is given; Pm= ,
(x )
2
Then, show that P= is the demand law
(x )
dR
Solution: R P.x and MR=
dx
R MR.dx dx
(x )
2
( x )1
R x A
1
R P.x x A,
(x )
We know that if output x=0 then revenue is also zero. Then A =
x
R P x x x
(x ) (x )
OR, P , Hence proved.
x
Example 4: If marginal revenue (MR) = 16 q 2 , find the maximum total revenue, also
find the total, average revenue demand.
16 q 2 0 q 4
4
4 4 q3 128
TR MR dq (16 q )dq 16q
2
0 0
3 0 3
x3
Total Revenue (TR) = ( 16 x )dx 16 x 3 c
2
when x 0 then c 0
TR q2
Average Revenue (AR) = 16
q 3
dc 0.001 2
Solution: C .dy ( 0.5 0.001 y 0.5 y y A
dy 2
Example 6: The sales of a product is depicted by a function S(t) = 100e -0.5t, where t is
number of years since the launching of the product, find
S( 3 ) 0 100e0.5 t dt 155.40
3
Solution: a)
b) S( 4 ) S( 3 ),
S4 3 100e0.5 t dt 17.6
4
e)
S( ) 0 100 e0.5 t dt 200
Solution: Given, P 30 2 x x2 30
For x 3, then p = 20
3
CS =
0
(30 2 x x 2 )dx P x
3
2 x2 x3
=
30 x 2 3 3 20
0
= 90-9-9-60=12 units
Example 8: The demand and supply laws are Pd (6 x)2 and Ps=14+x respectively. Find
the consumer surplus (CS), If;
(i) The demand and price are determined under perfect competition and;
(ii) The demand and price are determined under monopoly and the supply function is
identified with marginal cost function.
( 6 x )2 14 x x 2
Then, P=14+x=16
CS 0 ( 36 12 x x 2 )dx 16 2 56 / 3
2
TR = Pd x ( 36 12 x x 2 )x 36 x 12 x 2 x3
MR 36 24 x 3 x2
And supply price: Ps = 14 x , supply function Ps=MC
MR=MC
36 24 x 3 x2 14 x
i.e. x 1, or,7.33
At x 1, then, Pd=25
1 16
Hence, CS (36 12 x x 2 )dx 25 x) unit
0 3
D 20 4 x and S 4 4 x
20 4 x 4 4 x
or ,8 x 16
then; x 2
and , P 4 8 12
And, P=4+8=12
2
P x (4 4 x)dx 24 [4 x 2 x 2 ]02
0
24 16 8units
Problem Set
1. If the inverse demand function of commodity Q is given; P = 3q-1/2 and presently 100
units are being sold, then find the consumer surplus.
Ans. 30
2. Let interest rate will vary and represent by r(t). What is the present value of a flow
of income P(t) from t=a to t=b using this variable interest rate?
b
r ( s ) ds
Ans. e a
P(t )dt
a
REFRENCES
Allen, R.G.D, Mathematical Analysis for Economists, London: Macmillan and Co. Ltd
Chiang, Alpha C., Fundamental Methods of Mathematical Economics, New York: McGraw Hill
Carl P. Simon and Lawrence Blume, Mathematics for Economists, London: W .W. Norton & Co.
Knut Sydsaeter and Peter J. Hammond, Mathematics for Economic Analysis, Prentice Hall
Michael Hoy, John Livernois, Chris Mckenna, Ray Rees, Thantsis Stengos, Mathematics for
Economists, Addison-Wesley Publishers Ltd.
DC-1
Semester-II
Contents
2.1Graphical Presentation
2.4 Steps in the construction of histogram when class intervals are unequal
4. Limitations of Statistics
5.1 Population
5.2 Sample
7. Exercises
8. References
1. Learning Outcomes
1. Introduction of Statistics.
2. Characteristics of Statistics.
2. Introduction to Statistics
I t i s widely employed in various activities of business, government, and the natural and
social sciences. It is not only facts and figures; it refers to a range of techniques and
procedures for analyzing, interpreting, displaying, and making decisions based on data.
Hence, there are five stages in a statistical investigation, explained in following diagram i.e.
(1) Collection of data:- This is the first step and is the foundation of statistical analysis.
Therefore, data should be gathered with maximum care by the investigator himself (primary
data) or obtained from reliable published or unpublished sources (secondary data) .
(2) Organization of data:- Data must be organized by editing, classifying and tabulation so
that collected information can be easily assessable.
(3) Presentation of data: - Organized data must be presented in some systematic manner
so that statistical analysis becomes easier. Data can be shown with the help of tables,
graphs, and diagrams etc.
(4) Analysis of data:- After collection and organization, the data are to be reproduced by
various methods used for analysis such as averages, dispersion, correlation, and
interpretations etc.
(5) Interpretation of data:- Last step is interpretation of data, implies drawing of conclusion
on the basis of analysis of data. On the basis of conclusion various decisions can be taken.
The word "statistics" is commonly used in two ways, in the first way, "statistics" is used in
plural sense meaning numerical facts or data, called as "Descriptive Statistics". It deals with
collecting, analyzing, and clarifying data; which are otherwise quite unwieldy and immense.
It seeks to achieve this in a method that significant decisions can be easily obtained from
the data. It may; thus; seen as encompassing methods by bringing out and feature the
latent characteristic present in a set of numerical data. It not only makes easier
understanding of the data and systematic reporting thereof in a manner that makes them
manageable for further consultation, investigation, and evaluation.
For example, the NSSO reports the population of the India was 449.6 million in 1960; 555.2
million in 1970; 699 million in 1980; 868.9 million in 1990; 1.042 billion in 2000 and 1.206
billion in 2010, this information is an example of descriptive statistics. We call it as
descriptive statistics, if we estimate the percentage growth from one decade to decade.
However, we cannot call as descriptive statistics, if we use these to find out the population
for the year 2020 or percentage growth of population from the year 2010 to 2020; because
these statistics are not being used to calculate past population but to predict future
population.
Masses of unorganized data (e.g., census of population, earnings of workers etc.) are of
fewer values. However, statistical methods are available to arrange this sort of data into a
useful form. Data can be arranged into a frequency distribution. Different graph may be
used to describe data. A well-thought and analytical data grouping makes possible easy
description of the hidden data characteristics by means of variety of summary measures.
These includes measures of central tendency, dispersion, etc, it make the necessary scope
of descriptive statistics.
Today, with the development of probability theory, statistics is used to make prophecy or
comparison about a totality of observations (or population) using data collected about a
very little portion of that population. This technique is called as "Inferential Statistics". It is
also known as Inductive Statistics or Statistical inference.
It is the technique of finding conclusions from the set of data that are subject to random
variation; for instance; sampling variation. Most particularly, the term inferential statistics is
used to define systems of procedures that can be helpful in drawing conclusions from
datasets arising from systems influenced by random variation; for example; experimental
errors, random sampling, or random experimentation. First and foremost requirements of
such system of procedures for inference and induction are that the system should be able to
provide reasonable answers when applied to well-defined situations and that it should be
general enough to be applied to all type of situations. These statistics are basically used to
test hypotheses and make estimations using sample data. The two branches of inferential
statistics are estimation and hypothesis testing. The result of inferential statistics can be
We often present statistical information in a graphical form. A graph is often useful for
capturing reader attention and to portray a large amount of information. This method can
be used to illustrate the way in which one property changes when some other property
undergoes a measured change. For the visualization of data, there are a number of types of
graphs. They are given below;
Bar Graph: A graphical method of presenting qualitative data that have been
summarized in a frequency distribution or a relative frequency distribution.
Pie Chart: A graphical device for presenting qualitative data by subdividing a circle
into sectors that correspond to the relative frequency of each class.
Dot Plot: A graphical presentation of data, where the horizontal axis shows the range
of data values and each observation is plotted as a dot above the axis.
Histogram: A graphical method of presenting a frequency or a relative frequency
distribution or a density distribution.
Ogive: A graphical method of presenting a cumulative frequency distribution or a
cumulative relative frequency distribution.
Scatter Diagram: A graphical method of presenting the relationship between two
quantitative variables. One variable is shown on the horizontal and the other on the
vertical axis.
Basically, frequency distribution is simply a grouping of the data together, generally in the
form of a frequency distribution table, giving a clearer picture than the individual values.
The most usual presentation is in the form of a histogram or a frequency polygon that is
represented by dot plot.
A Histogram is a pictorial method of representing data. It appears similar to a Bar Chart but
has two fundamental differences:
1. Type of data: Bar graphs are usually used to display "categorical data", that is data that
fits into categories. Histograms on the other hand are usually used to present "continuous
data", i.e. data that represents measured quantity where, at least in theory, the numbers
can take on any value in a certain range.
2. Presentation of Data: The difference in the way that bar graphs and histograms are
drawn is that the bars in bar graphs are usually separated; whereas, in histograms the bars
are adjacent to each other, this is not always true however. Sometimes you see bar graphs
with no spaces between the bars but histograms are never drawn with spaces between the
bars.
children frequency
1 3
2 8
3 10
4 2
5 3
For the construction of histogram, relative frequency have to be calculated for each of the
observation; which is equal to:
Relative frequency =
1 3 3/26 ≈ 0.12
2 8 8/26 ≈ 0.31
3 10 10/26 ≈ 0.38
4 2 2/26 ≈ 0.08
5 3 3/26 ≈ 0.12
A histogram is constructed by drawing rectangles for each class of data. The height of each
rectangle is the frequency or relative frequency of the class. The width of each rectangle is
the same and the rectangles touch each other. The corresponding histogram of the above
example is:
Steps in construction
Intervals are often left the same width but if the data is scarce at the extremes then
classes may be joined.
If the intervals are not all the same width, calculate the frequency densities.
Hand drawn histograms usually show the frequency or frequency density vertically.
Histogram with unequal class intervals:
When classes have unequal widths, the vertical axis of a histogram must represent not
frequency (number of occurrences) but frequency density (relative frequency divided by its
class width), and the class widths must be accurately represented on the horizontal axis, so
that the area of each bar (not the height) represents the frequency of that class. The
frequency density shows the number of units vertically for every unit horizontally.
2.4 Steps in the construction of histogram when class intervals are unequal
1. Divide the relative frequency of each observation by the corresponding class width to
get the frequency density.
2. Construct the histogram with frequency density as the height of the rectangle and
class intervals as the base of the triangle
• (Note; if the frequency distribution is inclusive, convert them into exclusive
• If mid values are given, find out the lower and upper limits of the various
classes before constructing the histogram.)
Consider the following example:
Score Frequency
0 - 50 25
50 - 60 10
60 - 100 20
Solution:
Score Frequency Density
0 - 50 25 25/50 = 0.5
50 - 60 10 10/10 = 1.0
60 - 100 20 20/40 = 0.5
The area of each rectangle is the relative frequency of the corresponding class. Since, the
total of relative frequency is always equal to one; hence, total area of all rectangles in a
density histogram is equal to one.
Symmetric: A histogram is symmetric if right half of histogram is exactly equal to left half.
i.e, the two sides of the distribution are a mirror image of each other. For example, in a
normal distribution, points are as likely to occur on one side of the average as on the other.
A Biomodel histogram has two peaks. This happen when data having two different kinds of
individuals or objects. For example, a distribution of production data from a two-shift
operation might be bimodal, if each shift produces a different distribution of results.
A distribution is said to be positively skewed when upper tail is stretched towards right as
compare to the left side which means that majority of data has values towards the upper
end of its range. Most of the values tend to cluster toward the left side of the x-axis (i.e. the
smaller values) with increasingly fewer values at the right side of the x-axis (i.e. the larger
values).
For example, the distribution of personal income is positively skewed. Also, raw scores on
most measures of psychopathology are positively skewed.
A distribution is said to be negatively skewed when the tail on the left side of
the histogram is longer than the right side. Most of the values tend to cluster toward the
right side of the x-axis (i.e. the larger values), with increasingly less values on the left side
of the x-axis (i.e. the smaller values).When upper tail is stretched towards right as compare
to the left side which means that majority of data has values towards the lower end of its
range. For example, a distribution of analyses of a very pure product would be skewed,
because the product cannot be more than 100 percent pure.
Frequency polygon
It is constructed by joining the midpoints at the top of each column of the histogram. The
final section of the polygon often joins the midpoint at the top of each extreme rectangle to
a point on the x-axis half a class interval beyond the rectangle. This makes the area
enclosed by the rectangle the same as that of the histogram.
In a recent campaign, many airlines reduced their summer fares in order to gain a
larger share of the market. The following data represent the prices of round-trip tickets
from Atlanta to Boston for a sample of nine airlines.
Answer: The dot plot is one of the simplest graphical presentations of data. The horizontal
axis shows the range of data values, and each observation is plotted as a dot above the
axis. The figure shows the dot plot for the above data. The four dots shown at the value of
160 indicate that four airlines were charging Rs.160 for the round-trip ticket from Atlanta to
Boston.
Answer (i)
Answer (ii)
Ogive for the cumulative frequency distribution of the waiting times at first county bank
Waiting Times
Pie Graphs
It illustrates how a whole is separated into parts. The data is presented in a circle such
that the area of the circle representing each category is proportional to the part of the
whole that the category represents.
For example, a circle graph is shown in Data Analysis given below. The title of the graph is
“United States Production of Photographic Equipment and Supplies in 1971”. There are 6
categories of photographic equipment and supplies represented in the graph.
Statistics deals with every aspects of human activity. Statistics holds an important position
in different fields like Commerce, Industry, Chemistry, Economics, Mathematics, Biology,
Botany, Psychology, Astronomy etc, Therefore, application of statistics is very wide.
A number of economics problems can be easily understood by the use of statistical tools. It
helps in formulation of economic policies. Statistical data and advanced techniques of
statistical analysis are immensely useful in the solution of variety of economic problems
such as production, consumption, distribution of income, wealth prices, saving investment,
unemployment etc. For instance, the analysis of consumption pattern of the people may
reveal pattern of income spent on different heads of consumption by collecting relevant
information.
Statistical study is a quantitative tool mostly used within the economic area and is always
needed in a variety of ways such as determine the effectiveness of economic theories with
the help of the study of empirical real-world data, explaining cause-effect relationships
between variables for the use of assisting in the making of powerful public policy, estimating
the future actions of necessary economic conditions for the purpose of minimizing
uncertainty in making up of different business or public policy decisions, or adopting
mathematical models by incorporating actual data.
National income accounts are multipurpose measure for the administrators and economists.
Various statistical measures are used for construction of these accounts. In economics
research, statistical measures are used for collecting, organizing and analysis of the data
and testing hypothesis on it.
In short, statistics is very useful in every field of economics. It provides facts, direction to
solve a problem, evolution of economic laws, and also helps in economic planning.
4. Limitations of Statistics
Despite of usefulness of statistics in almost all sciences - social, physical and natural,
impressions should not be carried statistics can be used as a big magic which gives us
the accurate results to the problems. In spite of the wide scope of the subject it has
certain limitations and nevertheless the data neither properly collected nor interpreted
there is always chances of drawing wrong conclusions. Therefore, it is necessary to know
the limitations of statistics. Some important limitations of statistics are the following:
Statistics deals with aggregate of facts. Single or isolated figures are not statistics.
Data are statistical when they correlate to computation of masses, not statistical
when they correlate to an individual item or phenomenon as a different entity. This is
considered to be a major handicap portion of statistics.
Statistics are numerical statement of facts and figures. It is not applicable to the
study of those facts that are not quantitatively measurable. These attributes cannot
be explained in numbers. Qualitative phenomena, e.g., honesty, intelligence,
poverty, etc, cannot be studied in statistics unless these attributes are expressed in
terms of numerals. So, the quality aspect of a variable or the subjective
phenomenon falls out of the scope of statistics. Hence, this limits the scope of the
subject.
Statistical laws are not exact as incase of natural sciences. The conclusions obtained
through the phenomenon are not specifically or universally true, they are true but
only under some conditions. This is because statistics as a science is less exact
compare to natural sciences. So, statistics has less practical utility.
5.1 Population
The term "Population" normally means persons in a town, region, state, or country and their
respective attributes such as gender, age composition, marital status, educated and so on.
In statistics, the term "population" used in a different sense. It not only concerned with
number of people living in a area, but it also covers the population of households, a
population of events, objects, procedures or observations, including services like visits to
the doctor, or surgical operations. A population is thus a totality of creatures, events,
things, cases and so on. In short, a unit of population is whatever you count or measure.
Normally, a population should be relatively large in size and hard to infer some attributes by
considering its elements individually. It is impossible to theoretical survey the entire
population because all the members are not observable. If it is possible to reach the entire
population but it is very costly and also time consuming . Alternatively, researcher could
take a subset of this population called a sample. By using this sample, conclusions can be
drawn about the population under study.
A census may be preferred when the size of population is not too large. It may be desirable
to take the recourse to a census where the respondents are not widely scattered and
reliability of data is not a case when a census is just unavoidable.
A Conceptual population consists of all the values that might possibly have been observed.
It is also called as tangible population. e.g., a geologist weighs a rock several times on a
sensitive scale. Each time, the scale gives a different reading. Here the population is
conceptual because it consists of all the readings that the scale could in principle produce.
5.2 Sample
Example: If researcher wants to find out the mean height of the students in a particular
class room, then students in that room would represents the population. But if researcher
wants to find out the mean height of the students in that particular college, the students in
that particular room would represent a sample of the students in that college. The basic
unit of the population in called as element of the population. Each student is an element of
the college. Thus, a population is the totality of elements being studied and a sample is
The interesting relationship between population and sample is that the population can exist
without sample, but sample may not exist without population; thus, sample depends upon
population. A sample is not studied for its own sake. The basic objective of its study is to
draw inference about the population.
The samples are essential because within several models of research, it is impractical (from
both a strategic and a resource perspective) to examine all the members of a particular
population for a research assignment. However, census taking often is expensive, too time
consuming to provide information when it is needed. It is not feasible to include the whole
population when elements are destroyed to obtain information. Instead, a selected few
participants (which is called as sample) are chosen to ensure that the sample is true
representative of the population. Hence, the result obtained from the sample can be
concluded for the population, i.e., using information on a smaller group of participants to
infer to the group of all participants.
Normally, certain attributes of the items in the population are too being examined, for
example, the mean height of the children’s in a village. A characteristic may be categorical,
such as gender etc, or it may be numerical. In the former case, the value of the
characteristic is a category for example female, whereas in the later case, the value is a
number for example age=40 years. A variable is a characteristic that may assume more
than one set of values to which a numerical measure can be assigned.
Data outcomes from making consideration either on a single variable or concurrently on two
or more variables.
Uni-variate data refers to data where researchers are only observing one aspect of a
population or sample at a time, e.g. height of students. With two variable or bi-variate data,
researchers observe two aspects and if there are more than two variables then multivariable
data, e.g. height, weight, and age of students.
When sample is obtained from population, an investigator would frequently use sample
information to draw some type of conclusion about the population. It is imperative that the
sample is representative of the group to which it is being generalized. This branch is called
as inferential statistics.
In order to use statistics to learn things about the population, the sample must be random.
A random sample is one in which each and every elements of a population has a fair chance
of being chosen. The most commonly used sample is a simple random sample. It requires
that each and every possible sample of the selected size has an equal chance of being used.
Since, simple random sampling normally does not ensure a representative sample; when
population is heterogeneous; a sampling technique called stratified random sampling is
used. The sample becomes more representative of the population; when this sample is
selected by using this technique. This method can only be used when the population can be
divided in number of distinct "strata" or groups. In stratified sampling, you first identify
members of your sample who belong to each group. Then from each of the sub-group
(called strata), a randomly select a sample in such a manner that the sizes of the subgroups
in the sample are proportional to their sizes in the population.
It is required that the each strata used in stratified sampling must not overlap. Having
overlapping subgroups will provide some elements a higher chance of being selected in the
sample. If this happened, it would not be a probability sample.
In a systematic sample, the items of the population are placed in the form of a list and then
every nth item in the list is selected (consistently) for insertion in the sample. For instance,
if the population of research study includes 5,000 students in a particular college and the
researcher required a sample of 500 students. The students would be put into a form of list
and then every 10th student would be selected for the sample. To check against any
possibility of human bias in this method, the researcher should pick the first element at
random. This is called a 'systematic sample with a random start'.
Statistic is a statistical measure calculated from sample data. It is a value that express the
characteristic of a sample, for example sample mean, and it is also used to infer about the
corresponding a population parameter. Hence, a sample should represent the entire
population.
Examine a set of n data, x1, x2, …., xn; If this set of data represents a population, then its
mean value and the standard deviation of the population are given by:
µ=
σ=
Whereas, if this same set of data represents a sample, then its mean value and its standard
deviation are given by:
=
s=
Here, the formula for mean is same whether the data for a population or a sample. But, the
formula of the standard deviation depends on the interpretation of the data as population or
sample.
A statistic and parameter are very similar. A parameter is a numerical value that is
equivalent to an entire population characteristic, such as mean and standard deviation,
which are explanatory of entire population, are known as population parameters whereas a
statistic is a numerical value that describes a sample but not the whole population are called
statistic. Inferential statistics authorize researcher to make an informed guess about a
population parameter based on a statistic computed from a sample randomly drawn from a
particular population (see Figure )
Figure: Shows the relationship between sample and population with their statistical
measures:
For instance, suppose an investigator examine the population of dogs in Delhi and if
investigator wants to examine the mean height of all the dogs in the town then it is called
as population "parameter". If investigator selects a sample of 50 dogs from the town and
investigate mean height from that sample of dogs then it is called as 'statistic'.
A value of parameter always remains constant. It is not a random variable because all the
units in a population always remain same whereas statistic is a random variable whose
value varies from sample to sample because units selected in two or more samples are not
the same and different sample will give different values. Variation in the value of statistics
is called sampling fluctuation.
The use of statistics is a means to an end. It is a quantity that helps to determine the
unknown parameters of a population based on only few observations.
It is used to estimate the degree to which sample statistics approximate the population
parameters. Investigator basically bothered about the population and the estimated
parameters corresponding with that population. However, investigator cannot obtain these
7. Exercise:
1. How does statistics help in the solution of Economic problems? Explain with
examples.
Ans: Solution of various economic problems can be better analysed and understood
with the help of different statistical tools as discussed in details with examples given in
section 2 of the chapter.
2. Explain the difference between parameter and static with the help of examples.
3. Explain the importance of Statistics in Economics. What are the limitations of it?
Ans: Statistics holds a central position in Economics detailed given in section 2.2 and
2.3.
8. References:
1. Jay L. Devore, Probability and Statistics for Engineers, Cengage Learning, 2010.
2. John E. Freund, Mathematical Statistics, Prentice Hall,1992.
3. Earl k. Bowen and Martin K. Starr, Basic Statistics for Business and Economics,
McGraw-Hill, 1983.
DC-1
Semester-II
TABLE OF CONTENTS
1 Learning Outcomes
2 Introduction
3 Types of Averages
3.1 Mathematical Averages
3.1a Arithmetic mean
3.2 Averages of position
3.2a Median
3.2b Mode
4. Measures of Dispersion
4.1 Range
4.2 Standard Deviation and Variance
5 Skewness, Moments and Kurtosis
6 References
1. Learning Outcomes
2. Introduction
One of the most widely used set of summary figures is known as measures of location, which
are often referred to as averages or central tendency or central location. The purpose of
computing the average value for a set of observation is to obtain a single value which is
representative of all items in the data set. The single value is the point of location around
which the observations of the sample (or population) cluster.
Statistical series may differ from each other in the following ways;
Data may differ in values of the variable around which most of the items cluster and
can be measured by central tendency or averages.
They may differ in extent to which items are dispersed around the central value and
can be measured by measures of dispersion.
There may be difference in the extent of departure from a normal distribution and can
be measured by skewness and kurtosis.
3. Types of Averages
a) Median
b) Mode
Mathematical Averages
Example: The marks scored by sample of 10 students from the toppers in the college
are listed below. A sample of the marks of top ten student of the class in statistics
was asked to collect. Calculate the sample mean.
Marks : 99, 98, 96, 95, 94, 92, 89, 88, 86, 85
(out of 100)
Solution:
= 92.2 marks.
Proof: Let =
Then =
=a or .
= i.e.
Proof: = =
= ( is a constant and )
= ( )
=A+
n = Number of items.
Discrete Variables
=A+ h where =
If the deviation are further divided by a common factor and if this factor is represented by i
then (step deviation method).
=A+
=A+ .
Continuous variables:
If data set is for a continuous variable then ’s are the mid points of the class. If all the
classes have the same interval then h is the class interval.
Example: Calculate the mean marks obtained by the candidates from the following data:-
20-30 15
30-40 18
40-50 25
50-60 30
60-70 20
70-80 16
80-90 7
90-100 2
Solution:
50-60 55 30 0 0 0
60-70 65 20 10 1 +20
70-80 75 16 20 2 +32
80-90 85 7 30 3 +21
90-100 95 2 40 4 +8
n=140 -53
=A+
= 55 + = 51.22 marks
In calculating the weighted arithmetic average of each value of the variable is multiplied by
its weights and the products so obtained are aggregated. The total is divided by the total of its
weights and resulting figure is the weighted arithmetic average.
w = =
Example:
An examination was held to decide the award of a scholarship. The weights of various
subjects are different. The marks obtained by 3 candidates (out of 100 marks) are given
below. Who will get the scholarship?
Economics 4 87 85 90
Mathematics 3 90 88 80
Statistics 2 90 85 88
English 1 93 93 92
Solution:
English 1 93 93 92 93 93 92
For A: w = = 89.1
For B: w = = 86.7
For C: w = = 86.8
3.2(a) Median:
The second most popular measure of central location is the median. The median is calculated
by placing all the observations in order (ascending or descending). The observation that falls
in the middle of the series is termed as median. It means that median divides the series in two
equal parts. The median of a sample is denoted by and the population median is denoted by
Calculation of Median:
Example:
Compute the median of the following item: 5,7,9,12,10,8,7,15,21
Solution:
Series in ascending order: 5,7,7,8,9,10,12,25,21
n=9
( ) = Size of ( )th item
=( )th item
=
Example:
90 15
85 20
83 28
82 20
95 9
Solution:
82 20 20
83 28 48
85 20 68
90 15 83
95 9 92
th
( ) = item
th
= item = 46.5th item i.e. 83.
The cumulative frequency C must be just greater than N/2. 48 is just greater than 46.5 and
the value of X(marks) corresponding to 48 is 83. Thus median number of marks is 83.
Continuous Series:
( ) = Value of median.
h = Class interval.
Further Classification
Trimmed Mean:
The mean is very sensitive to the outlying values in the data set, whereas the
median is only the middle value. This extreme behaviour of either type might be undesirable,
we may need an alternative measure that is neither as sensitive as mean nor as insensitive as
median. The mean and median are opposite extremes of the same family of measures. The
mean is the average of all the data, whereas median is the middle value or the average of the
two middle values. To interpret, the mean is calculated by trimming 0 per cent from each end
of the sample, whereas for the median the maximum possible amount is trimmed from each
end. The trimmed mean is a compromise between mean and median.
The trimmed mean have some of the advantages of both mean and median without some of
the disadvantages. The 10 % trimmed mean is the mean computed by excluding the 10%
largest and 10% smallest values from the sample and taking the arithmetic mean of remaining
80% of the sample. For example consider the data:
= = 7.375
Quartile: The values which divide the data into four equal parts are known as
quartiles. There will be three such points Q1, Q2, Q3 such that Q3≥ Q2 ≥ Q1, termed as
three quartiles. Q1, known as the first quartile is the value which has 25% of the items
of the distribution below it and 75% of the items are greater than it. The second
quartile Q2 coincides with the median and has an equal number of observations above
it and below it. Q3, the third quartile has 75% of the observations below it and 25 % of
the observations above it. It divides the second half of the series into two equal parts.
th
Q1 = item
th
Q3 = item
Deciles: These are the values which divide a series into 10 equal parts. There are 9
deciles denoted by D1, D2, D3 . . . . . . . . D9.The fifth decides is the median of the
series.
th
D1 = value of item
th
D2 = value of 2 item
Percentile: These values divide the series into 100 equal parts and these are 99
percentiles.
th
P1 = value of item
th
P2 = value of 2 item
th
P99 = value of 99 item
3.2(b) Mode:
Mode is the most common item of the series. The mode is defined as the observation
(observations) that occurs within the greatest frequency. Mode is denoted by ‘Mo’.
For example in following series the value of mode is 25
Value: 10 12 15 25 30 50
Frequency: 5 15 20 40 10 8
Some series are bimodal for example:
Marks: 30 35 40 45 50 60
No. Of student: 5 15 25 25 14 6
In this case mode: 40, 45 (two modes)
However in all cases maximum frequency may not necessarily signifies maximum
frequency density. For example
Value: 10 12 14 16 18 20 22
Frequency: 50 80 70 78 77 30 15
In this series the value 12 has the highest frequency 80 but it is not the mode of the
series because the maximum frequency density is around the value 16. The values on
either side of 16 have fairly large frequencies. The values 14 , 16 , 18 account for ( 70
+ 78 + 77 = 225) frequency out of a total of 400. So the concentration of frequencies
is around the value 16 rather than 12.
Mode = L1 + ×
f0 = Stands for the frequency of the preceding class.
f2 = Stands for the frequency of the succeeding class.
L1 = lower limit of modal class.
L2 = upper limit of modal class.
In a symmetrical distribution mean, median, mode is identical and has the same
value. If a distribution is skewed then
For positively skewed distribution mean > median > mode
For negatively skewed distribution mean< median< mode
4. Measures of Variability:
A measure of average gives only partial information about the distribution. Data sets
may have the same average value but difference in other aspects. We need to also
obtain a measure of the spread of the distribution. A measure of variability of the data
set gives us another characteristic of the distribution.
4.1 Range:
Range is the simplest possible measure of dispersion. It is the difference between the values
of the extreme items of a series. Symbolically,
Range(R) = where
Coefficient of Range =
The main drawback of the range is that it depends on only two most extreme observations
and disregards the position of the remaining n - 2 values.
It is a measure of dispersion based on the upper quartile Q3 and the lower quartile Q1.
The spread of distribution can be inferred from the deviations of individual values of the data
from their average or mean value. Some of the deviations will be positive if
and other will be negative for . However, we cannot use an average of the deviations
to measure the variability of the data as = 0 so that = 0. The
technique of the calculation of mean deviation is mathematically illogical because in its
calculation the algebraic signs are ignored or omitted. Mean Deviation where the
signs are ignored. However, there are problems with using this mean deviation. The
commonly used measures of dispersion are the variance and standard deviation.
Variance: The variance is the mean of squared deviations about the mean of the series. The
sample variance, denoted by , is given by
= =
Note that the divisor is in calculating sample variance, while in calculating population
variance, denoted by , the divisor is N
What is the rationale behind it? Just as will be used to make inferences about the
population mean µ, we should define the sample variance so that it can be used to make
inferences about . Note that involves squared deviations about the population mean µ.
If we actually knew the value of µ, then we could define the sample variance as the average
squared deviation of the sample s about µ. However, the value of µ is almost never known
so the sum of squared deviation about must be used. But the s tend to be closer to their
average then to the population average µ, using n as the divisor lead to an underestimate of
So to compensate for this the divisor is used rather than n. Thus if µ then
> ,
The sample standard deviation, denoted by s, is the square root of the variancex
s = =
Note that s2 and s are both nonnegative. There is an alternative formula for sxx that avoids
calculating the deviations. The formula involves both , summing and then squaring,
and squaring and and then summing.
Sxx = = - =
Properties of Variance:
Let
So
=
Variance is not independent of scale.
To show is minimum if A=
Let
will be minimum if and >0
=
Putting or
= +2
is minimum if A= .
Standard deviation is the square root of the arithmetic average of the squares of the
deviations measured from the mean. Thus, in the calculation of standard deviation, first the
arithmetic average is calculated and the deviation of various items from the arithmetic
average are squared. The squared deviations are totalled and the sum is divided by the
number of items. The square root of the resulting figure is the standard deviation of the series.
Symbolically: s= =
Where s stands for the standard deviation of the sample data, for the sum of the squares
of the deviations measured from the arithmetic average and for the number of items.
Variance and standard deviation of population are denoted by and σ respectively.
Definitional formula:
=
Computational formula:
=
10 8
11 5
12 4
Solution:
=
n = 48 = 432 =28 124
Arithmetic average = = =9
In continuous series the class intervals are usually of equal size the deviation from the
assumed average can be expressed in class interval units, or in other words, step deviations
can be found out by dividing the deviations by the magnitude of the class intervals. The
formula for the8 calculation of standard deviation is then written as follows:-
s= ×i
Where i stands for the common factor or the magnitude of the class interval, and dx stands for
the deviations in class interval units, and other signs stands for what they stood in previous
formula.
s=
Coefficient of variation:
The variance and the consequently standard deviation is independent of change of origin but
not of scale.
Moments:
‘’Moment is a familiar mechanical term for the measure of a force with reference to its
tendency to produce rotation. The strength of this tendency depends, obviously, on the
amount of the force and the distance from the origin of the point at which the force is
exerted”. In statistics moments of random variable about some points are used to describe
the various characteristics of frequency distribution viz., dispersion, skewness and
kurtosis. Central moments are defined
=
Where is the rth central moment.
=
Where is the rth non central moment about the origin.
Skewness:
Measures of Skewness:
If a particular distribution is found to be skewed the next problem that series is to measure
the extent of skewness. Measures of skewness are meant to give an idea about the extent of
symmetry in a series. They are also called first measures of skewness. The first measures of
skewness are based on the assumption that in a skewed distribution the values of mean,
median and mode do not coincide. This being so, the difference between any two of these
values indicate the extent of skewness. Thus the first measures of skewness are:-
This formula is based on the difference between mean and mode. It uses standard deviation as
the divisor. It is expressed as follows:
Jp =
The moments coefficient of skewness is:
=
= =
Kurtosis:
Kurtosis is yet another measure which tells us about the form of a distribution. It tells us
whether the distribution, if plotted in a graph would give us a normal curve, a curve more flat
than the normal curve or a curve more peaked then the normal curve. If a distribution is more
peaked than the normal distribution it is called “Leptokurtic”. If the distribution is more flat
than the normal distribution it is called “platykurtic”. The normal distribution is known as
“mesokurtic”.
In the figure given above curve No.1 is normal or mesokurtic, curve No.2 is more peaked
than the normal curve and is leptokurtic and curve No.3 is more flat than the normal curve
and is platykurtic.
6. References:
Jay L. Devore: Probability and Statistics for Engineering and the Sciences,
Cengage Learning, 8th edition.
DC-1
Sem-II
1: Learning Outcomes
2: Introduction
3: Some Important Terminology
3.1: Sample Space
3.2: Events
3.3: Mutually Exclusive Events
3.4: Mutually Exhaustive Events
3.5: Equally Likely Events
3.6: Independent Events
4: Definitions of Probability
5: Examples of Probability
6: Properties of Probability
7: Summary
8: Exercises
9: References
10: MCQs
1. Learning Outcomes
2. Introduction
Galileo, in the 17th century, laid down some ideas on dice games. This evolved some ideas
and discussions that constituted the stepping stone of the probability theory. Probability is
the branch of mathematics which involves random experimentation. Jerome Cardan, an
Italian mathematician was the first to pen down the idea of probability. However, Pascal and
Fermat laid down the basic and formal foundation in the subject.
3.1Sample Space:
Generally, denoted by letter ‘S’, a sample space consists all possible outcomes of a
random/non-deterministic experiment. Suppose we need to find out which day of the
week will 29th February in 2016 appear? Then the sample space is:
S = {Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday}
Thus, the days in a week here become the elements or members of this set, where
each day has exactly one possible outcome, i.e. 29th February can fall on any one of
these days. Thus, here ‘S’ is the sample space and its elements i.e. days in a week
are referred to as sample points.
Now suppose we need to find the possibility of getting a ‘6’ in a roll of a die. Thus,
the sample space becomes S = {1, 2, 3, 4, 5, 6}, i.e. in a roll of a single die, there is
equal probability of getting a 1, or a 2, or a 3, or a 4, or a 5, or a 6. Now suppose we
roll two cubic dice together. The sample space now is calculated by Nn, where n is
number of trials or number of times experiment takes place or is repeated, N is the
number of elements in the sample. So, the sample space for a die rolled twice is:
S = {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6)
(2,1), (2,2), (2,3), (2,4), (2,5), (2,6)
(3,1), (3,2), (3,3), (3,4), (3,5), (3,6)
(4,1), (4,2), (4,3), (4,4), (4,5), (4,6)
(5,1), (5,2), (5,3), (5,4), (5,5), (5,6)
(6,1), (6,2), (6,3), (6,4), (6,5), (6,6)
Thus, n(S) = 36
Now, suppose we paint one die white (W) and another die black (B). If we close our
eyes and pick up a die randomly and want to study the occurrence of a white or a
black die, then the sample space becomes S = {W, B}. Now suppose we just want to
study the occurrence of a black die. So either the black die occurs or it doesn’t. Thus,
we can give value/numbers to it i.e. if black occurs, the experiment is successful and
we give it a number 1 and if it doesn’t occur, we give it a number 0. Thus, the
sample space becomes S = {0,1}. Thus, n(S) = 2
If we roll two successive die which are white and black in colour, the sample space
becomes, S = {(W,B), (W,W), (B,W), (W,W)}. Thus, n(S) = 4
Similarly, if we toss an unbiased coin, the sample space S = {head, tail}. If we toss
two unbiased coins together, the sample space (Nn = 22 = 4) becomes,
S = {(H,T), (H,H), (T,H), (T,T)}. Thus, n(S) = 4
3.2 Events
Thus, we can say that an event ‘E’ is a subset of a sample space ‘S’ i.e. E S.
If we want to find the event of the sum of 3 in a roll of two die, then the event is:
E3 = {(1,2), (2,1)}. Thus, n(E3) = 2
Suppose now we toss three unbiased coins, the sample space is (Nn = 23=8):
S = {(HHH), (HTT), (HTH), (HHT), (TTT), (THH), (THT), (TTH)}, n(S) = 8
Now, if we want to find out the event of getting all tails,
ET = {(TTT)}, n(ET) = 1
If we want to find out the event of ‘No Tail’, thus
Thus, there are no common element between A and B (seen by the separated
circles). Thus, they are mutually exclusive events.
If and only if at least one among the several events necessarily occurs, we call
them to be mutually exhaustive. For example, if Congress and BJP are the two
parties fighting for 2014 general elections, then one of them must win the
election and thus they are mutually exhaustive. Similarly, if we roll a six-faced
die, the occurrence of 1, 2, 3, 4, 5 and 6 are mutually exhaustive. If Mr. A plays
chess with his friend, Mr. A either wins or loses. Thus the events win and lose are
mutually exhaustive.
3.6Independent Events
Two events, A and B, are said to be independent if the occurrence of event A has
no effect on the occurrence of the event B. For example, “I had my lunch today”
and “My pen got stolen in the college today” are independent events. We multiply
the probabilities of independent events while calculating their probabilities. By
doing this, we want to find out the occurrence of both the event together,
provided they are not related, i.e. P(A and B)/ P(A∩B) = P(A) * P(B).
Q- Do you think if event are mutually exclusive then they are also independent, and
vice-versa?
Ans- No. By definition, two mutually exclusive events i.e. P(A∩B=0) cannot be
independent i.e. P(A∩B) = P(A) * P(B).
NOTE: sometimes the two events are such that their probability of occurring
together or not becomes unclear, For example, if we look at stock prices, we know
that, for example, reliance and tata are big companies with large capital and thus
their stock prices are highly valued. It may be expected that investing in such
companies requires big investment which would bring higher return, but the stock
prices fluctuate on account of other factors too (like, political uncertainity,
recession, war, natural calamity, etc.). Thus, the stock prices may rise or fall not
depending on its underlying value.
** NOTE: If two events A and B are dependent on each other (i.e. they are neither
independent nor mutually exclusive), then we use conditional probability, which we
will study in the next chapter.
4. Definitions of Probability
Probability of an Event: We have already studied about sample space and events. Now,
suppose we want to assign some numerical values to all the possible outcomes of a random
experiment to find its probability of occurrence in every trial.
There are four different ways in which the term ‘probability’ has been defined:
1) Classical definition
2) Axiomatic definition
3) Empirical definition
4) Subjective definition
P (E) = NE/N
i.e. the probability of an event E is the ratio of number of favorable outcomes to event E
(n(E)) to the total number of outcomes (n(S)) i.e. number of favorable outcomes/total
number of outcomes
For Example, if three unbiased coins are tossed simultaneously and we have to find out the
probability of the event ‘no head’. There are 23 = 8 possible outcomes (n(S)). If E is the
possibility of ‘no head’, then the events favorable to E = {TTT} and thus n(E) = 1.
If n(E) = 0 i.e. there are no favorable events outcomes to event E, then P (E) also becomes
zero and thus we can say that event E is impossible.
1) Classical definition can only be applied if and only if the events are mutually exclusive,
mutually exhaustive and equally likely. Else this definition cannot be applied.
2) The phrase ‘equally likely’ in the definition means equally probable. So we may say that
the definition is circular in nature.
3) The definition fails if the number of outcomes of an event is very large (infinite).
Von Mises was the one to introduce this concept. This definition uses the basis of deductive
theory which is not widely accepted. According to this definition, if an event (E) is found to
occur m times out of N trials of a random experiment, its relative frequency is given by m/
N. As N increases indefinitely, this relative frequency approaches a limiting value P, and is
called the probability of the event E.
This limit gets a meaning if we take this equation as an assumption to define P (E).
5. Examples of probability
Example 1) Suppose we want to find the probability of numbers 1 and 6 in any order, in
the throw of two die.
Now, as we throw the second die, it will be a specific number. i.e. if number 1 occurred in
the first die, the second die has to be 6 (P(B) = 1/6) and vice-versa (P(A) = 1/6). Thus, the
probability of second die (P(S)) being favourable is still 1/6.
Thus, we can say that first and second die are independent as occurrence of A or B in the
first die has no affect on the occurrence of A or B in the second die.
Thus, the probability of numbers 1 and 6 in any order = P(F) * P(S) = 1/3 * 1/6 = 1/18
Example 2) Find the probability of getting a double six in the throw of two die.
Example 3) What is the probability that all three siblings born in a family will have different
birthdays.
- Solution: Suppose the three children were born on three different days.
If the first child was born on one of the 365 days, the second child must be born on any
one of the remaining 364 days, and in the same way third child has to be born in any of the
remaining 363 days.
P(birthday of 1st child) = 365/365
P(birthday of 2nd child) = 364/365
P(birthday of 3rd child) = 363/365
All the three events are independent of each other.
Thus, probability that all three siblings born in a family will have different birthdays =
365*364*363/365*365*365
Example 4) If 5 students are sitting in a row at random, what is the probability that 4th and
5th students will sit together?
- Solution: Number of ways in which 5 students can sit = 5! = 5.4.3.2.1 = 120 ways
Suppose 4th and 5th students sit together, they can arrange among themselves in 4! 2! =
48 ways.
Example 5) From a pack of well-shuffled cards, two cards are drawn at random. Find the
probability that
i) Both cards are black
ii) One is a spade and other is a heart
- Solution: Number of ways in which two cards can be drawn from a pack of 52 cards =
52
C2 = 1326 ways
These outcomes are equally likely, mutually exclusive and mutually exhaustive.
i) Total number of black cards = 26
Number of cases favorable to both the cards are black = 26C2 = 325
Thus, probability that both cards are black = 325/1326
52
ii) Number of ways in which two cards can be drawn from a pack of 52 cards = C2 = 1326
ways
Example 6) A bag contains 3 black balls and 7 white balls. If one ball is drawn at random,
find the probability that it will be black?
- Solution: Total number of balls = 10, number of ways in which one ball can be drawn
from them at random is given by 10C1 = 10 ways
A black ball is drawn in 3C1 = 3 ways
Thus, P(black ball) = 3/10 = 0.3
Example 7) What is the probability of the event ‘no head’ in the toss of two unbiased coin?
Example 8) Two cards are drawn together from a pack of 52 well shuffled cards. Find the
probability that one is a club and one is a diamond, is:
52
- Solution: n(S) = C2 = (52*51/2*1) = 1326
Let E = event of getting 1 club and 1 diamond.
Example 9) From a pack of 52 well shuffled cards, a card is drawn at random. Find the
probability that: i) the card drawn is a face card, ii) it is queen of spade or king of diamond.
- Solution: n(S) = 52
i) There are three face cards i.e. jack, queen and king of each suit. Thus there are 12 face
cards out of 52 {n(E)}.
Let E = event of getting a face card
Thus, P(E) = 12/52 = 3/13
Then, n(E) = 2.
Example 10) A bag contains 6 red and 8 green balls. One ball is drawn at random. What is
the probability that the ball drawn is red?
Example 11) A bag contains 4 white, 5 red and 6 blue balls. Three balls are drawn at
random from the bag. The probability that all of them are red, is:
6. Properties of Probability
We know that P(S) = 1, thus sum of probabilities of a probability and its complement is 1,
so we may say that probability of complement of event A is:
iii) If there are two events A and B, their union is the sum of their probabilities minus their
intersection i.e.
iv) If an event B is a subset of event A, then probability of event B will be less than or equal
to the probability of event A i.e.
vi) If the sample space S is finite and it has events E 1, E2, ……, En such that S = { E1, E2,
………, En } then probability of S is:
For example, probability of getting an odd number in the throw of a die is:
7. Summary
- All the subsets of a sample space are called events. Suppose there are n number of
events, thus S = { E1, E2 , ………, En} and probability of S =
- P(S) = 1
-
-
-
- If A and B are independent, P(A and B)/ P(A∩B) = P(A) * P(B)
- If E1, E2, ………, En are mutually exclusive events with each other,
8. Exercise
ii) 1
iii)2/6
iv) 5/6
9. References
1. Jay L. Devore, Probability and Statistics for Engineers, Cengage Learning, 2010.
3. http://classof1.com/solution-library/view/math/probability/Multiple-choice-Questions-on-
theoretical-probability/621/probability/string/search
4. http://www.probabilitytheory.info/
1. Tickets numbered 1 to 20 are mixed up and then a ticket is drawn at random. Find the
probability that the ticket drawn has a number which is a multiple of 3 or 5?
i) 1/3
ii) 3/5
iii) 8/9
iv) 9/20
2. A bag contains 2 white, 3 black and 2 blue balls. Two balls are drawn at random. Find the
probability that none of the balls drawn is blue?
i) 10/21
ii) 2/21
iii) 2/7
iv) 5/21
Ans. i) 10/21
3. In a box, there are 8 red, 7 blue and 6 green balls. One ball is picked up randomly. What
is the probability that it is neither red nor green?
i) 1/3
ii) ¾
iii) 7/19
iv) 8/21
Ans. i) = 1/3
Let E= event that the ball drawn is neither red nor green
= event that the ball drawn is blue.
n(E) = 7.
i) 1/9
ii) 1/36
iii) 1/6
iv) 4/6
Let E = event of getting a sum ={(3, 4), (4, 3), (5, 2), (6, 1), (2,5), (1,6)}.
5. Three unbiased coins are tossed. What is the probability of getting at least two heads?
i) 3/4
ii) 1/4
iii) 3/8
iv) 4/8
Solution: Here S = {TTT, TTH, THT, HTT, THH, HTH, HHT, HHH}
6. In a simultaneous throw of tow die, what is the probability of getting two numbers whose
product is ‘not even’?
i) 1/4
ii) 3/4
iii) 3/8
iv) 5/16
Ans. i) 1/4
Let E = numbers whose product is even, thus = numbers whose product is not even
Then, E= {(1, 2), (1, 4), (1, 6), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (3, 2), (3, 4),
(3, 6), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6), (5, 2), (5, 4), (5, 6), (6, 1),
(6, 2), (6, 3), (6, 4), (6, 5), (6, 6)}
n(E) = 27.
P(E) = n(E)/n(S) = 27/36
P( ) = 1 – P(E) = 1 – 27/36
= 9/36
=¼
7. In a society, there are 15 blinds and 10 deaf people. Three residents are selected at
random. The probability that 1 deaf and 2 blinds are selected, is:
i) 21/46
ii) 25/117
iii) 1/50
iv) 3/25
Ans. i) 21/46
Solution: Let S be the sample space and E be the event of selecting 1 deaf and 2 blinds.
n(E) =
= 1050.
8. In a lottery of 35 chits, there are 10 chits with prizes and 25 chits left blank. A lottery is
drawn at random. Find the probability of getting a prize?
i) 2/5
ii) 1/5
iii) 5/7
iv) 2/7
Solution: n(S) = 35
n(Prize) = 10
Thus, P(prize) = 10/35 = 2/7
9. Two cards are drawn together at random, from a pack of 52 well shuffled cards. Find the
probability that cards are queenss?
i) 1/15
ii) 1/663
iii) 1/221
iv) 1/26
52
Then, n(S) = C2 =(52*51/2*1)= 1326.
10. Two dice are tossed. The probability that the total score is ‘ not a prime number’ is:
i) 5/12
ii) 7/12
iii) 1/6
iv) 2/36
Let E = Event that the sum is a prime number, thus = event that the sum is not a prime
number.
Then E= { (1, 1), (1, 2), (1, 4), (1, 6), (2, 1), (2, 3), (2, 5), (3, 2), (3, 4), (4, 1), (4, 3),
(5, 2), (5, 6), (6, 1), (6, 5) }
n(E) = 15.
P(E) = 15/36
Thus, P( ) = 1 – 15/36
= 21/36
= 7/12
DC-1
SEM-II
Lesson:Conditional Probability
College:Shyamlal College
University of Delhi
Statistical Methods In Economics - I
1: Learning Outcomes
2: Introduction
3: Theorems Of Probability/Counting Techniques
3.1: Theorem Of Total Probability
3.2: Theorem Of Compound probability
4: Conditional Probability and Concept Of Independence
5: Examples of Conditional Probability
6: Multi-Stage Experiment and Bayes’ Theorem
7: Summary
8: Exercises
9: References
10: MCQs
1. Learning Outcomes
2. Introduction
CONDITIONAL PROBABILITY:
It is defined as the probability of occurrence of an event A, given that another event B has
already occurred. We may represent the conditional probability of event A as:
For example, if a bag contains W white balls and B blue balls, and two balls are drawn at
Similarly, if the ball obtained in the first draw was blue, the conditional probability of getting
a white ball in the second turn is: W/(W+B-1)
Now, Let A denote a white ball in first draw, and B denote a white ball in second draw, then
P(A) = and
P(A∩B) given A
= ,
=
*
According to this theorem, for any two mutually exclusive i.e. P(A∩B = 0), exhaustive and
equally likely events A and B, we add the probabilities of the two events while calculating
the probabilities of the occurrence of either A or B i.e. if A and B have no elements in
common, then
For example, a class has 15 girls and 20 boys, thus total students = P(GUB) = P(G) + P(B)
= 35
Statistical Methods In Economics - I
If the two events are not mutually exclusive i.e. P(A∩B=0), then the addition rule becomes:
P(AUB) = P(A) + P(B) - P(A∩B) or
P(A or B) = P(A) + P(B) - P(A and B) or
P(A + B) = P(A) + P(B) - P(A X B)
According to this theorem, if we want to find the probability of two event A and B occurring
together simultaneously, we can multiply the probability of the event A and the conditional
probability of event B given that event A has actually occurred, denoted by P(B/A)
P(B/A) is the ratio of number of events favorable to events A and B to the number of events
favorable to event A.
i.e. P(B/A) = P(A∩B) / P(A) or
P(A∩B) = P(A) * P(B/A)
This is also known as the multiplication rule.
For example, if P(AUB) = 6/8, P(A) = 6/16, P(B) = 2/4. Find P(B/A) and P(A/B).
P(A∩B) = P(A) + P(B) – P(AUB) = 6/16 + 2/4 – 6/8 = 1/8
P(B/A) = P(A∩B) / P(A) = 1/8 / 6/16 = 1/3
P(A/B) = P(A∩B) / P(B) = 1/8 / 2/4 = ¼
For example, following results were obtained for three subjects namely Mathematics (M),
Economics (E) and Hindi (H) in a class:
25% of the students passed in M
20% of the students passed in E
35% of the students passed in H
7% of the students passed in M and E
5% of the students passed in M and H
2% of the students passed in E and H
1% of the students passed in all the three subjects
Find the probability that a student got passing marks in at least one of the subject:
P(M) = .25, P(E) = .20, P(H) = .35, P(M∩E) = .07, P(M∩H) = .05, P(E∩H) = .02, P(M∩E∩H)
= 0.1.
Therefore, P(MUEUH) = P(M) + P(E) + P(H) – P(M∩H) – P(M∩E) – P(E∩H) + P(M∩E∩H) =
.25 + .20 + .35 - .07 - .05 - .02 + .01 = .67
If we have to make a choice among various ‘n’ number of decisions/options that has to be
made, then total number of choices is given by:
C = c1 × c2 × c3 × . . . × cn
Where c1 = the number of ways to choose the1st option,
c2 = the number of ways to choose the2nd option, etc.
Permutation Rule:
If we want to place/draw E elements (out of n elements) in some sequential order, without
replacement, we use the rule of permutation. Thus, permutation of E elements out of n, is
defined as the number of ways to order E elements from a set of n elements, without
repetition (E< n). Order is what matters in permutation and objects are drawn without
replacement and it is denoted as nPE.
Explanation: Suppose the first object is placed in n ways, since this object won’t be
replaced, the second object can be placed in (n – 1) ways and the process continues till the
last object is placed.
n =
PE =
For example, there are 7 digits numbered 1,2,3…7. Suppose we have to make a number
plate of 4 digits, without replacement. Find the probability that the number plate formed is
an even number.
Statistical Methods In Economics - I
Solution: For the number to be even, the last number should be an even number.
Remaining three digits can be any number from the remaining 6.
For Example, Suppose 3 boys and 5 girls sit together in a row in a class. Find the
probability that: i) all boys sit together, ii) same gender sit in the extreme ends.
i) Assume BBB as 1 object and we arrange 6 objects i.e. 5 girls and 1 BBB. Thus number of arrangements = =6
arrangements. Thus,
Probability that boys sit together = 6/56 = 3/28
ii) There can be two ways i.e. one where boys sit at extreme ways and another way in which girls sit at extreme ends.
If girls are at each end, number of distinct arrangements of 6 remaining places in which
If boys are at each end, number of distinct arrangements of 6 remaining places in which
Combination Rule:
If we want to choose/draw E elements (out of n elements), without replacement, we use the
rule of combination. Thus, combination of E elements out of n, is defined as the number of
ways to draw/choose E elements from a set of n elements, without repetition (E< n). Here
order does not matter and it is denoted as nCE or .
Explanation: Suppose the first object is placed in n ways, since this object won’t be
replaced, the second object can be placed in (n – 1) ways and the process continues till the
last object is placed.
For Example: Suppose you pick up 13 cards out of a pack of 52 well-shuffled cards. Find
the probability that i) it has at least one king, ii) it has 3 queens, iii) 6 clubs, 4 diamonds, 2
hearts and 1 spade.
Solution: Since the order does not matter, we may randomly choose 13 cards out of 52
applying combination rule and thus our sample size becomes 52C13
i) We know that there are four kings in a pack of 52 cards, so at least one king
means we can choose either 1 king and 12 other cards [(4C1) (48C 12)], or 2 kings
and 11 other cards[(4C2) ( 48C11)], or 3 kings and 10 other cards[(4C3) (48C 10)].
Thus,
ii) We know that there are three queens in a pack of 52 cards, thus 3 queens means
we can choose 3 queens out of 4 in 4C3 ways and other 10 cards in 48C10 ways.
Thus,
P(3 queens) = ( 4C3 * 48C10) / 52C 13
iii) We know there are 13 cards of each suit i.e. spade, diamond, heart and club. We
choose 6 clubs in 13C6 ways, 4 diamonds in 13C4 ways, 2 hearts in 13C2 ways and 1
spade in 13C1 ways. These 4 cards can again be arranged in 4! Ways. Thus,
Example: If you draw a card from a pack of 52 well-shuffled cards, find out the probability
that it is a king of hearts, given that the card drawn in red.
Independent Events
We have already learnt about independent events in our last chapter i.e. if occurrence of
event A does not affect the occurrence of another event B. This implies that,
P(B/A) = P(B/AC) = P(B)
From the compound probability theorem we get,
P(A∩B) = P(A) * P(B/A)
= P(A) * P(B)
Similarly, for three independent events A, B, and C,
P(A∩B∩C) = P(A) * P(B) * P(C) and so on.
5. Examples
Example 1: One ticket is drawn at random from 100 tickets numbered 0, 1, 2, 3, …, 99. If
sum of the two digits on the ticket is i, such that, 0 ≤ i ≤ 18. Let Ei be the event that the
sum of the two digit number is i. Let F j be the event that the product of the two digits is j,
given 0 ≤ j ≤ 9. For each possible i, find P(Ei/F 0) .
Solution: F0 is the first event out of ten tickets drawn. Thus P(F0) = 1/10
(Ei∩F0) = event that the ticket numbered 0 is drawn. Thus, P(Ei∩F0) = 1/100
Thus, P(Ei/F0) = P(Ei∩F0) / P(F0)
Example 2: Suppose a card is drawn from a pack of 52 well-shuffled cards. Find the
probability that the card is a black Ace given it is a spade.
Example 3: Suppose there is a Apple i-phone wholesaler who has 20% of the phones
duplicate in his showroom. Suppose retailer buys i-phone from him. He has 10% probability
of buying a duplicate i-phone. Find out the conditional probability that the retailer buys an
original i-phone.
Example 4: A manufacturer makes light bulbs and found that 5% of the bulbs have a
common defect. Researchers studied that 93% of these defective bulbs show a certain
behavioral characteristic, while this characteristic was exhibited in 2% of the non-defective
bulbs. A bulb was examined which showed a characteristic symptom. Given this behavioral
symptom, find the conditional probability that the bulb has a defect.
Solution: Let A = event that the bulb is defective
B = event that the bulb has a characteristic symptom
Given, P(A) = 0.05, P(B/A) = 0.93, and P(B/Ac) = 0.02
Thus, P(A/B) =
= (0.93 * 0.05) /[(0.93 * 0.05) + (0.02 * 0.95)
= 93 / 131
Solution: Let A= event that the Woman lives till seventy years
B = event that the woman lives till eighty years
If the woman lives for eighty years, she would have already lived for 70 yrs, thus, B A.
Thus, P(B/A) = P(AB) / P(A) = P(B) / P(A) = .55 / .70 = 55/70
P(D0|G) =
Example 7: In a survey, 85% students say that they obey the rules in the school. Previous
experience show that 20% of students who do not obey the rules, say that they obey, out of
fear of parents. If a student is picked at random, find the probability that he does obey the
rules in the school. (Assume: all who obeys rules says that they obey).
It is not necessary for an event to be single-stage, if it can be broken down into stages, an
experiment can become a multi-stage.
Following results may take place when we apply conditional probabilities to events A and B
in case of a multi-stage experiment:
- I) Event A and event B may remain in the same stage and not enter another stage of
the experiment. Here we find the conditional probability of A given B using:
P(A/B)=P(A∩B) / P(B)
- II) Event A is still in the first stage that has already occurred whereas event B is in
the next stage that is yet to occur i.e. experiment is not yet complete.
- III) Event A is in the previous stage whereas event B will occur in a later stage and
the experiment is still incomplete
Bayes’ Theorem
Suppose an event B occurs n times and all are mutually exclusive to each other. Thus, Bi’s
covers the entire sample space. Now, let us assume an event A which may occur if and only
if one of the events B1, B2, B3, ….., Bn occurs. This implies that if the unconditional
probabilities i.e. P(B1), P(B2), P(B3),….., P(Bn) are known, then the conditional probabilities
i.e. P(A/B1), P(A/B2), ….., P(A/Bn) will also be known.
Statistical Methods In Economics - I
Now taking event A to be given, we can find out the conditional probabilities of events B1,
B2, B3, ….., Bn . Thus, given that A has actually occurred, the conditional probability P(Bi/A)
can be calculated as:
For example,
First urn contains 2 white and 4 blue marbles and second urn contains 2 white and 2 blue
marbles. A marble is transferred from urn 2 to urn 1 and then a marble is picked from urn 1
randomly. If it turns out to be a blue marble, what is the probability that the transferred
marble was white?
Solution: Let B1 = transferred marble was white, B2 = transferred marble was blue
Let A = marble drawn from urn 2 is blue
P(B1) = ½, P(B2) = ½,
P(A/B1) = 3/7, P(A/B2) = 5/7
P(B1/A) =
For Example: An entrepreneur expects his profits to rise over the next four quarters
with probability 0.4. As a result, a plan to increase the plant size over the next 12
months was prepared. When profits were analysed for past years data, in 8 out of ten
cases, profit occurred. Thus there was profit predicted in 2 out of ten cases but loss
occurred. Based on this information, how should the entrepreneur revise his plans of the
probability that profit will occur?
By Bayes’ formula, the revised probabilities associated with profit/ are computed
as:
= .32/.44 ≈ 0.73
7. Summary
- If two events A and B are not independent, we use conditional probability which is defined
as the probability of occurrence of an event A, given that another event B has already
occurred
=
- According to combination rule,
8. Exercise
9. References
1. Jay L. Devore, Probability and Statistics for Engineers, Cengage Learning, 2010.
10. MCQs
1. Suppose a medical company needs to find out if a certain drug can or cannot lead to an
improvement in symptoms for some patient with a particular medical condition. A study has
been done and following results were seen:
Statistical Methods In Economics - I
On the basis of the above table, given that the drug was provided, find out the conditional
probability that the patient shows improvement.
i) .3375
ii) .325
iii .225
iv) .275
Ans: i) .3375
= 270 / 800
= .3375 (Ans.)
2. Taking in reference the study of the table provided in question 1, find the conditional
probability that the patient was given the drug, given that the patient shows improvement.
i) .225
ii) .692
iii) .667
iv) .665
= 270 / 390
Statistical Methods In Economics - I
= .692
3. Suppose two cards are drawn without replacement from a deck of 52 cards. Find the
probability that both the cards are aces.
i) .0045
ii) .0050
iii) .0065
iv) .0385
Ans. i) .0045
Now, assuming that first card drawn is an ace and it is not replaced, the probability that
second card is also an ace = 3/51
Thus, probability that both cards are aces = 4/52 * 3/51 = .0045
4. Suppose two balls are drawn at random without replacement from an urn containing 4
red, 2 white and 3 green balls. Find the probability that the balls drawn are same in color.
i) .28
ii) .14
iii) .50
iv) .56
Ans. i) .28
Hint: Probability that two marbles are of same color = P(2R or 2W or 3G)
P(2R) = 4/9 * 3/8 (probability that first ball is red is 4/9 and assuming it to be true and
without replacing it, probability that second ball is red is 3/8)
i) .40
ii) .04
iii) .02
iv) .20
Probability that the bulb is defective and is produced by machine B = P(D/B) = .04
i) .08
ii) .028
iii) .029
iv) .027
Hint: Probability that the bulb is produced by machine A and is defective = P(D/A) = .01
Probability that the bulb is produced by machine B and is defective = P D/B) = .04
Probability that the bulb is produced by machine c and is defective = P(D/C) = .03
7. Suppose we again go back to the study shown in the table provided in question 1: Can
you say that ‘taken drug’ and ‘improvement are independent events?
Ans. ii) No
8. Suppose you have 5 show pieces, and you want to arrange 3 of them in your show case.
In how many different ways can you arrange them?
i) 5C 3 = 10 ways
iii) 3/5
iv) None
Ans. ii) 5P 3
Hint: This is the case of permutation. Since we want to arrange the show pieces and are
concerned with order, number of ways we can arrange the show pieces =
5
P3 = 5! / 2! = 60 ways
9. If a card is drawn from a deck of 52 well shuffled cards, what is the probability that it will
be a queen or a king?
i) 1/52
ii) 1/26
iii) 1/2
iv) 2/13
Ans.
Statistical Methods In Economics - I
10. In how many ways can we choose a sub-committee of 5 members from a club
consisting of 10 members?
10
i) P5 = 30240 ways
10
ii) C5 = 252 ways
iii) 5/10
iv) ½
10
Ans. ii) C5
Hint: This is an example of combination where we have to choose 4 members out of 10.
10
Thus, C5 = 252 ways
Discrete Random Variables And Probability Distributions
DC-1
Semester-II
Paper-III: Statistical Methods in Economics-I
Lesson: Discrete Random Variables And Probability
Distributions
Lesson Developer: Chandra Goswami
College/Department: Department of Economics, Dyal
Singh College, University of Delhi
TABLE OF CONTENTS
Learning Objectives 2
1. Random Experiments 2
2. Random Variables 4
Practice Questions 19
Content Developer
Chandra Goswami, Associate Professor, Department of Economics
Dyal Singh College, University of Delhi
Reference
Jay L. Devore: Probability and Statistics for Engineering and the Sciences, Cengage
Learning, 8th edition [Chapter 3]
.
Learning Objectives:
In this chapter you will learn what is a random variable and the two fundamentally
different types of random variables. You will learn how to arrive at the probability
distributions of discrete random variables and how to represent these graphically, as
well as presentation by summary expressions. This provides the tool for evaluating the
probability that the random variable takes on specific values or a range of values. You
will also learn how the probability distribution can be used to specify a mathematical
model for the population distribution. This will help you to identify the characteristics of
the population. The chapter ends with practice questions so that you can test your
understanding of the chapter contents.
Chapter Outline
1. Random experiments
2. Random variables
3. Probability distributions for discrete random variables
4. Graphical presentations of probability distributions
5. Parameters of a probability distribution
6. The cumulative distribution function for discrete random variables
7. Deriving probability mass function from cumulative distribution function
1. RANDOM EXPERIMENTS
A random or chance experiment is an experiment which yields different possible
outcomes. These outcomes may be qualitative or quantitative. In case of qualitative
outcomes, we observe a specific attribute of the variable. Quantitative outcomes result
when we observe a number describing the attribute of the variable. Until the outcome is
observed there is uncertainty about which particular outcome will be the result of the
experiment. If the experiment is repeated under identical conditions different outcomes
are likely to be observed at each trial.
Example 1.1
If a balanced coin is tossed there are two equally possible (qualitative) outcomes, a head
(H) or a tail (T).
Example 1.2
It is known that wind speed and direction affects time taken by aircraft to reach their
destination. The three possible outcomes for arrival time on any day are: before time, on
time, or delayed.
Example 1.3
If an unbiased die is tossed it will result in one of six possible outcomes, depending on
which face shows up: 1, 2, 3, 4, 5, or 6.
Example 1 4
If example 1.2 is restated to measure the extent of time delay in the aircraft reaching its
destination, we can denote the possible outcomes as x = 0 for ontime arrival (ie, as per
the scheduled time), x < 0 as measure of before time arrival (eg, - 5 minutes indicates
arrival is 5 min ahead of scheduled time), and x > 0 for late arrival (eg, x = 22 represents
arrival is 22 min after the scheduled time). We obtain an infinite number of possible
outcomes since time is a continuous variable. Here extent of time delay (in minutes) is
the variable where x 0.
Example 1.5
A bottling plant fills cold drinks in 200 ml bottles for its client. Although the machine is
calibrated to dispense 200 ml per fill, it is noted that the fill amount varies from bottle to
bottle by small amounts. If we denote X = amount filled in a bottle (in ml), since volume
is a continuous variable, the possible values of the variable are x 200
The outcomes in examples 1.1 and 1.2 are qualitative, and quantitative in examples 1.3,
1.4 and 1.5. There are a finite number of outcomes in examples 1.1, 1.2 and 1.3, whereas
the number of outcomes is infinite in examples 1.4 and 1.5. In methods of statistical
analysis we often need some numerical aspects of experimental outcomes. The mean, for
instance, is a numerical function of the outcomes.
2. RANDOM VARIABLES
If the exhaustive set of all possible outcomes of a random experiment are known then
probabilities of occurrence of the different outcomes can be assigned. The concept of a
Definition 1
For a given sample space S of some experiment, a random variable is any rule that
associates a number with each outcome in S
A random variable (rv) is thus a function defined over the elements of S. The domain of
the rv is the sample space and the range is a set of real numbers. A random variable is,
therefore, a variable that takes on numerical values determined by the outcome of a
random experiment. Thus, the value of the random variable will vary according to the
observed outcome of a random experiment. In general, random variables are functions
that associate numbers with some specific attribute of an experimental outcome. Random
variables will be denoted by uppercase letters, such as X and Y, and their values by the
corresponding lowercase letters, such as x and y.
Since the outcomes of a random experiment can be designated as a random variable, any
numerical function of the outcomes is also a random variable. It is random since its value
depends on which particular outcomes are observed. It is a variable since different
numerical values are possible. We can, therefore, assign probabilities to its possible
values. Therefore we can say that a random variable is a variable which can take one of
the different possible values in the sample space with an assigned probability. If X
denotes the rv and s the sample outcome, then X(s) = q where q is a real number.
Example 2.1
If X is a rv with m possible values x1, x2, x3,….xm and Y is a rv with n possible values
y1, y2,….yn then the linear function X + Y is also a random variable since x + y = xi + yj
where i = 1,2,….,m, and j = 1, 2,…..,n.
Exercise 1
A balanced coin and a fair die are tossed simultaneously. List the different possible
outcomes.
Solution:
Two possible outcomes of the coin are head (H) or tail (T). Six possible outcomes of the
die are 1, 2, 3, 4, 5, and 6. Since the coin and die are tossed simultaneously the possible
outcomes are as follows:
(H,1); (H,2); (H,3); (H,4); (H,5); (H,6); (T,1); (T,2); (T,3); (T,4); (T,5); (T,6)
Exercise 2
In Exercise 1, if H is denoted by 1 and tail by 0 so that x = 0, 1. and y = 1, 2, 3, 4, 5, 6
then list the different possible outcomes for the linear function X+Y
Solution:
x + y = 1, 2, 3, 4, 5, 6, 2, 3, 4, 5, 6, 7 according to the combinations listed in exercise 1
Exercise 3
Assigning appropriate probabilities to the values of the random variables X and Y in
exercise 2, determine the probabilities of x + y.
Solution:
Since the coin is balanced, P(x=0) = p(0) = ½ and P(x=1) = p(1) = ½. Similarly, for the
fair die, p(1) = p(2) = p(3) = p(4) = p(5) = p(6) = 1/6.
Since X and Y are independent, there are 12 possible equally likely outcomes. Therefore,
p(1) = p(7) = 1/12 and p(2) = p(3) = p(4) = p(5) = p(6) = 2/12
Definition 2
Any random variable whose only possible values are 0 and 1 is called a Bernoulli
random variable
If an unbiased coin is tossed repeatedly, on each toss there are only two possible
outcomes so it is a Bernoulli rv. If an experiment can result in only two possible
outcomes – success or failure – in each trial, we have a Bernoulli random variable.
There are fundamentally two different types of random variables: discrete random
variables and continuous random variables. The distinction between discrete and
continuous random variables lies in the number of possible values the rv can take. If the
rv can have a finite number or a countably infinite number of possible values it is a
discrete rv. If, on the other hand, the outcome can be any real number in a given interval,
the number of possibilities is uncountably infinite, and the rv is said to be continuous.
Definition 3
A discrete random variable is a rv whose possible values either constitute a finite set or
else can be listed in an infinite sequence which is “countably” infinite, where there is a
first element, a second element, a third element and so on.
Examples 1.1, 1.2 and 1.3 have possible values which constitute a finite set. So is the
case with exercise 1 and exercise.2. In all these cases the possible outcomes can be
counted.
Example 2.2
A new company wishes to establish its brand image. For this purpose it runs a series of
weekly newspaper advertisements until sales of its products reach the target level.
Reaching the level of target sales is considered a success. Success may be achieved in 1
week or 2 weeks or 3 weeks and so on. If we denote success by S and failure by F then
the sample space is S = [S, FS, FFS, FFFS,………..]. We can define the random variable
X = number of weeks before the advertising campaign ends. Then, X(S) = 1, X(FS) = 2,
X(FFS) = 3, X(FFFS) =4, and so on. Any positive integer is a possible value of X. Thus,
the set of possible values of the rv X is countably infinite.
Definition 4
A random variable is continuous if both the following conditions apply
1. Its set of possible values consists either of all numbers in an interval on the
number line or all numbers in a disjoint union of such intervals.
2. No possible value of the random variable has a positive probability.
Condition 1 implies that there is no way to create a listing of all the infinite number of
possible values of the variable. Condition 2 implies that intervals of values have positive
probability. As the width of the interval diminishes, probability of the interval decreases.
In the limit, probability of the interval is zero as the width of the interval reduces to zero.
Example 2.3
The university team is scheduled to visit any minute during a three hour long
examination starting at 9am. We may want to find the probability that the team visits at a
given time or we may be interested in the probability that the visit takes place during a
given time interval. The sample space is from 0 to 180 minutes. The probability that the
Variables such as time, height, distance, temperature, area, volume, weight, etc that
require measurement are continuous. In practice, however, limitations of measurement
instruments often do not allow measurement on a continuous scale. Yet it is useful to
study models of continuous variables as they often reflect real world situations.
various possible values of the rv X. The probability assigned to any value x of the rv will
be denoted by p(x).
Definition 5
The probability distribution or probability mass function (pmf) of a discrete random
variable is defined for every number x by p(x) = P(X = x) for each x within the range of
X.
Based on the postulates of probability, a function can serve as the pmf of X if and only if
p(x) satisfies the following two conditions
1. 0 < p(x) < 1 for each value within its domain
2. p(x) = 1 where the summation is over all values within its domain.
x
The first condition states that probability cannot be negative or exceed 1. The second
condition follows from the fact that all possible values of X are mutually exclusive and
collectively exhaustive so that the sum of the probabilities must equal 1. Thus, any
function which satisfies both properties can serve as the pmf of a discrete random
variable. Examples of pmf are Bernoulli Distribution, discrete Uniform Distribution,
Binomial Distribution, Negative Binomial Distribution, Hypergeometric Distribution and
Poisson Distribution.
Note that a function which satisfies the two conditions for one set of values of X may not
do so for another set of values. In the latter case the function cannot serve as a pmf of X.
To test whether a function is a pmf we need to check whether both conditions are
satisfied for the given X values.
Exercise 4
A balanced coin is tossed three times. Let X denote the rv that is defined as the total
number of heads. List the elements of the sample space and obtain the probability
distribution of the total number of heads observed. Find a formula for the pmf of the total
number of heads observed in three tosses of a fair coin.
Solution:
Denoting H = head and T = tail, elements of the sample space are
TTT, TTH, THT, HTT, THH, HTH, HHT, HHH.
Let the rv X = total number of heads observed in 3 tosses of a balanced coin. For a
balanced coin a head and a tail are equally likely outcomes so that P(H) = P(T) = ½. It
can be assumed that the outcome of any toss is independent of the outcomes of the other
two tosses of the coin. Then,
P(TTT) = P(X = 0) = p(0) = (1/2)(1/2)(1/2) = 1/8
P(TTH or THT or HTT) = P(X = 1) = p(1) = 3/8
P(THH or HHT or HTH) = P(X = 2) = p(2) = 3/8
P(HHH) = P(X = 3) = p(3) = 1/8
The probability distribution or pmf of X is given in the following table:
x 0 1 2 3
Both conditions for a pmf are satisfied since 0 < p(x) < 1 for x = 0, 1, 2 and 3, and
p(x) = 1
x
Based on the probabilities we observe that numerators of the four fractions 1/8, 3/8, 3/8
and 1/8 are the binomial coefficients 3 , 3 , 3 , 3 . The formula for the pmf can,
0
1 2 3
3
therefore, be written as for x = 0, 1, 2 and 3.
x
8
Exercise 5
A computer shop sells desktops, laptops, notebooks and tablets. .A prospective buyer
enters the shop. The random variable can take five possible values. X = 0 if no purchase
is made, X = 1 if a tablet is purchased, X = 2 if a notebook is purchased, X = 3 if a laptop
is bought, and X = 4 if a desktop is bought. If 40% of buyers purchase a tablet, 35%
buyers opt for a notebook, 20% a laptop and 5% a desktop, what is the probability
distribution of X?
Solution:
The pmf is as follows
x 0 1 2 3 4
Exercise 6
A balanced coin is tossed four times. Use the formula derived in exercise 4 to obtain the
pmf of X = total number of heads in four tosses of the coin.
Solution:
Total number of possible outcomes is 24 = 16 as the result of each toss is independent of
4
the remaining three tosses. Using the formula p(x) = the pmf is as follows:
x
16
x 0 1 2 3 4
Exercise 7
x4
Check whether the function given by f(x) = for x = 0, 1, 2, 3, 4 can serve as the
30
probability distribution of a discrete random variable.
Solution:
For given values of x the value of the function is as follows:
f(0) = 4/30, f(1) = 5/30, f(2) = 6/30, f(3) = 7/30, f(4) = 8/30
Each of the above values are positive fractions less than 1. Hence the first condition for a
pmf is satisfied. The sum of all the values of f(x), Σf(x) = (4 + 5 + 6 + 7 + 8)/30 = 1 so
that the second condition is also satisfied. Since both the required conditions for a pmf
are satisfied, therefore, the given function can serve as a pmf for a rv having the values 0,
1, 2, 3, and 4.
is distributed at various points on the number line. The pmf can be presented graphically
in probability histograms.
For a probability histogram, above each x with P(x) > 0 construct a rectangle centered at
x. The height of each rectangle is proportional to P(x).The area of the rectangle equals
p(xi) for X = xi. If the base of each rectangle is of unit width then the height will be equal
to p(xi) for X = xi.
Example 4.1
The pmf of exercise 5 is
x 0 1 2 3 4
For all x > 4 , p(x) = 0. The probability histogram is drawn by representing 1 with the
interval 0.5 to 1.5, 2 with the interval 1.5 to 2.5, 3 with the interval 2.5 to 3.5, and so on.
The line graph and bar chart are also referred to as histograms. The line graph is drawn
by drawing lines of height p(x) for corresponding x values. The bar chart is drawn with
each rectangle centered at the x value with a height equal to the probability of the
corresponding value of the rv. The line graph and bar chart for the pmf of ex. 5 are
illustrated in Fig 2 and Fig 3 respectively.
Definition 6
Suppose p(x) depends on a quantity that can be assigned any one of a number of possible
values, with each different value determining a different probability distribution. Such a
quantity is called a parameter of the distribution.
The collection of all probability distributions for different values of the parameter is
called a family of probability distributions.
Example 5.1
We consider a random experiment that can give rise to just two possible mutually
exclusive and exhaustive outcomes 0 and 1. Then p(0) + p(1) = 1. Such a rv is called a
Bernoulli random variable. If we select α such that 0 < α < 1, the pmf of the Bernoulli rv
can then be expressed as
1 x0
p ( x) x 1
0
otherwise
For each of the possible values of α in the interval between 0 and 1, we obtain a different
probability distribution. We thus obtain a family of Bernoulli distributions with each pmf
determined by a particular value of α. Since the pmf depends on the particular value of α
we often write the pmf of the Bernoulli distribution as p(x; α) rather than just p(x). The
quantity α in the Bernoulli pmf is a parameter. The value of the parameter α distinguishes
one Bernoulli distribution from another. If α can take any value in the interval 0 to 1, we
obtain an infinite number of Bernoulli distributions, each for a different value of α.
The value of the parameter may be unknown. If the population size is very large it may
not be possible to examine all the population values to ascertain the value of α. We can
then use sample data to infer about the parameter value α, where the sample is a
representative subset of the population.
Example 5.2
If the discrete rv X can take any value x1, x2, x3, ……… xn with equal probability we
have a discrete Uniform Distribution. We can denote the minimum value x1 = α, and the
maximum value xn = β. Then the pmf of the Uniform Distribution can be expressed as
1
x
p( x) n
0 otherwise
We obtain a family of uniform distributions with the pmf of each distribution determined
by a particular set of values for α and β. The pmf can be denoted by p(x; α, β), where α
and β are the parameters of the distribution. For different combinations of values of α
and β we obtain different uniform distributions.
Definition 7
The cumulative distribution function F(x) of a discrete random variable X with pmf p(x)
is defined for every number x by F(x) = P(X < x) = p( y )
y: y x
Thus, cdf is obtained by summing the pmf p(x) over all possible values of X = y
satisfying y < x. We use F(x) to calculate the probability that the observed value of X
does not exceed x. It follows that P(X < x) < P(X < x) since the value x is included in
P(X < x) and not in P(X < x). Only if P(X = x) = 0 then P(X < x) = P(X < x). In all other
cases where P(X = x) > 0 the inequality holds, ie, P(X < x) < P(X < x).
The first property states that F(x) is non-negative. F(x) = 0 for any value of X that is less
than the smallest permissible X value of the pmf since p(x) = 0 for all such values. It
follows that when all possible values of X have been considered F(x) = 1. For higher
values of X we again have p(x) = 0 so that F(x) remains unchanged at 1. The second
property implies that if p(b) = 0 then F(a) = F(b). Otherwise F(a) < F(b) when a < b.
The graph of F(x) is a step function. If X is a discrete rv whose set of possible values are
x1, x2, ……..., where x1 < x2 < x3 < ……….., the value of F(x) is constant in the interval
between two successive values xi-1 and xi, and then increases by p(xi) at xi. F(x) again
remains flat between xi and xi+1 when it jumps up (takes a step) by p(xi+1) at xi+1. This is
illustrated in Figure 4.
Since F(xi-1) < F(xi) and F(xi) < F(xi+1), at all points of discontinuity the cdf takes on the
greater of the two values. This is indicated by heavy dots in Figure 4. It can be seen that
as x increases, the cdf will change values only at those points that can be taken by the rv
with positive probability.
Example 6.1
Using the pmf in exercise 5,
x 0 1 2 3 4
F(0) = P(X = 0) = 0
F(1) = P(X= 0 or 1) = 0 + 0.4 = 0.4
F(2) = P(X= 0 or 1 or 2) = 0.4 + 0.35 = 0.75
F(3) = P(X= 0 or 1 or 2 or 3) = 0.75 + 0.20 = 0.95
Example 7.1
Given the cdf obtained in example 6.1
0 x 0 or x 1
0.40 1 x 2
F(x) = 0.75 2 x3
0.95 3 x 4
1 4 x
we get
p(0) = 0
p(1) = 0.4 -0 = 0.4
p(2) = 0.75 – 0.4 = 0.35
p(3) = 0.95 – 0.75 = 0.20
p(4) = 1 – 0.95 = 0.05
To obtain the probability that value of X falls in the interval [a, b] such that a < b, where
both a and b are included in the interval, we have to compute P(a < X < b) = F(b) – F(a-)
where a- denotes the largest possible X value that is strictly less than a. If the only
possible values of X are integers so that a and b are both integers, then
P(a < X < b) = P( X = a or a + 1 or a + 2 or…….or b) = F(b) – F(a – 1)
This principle can be used to find the probability that X takes the value a. By setting
b = a we obtain P(X = a) = p(a) = F(a) – F(a – 1).
This method is used to derive the pmf from the cdf.
We can similarly compute P(a < X < b) = F(b-1) – F(a) where a and b are not included in
the interval.
Note that F(b) – F(a) gives us P(a < X < b) where b is included in the interval but a is not
included.
Example 7.2
Given the cdf obtained in example 6.1,
P(1< X < 3) = F(3) – F(0) = 0.95 – 0 = 0.95
Exercise 8
A study of number of delayed flights in an hour (X) at an airport due to fog in winter
revealed the following probability distribution of the rv X.
x 0 1 2 3 4 5 6
PRACTICE QUESTIONS
1. Suppose one die has spots 1, 2, 2, 3, 3, 4 and a second die has spots 1, 3, 4, 5, 6,
8. If both dice are rolled, list the sample space (all possible outcomes). Let the rv
X = total number of spots showing. What is the pmf of X? Show that this pmf is
the same as that for two normal dice, each having 1, 2, 3, 4, 5, 6 spots.
3. Urn 1 and urn 2 each have two red balls and two white balls. Two balls are drawn
simultaneously from each urn. Let
X1 = number of red balls in the sample from first urn, and
X2 = number of red balls in the sample from the second urn.
Find the pmf of X1 + X2
4. An urn contains four balls numbered 1, 2, 3, and 4. If two balls are drawn from
the urn at random and Z is the sum of the numbers on the two balls, find
(a) the probability distribution of Z and draw the histogram
(b) the cdf of Z and draw its graph
5. A coin is biased so that heads is twice as likely as tails. For three independent
tosses of the coin, find
(a) the probability distribution of X, the total number of heads
(b) the probability of getting at most two heads, using the cdf of X
(c) P(1 < X < 3) and P(X > 2), using the cdf
6. The amount of coffee (in grams) in a 230-gm jar filled by a certain machine is a
random variable whose probability density is given by
0 x 227.5
1
f ( x) 227.5 x 232.5
5
0 x 232.5
Find the probabilities that a 230-gram jar filled by this machine will contain
(a) at most 228.65 gm of coffee
(b) anywhere from 229.34 to 231.66 gm of coffee
(c) at least 229.85 gm of coffee
8. Given the following cdf, derive the pmf of Y and draw the
(a) histogram of the pmf
(b) graph of the cdf
0 y 1
0.05 1 y 2
0.15 2 y4
F ( y)
0.50 4 y8
0.90 8 y 16
1 16 y
y2
(d) Could p(y) = 50 for y = 1,…..,5 be the pmf of Y?
x 1 3 4 6 12
p(x) 0.30 0.10 0.05 0.15 0.40
(i) Derive the cumulative distribution function (cdf) of X and draw the graph
of this cdf
(ii) Using the cdf, compute P(3 ≤ X < 6), P(3 < X < 6), and P(4 ≤ X).
DC-1
Semester-II
Paper-III: Statistical Methods in Economics-I
Lesson: Continuous random variables
And probability distributions
Lesson Developer: Chandra Goswami
College/Department: Department of Economics,
Dyal Singh College, University of Delhi
TABLE OF CONTENTS
Number
Learning Objectives
Reference
Jay L. Devore: Probability and Statistics for Engineering and the Sciences,
Cengage Learning, 8th edition [Chapter 4]
Learning objectives:
In this chapter you will learn what is meant by a continuous random variable. You
will learn how to arrive at the probability distribution of such types of random
variables and how to represent these graphically, as well as presentation by
summary expressions. You will then learn how to derive cumulative distribution
functions from the probability distribution function. You will also be able to derive
the probability densities from the cumulative distribution function. If either the
probability density function or the cumulative distribution function is known then
you will be able to evaluate the probability that the random variable takes on
specific values or a range of values. You will also learn how to identify the
characteristics of the population distribution like the shape of the distribution.
Chapter Outline
1. Continuous random variables
2. Probability distributions for continuous random variables
3. Cumulative distribution functions for continuous random variables
4. Deriving probability densities from cumulative distribution functions
5. Percentiles of a continuous distribution
6. Shape of the probability distribution
Example 1.1
Students of a college are given an objective type test. The proportion of correct
answers that a student scores in the test is a continuous variable which can range
from 0 to 1. Measured as a percentage, the outcome varies from 0 to 100
percent.
Example 1.2
A student travels to college by metro. The frequency of trains in the morning is 4
minutes. If the student reaches the platform as one train is departing she will
have to wait for 4 minutes till the next train enters the station. If she reaches just
as one train enters the station then she will have to wait 0 minutes to board the
train. If she reaches after the earlier train has left and the next train is yet to
arrive, she will have to wait for a time period between 0 and 4 minutes. Waiting
time is a continuous variable with a minimum of 0 minutes and a maximum of 4
minutes.
Example 1.3
The daily consumption of water (in liters) by an individual at home varies from
day to day through any given year. It depends on various factors like amount of
time spent at home, weather conditions, time of year, how much of the time
spent at home is during waking hours, etc. The unit of measurement is a
continuous variable with a minimum value of 0 liters.
Definition 1
A random variable is continuous if both the following conditions apply
1. Its set of possible values consists either of all numbers in an interval on
the number line or all numbers in a disjoint union of such intervals.
2. No possible value of the random variable has a positive probability.
Condition 1 implies that there is no way to create a listing of all the infinite
number of possible values of the variable. Condition 2 implies that intervals of
values have positive probability. As the width of the interval diminishes,
probability of the interval decreases. In the limit, probability of the interval is zero
as the width of the interval reduces to zero.
Example 1.4
The university team is scheduled to visit any minute during a three hour long
examination starting at 9am. We may want to find the probability that the team
visits at a given time or we may be interested in the probability that the visit
takes place during a given time interval. The sample space is from 0 to 180
c
minutes. The probability that the team visits during an interval of length c is .
180
This assignment of probabilities applies only to intervals on the measurement axis
from 0 to 180. The probability decreases as the interval becomes shorter. For an
5
interval of 5 seconds, the probability is computed as 0.0004629 As the
10800
length of the interval approaches zero, the probability that the team will visit also
approaches zero. That is why we always assign zero probability for a single point
on the number line. This does not mean that the team will not visit. The team will
visit at some point in the interval from 0 to 180 minutes even though each point
has zero probability.
Variables such as time, height, distance, temperature, area, volume, weight, etc
that require measurement are continuous. In practice, however, limitations of
To derive the probability distribution for a continuous rv let us first begin with a
discrete rv. Let X be a discrete rv which can take integer values such that x 1 < X
< xn, where x1 and xn are the minimum and maximum values respectively of the
rv X.
If x = x1, x2, …., xn then we can draw a probability histogram with n rectangles.
The area of the rectangle centered at xj is the proportion of the population that
fj
has the value xj, ie, where N is the population size. Summing over the n
N
n
fi
values of X we obtain N
i 1
1
Now we allow X to take one additional value in each interval so that x 1’ is midway
between x1 and x2; x2’ is midway between x2 and x3; and so on. Then total
number of x values will be 2n – 1 (instead of 2n, as there are n - 1 intervals).
With measurements of x taken at smaller intervals, the rectangles become
narrower, though the sum of the areas of all rectangles remains one.
Definition 2
Let X be a continuous random variable. Then a probability distribution or
probability density function (pdf) of X is a function f(x) such that for any two
b
The probability that X takes on a value in the interval [a, b] is the area under the
graph of the density function above the interval [a, b] on the number line.
The first condition requires non-negative values of pdf for any x value. The
second condition requires that area under the entire curve of f(x) should equal
one, ie X values are collectively exhaustive. If all possible values of X are
considered then the second condition will be satisfied. Examples of pdf are the
continuous Uniform Distribution, the Normal Distribution, the Exponential
Distribution, etc.
Unlike the pmf, where we can obtain P(X = c) as the probability that the discrete
rv X takes the value c, the probabilities for a continuous rv are always associated
with intervals. The pdf yields P(X = c) = 0 for any particular value of the rv X.
This follows from the definition of a continuous rv as specified in condition 2 of
definition 1.
This is not the case with discrete random variables. If both a and b are possible
values of the discrete rv X then these probabilities will all be different. If a < b,
then for the discrete rv X,
P(a < X < b) ≠ P(a < X < b) ≠ P(a < X < b) ≠ P(a < X < b).
Example 2.1
A milk vendor has a refrigerated storage tank of 1000 liters capacity, which is
filled each morning for sale during the day. It is not possible to predict the
amount of milk sold on any particular day. The sale of milk on any day can vary
from 0 lt. to 1000 lt. Past experience shows that any demand in the interval of 0
and 1000 is equally likely. The rv X indicates the sale of milk on a particular day.
The pdf of X is given by the continuous Uniform Distribution
0.001 0 x 1000
f(x) =
0 otherwise
In general, if α and β are the lower and upper limits of the value that the
continuous rv X can take, then pdf of X is
1
0 x 1000
f(x; α, β ) =
0 otherwise
The probability of an interval depends only on the width of the interval in case of
the uniform distribution.
1
In our example, β – α = 1000 so that = 0.001. We can use this to obtain
the probability that sale of milk on a particular day is between 200 and 500 liters
as follows:
P(200 < X < 500) = (500 – 200)(0.001) = 0.3
Note that α and β are the parameters of a population of the continuous rv X that
is described by a uniform distribution. We have a family of uniform distributions
for different values of the two parameters. Each distribution is specified by a
particular pair of values of α and β.
Exercise 1
Show that f(x) = 3x2 for 0 < x < 1 represents a pdf and calculate P(0.1 < x <
0.5).
Solution
f(x) can represent a pdf if both conditions for a pdf are satisfied, ie, f(x) > 0 and
f ( x) dx 1 .
Since f(x) = 3x2 and x2 > 0 always, hence f(x) > 0 for all x values. Therefore,
for 0 < x < 1, f(x) > 0 and the first condition is satisfied.
1 1
3x 3
3x
2
dx = = 1 – 0 = 1, which satisfies the second condition for pdf.
0 3 0
Since both conditions are satisfied, f(x) = 3x 2 represents a pdf for 0 < x < 1
0. 5
Now, P(0.1 < x < 0.5) =
3x dx = (0.5)3 – (0.1)3 = 0.125 – 0.001 = 0.124
2
0 .1
Example 2.2
e x x0
The pdf for a continuous rv is given as f(x) =
0 x0
So that as x value increases from x = 0, f(x) decreases rapidly or exponentially,
as illustrated in Fig 3
e
x
Now, P(a < X < b) = dx. This is the shaded area in figure 3.
a
If a = 2 and b = 5, then
5 5
e x
e
x
P(2 < X < 5) = dx = 2 = - (0.006738 – 0.135335) = 0.128597 =
2
0.13
Therefore, 13 percent of the area under the curve of f(x) = e- x
lies above the
measurement axis in the interval [2, 5].
Exercise 2
Show that f(x) = e- x for 0 < x < ∞ represents a pdf, and compute the probability
that
X > 1.
Solution
f(x) = e -x
would represent a pdf if f(x) > 0 and f ( x) dx 1 for 0 < x < ∞
0
f ( x) dx e
x
= dx = 0 = [0 – 1] = 1.
0 0
e
x
P(X > 1) = dx = - [ 0 – e- 1] = e-1 = 0.368
1
Exercise 3
The pdf of the rv X is given by
k
0 x4
f ( x) x
0 otherwise
Find (a) the value of k, and (b) P(X > 1)
Solution
4
k
(a) Given that f(x) is a pdf we have
0 x
dx 1
4 4
4 k x 1
Now
0 x
dx
12
= 2k [2 – 0] = 4k. Equating 4k and 1 we get k = 4 so
0
1
that f(x) =
4 x
4 4
2 x 1
P(X > 1) =
1 1
(b) dx = 1 - 2 = 2 = 0.5
1 4 x 4 1
Exercise 4
If the continuous random variable X can take only non-negative values and has
the density function f(x) = e2x for x > 0, and 0 otherwise, what is the maximum
value of X?
Solution
e dx 1
2x
If f(x) is a density function then for x > 0, and 0 otherwise.
0
x x
e2 y e2 x 1
0 1 e2x = 3
2y
e dy
2 0
2 2
Therefore, 2x = ln 3 = 1.0986, so that x = 0.549
Hence, f(x) will be a density function for 0 < x < 0.549. Maximum value of X is
0.549
3 CUMULATIVE DISTRIBUTION FUNCTIONS FOR CONTINUOUS
RANDOM VARIABLES
Similar to the case of discrete random variables, there are many problems where
we need to know the probability that a continuous rv X takes a value that does
not exceed a specified value x. For this we need the cumulative distribution
function (cdf) of X.
Definition 3
If X is a continuous random variable then the cumulative distribution function
F(x) for X is defined for every number x by
x
For each x, F(x) is the area under the density curve to the left of x. As x value
increases, F(x) also increases smoothly until F(x) =1 and then it continues as a
flat line parallel to the measurement axis.
The cdf gives the probability P(X < x) obtained by integrating the pdf f(y)
between
-∞ and x. As in the case of the discrete rv, here too F(- ∞) = 0, F(∞) = 1, and
F(a) < F(b) when a < b.
Also P(a < X < b) = F(b) – F(a) where a < b.
Since X is a continuous rv,
P(a < X < b) = P(a < X < b) = P(a < X < b) = F(b) – F(a) where a < b.
Example 3.1
1
A x B
Given the uniform distribution f(x; A, B ) = B A ,
0 otherwise
x
0 x A
x A
F(x) = A x B
B A
1 x B
The pdf and cdf of the uniform distribution of a continuous rv are illustrated in Fig
4.
If the graph of the pdf is bell-shaped as in case of the Normal Distribution [fig 5
(a)], then the cdf will be as in Figure 5 (b)
Exercise 5
The density function of the rv X is given by
6 x1 x 0 x 1
f(x) =
0 otherwise
Obtain the cdf and compute P(X < ½).
Solution
yx
y 2 y3
x x x
F(x) = f ( y) dy 0 6 y 1 y dy 6 0 y y 2
dy 6 2 3
y 0
If x < 0, F(x) = 0
If 0 < x < 1, F(x) = 3x2 – 2x3
If x = 1, F(x) = 3 – 2 = 1
If x > 1 F(x) = 1 since f(x) = 0
Therefore the cdf can be represented as follows
0 x0
2
F(x) = 3 x 2 x 0 x 1
3
1 x 1
To compute P(X < ½), we substitute x = ½ in F(x) since P(X < ½) = P (X < ½)
for a continuous rv.
3
F(1/2) = 3(1/4) – 2(1/8) = 4 41 12 = 0.5
Exercise 6
x 1
Show that the expression g ( x) can serve as a cdf for -1 < x < 1.
2
Solution
If g(x) is to represent a cdf we must show that g(x) = 0 for x < -1, g(x) = 1 for x
> 1, and 0 < g(x) < 1 for the interval -1 < x < 1.
1 1 11
Now, g (1) 0, and g (1) 1. Let us select a value x = 0 in the
2 2
given interval.
1
Then g(0) = 2 where 0 < 1
2
< 1.
Since all three requirements are satisfied, g(x) can serve as a cdf for -1 < x < 1
For given cdf we can obtain the pdf by taking the derivative of F(x). By definition
3, if X is a continuous rv and the value of its probability density at y is f(y) then
the cdf is
x
dF ( x)
Hence, f(x) = = F'(x) at every x at which the derivative F'(x) exists.
dx
Example 4.1
In example 3.1, for the uniform distribution the cdf is
0 x A
x A
F ( x) A x B
B A
1 x B
The graph of F(x) is given in Fig 4(b).
It can be seen that F(x) is differentiable for A < x < B.
At x = A and x = B, F(x) cannot be differentiated.
For x < A, F(x) = 0 and for x > B, F(x) = 1
Hence, F'(x) = f(x) = 0 if x < A, or, if x > B.
0 x A
1
f ( x) A x B
B A
0 xB
Exercise 7
A continuous rv Y has a cdf given by
0 y0
2
F(y) = y 0 y 1
1 y 1
Compute P( 12 < Y < 3
4
) in the two ways by using (a) the cdf, and (b) the pdf
Solution
1 3 3 1 9 1 5
(a) P( 2 < Y < 4 ) = F( 4 ) – F( 2 ) = 16 - 4 = 16 = 0.3125
2 y dy y
2 4
Then P( 1
2
<Y< 3
4
)= 1
2
1
2
9 1 5
=
16 4 16
0.3125
Definition 4
Let p be a number between 0 and 1. The (100p)th percentile of the distribution
of a continuous random variable X, denoted by η(p), is defined by
p
p = F[η(p)] = f ( y) dy
Then η(p) is that value on the measurement axis such that 100p percent of the
area under the graph of f(x) lies to the left of η(p) and 100(1-p) percent lies to
the right. This is illustrated in Figure 7
Figure 7 Percentiles
If p = 0.3 then 30% of the area under the graph of f(x) lies to the left of η(0.3)
and 70% to the right of η(0.3). The 30 th percentile is denoted by η(0.3) since p =
0.3
Example 5.1
For the rv X with following pdf
1
x 1 2 x4
f ( x) 8
0 otherwise
To find the 75th percentile, η(0.75), we need to first obtain the cdf from the given
pdf.
8 x 1 dx
1
F(x) =
p p
x 1 x2 x
Therefore, F[η(p)] = p =
2
dx =
8 8
16 8 2
3
Substituting p = 0.75 = 4 we obtain
3 p p 4 2
2
= =
1
p 2 2 p 8
4 16 8 16 8 16
Rearranging the terms we get
[η(p)]2 + 2η(p) – 20 = 0
2 4 80
Factorising, η(p) = 1 4.58 3.58 or 5.58
2
Since minimum value of X is 2 and the maximum is 4, the 75th percentile is 3.58
because that is the only possible value that X can take. The alternative value -
5.58 does not fall in the range of possible values.
Hence, η(p) = 3.58
Definition 5
Example 6.1
The median of the pdf given in example 5.1 is computed by letting p = ½ so that
~ ~
1 x 1 x2 x ~ 2 ~ 1
F[ ~ ] =
2
= 2 8 8 = 16 8 2 = 16 8 2
dx
~ 2 ~
Therefore, 1 0 ~ 2 2~ 1600 0
16 8
2 4 64
So that ~ = 1 4.123
2
Since 2 < x < 4, ~ = 3.123.
Half the area of the density curve is to the left of 3.123 and the other half is to
the right.
If a random variable has a symmetric pdf then the median will coincide with the
point of symmetry since half the area under the density curve lies on either side
of the point.
A positively skewed distribution has a long right-hand tail. Similarly, a negatively
skewed distribution has a long left-hand tail. Figure 8 illustrates the three kinds of
distributions.
Example 6.2
The incomes of employees of a company will usually be positively skewed as
there are a large number of low income workers and fewer employees with high
income.
Example 6.3
A well known manufacturing company assures that its product will last a
minimum period of three years. However, due to a defective component sourced
from one of the suppliers, the lifetime of a batch of the product is likely to be
drastically reduced. The distribution will then be negatively skewed.
It can be shown that for a symmetric pdf the median coincides with the mean of
the distribution. If the mean and median have different values then the
distribution is asymmetric, ie, skewed. If mean is less than median the
distribution is skewed to the left or negatively skewed. On the other hand a
distribution is positively skewed or skewed to the right when the mean is greater
than the median.
The mode of the distribution is that value of the random variable at which the
graph of the probability distribution reaches its highest point. If there is only one
peak or “high point” it is a unimodal distribution. If there are two modes it is
called a bimodal distribution. A distribution having more than two modes is said
to be multimodal.
Example 6.4
Suppose that the rv X has pdf
1
9 4 x
2
1 x 2
f ( x)
0 otherwise
Differentiating f(x) with respect to x, we get
f ' x 0
2x
9
Setting f ' x 0 we get x = 0
Taking the second derivative,
Comparison of the mode and median can also be used to indicate the shape of
the distribution. For a symmetric distribution mode = median. In case of a
positively skewed distribution, medium > mode, whereas medium < mode for a
negatively skewed distribution.
The other characteristics of the distribution like mean and variance can be
computed with the help of mathematical expectations.
PRACTICE QUESTIONS
1. Suppose the rv Y has the pdf f(y = 4y3 for 0 < y < 1 and 0 otherwise.
Find
P(0 < Y < ½).
0 x 227.5
1
f ( x) 227.5 x 232.5
5
0 x 232.5
Find the probabilities that a 230-gram jar filled by this machine will
contain
(a) at most 228.65 gm of coffee
(b) anywhere from 229.34 to 231.66 gm of coffee
(c) at least 229.85 gm of coffee
0 x0
2
x
F ( x) 0 x2
4
1 2 x
value α = - 0.015 and maximum value β = 0.015. Find the probabilities that
such an error will
(i) be between – 0.002 and 0.003
(ii) exceed 0.005 in absolute value
DC-1
Semester-II
Paper-III: Statistical Methods in Economics-I
Lesson: Mathematical expectation discrete
Lesson Developer: Chandra Goswami
College/Department: Department of Economics, Dyal
Singh College, University of Delhi
TABLE OF CONTENTS
Learning Objectives 2
Practice Questions 20
Content Developer
Chandra Goswami, Associate Professor, Department of Economics
Dyal Singh College, University of Delhi
Reference
Jay L. Devore: Probability and Statistics for Engineering and the Sciences,
Cengage Learning, 8th edition [Chapter 3]
.
Learning objectives:
In this chapter you will learn how to obtain two main characteristics of the probability
distribution of a discrete random variable. The mean of the distribution is the point on the
number line where the distribution is centered and the variance is a measure of the spread of
the distribution. You will learn how to derive these characteristics of distributions of discrete
random variables. You will also learn how to apply the rules of mathematical expectation to
functions of random variables as well as to sums of random variables.
Chapter Outline
1. Expected value of a discrete random variable
2. Expectation of a function of a discrete random variable
3. Rules of mathematical expectation
4. Variance of a discrete random variable
5. Variance of a function of a discrete random variable
6. Covariance and variance of sums of random variables
7. Parameters of the probability mass function
If X is a discrete rv with a set of possible values D and pmf p(x), then we can define the
expected value of X, denoted by E(X) or μx, as follows
Definition 1
E(X) = x. px
xD
If D = [x1, x2, x3,……xn], then E(X) = μx, = x1.p(x1) + x2. p(x2) + x3. p(x3) +…….. + xn.
p(xn)
If it is clear to which X the expected value refers, μx may be used instead of μx,
The expected value of a rv is called its mean value. We can interpret the expected value as
the long-run average value that the rv takes over a large number of repeated trials of an
experiment performed in identical and independent fashion. When the trials are conducted in
this fashion then the outcome of any trial is independent of outcomes of the other trials.
Nx
infinitely large, the ratio tends to the probability of occurrence of x. In other words,
N
Nx
→ P(x) as N → ∞. Thus, E(X) is the mean of the probability distribution of the random
N
variable X.
To compute the population average value of X we need only the possible values of X along
with their respective probabilities. The size of the population is immaterial as long as the pmf
is given. The mean value of X is a weighted average of the possible values of X, where the
weights are the probabilities of these values. The expected value μ may not coincide with any
of the possible values of X. Note that the mean will coincide with the median if the
distribution is symmetric.
The expected value of a rv X is also referred to as the first moment of X about the origin or
simply the first moment. The quantity E(Xn) is similarly the nth moment of X where n > 1.
Example 1.1
Example 1.2
If X has a pmf as follows
x 1 2 3 4
p(x) 0.002 0.146 0.588 0.264
Example 1.3
Let X = number of trials till the first success is observed, and p = the probability of success.
The pmf of X is
p(1 p) x 1 x 1,2,3......
p ( x)
0 otherwise
The mean of X is E(X) obtained as follows
E(X) = x. p x = x(1 p) x 1 p x(1 p) x 1
xD x 1 x 1
Now
d
1 p x x1 p x1
dp
Substituting we get
1 p x
d
E(X) = p
x 1 dp
d
1 p x
= p 1
dp x 1
x
Since 1 p
1
,
x 0 1 1 p
1 p
1
1,
x
therefore
x 1 p
and () 1 p 1
x 1
x 1 p
[This is a convergent geometric series as p < 1 and (1-p) < 1]
d 1 1 1
E(X) = p 1 p 0 2
dp p p p
1 1
= p 0
p2 p
Alternately
x 1 x 1
It is possible to have a probability distribution where larger values of the rv X have higher
probabilities. Such distributions with “heavy tails” may result in a mean value that is not
finite.
Example 1.4
k
2
x 1,2,3,.. k
p( x) x where k is chosen so that x 2
1
0 otherwise x 1
k 1
E(X) = μ = x
x1 x 2
k
x1 x
1
E(X) is not finite as the harmonic series x
x 1
is equal to infinity and p(x) does not decrease
Exercise 1
Find the expected value of the rv X having the pmf
x2
x 1, 0,1, 3
p ( x) 7
0 otherwise
Solution
x2 3 1 1 1
E(X) = x 1 0 1 3
7 7 7 7 7
3 1 1
1 0 1 3
7 7 7
1
7
Exercise 2
Find E(X) where X is the outcome when we roll a fair die.
Solution
For a fair die each face of the die is an equally likely outcome.
p(1) = p(2) = p(3) = p(4) = p(5) = p(6) = 1
6
where, x = 1, 2, 3, 4, 5, 6 denote the number of dots on the six faces of the die or outcome
when we roll the die.
1 1 1 1 1 1 21 7
E(X) = (1) + (2) + (3) + (4) + (5) +(6) = = = 3.5
6 6 6 6 6 6 6 2
Note that we can never observe the outcome to be 3.5 as X can only take the integer values
1, 2, 3, 4, 5, 6.
E(X) is simply the average value of X if we roll a fair die a large number of times.
Exercise 3
An investor is considering three strategies for a $1,000 investment. The estimated probable
returns are:
Strategy 1: A profit of $10,000 with probability 0.15 and a loss of $1,000 with probability
0.85
Strategy 2: A profit of $1,000 with probability 0.50, a profit of $500 with probability 0.30
and a loss of $500 with probability 0.20
Strategy 3: A certain profit of $400.
Which strategy has the highest expected profit?
Solution
Let Xj = returns from investment in jth strategy, where j = 1, 2, 3
Strategy 1: E(X1) = (0.15)(10000) + (0.85)(-1000) = 1500 – 850 = $650
Strategy 2: E(X2) = (0.50)(1000) + (0.30)(500) + (0.20)(-500) = 500 + 150 – 100 = $550
Strategy 3: E(X3) = (1)(400) = $400
Since E(X1) > E(X2) > E(X3) therefore strategy 1 is most profitable.
Exercise 4
A group of 500 persons participate in a lottery with a first prize of 1000, two second prizes
of 500 each and five third prizes of 100 each. If the lottery is equitable, so that each
player’s expectation is zero, then what is the fair price of the lottery ticket?
Solution
1
Since there are 500 players, the probability of winning the first prize is , the probability
500
2 5
of winning the second prize is , and that of winning the third prize is . Then,
500 500
1 2 5 492
E(X) = (1000) + (500) + (100) + (0) =2+2+1=5
500 500 500 500
The lottery will be equitable if ticket price is 5, in which case expected earnings are 5 and
the cost to player is 5 so that each player’s expectation is E(X) – 5 = 0
Proposition 1
The expected value of Y or μh(X) is thus a weighted average of possible values of h(x) and the
weights are the corresponding probabilities. E[h(X)] is computed in the same way as E(X),
except that we substitute h(X) in place of X. Examples of h(X) are aX+b, eX, lnX, etc.
Example 2.1
Let X be the damage incurred (in $) in a certain type of accident during a given year. Possible
X values are 0, 1000, 5000 and 10000, with probabilities 0.8, 0.1, 0.08 and 0.02 respectively.
A particular company offers a $500 deductible policy. The company wishes to fix the
premium amount to be charged so that its expected profit is $100.
Since the company offers $500 deductible policies, the amount to be paid in case of accident
claim will be 0, 500, 4500 and 9500 with respective probabilities 0.8, 0.1, 0.08 and 0.02.
Expected payment for accident claim = (0)(0.8) + (500)(0.1) + (4500)(0.08) + (9500)(0.02)
= 50 + 360 + 190 = $600
Expected profit is the difference between the premium charged and the expected expenditure
on accident insurance claims.
Since expected profit is $100, the company should charge a premium of $700 so that
100 = 700 – 600.
Example 2.2
The pmf for the rv X is as follows
x 4 6 8
p(x) 0.5 0.3 0.2
and
Y = h(X) = 20 + 3X + 0.5X2
The possible Y values are 40, 56 and 76, obtained by substituting x = 4, 6, 8 in h(x)
E(Y) = E[h(X)] = (40)(0.5) + (56)(0.3) + (76)(0.2) = 20 + 16.8 + 15.2 = 52
Proposition 2
E[h(X)] = E(aX + b) = a.E(X) + b
ie, μaX+b = aμX + b
Proof
Rule 1
If b = 0, then for any constant a, E(aX) = aE(X)
Multiplication of X by the constant a changes the unit of measurement. The rule says that
expected value in the new units equals the expected value in the old units multiplied by the
factor a.
Rule 2
If a = 1, then for any constant b, E(X + b) = E(X) + b
If a constant is added to each possible value of X, there is a change in origin. Then the
expected value will be shifted by the same amount b.
Rule 3
If a = 0, then for any constant b, E(b) = Σb.p(x) = b Σp(x) = b
That is, the expected value of a constant is just its value. This is only logical. As a constant
value is a certainty, there is no probability associated with it. The expected value is the value
of the constant itself.
Proposition 3
If n > 1, E(Xn) = x . p x
xD
n
This follows from proposition 2. Thus, the second moment (about the origin) is
E(X2) = x . p x
xD
2
Proposition 2 can be extended to more than one rv. Let X and Y be two discrete random
variables. If X has a set of possible values D with pmf p(x), and Y has a set of possible values
D* with pmf p(y), then for the function of the two random variables g(X,Y) = X + Y we have
Proposition 4
E(X + Y) = E(X) + E(Y)
Proof
We can similarly show that the expected value of the sum of any number of random variables
equals the sum of their individual expectations.
Example 3.1
E(X + Y + Z) = E[(X + Y) + Z] = E(X + Y) + E(Z) = E(X) + E(Y) + E(Z)
Exercise 5
An individual who has automobile insurance from a certain company is randomly selected.
Let Y be the number of traffic rule violations for which the individual was booked during the
last three years. The pmf of Y is
y 0 1 2 3
p(y) 0.60 0.25 0.10 0.05
Exercise 6
An appliance dealer sells three different models of upright freezers having 13.5, 15.9, and
19.1 cubic feet of storage space, respectively. Let X = the amount of storage space of the
freezer purchased by the next customer.
Suppose X has the following pmf
x 13.5 15.9 19.1
p(x) 0.2 0.5 0.3
(a) Compute (i) E(X), and (ii) E(X2)
(b) If price of a freezer having capacity X cu. ft. is 25X – 8.5, what is the expected price
paid by the next customer to buy a freezer?
(c) Suppose that although the rated capacity of a freezer is X, the actual capacity is
h(X) = X – 0.01X2. What is the expected actual capacity of the freezer purchased by
the next customer?
Solution
(a) (i) E(X) = (13.5)(0.2) + (15.9)(0.5) + (19.1)(0.3) = 2.7 + 7.95 + 5.73 = 16.38 cu ft
(ii) E(X2) = (13.5)2(0.2) + (15.9)2(0.5) + (19.1)2(0.3)
= 36.45 + 126.405 + 109.443 = 272.298
(b) Let Y = price of freezer, where Y = 25X – 8.5
E(Y) = 25E(X) – 8.5 = (25)(16.38) – 8.5 = $401
(c) Actual capacity = h(X) = X – 0.01X2
E[h(X)] = E(X) – (0.01)E(X2) = 16.38 – (0.01)(272.298) = 16.38 – 2.72 = 13.66 cu ft
Exercise 7
Let X = the outcome when a fair die is rolled once. If before the die is rolled you are offered
1 1
either dollars or h(X) = dollars, would you accept the guaranteed amount or would
3 .5 X
you gamble?
Solution
1
Guaranteed amount = $ = 0.2857 = $0.29
3 .5
1
Otherwise, h(X) = , where x = 1, 2, 3, 4, 5, 6
X
1
Since this is a fair die, P(1) = P(2) = P(3) = P(4) = P(5) = P(6) =
6
1 1 1 1 1 1 1 1 1 1 1 1
E[h( X )]
1 6 2 6 3 6 4 6 5 6 6 6
1 1 1 1 1 1 1 60 30 20 15 12 10 147
1 360 0.4083
6 2 3 4 5 6 6 60
Therefore E[h(X)] = $0.41 > guaranteed amount $0.29. It is a better option to gamble.
One useful application of Proposition 1 is to obtain E(X2), ie, the second moment of X about
the origin. Here h(X) = X2. Let D is the set of all possible values that X can take. Since the
Expected value of the rv X is the mean of the probability distribution or pmf of X. It tells us
what will be the value of X on the average when the experiment is repeated a very large
number of times in an identical and independent fashion (ie, replicated a very large number
of times). We need to also obtain the variance of X to examine the amount of variability in
the probability distribution of X. The mean and variance are useful measures for summarizing
the essential properties of the pmf.
Example 4.1
Let the rv X have the pmf p(x) = ½ for x = -1, 1 and let the rv Y have the pmf p(y) = ½ for
y = -100, 100.
E(X) = E(Y) = 0 but the pmf for Y is more spread out than that for X
Definition 2
Let X have pmf p(x) and expected value μ. Then the variance of X is
V X x2 x px E X
2 2
xD
Variance is thus the expected value of the function h(X) = (X – μ)2, ie, the squared deviation
of X from its mean. Hence, variance is the expected squared deviation. Variance is thus the
weighted average of squared deviations, where the weights are the probabilities.
An alternative formula for Var(X) can be derived as follows, where x D and D is the set of
all possible values of the rv X:
Variance of X is, therefore, equal to the expected value of the square of X minus the square of
the expected value of X. It is often easier to compute V(X) using this formula than the
definitional formula E[(X – μ)2].
Example 4.2
Given the pmf of X in example 2.2, we can compute the mean, variance and standard
deviation of the probability distribution of the random variable X.
x p(x) x.p(x) x2 x2.p(x)
4 0.5 2 16 8
6 0.3 1.8 36 10.8
8 0.2 1.6 64 12.8
= [a( x )] . px
D
2
= a
2
[ x ] . p x
D
2
= a 2 E[ X ]2
= a 2V ( X )
Thus we get a simple relationship between V[h(x)] and V(X) for the linear function
h(x) = aX + b
Proposition 5
V (aX b) aX
2
b a X and aX b a . X
2 2
We need to take the absolute value |a| since a 2 = ± a, and standard deviation can never be
negative.
Proposition 5 yields the following two rules of variance and standard deviation for the
function of a random variable.
Rule 4
V (aX ) aX
2
a 2V ( X ) and aX a . X when b= 0 in h(x) = aX + b
Rule 5
V ( X b) X2 b V ( X ) and X b X when a= 1 in h(x) = aX + b
Thus change in origin by adding or subtracting a constant b does not affect the variability of
the distribution. It just shifts the distribution to the left for b < 0, and to the right for b > 0.
However, change in the unit of measurement by multiplication or division by a constant a
impacts the variability. The new standard deviation is a product of the old standard deviation
and the absolute value of the conversion factor a. If 0 < |a| < 1 the distribution becomes
narrower. If |a| > 1, the new distribution is more spread out than before.
Exercise 8
The total cost for the production process is equal to $1000 plus two times the number of units
produced. The mean and variance for the number of units produced are 500 and 900
respectively. Find the mean and standard deviation of the total cost.
Solution
Let X denote the number of units produced where X is a rv. Then the cost function is
h(x) = 1000 +2X.
Given that E(X) = 500 and V(X) = 900,
E[h(x)] = E[1000 + 2X] = 1000 + 2E(X) = 1000 + (2)(500) = $2000
and
V[h(x)] = V(1000 + 2X) = 2V(X) = (4)(900) = 3600
so that h( x ) 3600 = $60
Example 6.1
V(X + X) = V(2X) = 4 V(X) whereas V(X) + V(X) = 2 V(X)
so that [V(X + X)] ≠ [V(X) + V(X)]
If the random variables are, however, independent then variance of a sum of the rv’s will
equal the sum of the respective variances. Recall that if X and Y are two independent rv’s
then the probability of occurrence of one variable is not affected by the probability of
occurrence of the other. To prove that V(X+Y) = V(X) + V(Y) when X and Y are
independent we need to first define the concept of covariance of two random variables.
Definition 3
x p( x ).E(Y ) E( X ).E(Y )
xiD
i i
From this it follows that if X1, X2,…….. are independent random variables, then
V(X1+ X2 +,……..) = ΣV(Xi) if each pair of Xi, Xj are mutually independent (i≠j).
Exercise 9
A company produces and sells security devices in two countries which do not permit
international trade in this item. Let X and Y denote the number of devices sold weekly in the
first country and the second country respectively. The profit function in the two countries are
h(x) = 200X - 100 and h(y) = 500Y- 250
Compute the mean and standard deviation of weekly total profits (measured in $) of the
company if the pmf’s of X and Y are as follows:
x 3 4 5 6 y 1 2 3 4
p(x) 0.1 0.2 0.3 0.4 and p(y) 0.2 0.4 0.3 0.1
Solution
E(X) = 3(0.1) + 4(0.2) + 5(0.3) + 6(0.4) = 0.3 + 0.8 + 1.5 + 2.4 = 5
E(X2) = 9(0.1) + 16(0.2) + 25(0.3) + 36 (0.4) = 0.9 + 3.2 + 7.5 + 14.4 = 26
V(X) = 26 – 25 = 1
E(Y) = 1(0.2) + 2(0.4) + 3(0.3) + 4(0.1) = 0.2 + 0.8 + 0.9 + 0.4 = 2.3
E(Y2) = 1(0.2) + 4(0.4) + 9(0.3) + 16(0.1) = 0.2 + 1.6 + 2.7 + 1.6 = 6.1
V(Y) = 6.1 – 5.29 = 0.81
Expected profits in the first country = E[h(x)] = E[200X - 100]
= 200.E(X) -100
= 200(5) - 100 = $900
Variance of profits in first country = V[200X -100] = 40000 V(X) = 40000(1) = 40,000
Expected profits in the second country = E[h(y)] = E[500Y – 250]
= 500.E(Y) – 250
Exercise 10
Show that Cov(aX+b,cY+d) = acCov(X,Y)
Solution
Cov(aX+b,cY+d) = E[{aX + b – E(aX + b)}{cY + d – E(cY + d)}]
= E[{aX + b – aE(X) +b}{cY + d – cE(Y) + d}]
= E[(a{X – E(X)})(c{Y- E(Y)})]
= acE[{X – E(X)}{Y – E(Y)}] = acCov(X,Y)
If most of the population values are close to μ, the spread of the distribution is small and σ2 is
relatively small. If, however, there are x values that are far from μ that have large p(x), then
σ2 will be quite large.
Example 7.1
In example 5.1, p(x) = ½ for x = -1, 1 and p(y) = ½ for y = -100, 100
E(X) = μX = ½(-1) + ½(1) = 0,
and E(Y) = μY = ½(-100) + ½(100) = 0,
so that μX = μY = 0
V(X) = E(X2) – [E(X)]2 = [½ (1) + ½(1)] – 0 = 1, and
V(Y) = E(Y2) – [E(Y)]2 = [½ (10000) + ½(10000)] – 0 = 10,000
Therefore, V(Y) > V(X)
The characteristics of the distribution can now be specified. We can obtain the calculated
values of the mean and variance. The histrogram of the distribution will show whether the
distribution is symmetric or asymmetric. It will also show whether the distribution is
unimodal, bimodal or multomodal.
Practice Questions
1 12 48 64
1 If X takes on the values 0, 1, 2, and 3 with probabilities , , , and
125 125 125 125
respectively, find E(X) and V(X). Use these results to find the mean and variance of
Y = 3X + 2
2. The pmf of the amount of memory X(GB) in a flash drive is given as follows:
x 1 2 4 8 16
p(x) 0.05 0.10 0.35 0.40 0.10
W W
E 0 and V 1
n
E ai X i
i 1
7. A stationery shop orders copies of a certain magazines each week. Let X = demand
for the magazine, with pmf
x 1 2 3 4 5 6
p(x) 1/15 2/15 3/15 4/15 3/15 2/15
Suppose the shop actually pays 5 for each copy of the magazine and the price to
customers is 10. If magazines left at the end of the week have no salvage value, is it
better to order three or four copies of the magazine?
9. Given that variables X and Y are independent and Z = aX – bY, prove that
Var(Z) = a2Var(X) + b2Var(Y)
10. Arun and Barun play a game in which they toss a fair coin three times. The one
obtaining heads first wins the game. If Arun tosses the coin first and if the total value
of the stakes is 20, how much should be contributed by each in order that the game
be considered fair?
11. A bakery sells bread for Rs. 15 each. Daily sales X is a random variable and has a
distribution with mean 530 and standard deviation 69
(i) Find the mean daily total revenues from the sale of bread
(ii) Find the standard deviation of total revenues from the sale of bread
(iii) If daily costs (in Rs) for making bread are given by C=1000+0.95X, find the
mean and variance of daily profits from sales of bread
12. A chemical supply company currently has in stock 100 kg of a certain compound,
which it sells to customers in 5-kg batches. Let X = the number of batches ordered by
a randomly chosen customer, and suppose that X has pmf
x 1 2 3 4
p(x) 0.2 0.4 0.3 0.1
Compute E(X) and V(X). Then compute the expected number of kgs left after the
customer’s order is shipped and the variance of the number of kgs left.
DC-1
Semester-II
Paper-III: Statistical Methods in Economics-I
Lesson: Mathematical expectation continuous
Lesson Developer: Chandra Goswami
College/Department: Department of Economics,
Dyal Singh College, University of Delhi
TABLE OF CONTENTS
Content Developer
Chandra Goswami, Associate Professor, Department of Economics
Dyal Singh College, University of Delhi
Reference
Jay L. Devore: Probability and Statistics for Engineering and the Sciences,
Cengage Learning, 8th edition [Chapter 4]
Learning objectives:
In this chapter you will learn how to obtain two main characteristics of the
probability distribution of a continuous random variable. You will learn how to derive
the mean and variance of distributions of continuous random variables. You will also
learn how to apply the rules of mathematical expectation to functions of random
variables as well as to sums of random variables. The mean, variance, median and
mode, and coefficients of skewness and kurtosis will help you to identify the
characteristics and shape of the distribution.
Chapter Outline
1. Expected value of a continuous random variable
2. Expectation of a function of a continuous random variable
3. Variance of a continuous random variable
4. Variance of a function of a continuous random variable
5. Rules of mathematical expectation
6. Expectation and variance of sums of continuous random variables
7. Characteristics of the probability density function
Definition 1
The expected value or mean value of a continuous random variable X with probability
density function f(x) is
X E ( X ) x. f ( x)dx
When the pdf f(x) specifies a model for the distribution of X values in a numerical
population, then μX is the population mean.
Example 1.1
If a contractor’s profits on a construction job can be looked upon as a continuous rv
having the pdf
1
18 ( x 1) 1 x 5
f ( x)
0 otherwise
where the units are $1000, her expected profit is
1
E( X ) x. 18 ( x 1)dx
5
1
( x 2 x)dx
18 1
5
1 x3 x2
18 3 2 1
1 125 25 1 1
18 3 2 3 2
1 126 24
18 3 2
42 12 54
3
18 18
Therefore, expected profit is $3,000
Exercise 1
The tread wear (in thousands of kilometers) that car owners get with a certain kind of
tyre is a rv X whose pdf is given by
1 30x
e x0
f ( x) 30
0 x0
What tread wear can a car owner expect to get with one of the tyres?
Solution
1 30x
E ( X ) x. e dx
30
x
1
30 0
x.e 30 dx
Integrating by parts,
1 30x
x
30 0
E( X ) x e dx 1 e dx
30
0 0
x
x
1 e 30 e 30
x 1 1 dx
30 30 0
30
0
x
x
0 e
1 e 30
30
dx 1
30 0 30
0
1
0 1 30
30
Therefore, average tread wear a car owner can expect to get is 30,000 km.
Definition 2
If X is a continuous random variable with pdf f(x) and h(X) is any function of X, then
E[h( X )] h ( X ) h( x) f ( x)dx provided that
h( x) f ( x)dx
Proposition 1
If h(X) is a linear function such as h(X) = aX + b, then E[h(X)] = aE(X) + b
Proof:
= a xf ( x)dx b
f ( x)dx
= aE(X) + b
Example 2.1
If the pdf of X is given by
2(1 x) 0 x 1
f ( x)
0 otherwise
and h( X ) 2 X 1
Then,
E[h( X )] 2 x 1.21 x dx
1
2 (2 x 1 2 x 2 x)dx
0
1
2 ( x 1 2 x 2 )dx
0
1
x2 2x3
2 x
2 3 0
1 2
2 1 0
2 3
3 2
2
2 3
4 5
3 1.67
3 3
Exercise 2
An ecologist wishes to mark off a circular sampling region having radius 10 m.
However, the radius of the resulting region is actually a random variable R with pdf
3
4 1 10 r
2
9 r 11
f (r )
0 otherwise
What is the expected area of the resulting circular region?
Solution
Since area of a circle is h(R) = πR2, therefore expected area of the resulting circle is
11
E[h(R)] = E R 2 =
22 22
E R 2 where E R r f r dr
2 2
7 7 9
Now,
11
3 2
E(R 2 )
49
r 1 100 20r r 2 dr
11
3
99r 2 20r 3 r 4 dr
49
11
3 99r 3 20r 4 r 5
4 3 4 5 9
3
1
33 9 3 113 5 114 9 4 9 5 115
4
5
3
33729 1331 514641 6561 59049 161051
1
4 5
3
33 602 58080 102002
1
4 5
19866 40400 20400.4
3
4
133.6 100.2
3
4
Since area of a circle is h(R) = πR2, therefore expected area of the resulting circle is
E[h(R)] = E 22 R 2 =
7
22
7
E R2
22
7
100.2 314.9143 sq.m.
Exercise 3
The weekly demand for propane gas (in 1000s of gallons) from a particular facility is
an rv X with pdf
1
21 2 1 x 2
f ( x)
x
0 otherwise
(a) Compute E(X)
(b) If 1.5 thousand gallons are in stock at the beginning of the week, how much of
the 1.5 thousand gallons is expected to be left at the end of the week?
Solution
(a)
1
2
E X x.21 2 dx
1 x
2
2
1 x2
2 x dx 2 ln x
1
x 2 1
1
2 2 0.693 0 1.614
2
= 1,614 gallons
(b) Amount in stock is 1.5 thousand gallons out of which the demand is a random
variable X thousand gallons.
Amount left = h(x) = max{(1.5 - x), 0] thousand gallons
2
E h x max 1.5 x ,0f ( x) dx
1
1
1 .5
1.5 x .21 x
1
2
dx
1. 5 1
1 .5
2 1.5 x 2 dx
1 x x
1.5
x 2 1.5
2 1.5 x ln x
2 x 1
2 1.5
2 1.5 1.5
2
1
0.4055 1.5 1.5 0
2 1.5 2
22.5305 2.5 0.061
Therefore, the expected amount left in stock at the end of the week is 61 gallons.
(Since weekly demand for propane can vary between 1000 and 2000 gallons for
1 < x < 2, and amount in stock is 1.5 thousand gallons, amount left at the end of the
week can vary between the minimum of 0 if demand x = 1.5 or more, and maximum
of 500 gallons if demand is x = 1.)
Definition 3
The variance of a continuous random variable X with pdf f(x) and mean value μ is
V ( X ) X2 E[ X ]2 x f x dx
2
It can be shown that for continuous random variables, just like in the case of discrete
random variables,
V(X) = E(X2) –[E(X)]2
Proof
x f x dx
2 2
X
x 2x 2 f x dx
2
x f x dx 2 xf x dx f x dx
2 2
E X 2 2 E X 2 sin ce f x dx 1
E X 2 2
sin ce E X
E X E X
2 2
Example 3.1
In exercise 3, the pdf of X is given as
1
21 2 1 x 2
f ( x)
x
0 otherwise
V(X) = E(X2) – [E(X)]2 , where E(X) = 1.614
1
1
x 21 2 dx
2 2
E(X ) =
x
2
2 ( x 2 1) dx
1
2
x3
2 x
3 1
8 1
2 2 1
3 3
2 2 8
2 2.667
3 3 3
V(X) = 2.667 – (1.614)2 = 2.667 – 2.605 = 0.062
and σx = 0.249
= 249 gallons.
Exercise 4
For what value of k does V(Y) =2 when pdf of Y is given as
2y
2 0 yk
f ( y) k
0 otherwise
Solution
k
2y
E (Y ) y dy
0 k2
k
2 y3 2
2 k
k 3 0 3
k
2y
E (Y 2 ) y 2 dy
0 k2
k
2 y4 1
2 k2
k 4 0 2
1 4 1
V (Y ) k 2 k 2 k 2
2 9 18
k2
Given that V (Y ) 2,
18
k 2 36.
Therefore, k = 6
The variance is the second moment about the mean, ie the second central moment. It
is obtained by taking the difference of the second moment about the origin {ie,E(X2)}
and the square of the first moment about the origin {ie, [E(X)]2}
and Eh X 2
hx f x dx
2
Proposition 2
If h(X) is a linear function such as h(X) = aX + b, then V[h(X)] = a2V(X)
Proof:
When h(X) = aX + b, ie, a linear function of the rv,
E[h(X)] = E[aX + b]
= a E(X) + b
= aμ + b
and
V [h( X )] ax b f x dx a b
2 2
a x 2 2abx b 2 f x dx a b
2 2
a 2 x 2 f x dx 2ab xf x dx b 2 f x dx a b
2
a E X
2
2ab b a
2 2 2 2
2ab b 2
a E X
2 2 2
a E X [ E X ]
2 2 2
a 2V ( X )
Example 4.1
In example 1.1 the contractor’s profits X (in thousand dollars) was a continuous rv
with pdf
1 ( x 1) 1 x 5
f ( x) 18
0 otherwise
Expected profits = E(X) = 3 = $3000
V ( X ) E ( X 2 ) [ E ( X )]2
5 2 1 1 5 3
x . ( x 1) dx 9 x x 2 dx 9
1 18 18 1
5
1 x4 x3
9
18 4 3 1
1 625 125 1 1
9 156 42 9 2
1
18 4 3 4 3 18
and X 2 1.41421 $1414.21
Exercise 5
Let Y have the pdf
31 y 2
0 y 1
f ( y)
0 otherwise
1
4 y 2 y 2 y 3 dy
0
1
y2 y3 y4
4 2
2 3 4 0
1 2 1 1
4
2 3 4 3
1
E (Y ) y 2 .41 y dy
2 2
1
4 y 2 2 y 3 y 4 dy
0
1
y3 y4 y5
4 2
3 4 5 0
1 2 1 2
4
3 4 5 15
2 1 1
V (Y )
15 9 45
1 5
V (W ) (25)
45 9
Rule 1
If b = 0, then for any constant a, E(aX) = aE(X)
The rule says that expected value in the new units equals the expected value in the old
units multiplied by the factor a. Multiplication of X by the constant a changes the unit
of measurement
Rule 2
If a = 1, then for any constant b, E(X + b) = E(X) + b
If a constant is added to each possible value of X, there is a change in origin. Then the
expected value will be shifted by the same amount b to the right or left depending on
whether b is greater than or less than zero respectively.
Rule 3
If a = 0, then for any constant b, E(b) = b. f ( x) dx = b f ( x) dx = b
Rule 4
V (aX ) aX
2
a 2V ( X ) and aX a . X when b= 0 in h(X) = aX + b
Rule 5
V ( X b) X2 b V ( X ) and X b X when a= 1 in h(X) = aX + b
A change in origin by adding or subtracting a constant b does not affect the variability
of the distribution
Rule 6
V (b) 0 and X b 0 when a= 0 in h(X) = aX + b
Thus the same rules of mathematical expectation for calculating the mean and
variance apply for distributions of both discrete and continuous random variables.
Proposition 3
Whether or not the random variables X and Y are independent, E(X+Y) = E(X) + E(Y)
This result can be extended to more than two linear functions of continuous random
variables. For example, if there are linear functions of X, Y and Z, the expected value
of the sum of the functions is the sum of the expected value of the functions.
Proposition 4
If the random variables X and Y are independent, V(X+Y) = V(X) + V(Y)
= V[g(X)] + V[h(Y)]
and
σg(X)+h(Y) = V [ g ( X ) V [h(Y )]
This result can be similarly extended to sums of linear functions of three or more than
three independent continuous random variables. We see that the methods for
obtaining the expectation and variance of sums of linear functions of continuous
random variables are the same as that for discrete variables.
Example 6.1
The independent random variables X and Y have the following density functions:
x
0 x2
f ( x) 2
0 otherwise
2(1 y ) 0 y 1
and f ( y)
0 otherwise
Let g(X) = 3X -1 and h(Y) = 2Y + 1.
Then,
E[g(X) + h(Y)] = E[g(X)] + E[h(Y)]
= [3E(X) – 1] + [2E(Y) + 1]
2
1 x3
2 2
x 1 2 4
E ( X ) x. dx x dx
0
2 20 2 3 0 3
Eg X 3 1 3
4
3
1
1 1
y2 y3 1 1 1
E (Y ) y.2(1 y ) dy 2 ( y y ) dy 2 2
2
0 0 2 3 0 2 3 3
EhY 2 1
1 5
3 3
Eg X h(Y ) 3
5 14
4.667
3 3
Also
V[g(X) + h(Y)] = V[g(X)] + V[h(Y)]
Now,
V[g(X)] = 9V(X) = 9[E(X2) – (4/3)2]
and
V[h(Y)] = 4V(Y) = 4[E(Y2) – (1/3)2}
2
1 x4
2 2
x 1 3
E ( X ) x . dx x dx 2
2 2
0
2 20 2 4 0
16
V g X 92 2
9
1
y3 y4
1 1
1
E (Y ) y .2(1 y ) dy 2 y y dy 2
2 2 2 3
0 0 3 4 0 6
1 1 2
V hY 4
6 9 9
V g ( X ) h(Y ) 2
2 20
2.222
9 9
For a unimodal distribution, by equating the first derivative of the pdf to zero, such
that the second derivative is negative, gives us the modal value. The mode is the value
of the rv at which the graph of the distribution reaches its highest point. The median
is obtained by computing the 50th percentile, so that half the distribution is on either
side of the median.
Note that the mean, median, mode and standard deviation are all expressed in units of
measurement of the rv. If the units are changed by a multiplication factor a then the
values of all these measures will also be affected accordingly.
The moment measure of skewness requires the third moment E(X3). The moment
coefficient of skewness is independent of units of measurement. The formula for the
EX
3
moment coefficient of skewness is , where the numerator is the third
3
central moment. This may be expressed in terms of the moments about the origin.
E(X-μ)3 = E(X3) – 3 E(X2) + 2[E(X)]3. Here too the coefficient will be greater than,
less than, or equal to zero if the distribution is positively or negatively skewed or
symmetric, respectively. The normal distribution is symmetric.
4 th central moment
= [since deviations are taken about E(X) = μ]
2 nd
central moment
2
The coefficient of kurtosis is 3 for the normal distribution. It is less than 3 for a
density function that is flatter than the normal distribution, with short fat tails. If the
density function is more peaked than a normal distribution, with long tails, then the
coefficient is greater than 3.
The following figure illustrates the three types of distributions. For the purpose of
comparison of kurtosis, only symmetric distributions are shown.
PRACTICE QUESTIONS
2. A box is to be constructed so that its height is 5 inches and its base is Y inches
by Y inches, where Y is a random variable described by the pdf given below.
Find the expected volume of the box.
6 y(1 y) 0 y 1
f ( y)
0 otherwise
3
If E(X) = , find a and b
5
18 x 1 2 x4
f(x) =
0 otherwise
Is the distribution symmetric? Give reasons for your answer.
10. The coefficient of variation (σ/μ) is a measure of the spread of the distribution
that is independent of the unit of measurement of the rv. If the rv Y is
described by the following pdf,
3(1 y ) 2 0 y 1
f(y) =
0 otherwise
what is the coefficient of variation of X?
DC-1
Semester-II
Contents
1. LEARNING OBJECTIVES
2 INTRODUCTION
2.1 DISCRETE DISTRIBUTIONS
2.1.1 BINOMIAL DISTRIBUTION
2.1.2 POISSON DISTRIBUTION
2.1.3 BINOMIAL APPROXIMATION TO POISSON
1. LEARNING OBJECTIVES
We discuss 4 distributions here.
Discrete distributions: Binomial, Poisson
Continuous: Uniform, Normal
For each we need to be familiar with
Probability mass function( for discrete distributions), probability distribution
function( for continuous distributions)
Cumulative distribution function
Conditions for a distribution to hold
Approximations (as applicable)
Mean ,variance and standard deviation
2. INTRODUCTION
What are theoretical distributions?
Theoretical distribution is based on mathematical formulae. They are derived from model or
estimated from data, rather than conducting experiments physically or making a sample
space. If certain conditions are fulfilled we can say that a variable follows a particular
distribution.
2.1 DISCRETE DISTRIBUTIONS
2.1.1 BINOMIAL DISTRIBUTION
For any variable x to follow a binomial distribution the following conditions must be met.
1. There are n fixed and identical trials.
2. Each trial is independent of other trials, so that outcome of one trial does not affect
the outcome of any other trial.
3. Each trial has ONLY two possible outcomes S (Success) and F (Failure).
4. P(Success) =P(S) is denoted as p and is constant in each trial. P(Failure) = q = 1 − p.
The pmf is given as the probability of r successes in n trials.
P(X=r) =n Cr*prqn−r=n!/(r!(n − r)!) prqn−r
Mean=E(x) = n*p
Variance= V(x) = n*p*q
TIP: Success and Failure are labelled in an arbitrary fashion. We can label any of the
two events in the sample space as success. For example a girl/boy child can be labelled
success or failure, without affecting the answer. However we need to be careful with
the value of r.
Q Assume that a die is tossed 5 times. What is the probability of getting exactly 2 fours?
We use this example to show that any of the two events in a binomial experiment can be
labelled as success.
Option 1:
Define getting a four to be a success so that p=1/6. We want P(r=2). Use n= 5,p= 1/6 and r= 2
to get b(2; 5, 0.167) = 5C2 * (0.167)2 * (0.833)3 = 0.161.
Option 2:
Now define any number except four to be a success so that p=5/6. We want 2 fours so that
the number of success = 5-2=3 We now want P(r=3). Use n= 5, p= 5/6 and r= 3 to get
P(r=3) = 5C3 * (0.833)3 * (0.161)2 = 0.161.
We have illustrated that ‘labelling’ of success and failure have no impact on the answer as
long as the number of success is chosen properly and correctly.
The Poisson distribution is a discrete probability distribution for the number of events that
occur randomly in a given interval of time/period. This is unlike a binomial, hypergeometric
or negative binomial distributions that are based on an experiment that uses trials/ draws to
get probability of various outcomes.
Let X = the number of events in a given interval
λ = mean number of events per interval
The probability of observing r events in a given interval is given by the pdf.
P(X=r) = e –λ λr/r! Where r takes values 0,1,2,3,4.... and e=2.718282
Mean= variance= λ
NOTE: the rate/number of events is always in terms of a specified interval like per hour or
minutes or days.
Q Historical data shows that there are 1.8 births per hour in a village. What is the
probability that 4 babies will be born in any given hour here?
Q Let X equal the number of typos on a printed page with a mean of 3 typos per page.
a. What is the probability that a randomly selected page has at least one typo on it? We
can find the requested probability directly from the pdf. The probability that X is at
least one is: P(X ≥ 1) = 1 − P(X = 0) = 1- e –3 30/3! =1−0.0498=0.9502 That is, there
is just over a 95% chance of finding at least one typo on a randomly selected page
when the average number of typos per page is 3.
b. What is the probability that a randomly selected page has at most one typo on it?
b. What is the probability that an order will be faxed within the next 9 minutes?
NOTE that the rate was in terms of hours whereas this question talks in minutes. So
we convert minutes to hours. 9 minutes= 9/60 hours
c. What is the probability that more than 12 minutes will elapse between faxed orders?
12 minutes= 12/60 hours= .2 hours
Q The publisher of a medical journal claims that probability of an error is .005. the
errors on each page are independent of each other. If a journal has 400 pages,
If we use a binomial then, the answer is .270669. The answers are close to each other,
proving that approximation holds.
= .135226+.270671+.270671= .676653
EMPIRICAL RULE:
The Empirical Rule is based on the above concept of z distribution. It states that if a data set
is normally distributed with population mean µ and standard deviation σ, then the following
are true:
About 68% of the values lie within 1 standard deviation of the mean In statistical
notation, this is represented as μ ±σ
About 95% of the values lie within 2 standard deviations of the mean .The
statistical notation for this is μ ±2σ
About 99.7% of the values lie within 3 standard deviations of the mean or
between - μ ± 3σ.
Consider an example:
Q The Bulb Co, Ltd finds that its average CFL lasts 1000 hours with a standard
deviation of 100 hours. Assume that CFL life is normally distributed.
a. What is the probability that a randomly selected CFL will burn out in 1200 hours or
less?
Let x be the life of CFL in hours:
E(x) = 1000. standard deviation(x)=100.We want P( x <1200)
b. What is the probability that a randomly selected CFL will last more than 1200 hours?
Z=( 1200-1000)/100= 2
Since area under the curve is 1 we find P(z>2) = 1-.977= 0.023. Thus, there is a 2.3%
probability that a CFL will last more than 1200 hours.
c. What is the probability that a randomly selected CFL will last between 1100 and 1200
hours?
Q Chemical Company claims that its chemical X contains on the average 4.0 fluid ml of
caustic materials per liter. It further states that the distribution of caustic materials per liter is
normal and has a standard deviation of 1.3 fluid ml. What proportion of the individual liter
containers for this product will contain more than 5.0 fluid ml of X?
P(x > 5) = P(z >(5-4)/1.3) =P(Z > .769)= 0.220947 or 22.0947% of the individual
liter containers for this product will contain more than 5.0 fluid ml of X.
Z1 = 0.842
X1= .842*3.7 +12.8 =15.9154
TIP: we need to understand the relation between x , z and area under normal
curve. Given any one of them we must be able to find the other two.
X Z AREA
bj
The z table can be understood like this. The first column gives the value of z, while the
other columns contain the area to the left of a given z value. From the values shown
below:
i. P( z < .4) = .6554
ii. P( z <.46)= .6772
iii. P( z > .4) = 1-P( z <.4) = 1-.6554= .3446
iv. P( .4< z< 1.1) = .8643 -.6554= .2089
area to right of
α percentile zα value area to left of zα zα
0.1 90 1.281552 90% 10%
0.05 95 1.644854 95% 5%
0.025 97.5 1.959964 97.50% 2.50%
0.01 99 2.326348 99.00% 1.00%
0.005 99.5 2.575829 99.50% 0.50%
0.001 99.9 3.090232 99.90% 0.10%
0.0005 99.95 3.290527 99.95% 0.05%
We can now move to calculate percentiles for non standard distributions ( those that do
not have 0 mean and 1 sd). If a data set X, is normally distributed with population
mean µ and standard deviation σ, then
100(1-α)th percentile for X is given as μ +(100(1-α)th percentile for z distribution) *σ.
Take an example:
Q. The CAT exam is used to enter prestigious IIMs in India. Assume that scores are
based on a normal distribution with a mean of 1500 and a standard deviation of 300. IIM
Bangalore offers an interview to those who are in the top 3%, while IIM Guwahati offers an
interview to top 8%. How much do you need to score to guarantee an interview in both
places?
Let scores obtained in CAT be denoted by X.
For Bangalore: We want a minimum 97th percentile here to qualify
Using α = .03 we get zα as 1.88 according to the standard normal distribution (z
distribution).
Converting this to X distribution implies that 97th percentile for X is 1500
+1.88*300= 2064. So we need a score of 2064 to get into IIM Bangalore.
For Guwahati: We want a minimum 92th percentile here to qualify
Using α = .08 we get zα as 1.405 according to the standard normal distribution (z
distribution).
Converting this to X distribution implies that 92th percentile for X is 1500
+1.405*300= 1921.5. So we need a score of 1921.5 to get into IIM Guwahati.
=0 for x >b
=0 for x >b
questions:
a. What is probability density function of x?
f(x)=1/(2 -1)= 1 for -1≤x≤<1
b. Consider the variable Y such that Y = 2X2 − X . Determine the sample space of Y;
when X= 1, Y = ( 2*12-1) = 1 and when X= -1 then Y=(2*-1*-1 –(-1))= 3. So Y
ranges from 1 to 3
X Y
-1 3
-0.5 1
0 0
0.5 0
1 1
c. compute the mean and variance of X
Mean (1-(-1))/2= 1
Variance = ( 1-(-1)2)/12= 1/6
Q: The closing price of Sport Goods Ltd is uniformly distributed between Rs15 and
Rs33 per share.
What is the probability that the stock price will be:
a. More than Rs 28? =.27778
b. Less than or equal to Rs20? =.27778
3. USEFUL LINKS
http://mathworld.wolfram.com
http://www.stat.purdue.edu/~zhanghao/STAT511/handout/Stt511%20Sec3.5.pdf
http://www.stattrek.com
4. EXERCISES
Q1. Ina normal distribution 31% of the observations are under 45 and 8% are above 64.
What is the mean and variance of X.
P( X >64) = P( z > z1) =0.08. from the normal tables z1= +1.405
Solve eq1 and 2 together to get mean = 49.95739 and sd = 9.99474 (sd is standard deviation)
Q2. The average time taken to finish a project by L&T is 11 months with sd deviation=2.4
months. If the firm has 19 projects in the pipeline how many can be expected to be
completed in less than 1 year?
E(X)=12.59 so 12 projects.
Q3. The time for waiting for playing ground in a local tennis club ranges uniformly
between 23.5 to 40.5 minutes. If the probability that Harsh has to wait for more than 30
minutes is 60%, he will rather play badminton. Should game will he choose?
Q4. JK tyres claims an average life of 45000 km for its tyres with standard deviation of
2000kms. Bharat buys 4 tyres for his old car. What is the chance that all 4 tyres will last at
least 46000 kms, assuming life of each tyre as independent of all other tyres in the car?
Probability of 1 tyre lasting more than 46000 kms is P(X≥46000) = P(z ≥.5)= .3085
Q5. Let X be normally distributed with mean=30 and variance=49. Find C such that P( (X
-30) <C) =.9545.
So C/7 = 1.69
C= 7*1.69= 11.83
Sem-II
1. Learning outcomes
2. Introduction
3. Covariance
c. Special cases
4. Correlation
5. Appendix
6. Summary
7. Exercises
8. Glossary
9. References
After you have read this chapter, you should be able to:-
1. Define Covariance.
4. Compute correlation.
1. Covariance
When X and Y are two random variables and are not independent then
covariance between two random variables X and Y is
Y is mean of variable Y.
If suppose, X and Y are positively related to each other, then this means that
when X attain large value then corresponding Y value also tend to be larger and
small values of X correspond to small values of Y. Then most of the probability
yielding a negative Cov.( X , Y ) . If they are not related at all, then positive
product values would tend to be cancelled out with negative product values,
In the above figure, '+' and '' signs are marked show areas where if X and Y
values are plotted shows 'positive' or 'negative' values respectively. In panel (a) X
and Y have positive relationship, in panel (b) X and Y have negative relationship
and in panel (c) X and Y have no relationship so covariance would be positive in
first case (a) negative in case (b) and around zero in case (c).
( x X )( y Y ) XY ( x, y )
all x all y
xy XY ( x, y ) X Y
all x all y
Y
X 0 100 200 PX(x)
100 0.20 0.10 0.20 0.5
250 0.05 0.15 0.30 0.5
PY(y) 0.25 0.25 0.5 1
X E ( X ) xPX ( x )
175
Y E (Y ) yPY ( y )
125
E[ XY X X Y Y X Y ]
E ( XY ) Y E ( X ) X E (Y ) X Y
E( XY ) X Y Y X X Y
xy f XY ( x, y ) X Y
E( XY ) X Y
24 xy 0 x 1, 0 y 1, x y 1
f XY ( x, y )
0 otherwise
12 x (1 x) 2 0 x 1
and marginal PDF is f X ( x)
0 otherwise
1 1 x
xy 24 xy dydx
0 0
1 1 x
24 x 2 y 2 dy dx
0 0
1 24 2 3 1 x
x y 0 dx
0 3
1
8 x 2 (1 x)3 dx
0
1
8 x 2 x5 3 x 4 3 x 3
0
1
x.12 x(1 x ) 2 dx
0
1
(12 x 2 12 x 4 24 x3 ) dx
0
1
x3 x5 x4
12 12 24
3 5 4 0
12
4 6
5
12
2
5
2
5
2
Also, Y since marginal PDF is same.
5
Cov ( X , Y ) E( X , Y ) X Y
2 2 2
15 5 5
10 12 2
75 75
This minus signs shows negative relationship between X and Y (since x y 1,
more of X would mean less of Y).
d) Special Cases
Cov( X , Y ) E ( X , Y ) E ( X ) E (Y )
=0
But above result must be used with a caution as reverse of it is not true. Consider
the sample space = {(2, 4), (1, 1), (0, 0), (1, 1), (2, 4)}, where each point is
equally likely. Random variable X is first component of sample and Y be the
second
1
P ( X 1, Y 1)
5
1
P ( X 1)
5
2
P (Y 1)
5
So P( X 1, Y 1) P( X 1) and P(Y 1)
1
E ( XY ) [( 8) ( 1) 0 1 8] 0
5
1
E ( X ) [(2) (1) 0 1 2] 0
5
1
E (Y ) [4 1 10 1 4] 2
5
Cov( X , Y ) 0 2 0 0
Cov ( X , X ) E ( X 2 ) E ( X ) E ( X )
E ( X 2 ) [ E ( X )]2
Var ( X )
c) Suppose X and Y are random variables and a and b are constants then
Proof to it is as follows:
a 2 E ( X 2 ) b 2 E (Y 2 ) 2abE ( XY ) a 2 X2 b 2 Y2
2ab X Y
a 2 [ E ( X 2 ) X2 ] b 2 [ E (Y 2 ) Y2 ] 2ab[ E ( XY ) X Y ]
a 2V ( x) b 2V (Y ) 2 ab cov( X , Y )
n n
var ai X i ai2 var( X i )a j 2 a j a j cov( X i , X j )
i 1 i 1 i j
If X1 ,..., X n are independent random variables and all ai ' s are equal to 1 then
e) Scaling of Variables
2
was 1875 and is second it was
75
Could we conclude that X and Y variables in first example have strong
(positive) relationship and a weaker (negative) relationship emerges in second
example? The Answer to it is No!
To see this, lets scale our random variables by a and lets check covariance
for two new variables aX and aY and compare it with covariance of X and Y.
E ( a 2 XY ) aE ( X ) aE (Y )
a 2 [ E ( XY ) E ( X ) E (Y )]
a 2 cov( X , Y )
2. Correlation
As discussed in the last section that covariance suffers from the defect
that scaling of variable alters the value of covariance and then it does not serve
as a measure of strength of relationship. So a better measure is studied called
correlation coefficient.
where X * ( X X ) / X
Y * (Y Y ) / Y
X X Y Y X X Y Y
since cov( X * , Y * ) E E
Y Y X Y
1 1
E[( X X )(Y Y )] E ( X X ) E (Y Y )
XY X Y
1
cov( X , Y ) 0
X Y
n X
Since E(X X ) E( X ) X X 0
n
nY
E (Y Y ) E ( X ) Y Y 0
n
X* and Y* are called standardised variables. So correlation coefficient is
covariance between these standardized variables.
Var ( X * Y * ) 0
X X * * Y Y
var 2cov( X , Y ) var
X Y
1 1
2
[var( X ) 0] 2cov( X * , Y * ) 2 [var(Y ) 0]
X Y
[Since variance of constant is zero]
1 2 ( X ,Y ) 1
var( X * Y * ) 2[1 ( X , Y )] 0
1 ( X ,Y ) 0
| ( X ,Y ) | 1
X * Y* (say)
X* Y* (linear relationship)
Since it is iff only first part of statement is proved. For second part,
Let Y aX b
then E (Y ) aE ( X ) b and V (Y ) a 2V (Y )
Y a X
E ( XY ) E ( X ) E (Y )
( X ,Y )
X Y
Putting Y aX b, E(Y ) aE ( X ) and Y a X
E ( X (aX b)) E ( X ) a( E ( X ) b)
( X ,Y )
X a. X
E (aX 2 bX ) a ( E ( X )) 2 bE ( X )
a X2
aE ( X 2 ) a ( E ( X )) 2 bE ( X ) bE ( X )
a X2
Hence second part of statement is also proved. Similar results can be obtained for
( X ,Y ) = 1.
(aX b, cY d ) ( X , Y )
cov( aX b, cY d )
(aX b, cY d )
V (aX b )V (cY d )
a.c cov( X , Y )
a 2V ( X )c 2V (Y )
a.c cov( X , Y )
| a.c | X Y
(aX b, cY d ) ( X , Y )
This amounts to saying that scaling variables up or down in the same direction
does not affect the correlation coefficient.
A zero value of covariance as discussed earlier, need not imply that X and
Y are independent and so is for ; 0 does not imply that there is no
P ( x, y ) 1
4
for ( x, y ) ( 4,1),(4, 1) (2, 2), ( 2, 2)
1 1 1 1
X .( 4) .(4) (2) ( 2)
4 4 4 4
=0
1 1 1 1
Y (1) (1) (2) (2)
4 4 4 4
=0
1 1 1 1
E ( XY ) (4) ( 4) (4) (4)
4 4 4 4
=0
cov( X , Y ) E( XY ) X Y 0
so XY 0
Plotting all pairs of X and Y on the graph shows that two variables are dependent
following graph
X2 E ( X 2 ) ( E ( X )) 2
36250 (175) 2
5625
X 5625 75
Y2 E (Y 2 ) ( E (Y ))2
6875
Y 6875 82.92
1875
( X ,Y ) 0.301
75 82.92
Example 5:
Risk on Securities
In the last chapter, we calculated expected returns of the two securities but we
did not mention anything of risk. Risk is measured by standard deviation.
Standard deviation is an estimate of the likely divergence of an actual return from
expected return. So standard deviation is useful measure of risk as it weighs the
deviation with possible probability of that outcome.
Let’s revisit the old example of securities A & B. Following are the returns in
different states.
State 1 2 3 4 5 Total
Returns 10% 12 8 14 19
on
Security A,
RA
Returns 20% 25 33 27 22
on
= (.10X(-1.5)2) + (.25X(0.5)2)+(.35X(-3.5)2)+(.20X(3.5)2)+(.10X(8.5)2)
= 0.225+0.0625+4.2875+2.45+7.225=14.25
Similarly,
= 5.476+1.44+10.976+0.032+2.916=20.84
Though the expected returns on security B are higher but also risk is higher on
security B measured by standard deviation.
Portfolio
Let us study the same portfolio of A & B with same weights of 0.25& 0.75
respectively. V(Rp): Variance of Portfolio=V (0.25RA+0.25RB)
= (0.1x11.1)+(0.25x(-1.2))+.(35x(-19.6))+(.20x(-1.4))+(.10x(-5.4))
V(RP) = [(0.25)2x14.25]+(0.75)2x20.84]+[2x(0.75)x(0.25)x(-7.87)]
=0.8906+11.7225+ (-2.95125)
= 9.66
( , ) .
= = -0.457
. ∗ .
Exercises:
Q.1 Suppose that two dice are thrown. Let x be the number showing on the first
die and let y be the larger of the two numbers showing. Find cov (X,Y).
Q.2 Show that Cov(ax+b,cy+d) = ac cov (x,y) for any constants a,b,c,& d.
0, elsewhere
Sem-II
1. Learning outcomes
2. Introduction
7. Appendix
8. Summary
9. Exercises
10. Glossary
11. References
After you have read this chapter, you should be able to:-
3. State the relationship between Joint Probability Distribution and Joint Cumulative
Distribution Function.
6. Calculate the marginal and conditional Probability distribution function given Joint
Cumulative Distribution Function.
In the last chapters, random variable was discussed as defined over some sample space S
with a measure of probability and it seems reasonable that many different random variables
are defined over the same sample space, S. To illustrate this, suppose there is a random
variable X defined as totals observed when a pair of dice is rolled. This is not only random
variable that can be studied on rolling of pair of dice. For example, one may be interested in
considering product or difference between the two numbers observed on two dices. In this
chapter, we shall study pair of random variables defined over a joint sample space at the
same time. For example, we wish to study simultaneously; that what are probabilities of
occurrence of event X : score on 1st dice and Y : greater of the two scores.
In the first section of this chapter, we would deal with joint PDFs for discrete and random
variables. Second section of this chapter focuses on joint cumulative function. Third
section covers the independence aspect of X & Y random variables. Finally conditional
probability distribution is discussed.
Suppose S is discrete sample space on which two random variables say X and Y are defined.
Probability that X takes value x and y takes value y in denoted by
PX ,Y ( x, y ) P ( X x and Y y )
Table 1
Sample space X Y
(1, 1) 1 1
(1, 2) 1 2
(1, 3) 1 3
(1, 4) 1 4
(2, 1) 2 2
(2, 2) 2 2
(2, 3) 2 3
(2, 4) 2 4
(3, 1) 3 3
(3, 2) 3 3
(3, 3) 3 3
(3, 4) 3 4
(4, 1) 4 4
(4, 2) 4 4
(4, 3) 4 4
(4, 4) 4 4
Table 2
Y 1 2 3 4 PX(x)
X (Row Total)
4 0 0 0 4/16 4/16
From the above table, PX ,Y (2, 2) = 2/16 i.e. when score on 1st tetrahedron is 2 and
highest of two scores is 2 is when (2, 1) or (2, 2) is the outcome. PX ,Y (4, 3) is zero since if
4 is the score on 1st tetrahedron; highest of two can't be 3 (it has to be four i.e. observed
on 1st tetrahedron).
Now let’s study, last row and last column of the above table. Suppose, given the joint PDF
we wish to again calculate probability distribution of either X or Y. X can take value 1 when
Y take value 1, 2, 3 or 4. So P(X = 1) is summed horizontally to 4/16. Likewise for P(Y = 1)
probabilities are added vertically. Last row labelled PY(y) is known as marginal probability
distribution function of Y.
For example:
X : P ( X 1) P ( X 1, Y 1) P ( X 1, Y 2) P ( X 1, Y 3) P( X 1, Y 4)
Suppose that PXY ( x, y ) is the joint PDF of the discrete random variables X and Y. Then,
marginal PDF for X is obtained by summing joint PDF over all values of Y i.e.
distribution of X and Y.
possible values of X & Y the probability of occurrence of any of the possible events is surely
1. This can be verified from table 2.
4
The minimum value of joint probability was zero and maximum value was . Summing all
16
the probabilities of X over all Y values gives PX ( x) and then summing all PX ( x) is 1 and
vice-versa.
Since
all x all y
f ( x, y ) 1
k 2k 3k 2k 4k 6k 3k 6k 9k 1
36k 1
1
For k , f ( x, y) kxy serves as probability distribution function.
36
b) Continuous Probability Density Function
If X and Y are random variables, then joint PDF would be defined as a function when
integrated over the range of values of X and Y1 gives the probability that X and Y takes on
values within that range. Suppose there exists a function f XY ( x, y ) for any region R in xy-
plane; then
P[( X , Y ) R ] f XY ( x, y)dxdy
R
Example 3. A study showed that daily number of hours, X, a teenager watches J.V. and the
daily member of hours, Y a teenager studies is approximated by joint PDF
f XY ( x, y ) xye ( x y ) , x 0, y 0
Suppose a teenager is chosen at random. The probability that he spends at least twice as
much time watching TV as he does working on his studies.
The region R (in definition of continuous PDFs) corresponds to the xy-plane where X 2Y,
x/2
P( X 2Y ) xye ( x y ) dydx
0 0
xe x
0
0
x/2
ye y dy dx
x
xe x 1 1 e x / 2 dx
0
2
1
In continuous case P(( X x), (Y y)) 0 ; i.e. probability at any point is zero. X and Y need take values over
the range R.
16 4 7
1
54 9 27
(ii) ( x, y)dydx 1
Example 4 : Suppose that joint probability density function for two continuous random
1
x
1 c ( xy ) dy dx
0 0
x
1 y2
1 c x dx
0 2 0
1
x3
1 c dx
0
2
1
x4
1 c.
8 0
1
1 c
8
c 8.
f X ( x) f XY ( x, y)dy
and fY ( y ) f XY ( x, y)dx
where f X ( x) and fY ( y ) are marginal PDFs for X and Y and f XY ( x, y ) are joint PDFs for
Example 5: Suppose that joint PDF for two continuous variables is given by
1
f XY ( x, y ) , 0 x 3, 0 y 2
6
Then marginal PDF of X is given by
2 1
f X ( x) dy
0 6
2
y
60
1
for 0 x 3.
3
So, X is a uniform random variable defined over the interval [0, 3].
3 1
fY ( y ) dx
0 6
3
x
60
1
for 0 y 2
2
Y is also a uniform random variable defined over [0, 2].
F 2 ( x, y ) P ( X x, Y y ) f ( s, t )
s x t y
y x
F ( x, y ) f (s, t )dsdt for < x < , < y <
2
Then it is clear that f XY ( x, y ) F ( x, y )
xy
2 F ( x, y )
` (1 e x )(1 e y )
xy x y
(1 e x e y e( x y ) )
x y
y
[e e ( x y ) ]
x
e ( x y ) for x 0 and y 0
f XY ( x, y )
0 elsewhere
Suppose there are n-discrete variables defined over same sample space, S : X1 , X 2 ,...., X n
. Then,
f ( x1 , x2 ,..., xn ) P( X1 x1 , X 2 x2 ,..., X n xn )
and F ( x1 , x2 ,..., xn ) P( X 1 x1 , X 2 x2 ,..., X n xn )
xn x2 x1
F ( x1 , x2 ,..., xn ) .... f (t1 ,....tn )dt1 , dt2 ,....dtn
n
Also, f ( x1 , x2 ,..., xn ) F ( x1 , x2 , x3 ,...., xn )
x1x2 ....xn
marginal pdf of X1 is g ( x1 ) .... f ( x1 ,....xn )dx2 ...dxn
( x y)z
f ( x, y , z ) for x = 1, 2,
63
y = 1, 2, 3,
z = 1, 2
3 4 6
63 63 63
13
63
are independent if and only if there are functions g(x) and h(y) such that
f XY ( x, y ) g ( x)h( y )
1
If above equation holds, there is a constant K such that f X ( x) kg ( x) and fY ( y ) h( y )
k
where k is set to be h( y ) dy
f XY ( x, y) 12 xy(1 y) , 0 x 1, 0 y 1.
f XY ( x, y ) 12 x[ y(1 y )]
g ( x).h( y)
1
fY ( y ) h( y )
k
where k h( y ) dy
1
y (1 y ) dy
0
1
y 2 y3
2 3 0
1
6
1
so f X ( x) 12 x 2 x for 0 x 1
6
P( A B)
P( A | B)
P( B)
Suppose these events are X x and Y y then
P( X x & Y y )
P( X x / Y y )
P (Y y )
f XY ( x, y )
f XY ( x / y ) , h( y ) 0
hY ( y )
where f XY ( x, y ) is joint PDF of X and Y and hy(y) is value of marginal distribution of Y at y
depending upon whether X and Y are discrete or continuous random variables respectively.
4 xy for 0 x 1, 0 y 1
f ( x, y )
0 elsewhere
Marginal distribution of X; g ( x) f ( x, y ) dy
1
4 xydy 2 xy 2 |10
0
= 2x
Marginal distribution of Y; h( y) f ( x, y ) dx
1
4xydx
0
2 x 2 y |10
2y
f ( x, y ) 4 xy
f ( x / y) 2x for 0 < x < 1
h( y ) 2y
Generalisation II
f ( x1 , x2 ,..., xn ) f1 ( x1 ) f 2 ( x2 ).... f n ( xn )
f ( x1....xn )
z ( x2 , x3 ,....xn / x1 )
g ( x1 )
Appendix
Collection of Sets (Y = y) for all y forms a partition of S; that is, they are disjoint and
(Y y ) S . The set ( X x) ( X x) S
all y
( X x) [( X x) (Y y )] S , so
all y
PX ( x) P( X x) P [( X x) (Y y)]
all y
P( X x),(Y y))
all y
PXY ( x, y )
all y
II. Proof to the theorem that f X ( x) f XY ( x, y ) dy where f XY ( x, y ) is joint
x
FX ( x) P ( X x ) f XY (t , y ) dtdy
f X ( x) f XY ( x, y ) dy
III. Proof to the theorem that the continuous random variables are independent
f XY ( x, y) g ( x)h( y )
P( X x) P(Y y )
FX ( x) FY ( y )
2 2
f XY ( x, y ) FX ( x, y) FX ( x) FY ( y )
xy xy
f XY ( x, y ) FX ( x ) FY ( y )
x y
f X ( x) fY ( y )
Second part of the proof assumes f XY ( x, y ) g ( x)h( y ) and needs to prove that X and Y
are independent.
f X ( x) f XY ( x, y ) dy
g ( x ) h( y ) dy g ( x ) h( y ) dy
let k h( y ) dy
so f X ( x) kg ( x)
g ( x ) h( y ) dx
h( y ) g ( x ) dx
h( y )
g ( x)dx
h( y )dy
h( y)dy
h( y )
g ( x) h ( y ) dxdy
k
1 1
h( y ) 1 h( y )
k k
Therefore,
1
kg ( x ). h( y ) dxdy
A B k
f X ( x ) dx fY ( y )dy
A B
P( X A) P( X B)
Let Y aX b
y b
Then PY ( y ) PX
a
y b y b
P X PX
a a
Let W X Y . Then,
PW (w) PX ( x) PY ( w x)
all x
PW (w) P(W w) P( X Y w)
P ( X x, Y w x )
all x
P( X x, Y w x )
all x
P ( X x) P ( Y w x )
all x
PX ( x) PY ( w x)
all x
f w ( w) f X ( x ) fY ( w x ) dx
Fw (w) P( X Y w)
w x
Fw ( w) f X ( x) fY ( y ) dydx
w x
f X ( x) fY ( y )dy dx
d d
Fw ( w) Fw ( w) f X ( x ) FY ( w x ) dx
dw dw
f X ( x ) fY ( w x ) dx
Q.1 If pxy (x,y) = cxy at the points (1,1), (2,1),(2,2) and (3,1) , and equals 0, elsewhere.
Find c.
Q.2 Suppose that random variables x and y vary in accordance with the joint pdf,
fxy(x,y)=c(x+y), 0<x<1, 0<y<1. Find c.
Q.3. An advisor looks over the schedules of his fifty students to see how many math and
science courses each has registered for in the coming semester. He summarizes his results
in a table. What is the probability that a student selected at random will have signed up for
more math courses than science courses?
Number Of 0 1 2
science
0 11 6 4
courses,Y
1 9 10 3
2 5 0 2
Q.4 suppose that x & y have a bivariate uniform density over the unit square:
& 0, elsewhere
i) find c
Find p (Y<3X)
Q8. Find the joint pdf associated with two random variables X & Y whose joint cdf is
Q9. The four random variables W, X, Y & Z have the multivariate pdf
Fwxyz(w,x,y,z)=16wxyz
For 0<w<1, 0<x<1,0<y<1, and 0<z<1. Find the marginnal pdf fWX(w,x) and use it to
compute P(0<W<1/2, ½<X<1)
Q10. Suppose fX (x)= x , x 0 and fY(y)= , y0 where X and Y are independent. Find
the pdf of X and Y.
References:
1. Jay L. Devore, Probability and Statistics for Engineers, Cengage Learning, 2010.
2. William G. Cochran, Sampling Techniques, John Wiley, 2007.
3. Richard J. Larsen and Morris. L. Marx, An Introduction to Mathematical
Statistics and its Applications, Prentice Hall, 2011.
Sem-II
1. Learning outcomes
2. Introduction
3. Conditional Expectation
4. Unconditional Expectation
6. Appendix
7. Summary
8. Exercises
9. Glossary
10. References
After you have read this chapter, you should be able to:-
As we saw in last few chapters Expected value in one variable case is the
and E(X ) xf X ( x ) if X is continuous random variable. E(X) is the value that
random variable X and E ( g ( X )) g ( x ) f X ( x) for continuous random
variable X.
This chapter is divided into three sections. First section covers conditional
expectation of a variable assuming a value of other variable for discrete and
continuous random variables. Second part of this chapter covers expected value
of some function of two or more random variables. Last sections focuses on some
laws of expectations, following their proofs.
1. Conditional Expectation
X for a given value of Y = y; where X and Y are two random variables. Conditional
expectation of X given Y is equal to the mean of the conditional distribution of X
given Y.
E ( X / Y y1 ) xf XY ( x / y1 )
marginal probability.
Example 1: Consider the following joint PDF for random variables X and Y, where
X stands for no. of printers sold and y represents no. of computers sold
X
Y 1 2 3 fY(y)
sold is known to be 2.
all x
x. f XY ( X ,2)
E ( X / Y 2)
fY (2)
0.22
2.2
0.10
xf XY ( x, y )dx
R
fY ( y ) dy
Example 2
( + 1) 0 < < 1
f(x| ) =
0 ℎ
E(X| )=∫ ( + 1)
=∫ ( + )
2. Unconditional Expectation
If we wish to find expected value of some single valued function of X & Y, say
of g ( X ,Y ) .
( , ) =∑ ∑ ( , ) ( , )
X and Y variables.
Y
X 1 1 fX(x)
1 0 0.2 0.2
2 0.2 0.3 0.5
3 0.1 0.2 0.3
fY(y) 0.3 0.7 1
Then E ( XY ) xy. XY ( x, y)
0.10
then E ( XY ) gives us the expected value that XY could take if X and Y are
randomly chosen.
2 2
Similarly, E ( X Y ) x y XY ( xy )
=2
X 2Y can take values from set A = {1, 4, 9, 1, 4,9}, E ( X 2Y ) = 2,gives the
2
expected value ( X Y ) could take if X and Y are randomly chosen.
Example 4
A nut company sells cans of mixed nuts containing almonds, cashews and
peanuts. Suppose net weight of each can is exactly 1 lb, but the weight of each
nut in the mix is random. Let X = the weight of almonds in a selected can and Y
= weight of cashews. Consider the joint PDF for XY as follows:
24 xy 0 x 1, 0 y 1, x y 1
f ( x, y )
0 otherwise
h( X , Y ) 1( X ) 1.5(Y ) 0.5(1 X Y )
= 0.5 + 0.5X + Y
E (h( X , Y )) h( x, y ) f ( x, y )dxdy
1 1 x
[(0.5 0.5 x y )24 xy ]dydx
0 0
1 1
12 xy 12 x 2 y 24 xy 2 dy dx
0 0
1 x
2 2
y3
2 y 2 y
1
12 x 12 x 24 x dx
0
2 2 3 0
1
(6 x (1 x)2 6 x 2 (1 x) 2 8 x(1 x)3 )dx
0
1
[6 x 6 x 3 12 x 2 6 x 2 6 x 4 12 x3 8 x 8 x 4 24 x 2 24 x3 ]dx
0
1
2 x5 18 x 4 30 x3 14 x 2
5 4 3 2 0
2 18
10 7
5 4
4 45 30
10
11
1.1
10
So the expected cost of randomly chosen can of nuts would be expectedly $1.1.
E ( XY ) E ( X ).E (Y )
E ( XY ) xy PXY ( x, y )
all x all y
xy PX ( x) PY ( y )
all x all y
xPx ( x). y PY ( y )
all x all y
E ( X ).E (Y )
then
E ( g ( X 1 , X 2 ,... X n )) ...
all x1 all xn
g ( X 1 , X 2 ,... X n ) PX1 ... X n ( x1 ,..., xn )
E ( g ( X 1 , X 2 ,... X n )) .... g ( X 1 ,..., X n ) f X1 .... X n ( x1 ,...., xn )dx1...dxn
E (aX bY ) aE ( X ) bE (Y )
E (aX bY ) (ax by ) f XY ( x, y )
all x all y
(a x) f XY ( x, y ) (b y ) f XY ( x, y )
all x all y all x all y
a x f XY ( x, y ) b y f XY ( x, y )
all x all y all x all y
a x f XY ( x, y ) b y f XY ( x, y )
all x all y all y all x
a x f X ( x ) b yfY ( y )
all x all y
aE ( X ) bE (Y )
Proof: E (aX bY ) (ax by) f XY ( x, y)dxdy
axf XY ( x, y )dxdy byf XY ( x, y) dxdy
a x
f XY ( x, y )dy dx b y
f XY ( x, y )dx dy
a x f X ( x )dx b y fY ( y )dy
aE ( X ) bE (Y )
E ( XY ) E ( X ).E (Y )
The proof of it is already done in second section. Also, if X and Y are
independent random variables, then,
E ( g ( x ).h( y )) g ( x )h( y ). f XY ( x, y )
all y all x
g ( x)h( y ). f X ( x). fY ( y )
all y all x
g ( x ). f X ( x) h( y ). fY ( y )
all x all y
1
Let Xi be the number showing on the ith die for i = 1, 2, ... 10. PX i ( k ) for k
6
= 1, 2, 3, 4, 5, 6. Expected value of a number showing on the ith
6
1 1
dice E ( X i ) k 21 3.5
k 1 6 6
X X 1 X 2 .... X 10
E ( X ) E ( X 1 ) E ( X 2 ).... E ( X 10 )
10 3.5
= 35.
Let X and Y denote the seat numbers of the first and second individuals,
respectively. Possible (X,Y) pairs are {(1,2),(1,3)…….,(5,4)}, and the joint pmf of
(X,Y) is
( + 1) = 1, … 5; = 1, … .5; ≠
p(x,y) =
0 ℎ
the number of seats separating the two individuals is h(X,Y) = |X-Y|-1. The
following table gives h(x,y) for each possible (x,y) pair.
h(x,y) 1 2 3 4 5
1 - 0 1 2 3
2 0 - 0 1 2
y 3 1 0 - 0 1
4 2 1 0 - 0
5 3 2 1 0 -
E(h(X,Y)) = ∑ ∑ ℎ( , ). ( , )
=∑ ∑ (| − | − 1) ∗ …..(x≠y)
=1
Most securities available for investment have uncertain outcomes and thus are
risky. Each investor has to hence decide in which asset would he invest. While
selecting securities, investor could look at expected returns on various securities
& have decide for or choose securities with higher expected returns.
Suppose there are two securities viz A & B . In different situations returns on
these securities would vary. The likelihood that any of these state prevail is also
given in the following table (i.e probability of each state).
State 1 2 3 4 5 Total
Returns on 10% 12 8 14 19
Security
A,RA
Returns on 20% 25 33 27 22
Security
B,RB
Probability, 10 25 .35 .20 .10 1
P
P * RA 1 3 2.8 2.8 1.9 11.5
P * RB 2 6.25 11.55 5.4 2.2 27.4
From the above table, we can read that in state 3 which is likely to occur & with
35% chance, returns on Security A & B would be 8 % 33% respectively.
=11.5%
=27.4%
Creating Portfolio
If investor decides to invest the entire money not just in one security but some
mix of the two, then he is creating a portfolio.
Suppose if investor have chosen to invest 75% of his money is security B & 25 %
is security A then Expected returns on this portfolio,
=.(25x11.5)+(75x27.4)
= 2.875+20.55
= 23.425%
Though the expected return on portfolio is less than expected return on security B
but in case of portfolio risk is reduced as he is now not putting all eggs in one
basket. (This would be dealt in next chapter.)
Exercises:
Q. 1 Suppose that the daily closing price of stock goes up an eighth of a point
with probability p and down with a probability q, where p>q. After n days how
much gain can we expect the stock to have achieved? Assume that the daily price
fluctuations are independent events.
References:
DC-1
Semester-II
1
Mathematical Methods for Economics: Vectors and Vector Operations
2.2 References
In the present chapter you will learn about the following aspects;
1.1 Introduction
There are many systems in mathematics, which are employed to handle problems in
geometry, mechanics and other branches of applied mathematics. Vectors, Matrices and
Determinants are the important part of mathematical systems, which are related to linear algebra.
Basically linear algebra is branch of mathematics concerned with the study of vectors. Vector
spaces and linear maps between them are the main structure of linear algebra. Most of the
economic problems are based on multidimensional. Economists have used mathematical model
to solve these problem in terms of system of equations. If the system of equation are linear then
this area of mathematics are called linear algebra.
1.2 Linear Equation Systems
In general, equations systems are linear, if it has the form such that;
a11x1 + a12x2 + a13x3 + …… + a1nxn = b1
a21x1 + a22x2 + a23x3 + …… + a2nxn = b2
a31x1 + a32x2 + a33x3 + …… + a3nxn = b3
--------------------------------------------------------------------
3
Mathematical Methods for Economics: Vectors and Vector Operations
Suppose an economy has three sectors, i.e. agriculture, industry and service. The total
output of a particular sector is consumed by all these sectors as input and final demand of the
sectors.
Input-output process can be explained by the given table;
By the above table, total output of agriculture, industry and service sectors can be
written as;
X1 = X11 + X12 + X13 + F1
X2 = X21 + X22 + X23 + F2
X3 = X31 + X32 + X33 + F3
and L = L1 + L 2 + L 3
In general, we can write as;
n n
Xi Xij Fi And L Li
j1 L 1
4
Mathematical Methods for Economics: Vectors and Vector Operations
The above identity states that all the output of particular sector could be utilized either as
an input in one of the producing sectors of the economy or as a final demand.
X ij
aij
Xj
or Xij = aij . Xj
Or, X = AX + F
X [I-A] = F OR X = [I-A]-1 F
……………………………………………….
…………………………………………………
This is called Leontief systems of input-output. The numbers a11, a12, a13 … ann are
called technical (input) coefficient and b1, b2, b3 … bn are final demand.
5
Mathematical Methods for Economics: Vectors and Vector Operations
0.5 0.2
Example: If the technical coefficient is given by A and final demands of
0.1 0.4
goods are 50 and 100. Write down the Leontief model.
X1 0.5 0.2 X1 50
Or, X 0.1 0.4 X 100
2 2
1
0.5 0.2 X1 50 X1 0.5 0.2 50
Or, 0.1 0.6 X 100 or X 0.1 0.6 100
2 2
1.4 Vectors
‘Vectors facilitate analytic study of such physical objects as have direction in addition to
magnitude’. A vector space is a set whose elements can be added together and multiplied by
scalars or numbers.
Let F be a field of vector space and a1, a2 ……… an be the numbers of F. Then the
ordered set of number's is called vector of order n.
V = {a1, a2 a3 … an}
Where, a1, a2 ……… an are called the components of the vector V and these numbers are
1.4.1 Scalars
Quantities that have only magnitude and no direction are called scalars. For example –
time, population, temperature, power etc.
6
Mathematical Methods for Economics: Vectors and Vector Operations
Zero Vectors: A vector whose initial and terminal points are coincided is called a zero
vector. The length of zero vectors is zero.
Equal Vector: Two vectors are said to be equal if they have the same length and
direction.
Unit Vector: A vector 'a' is called a unit vector if its magnitude is one. It is denoted by â .
a1
a
Column Vector: It is represented by a column, e.g. a 2
a
n
Free Vector: A vector whose direction is known but the initial point and the line of
application are not known is called a free vector.
U+0=U …(iii)
U + (-U) = 0 …(iv)
( + )U = U + U …(v)
7
Mathematical Methods for Economics: Vectors and Vector Operations
(U + V) = U + V …(vi)
Solution:
Solution: Given;
Then, 2x – 5 = 3 or x = 4
2y + 10 = 1 or y = -9/2
2z + 15 = 3 or z = -6
8
Mathematical Methods for Economics: Vectors and Vector Operations
2 4 1
Example: Prove that vector equation u V represents two equation in two
3 6 0
unknown u, v, find the solution.
2 4 1
Solution: Given; u V
3 6 0
i.e. 2u + 4v = 1 …………………..…(1)
-3u + 6v = 0 ………………….…(2)
Example: (i) When is the vector b is said to be a linear combination of vectors x, y and z?
(ii) Consider the vector a = (1, 2, 3) and b = (2, 3, 1) then Find k such that
w = (1, k, 4) is a linear combination of ‘a’ and ‘b’.
1 2 1
(ii) Let 2 b 3 k
3 1 4
Then; a + 2b = 1, 2a + 3b = k and 3a + b = 4,
The triangle law of vectors: The figure shows the triangle law of vectors. It is represented
by AB BC AC or a + b = (a + b)
9
Mathematical Methods for Economics: Vectors and Vector Operations
Parallelogram law of vectors: The given figure shows the parallelogram law of vectors. It is
represented by OA OC OB .
Some other geometric interpretation of vectors operations in 2-space and 3-space is given below;
Example: Given u = (5, -1) and v = (-2, 4) compute u + v with the help by geometric vectors
starting of origin.
Solution:
10
Mathematical Methods for Economics: Vectors and Vector Operations
The scalar product of any two n-vectors u = (u1, u2 … un) and v = (v1, v2 … vn) is defined
n
as; u.v = u1.v1 + u2.v2 + … + un.vn = u v
i 1
i i
If the commodity vector a = (a1, a2 … an) and price of commodity vector P = (P1, P2 …
Pn) then the scalar product of P and a is called total value of the entire commodity vector. It is
defined as; p1a1 + p2a2 + … + pnan = p.a
= -2 + 6 + 15 = 19
11
Mathematical Methods for Economics: Vectors and Vector Operations
If a = (a1, a2, a3 … an) be an n-vector then the length (norm) of a vector ‘a’ is given by:
If u = (u1, u2 … un) and v = (v1, v2 … vn) be on the vectors then the distance (Euclidean)
between the vectors is given by;
d uv u1 v1 2 u 2 v2 2 ... u n vn 2
PQ a 2 a1 PQ b2 b1 By Pythagorean Theorem,
l2 = m2 + n2
Or PQ l a1 a 2 2 b1 b2 2
In particular, if we take y to be zero, then the distance from the point x = (x 1, x2, … xn) to
u.v u . v
Example: If u = (1, 2, -3) and v = (-3, 2, 5) be the two vectors then find lengths of vectors,
distance between vectors and check the Cauchy-Schwarz inequality (CSI).
Solution:
Lengths: u 1 4 9 14, v 9 4 25 38
Distance: d u v 16 0 64 80
Orthogonality: If the angel between two vectors is 90 then the vector are said to be orthogonal.
It is denoted by a b. So, we can say that two vectors in R2 or R3 are orthogonal if and only if
their scalar product is zero.
a b a.b = 0
Their dot product are zero, a.b 0 . Both vectors are unit vectors, a b 1,
i.e. u.u = 1 and v.v = 1.
13
Mathematical Methods for Economics: Vectors and Vector Operations
r 2 V12 V22 ... Vn2
= r . V proved.
Theorem II: Suppose u and v are two vectors in Rn and Q be the angel between them then
proved that u.v u v cos Q
Proof: Let u = OP and v = OQ be the two vectors and t.v = OR, here v is the vector and t is a
scalar multiple.
14
Mathematical Methods for Economics: Vectors and Vector Operations
tv t v
cos Q …(1)
u u
u tv u tv
2 2 2
t 2 v u t.v
2 2 2
u
t 2 v u.u 2.t.u.v t 2 .v.v
2 2
u
t 2 v u t 2 v 2.t.uv
2 2 2 2
u
u.v
t 2
(2)
v
u.v
cos Q Proved.
u.v
Theorem III: If u and v are the two vectors in Rn then proved that u v u v
u.v
Proof: We know that cos Q 1
u.v
u v u 2 u.v v
2 2 2
Now
u 2 u.v v u 2 u . v v
2 2 2 2
Or,
u v . u v u v
2
15
Mathematical Methods for Economics: Vectors and Vector Operations
uv u v
2 2
1 3
Example: Given, a 15 , b 5 are two-column vector.
2 1
Solution: (i)
a 1 225 4 230
b 9 25 1 35
(ii)
1 3 1 3k
c 15 k 5 15 5k
2 1 2 k
1 3k 3 0
15 5k .5 0
2 k 1 0
3 9k 75 25k 2 k 0
k2
16
Mathematical Methods for Economics: Vectors and Vector Operations
The vectors a, b and c are called linearly dependent vectors if scalars x, y and z exist,
such that:
The vectors a, b and c in a plane are called linearly independent vectors if,
xa + yb + zc = 0
Note:
Example: If 5a + 3b = 2c and a.b = c then show that a and c have the same directions and a
and b have opposite direction. Are the vectors a, b and c linearly independent?
The vectors a, b and c are not linear independent since there exists a linear combination
of the vectors a, b and c.
^ ^
Example: Let a i 3 j and b 2iˆ 5 ˆj then find a unit vector parallel to vector a + b.
17
Mathematical Methods for Economics: Vectors and Vector Operations
Now a b 1 64 65
1 ˆ 8 ˆ 1
i j aˆ
65 65 a
A line in Rn
The line L through the vectors a = (a1, a2 … an) and b = (b1, b2 … bn) is the set of all x =
(x1, x2 … xn) satisfying;
x1 = (1 – t) a1 + tb1
x2 = (1 – t) a2 + tb2
……………………
xn = (1 – t) an + tbn
Now, let p = (p1, p2 … pn) is a point in Rn then straight line L passing through (p1, p2 …
pn) in the same direction of the vector a = (a1, a2 … an) is given by;
x = p + t.a
18
Mathematical Methods for Economics: Vectors and Vector Operations
A hyper plane through vector ‘a’ = (a1, a2 … an) that is orthogonal to ‘a’ vector p = (p1, p2
… pn) 0 is the set of all points x = (x1, x2 … xn) satisfying,
p.(x – a) = 0.
Example: Find the equation for the plane in R3 though vector v = (2, 1, -1) with
P = (-1, 1, 3) as a normal.
Or -x1 + x2 + 3x3 = - 4.
Example: Given = (1, 2, 1) and = (-3, 0, -2), find real number x1 and x2 such that (x1 +
x2) = (5, 4, 4)
Now, x1 – 3x2 = 5
2x1 = 4
and x1 – 2x2 = 4
Solving, x1 = 2 and x2 = -1
Example: Find the equation of the line in R3 passing through the points (2, 4, -1) and (5, 0,
7). Where does the line intersect the xy plane? Using this equation to exactly describe the line
segment joining the two given points.
19
Mathematical Methods for Economics: Vectors and Vector Operations
Then, x1 = (1 – t) . 2 + t.5 = 2 + 3t
x2 = (1 – t).4 + t.0 = 4 – 4t
x3 = (1 – t).(-1) + t.7 = -1 + 8t
x1 = 19/8, x2 = 7/2, x3 = 0
Solution: We know that three vectors, u, v and w are coplanar if u.x + v.y + wz = 0, where x
+ y + z = 0 and x, y, z are not all zeros.
So, 2x + y = 3 …(i)
x + 3y = 4 …(ii)
x – 5y = -4 …(iii)
20
Mathematical Methods for Economics: Vectors and Vector Operations
Problem: Prove that the two vectors a and b are equal if and only if their components along
the x and y-axes are equal.
Solution: Let a a1jˆ a 2ˆj and b bijˆ b2ˆj be two vectors where a1, a2 and b1, b2 are the
Necessary Condition:
a1 b1 ˆi b2 a 2 ˆj
It shows either (a1 – b1) î and b2 a 2 ˆj are parallel or each is a zero vector. But they
are not parallel.
a1 b1 ˆi b2 a 2 ˆj 0
a1 – b1 = 0 and b2 – a2 = 0
Or a1 = b1 and b2 = a2
Sufficient Condition:
a1 = b1 and a2 = b2
a1 – b1 = 0 and b2 – a2 = 0
a1 b1 ˆi 0 b2 a 2 ˆj
a=b
Problem Set
21
Mathematical Methods for Economics: Vectors and Vector Operations
1. Having bought n commodities the price being p1, p2 … pn and quantities being Q1 Q2 …. Qn
express the total cost of purchase in vector notation.
2. The input coefficient matrix and final demand of three sector economy is given below:
1 0 4
u 2 , v 1 , w 5
3 4 0
5 12
6. Show that the vectors: x , y t are orthogonal and find length of vector.
4 15
7. Find the vector of unit length that is normal to the plane 3x + y – z = 10.
Prove that: u v u v 2 u 2 v
2 2 2 2
8.
9. Can the vectors (1, 0, 0), (0, 1, 0) and (0, 0, 1) spam the R3 space.
22
Mathematical Methods for Economics: Vectors and Vector Operations
1 0 1
10. Find the pattern of dependence of 1 1 0 can they span R3?
0 1 1
11. To find the point-normal equation of plane P which contains the points, p = (2, 1, 1), q = (1,
0, -3), r = (0, 1, 7).
5 0
12. Given u and v , find 24 + 3v graphically.
1 3
14. If the sum of two unit vectors is a unit vector, show that the magnitude of their difference is
3.
23
Mathematical Methods for Economics: Vectors and Vector Operations
1. P Q
1 1 P2Q2 ... Pn Qn
1
x1 0.7 0.4 0.2 100
X I A .F x 2 0.2 0.5 40
1
2. 1
x 3 0.1 0.3 0.9 30
3. u = 3, v = -3, w = -4.
4. x = 3u + 4v
5. No
6. x 41, y 369 .
9. Yes
11. 3x – 7y + z = 0.
REFRENCES
Allen, R.G,D, Mathematical Analysis for Economists, London: Macmillan and Co. Ltd
Knut Sydsaeter and Peter J. Hammond, Mathematics for Economic Analysis, Prentice Hall
Carl P. Simon and Lawrence Blume, Mathematics for Economists, London: W .W. Norton & Co.
24
Mathematical Methods for Economics: Matrices and Matrix Operations
DC-1
Semester-II
1
Mathematical Methods for Economics: Matrices and Matrix Operations
1.8 References
In the present chapter you will learn about the following aspects;
2
Mathematical Methods for Economics: Matrices and Matrix Operations
1.1 Introduction
The subject of matrices had its origin in various types of problems. Of these, solutions of
a given system of equations d liner transformations in geometry are extremely interesting. In
1857, the British mathematician Arthur Cayley formulated the general theory of matrices. He
developed the properties of matrices as pure algebraic structure, though matrices as arrays of
coefficients in homogeneous linear equation were recognized long before. A matrix is a very
useful tool to analysed of various problems in different subjects.
1.1.1 Matrices
A matrices is ordered set of numbers listed rectangular form; i.e.
a11 a12 a13
a a23
21 a22
a31 a32 a33
OR
pattern of numbers on the other hand determinant gives us a single number. The size of matrix is
written a ij , where, i =row and j = columns. a ij is the element of a matrix.
For examples
3
Mathematical Methods for Economics: Matrices and Matrix Operations
a11
a a11 a12 a13
a11 a12 a
a 2 2 Matrix 21 4 1 Matrix 21 a22 a23 3 3matrix
21 a22 a31
a31 a32 a33
a41
Square Matrix: If a matrix has ‘n’ rows and ‘n’ columns then we say it is a square
matrix. For example;
Diagonal Matrix: It is a square matrix where all non-diagonal element is zero such that
a11 0 0
A 0 a12 0
0 0 a13 33
a11
A a21
a31 31
Zero Matrix: If the all elements of an matrix is zero then it is called zero matrix.
0 0 0
A 0 0 0
0 0 0
4
Mathematical Methods for Economics: Matrices and Matrix Operations
Opposite Matrix: If the all elements of an matrix multiply by negative sign then we get
opposite matrix
aij = aij
Transpose Matrix: If we convert row to column and column to row of an matrix then we
gent transpose of an matrix. For example;
Then;
AT A' a ' ij , where a’ij = aij
Example: Given;
1 0 2
A , find A
T
2 3 1
Solution:
1 2
A A 0 3
T '
2 1 3 x 2
5
Mathematical Methods for Economics: Matrices and Matrix Operations
Rules of Transposition
A
1 1
A..................(i)
A B A1 B1..................(ii )
1
AB B1 A1........................(iv)
1
Identity Matrix: An identity matrix (I) is an diagonal matrix with all diagonal elements
1 0 0 0
1 0 0 0 0 0
I 3 0 1 0
1
In
.. .. .. ..
0 0 1 33
0 0 0 ...1 nn
Or A In = InA = A
Given; aij = 2i j
6
Mathematical Methods for Economics: Matrices and Matrix Operations
Then Matrix;
1 0 1
A 3 2 1
5 4 3 33
Equality Matrix: Suppose, A = (aij)m n and B = (bij) m n be the two mn matrices.
Then A and B is said to be equal matrices if A = B
Thus, if both matrices have some dimension then they called equal. Otherwise they called
unequal matrix such that A B.
Example: Given;
3 y 1 z 2 2
2 x 4 3
7
Mathematical Methods for Economics: Matrices and Matrix Operations
3 y 1 z 2 2
2
x 4 3
Z 2 3 or Z 5, y 1 2, y 3 and x 3
Let A = aij)m x n and B = (bij) be the two martices thent the sum of A and B matrices is
defined as;
A + B = (aij)m x n + (bij)m x n
A + B = (aij + bij)m x n
A = (aij)m x n = (aij)m x n
A + B = (aij + bij) m x n
Example : Given,
1 2 3 1 0 2
A , B
4 2 - 3 0 2 1
1
Compute A + B, and 2 A B
2
Solution:
8
Mathematical Methods for Economics: Matrices and Matrix Operations
1 2 3 1 0 2
AB
4 2 -3 0 2 1
11 2 0 3 2
4 0 2 0 -3 1
2 2 5
4 4 -2
And
1 2 3 1 1 0 2
2A ½B 2
4 2 -3 2 0 2 1
2 4 6 ½ 0 1
8 4 -6 0 1 ½
2 ½ 4 0 6 1 2½ 4 7
8 0 4 1 -6 ½ 8 5 5½
o (A + B) + C = A + (B + C)
o A+B=B+A
o A+0=A
o A + (A) = 0
o ( + ) A = A + A
o (A + B) = A + B
1.3 Matrix Multiplication:
9
Mathematical Methods for Economics: Matrices and Matrix Operations
Example: Given;
2 3
4
A 2 0 and B
1 2 2
Compute AB and BA
Solution:
2 x 4 3x 2 14
AB 2 x 4 0 x 2 8
1x 4 2 x 2 8
Problem : Given that A = (aij)mn and B (bij)mp then compute the product of both matrix
i.e. C = AB
C AB
10
Mathematical Methods for Economics: Matrices and Matrix Operations
In general;
1 0 0 1
For example, let A and B
0 1 1 0
1 0 0 1
AB
0 1 1 0
1 0 0 1 11 0 0
0 0 (1) 1 0 1 (1) 0
0 1
1 0
And
0 1 1 0 0 x 1 1 x 0 0 x 0 1 x (1)
BA
1 0 0 1 1 x 1 0 x 0 1 x 0 0 x (1)
11
Mathematical Methods for Economics: Matrices and Matrix Operations
0 1
BA
1 0
A(B + C) = AB + AC
AI = A = IA {Also, I I = I}
If the product of two matrices is a zero matrix, then it is possible that none of them is a
zero matrix, i.e.
AB = 0, then A 0 and B 0
Let
1 1 2 2
A and B
1 1 2 2
12
Mathematical Methods for Economics: Matrices and Matrix Operations
Then
3 4 2 6
A and B
2 7 1 5
3 4 2 6
A and B
2 7 1 5
3 2 4 6 5 10
A B
2 1 7 5 3 12
3 4 2 6 3 2 4 1 3 6 4 5
and AB 2 x2 7 x1 2 6 7 5
2 7 1 5
10 38
11 47
Now
3 4 3 4 9 8 12 28
A2
2 7 2 7 6 14 8 49
13
Mathematical Methods for Economics: Matrices and Matrix Operations
17 40
20 57
2 6 2 6 4 6 12 30
B2
1 5 1 5 2 5 6 25
10 42
7 31
17 10 20 40 42 76
20 7 22 57 31 94
47 158
(1)
49 182
Also,
55 170
51 174 ----------------------------------------------(ii)
A B2 A2 B2 2AB
14
Mathematical Methods for Economics: Matrices and Matrix Operations
1 1
Example: Let A then prove that;
0 1
1 k
Ak
0 1
Solution: Given
1 1
A
0 1
1 1 1 1 1
A. A
0 1 0 1
1 1 2
0 1
1 1 k 1 1 1
Then AK AK 1. A 0 1
0 1
1 k
A Pr oved
0 1
Idempotent Matrix: Let A be an square matrix then the product A by itself is called Idempotent
matrix. It is defined as;
AA = A , AAA = A3 = A
In General An = A
15
Mathematical Methods for Economics: Matrices and Matrix Operations
Orthogonal Matrix: Let A is the nn square matrix then A is said to be orthogonal matrix if,
a b
Example: Given; A
b a 2 2
Solution: Given,
a b
A
b a 2 2
a b
Than A
b a 2 2
a b a b
AA
b a b a
a 2 b 2 ab ba
2
ba ab b a
2
a 2 b 2 ab ba
2
ba ab a b
2
a 2 b2
0
0 a b
2 2
Given, a2 + b2 = 1 then
1 0
AA I2 Hence Proved
0 1
16
Mathematical Methods for Economics: Matrices and Matrix Operations
It is defined as;
2x + 3y = 4 --------- (1)
6x - y = 2 --------- (1)
2 3 x 4
A , X and b then;
6 1 y 2
2 3 x 2 x 3 y
AX , y 6 x y
6 1
AX = b
3 4
Example: If A then prove by mathematical induction
1 1
1 2n 4n
An
n 1 2n
3 4
Solution: Let A
1 1
1 2 2 4 2
1 2 1 2 2
17
Mathematical Methods for Economics: Matrices and Matrix Operations
1 2n 4n
A3 A. A
1 2n
proved
n
Example: (i) A matrix P is orthogonal if P'P = 1. Prove that if P is an nn matrix whose columns
are all of length 1 and mutually orthogonal then P is orthogonal .
1 1 1
A 1 3 4
7 5 2
P is orthogonal
(ii) No, A is not orthogonal matrix because columns of A are not of length 1
then;
AA I 3
Example: (i) Let D be the 33 diagonal matrix with entries d1, d2 and along d3 along the
diagonal and zero's elsewhere. Let A = (aij) be an arbitrary 33 matrix. Compute AD and DA.
Show that AD multiplies the ith column of A be entry di while DA multiples the ith row of A by
entry di.
18
Mathematical Methods for Economics: Matrices and Matrix Operations
AD = DA
a11 0 0
(ii) Given; A 0 a22 0
0 0 a33 3 3
2 0 0
D 0 3 0
0 0 4
2 1
Example: For what value of β, D is symmetric?
2 1
Solution: Given,
19
Mathematical Methods for Economics: Matrices and Matrix Operations
2 1
A
2 1
A = AT = A1
2 1 2
2
2 1 1 1
2 1 2
or 2 3
AB = BA = In
or AB = I
Problem Set
20
Mathematical Methods for Economics: Matrices and Matrix Operations
(1) Given,
1 2 x 1
A , B
1
, find x and y
2 1 y
1 q 1 nq
(2) Let B , the prove B n
0 1 0 1
3 2
If A , find A2 5 A 7 I
1
(3)
5
1 3 0 0 1 0
(4) If A 1 1 0 , B 1 0 0 then prove that AB BA
4 1 0 0 5 1
2 3 3 1
(5) Let B , B , prove that AB BA
4 5 2 5
(6) Given,
2 2 4
A 1 3 4 , then prove A is idempatent matrix
1 2 3
1
(7) For the following given matrix is orthogonal
2
0
A 0
0 1 0
3 1 2
(8) Show that A 1 2 0 , is symmetic matrix
2 0 1
21
Mathematical Methods for Economics: Matrices and Matrix Operations
(10) If is a scalar, A and B are matrices of order 3 4, then show that (A + B) = A +
(1) x=1&y=2
11 6
(3) 15 23
2 6 4 2
1 1 1
(9) X , y 1 10 , z 4 6
2 2 2 4 2 3 2
REFRENCES
Allen, R.G,D, Mathematical Analysis for Economists, London: Macmillan and Co. Ltd
Chiang, Alpha C., Fundamental Methods of Mathematical Economics, New York: McGraw Hill
Knut Sydsaeter and Peter J. Hammond, Mathematics for Economic Analysis, Prentice Hall
22
Mathematical Methods for Economics: Matrices and Matrix Operations
Carl P. Simon and Lawrence Blume, Mathematics for Economists, London: W .W. Norton & Co.
23
Determinants and Matrix Inversion
DC-1
Semester-II
In the present chapter you will learn about the following aspects;
1.1 Introduction
The present chapter is developed to understand the concept of determinants and its
application to find out matrix inversion. Basically, it is a part of linear algebra which eases
the difficulty level of the simultaneous equations in algebra by providing means for their
presentation and solution.
Every square matrix A n×n is associated with a unique number called the
determinants of the matrix. If A = (aij) be an n×n matrix, then the determinant of A is
denoted by |A| or det(A) or
If A = (a11) be an 1×1 matrix, then |A| = a11 i.e., the determinant is equal to the element
itself.
a a12
If A 11 be a 2×2 matrix then,
a 21 a 22
a11 a12
|A| = a11×a22-a12×a21
a 21 a 22
Thus we may represent the determinant in terms of rows and columns as:
LEADING TERM: The diagonal elements in the determinant i.e. b11 and c22 are the leading
term and it always has a positive sign.
Note: A determinant of the second order has two diagonal elements having positive signs
and 2! = 2 terms in its expansion out of which one is positive and other is negative.
We can express this equation in a compact way and solve it by using matrices. Let
B= ,X= ,C=
In simple terms, it can be written as BX = C and thus be solved. This square matrix
we know is non-singular and in this chapter, we use determinants to determine whether a
given square matrix is non-singular/invertible or not. For a matrix to be non-singular, its
determinant value should not be equal to zero.
Example: Compute the value and cofactors of the given below determinant.
|A| =
|A| = 1 -2 +3
= 3 – (-18) + (-30) = -9
We can expand the determinant by any row or column, if we expand it by 1st column,
|A| = 1 -4 +7
Cofactors of determinant;
Sarrus’ rule is the alternative way to compute determinants of order 3. This method
is very convenient for many people. In this method, we write down the determinant twice,
except that the second time the last column of the IInd determinant should be omitted. It is
given below;
Firstly, multiple along three lines falling to the right, giving all these products a plus
sign;
a11a22a33+a12a23a31+a13a21a32 ---------------------------------------------
(A)
Secondly, multiple along three lines falling to the right, giving all these products a minus
sign;
-a31a22a13-a32a23a11-a33a21a12 ---------------------------------------------
(B)
The sum of equation (A) and (B) is exactly equal to determinant A i.e ,| A|.
If all the elements of any row or column are zero, the value of the determinant is
also zero, then, | A| = 0
If we exchange all the rows of a determinant from columns and vice-versa, the
determinant remains unchanged in value and signs i.e. the value of a determinant
and its transpose remains same i.e.|A| = |A|T
If we interchange any two rows or two columns of a determinant, its value remains
unchanged numerically but changes in sign.
If a constant ‘c’ is multiplied (or divided) by all the elements of any one row (or
column) of a determinant, then the value of the determinant is also multiplied (or
divided) by ‘c’.
The determinant of the product of two n×n matrices A and B is the product of the
determinants of each of the factors;
AB A . B
A n A
1.5Multiplication of Determinants
× =
4 2 5 3 4 1 5 2
Example: Let, × =
1 2 2 3 1 1 2 2
= =
Suppose we have a determinant |A| and its adjoint is represented as |A|’ or Adj A.
It is also known as augmented matrix. The elements in |A|’ are the cofactors of the
corresponding elements of |A| , i.e.,
Where B1, C1, D1, …. are the respective cofactors of b1, c1, d1, … of determinant |A|.
Now, B1 = (-1)1+1 = -2
B2 = (-1)2+1 = +1
B3 = (-1)3+1 = +4
Thus, |A’| =
Suppose we have a determinant A, its inverse is represented as A-1. Provided the |A|
≠ 0, the inverse of A is formed by dividing every element of the adjoint of determinant A by
|A|.
indicate the position of its respective element (i.e. b 11 lies in the 1st row and 1st column and
c12 lies in 1st row and 2nd column and so on). The general formula for a determinant is |A|
= where arc are the elements of the determinant. A determinant is said to be
symmetric if arc = acr for all r,c = 1, 2, 3, …., n.
1. If we find the adjoint of a symmetric determinant, we see that its adjoint is also
symmetric.
2. If we square a symmetric determinant, the resultant determinant is also a symmetric
determinant.
1.9 Skew and Skew-Symmetric Determinants
indicate the position of its respective element (i.e. b 11 lies in the 1st row and 1st column
and c12 lies in 1st row and 2nd column and so on). The general formula for a determinant
is |A| = where arc are the elements of the determinant. A determinant is
said to be ‘skew’ if arc = -acr for all r,c = 1, 2, 3, …., n and r≠c.
And if arc = -acr for all r,c = 1, 2, 3, …., n and r ≠ c and arc =0 for all r = c, then the
determinant is known as ‘skew-symmetric’.
Institute of Lifelong Learning, University of Delhi
Determinants and Matrix Inversion
d b
a b 1 1 d b ad bc ad bc
For Example; A then A
c d ad bc c a c a
ad bc ad bc
Let A = (aij) be an n×n matrix with determinant det( A) | A | 0 and it has a unique
1 1 1
inverse A such that A A = A A=I, then;
1
A 1 . Adj ( A)
A
Properties of the Inverse: Let A and B are invertible n×n matrix, then;
1
If A is invertible then ( A 1 ) 1 A
A I to I A-1
It can be explained by the help of an example.
1 4
Let A = , then
2 7
1 4 1 0
A I
2 7 0 1
1 4 1 0
2 R1 R2
0 1 2 1
1 4 1 0
- R2
0 1 2 1
1 0 7 4
- 4R 2 R 1
0 1 2 1
I A 1
7 4
A-1 = 2 1
Now we can rewrite this equation form in the form of determinant as follows:
× =
Let |B| =
Now,
x . |A| = x. =
= = |B|
Thus, x= .
Similarly we can solve for y and z values.
3x 2 y 1
For Example, Let be the Simultaneous Equations then
5 x 3 y 11
3 2 1 2
D (9 10) 19 Dx (3 22) 19
5 3 11 3
3 1
Dy (33 5) 38
5 11
Dx 19 Dy 38
Now, x 1 and y 2
D 19 D 19
Let A1 = and A2 =
a= = = = 55/11 = 5
b= = = = 11/11 = 1
AB( AB) 1 I
A1 AB( AB) 1 A1I A1 , multiplying by A1
B 1 IB( AB) 1 B 1 A1 , using A1 A I and multiplying by B 1
B 1 B( AB) 1 B 1 A1
( AB) 1 B 1 A1
Y = C + I0 +GO and C = a + bY
Solution: Given;
Y – C = I0 +GO
-bY + C = a
1 1
D 1 b (1 b)
b 1
I o Go 1
Dy ( I o Go ) (1 a ) ( I o Go ) a
a 1
1 I o Go
Dc a b( I o Go )
b a
then
( I o Go ) a
Y
(1 b)
a b( I o Go )
C
(1 b)
1 1 1
Example: Prove that A a b c (a b)(b c)(c a)
a2 b2 c2
1 1 1
Solution: Given; A a b c
a2 b2 c2
0 0 1 0 0 1
A a b bc c (a b)(b c) 1 1 c
a2 b2 b2 c2 c2 a b b c c2
1 1
A (a b)(b c)
ab bc
A (a b)(b c)(b c a b) (a b)(b c)(c a)
A2 = [I-X(X’X)-1X’]2
= I + X(X’X)-1X’ - 2 X(X’X)-1X’
= [I - X(X’X)-1X’] = A Proved
Example: A monomial square matrix M is one in which there is exactly one non-zero entry
in each row and in each column. Show that any 2×2 monomial matrix is invertible
and describe its inverse.
a 0 0 a
or with a ≠ 0 and b ≠ 0
0 b b 0
1 1
1 a 0 0 a 0 1/ b
M and
0 b b 0 1 / a 0
Example: For what value of µ the following system of equations has non-trivial solutions?
5 x 2 y z x
2 x y y
x z z
(5 ) x 2 y z 0
2 x (1 ) y 0
x (1 ) z 0
The above system of equations has a nontrivial solution iff the coefficient matrix is
singular i.e. the determinant of coefficient matrix must be zero.
5 2 1
2 1 0 0
1 0 1
µ(1-µ)(µ-6) = 0
Problem Set
1. Prove that 2 2
2 ( )( )( )( )
2. Solve the following system using both Cramer’s rule and matrix inverse:
5a 6b 4c 15
2 x 3 y 3
a) b) 7 a 4b 3c 19
4 x y 11 2a b 6c 46
2 x 3 y z 12 0
2 x y 5
d) 3 x 4 y 11z 46 e)
5 y 4 z 5 3x 2 y 3
X + 24 + 3z = 6
f) 2x + 4y + z = 7
3x + 24 + 9z = 14
4. Find the cofactors of the following determinant and prove that |A’| = |A 2|
|A| = .
5 3 5 2
5. For a given matrix A , the transpose is A . A matrix A is called
2 4 3 4
1 2 2
1
orthogonal if AA AA I . Show that the matrix A 2 2 2 is
3
2 2 1
orthogonal.
240 750
1200 1500
6. Given an input coefficient matrix A and the demand matrix
720 450
1200 1500
210
D . Find the output matrix X, such that ( I A) X D
330
2 x 3 y 3
7. Find k so the system has no solution:
kx y 11
8. Show that the following system of equations has no solution;
x 2y z 5
3x y z 2
x 5y z 4
1
9. If A and B are the invertible then prove that BAB A
3 5
10. Find the inverse of the matrix A and verify that A. A1 A1 . A I
7 11
3 5 1 3
11. If A and B , then prove that ( AB ) 1 B 1 A1
2 7 2 4
12. Prove that the homogenous system of equations
ax by cz 0
bx cy az 0
cx ay bz 0
Has a nontrivial solution if and only if a b c 3abc 0
3 3 3
REFRENCES
Allen, R.G,D, Mathematical Analysis for Economists, London: Macmillan and Co. Ltd
Knut Sydsaeter and Peter J. Hammond, Mathematics for Economic Analysis, Prentice
Hall
Michael Hoy, John Livernois, Chris Mckenna, Ray Rees, Thantsis Stengos,
Mathematics for Economists, Addison-Wesley Publishers Ltd.
Carl P. Simon and Lawrence Blume, Mathematics for Economists, London: W .W.
Norton & Co.
DC-1
Semester-II
- S. Gudder
1: Learning Outcomes
2: Important terminology
2.1: Minors
2.2: Co-factors
3: Linear Dependency
4: Rank of A Matrix using Determinants
5: Summary
6: Exercise
7: References
8: MCQs
1. Learning Outcomes
- Understand the concept of determinants and will be able to apply it to solve the
mathematical equations.
- Get to know the various properties of determinants and evaluate them.
- Know the steps and tools to calculate and solve determinants.
- Understand the different types of determinants and its various concepts.
- Solve the simultaneous equations using determinants applying Cramer’s Rule.
- Understand the concept of linear dependency and how to calculate rank of a matrix.
2. Important Terminology
2.1Minors:
b1 - c1 + d1
Similarly, minor of b2 =
Minor of b3 =
Similarly, minor of c1 =
So we can say that the minor of any element in nth order determinant is a (n – 1)th order
determinant (where n=1, 2, …, ∞)
2.2Cofactors:
|A| =
frc = (-1)r+cMrc , (where Mrc = minor of determinant A formed by eliminating row ‘r’ and
column ‘c’ from A).
Co-factor of c12 = (-1)1+2 = (-) , as it lies in 1st row and 2nd column and
so on for other elements.
For Example, Compute the value and cofactors of the following determinant:
|A| =
|A| = 1 -2 +3
= 3 – (-18) + (-30) = -9
|A| = 1 -4 +7
COFACTORS:
3. Linear Dependency:
Suppose, = and =
Two vectors and are known to be independent if a and b are the only numbers which
satisfies the equation ax+by = 0 i.e.
a +b = 0
a +b =0
a=b=0 is the only solution to this equation, given x and y are two independent vectors.
Two vectors and are known to be independent if the matrix, with these vectors as
columns ( ), has a non-zero determinant. Thus, if the determinant of a matrix is non-
zero, a set of n vectors, with n size, is linearly independent. And of course if the
determinant is zero, the set is dependent.
For example, two vectors <9, 2><5, 7> are linearly dependent since the determinant of
the matrix containing the two vectors as columns is non-zero.
We need to find the constants a and b such that a +b = 0 and a=b=0 is the trivial solution
for it.
a +b =
9a + 5b = 0
2a + 7b = 0
|A| = (63-10) = 53 ≠ 0
Linearly Independent
For example, show that the two vectors <2, -3> and <-10, 15> are linearly dependent.
We need to find the constants a and b such that their values are different from zero.
a +b =
2a - 10b = 0 or 2(a-5b) = 0
-3a + 15b = 0 or -3(a-5b) = 0
|A| = (30-30) = 0
Linearly dependent
We need to find out the constants a, b, c such that we get the trivial solution a=b=c=0
Let, a +b +c =
a + b =0
a + c=0
b + c=0
Linearly independent
Alternatively, Let A =
Linearly independent
For example, show that the following vectors are dependent and also find the relation
between them:
Let matrix A =
Linearly dependent
Now, to find the relation between the vectors x, y and z, we need to define the constants a,
b and c such that:
a +b +c =0
=0
Now, through elementary row operation using the GAUSSIAN ELIMINATION we get,
Thus we get,
2a + 3b + 5c = 0
1b + 3c = 0
c(2x – 3y + z) = 0
2x – 3y + z = 0
-2x + 3y = z
Using the elementary row operations, after we obtain a triangular matrix, we can write the
associated linear equation and then try to solve it. This is called Gaussian Elimination.
Following are the steps involved in Gaussian Elimination for homogeneous system of Linear
equations:
2x-3y+2z=21
x+4y-z=1
-x+2y+z=17
- Adding R1 to R3 we get,
- Dividing R3 by 6 we get,
x+4y-z=1
y+4z=55
z=13
Solving the above equations we get,
Rank of the matrix is 3 as there are three leading ones, each corresponding to x, y and z.
-x+3z=2
2x+y-4z=-1
x+2y+z=4
x =
- Adding R2 to R1 we get,
- Adding R2 to R3 we get,
x+y-z=3
y+2z=5
Solving it we get,
Z=0
Y=5
X=-2
Since there are two leading ones in the above matrix. Rank of the matrix is 2.
If we have a nxn matrix or a nth order determinant, and if the determinant value is zero, the
rank of the associated matrix must be equal to n. The number ‘r’ is defined as the rank of
matrix A if there is at least one (r x r) non-zero square sub-matrix (of A) determinant.
Thus to calculate the rank of a matrix, we see the maximum order of the minors of the
matrix which are non-zero.
A=
-NOTE: We can discard the 5th column as all its elements are zero/null.
- NOTE: We can see that column 3 = column 1 + column 2. Thus we can also discard the
3rd column.
Now we have,
A=
, it is a 5X3 matrix
- Now we have to find out the maximum order of the minors of the matrix which are non-
zero. Even if there is at least one non-zero minor of a particular order determinant, we
consider that order to be the rank of the matrix.We shall start from the lowest nxn matrix or
nth order determinant
ORDER 1: all the non-zero elements of the matrix is the non-zero minor of 1st order
determinant. So we shall look at higher order determinant.
ORDER 2: =4≠0
Thus, we have a non-zero minor or non-zero square sub-matrix determinant of 2nd order
determinant, so we shall look at higher order determinant.
Thus, we have a non-zero minor or non-zero square sub-matrix determinant of 3rd order
determinant
We do not have a higher order determinant further, so the rank of matrix A = 3 which is the
order of the maximum non-zero square sub-matrix determinant.
NOTE:
- The rank of a matrix is not more than that of the number of its rows or columns,
whichever is less.
Answer: The row is not independent, if one row is a multiple of other, and the
determinant is also zero. In this case also, the matrix has two identical rows (and two
identical columns), thus the determinant is zero.
= (2-2) = 0
Answer: Now we may consider that 4th row is a multiple of the 1st row. Thus, the rank of
the matrix is 3.
IMPORTANT:
Answer: We can see that the 3rd column is a linear combination of the 1st and 2nd column.
Thus we can eliminate the 3rd column. Thus we get
We can use determinants to find out the largest non-zero square sub-matrix. Here the
largest square sub-matrix is (3X3) matrix. Thus, we check for all 3X3 sub matrix whether its
determinant is a non-zero.
Thus, determinant of all possible 3X3 sub matrix is zero, thus rank is less than 3. Now we
look at 2X2 sub-matrix:
=1≠0
Answer: We can eliminate the 4th column as it is a multiple of the 1st column.
We can use determinants to find out the largest non-zero square sub-matrix. Here the
largest square sub-matrix is (3X3) matrix. Thus, we check for the 3X3 sub matrix whether
its determinant is a non-zero.
Thus, determinant of all possible 3X3 sub matrix is zero, thus rank is less than 3. Now we
look at 2X2 sub-matrix:
= -1 ≠ 0
Answer: We can use determinants to find out the largest non-zero square sub-matrix. Here
the largest square sub-matrix is (4X4) matrix. Thus, we check for the 4X4 sub matrix
whether its determinant is a non-zero.
Thus, r(B) = 4
- We can eliminate the 3rd column as all its elements are zero.
- We can also eliminate the 5th column as it is proportional to the 1st one.
- We can see that c2 = -2 c1 + c4, thus c2 being a linear combination of c1 and c4, we can
eliminate c2.
Thus the resulting matrix is:
We can use determinants to find out the largest non-zero square sub-matrix. Here the
largest square sub-matrix is (2X2) matrix. Thus, we check for all 2X2 sub matrix whether its
determinant is a non-zero.
=1≠0
Thus, r(c)=2
Using Gaussian Elimination, we try to form an upper triangle of the matrix A which
eliminates the linearly dependent row vectors.
There are no more elementary row operations possible. Thus, eliminating the 3 rd row, we
can say that the rank of the matrix is 2. Thus r(A) = 2.
Points to note:
A system of homogeneous equations can have three solutions i.e. A unique solution, no
solution or infinitely many solutions.
Unique Solution- When a system is consistent and the number of variables (unknown) in
the system is equal to the number of non-zero rows, the system has a unique solution.
Infinite Solutions- When a system is consistent and the number of variables (unknown) in
the system is more than the number of non-zero rows, the system has infinite solutions.
x =
- If k= -2, then rank[A|B] = 3 as there will be three leading ones. Thus rank[A|B] ≠
rank[A]. Thus the system has no solution at k= -2.
- If k= 1, then rank[A|B] = 2 as there will be two leading ones. Thus rank[A|B] =
rank[A]. Number of variables=3 and number of non-zero rows =2. Thus the system
has many solutions at k= 1 as number of variables > number of non-zero rows
- If k ≠ 1, -2, then rank[A|B] = 3 = rank [A]. Number of non-zero rows is thus 3.
Thus, the system has a unique solution at k ≠ 1, -2 as number of variables =
number of non-zero rows.
5. Summary
Through this chapter we have been able to write the simultaneous linear equations in a
compact way and we have also used determinants to find the solution to those equations.
- A determinant of a 2X2 matrix is defined as product of two diagonal elements minus the
product of two off-diagonal elements of a matrix. For a matrix to be invertible/non-singular
its determinant should be different from zero.
- We can expand a determinant by any row or column and it will generate the same value of
the determinant every time.
b1 - c1 + d1
frc = (-1)r+cMrc , (where Mrc = minor of determinant A formed by eliminating row ‘r’ and
column ‘c’ from A).
x= ,y= and z = .
- Two vectors and are known to be independent if the matrix, with these vectors as
columns ( ), has a non-zero determinant. Thus, if the determinant of a matrix is non-
zero, a set of n vectors, with n size, is linearly independent. And of course if the
determinant is zero, the set is dependent.
- The number ‘r’ is defined as the rank of matrix A if there is at least one (r x r) non-zero
square sub-matrix (of A) determinant.Thus to calculate the rank of a matrix, we see the
maximum order of the minors of the matrix which are non-zero.
6. Exercise
2. What is the rank of a determinant and what are the applications of determinant?
5. What are the different types of determinants and explain their properties.
X + 24 + 3z = 6
2x + 4y + z = 7
3x + 24 + 9z = 14
Find out the values of x, y and z using Cramer’s rule.
10. Find the cofactors of the following determinant and prove that |A’| = |A2|
|A| = .
7. References
3. http://www.sheir.org/matrices-determinants-mcqs.html
DC-1
Semester-II
Paper-IV: Mathematical methods for Economics-II
Lesson: Geometric Representation of Functions: Graphs and Level
Curves
Lesson Developer: Neha Goel
College/Department: Shyamlal College, University of Delhi
1 Learning Outcomes
2 Introduction
3 Points in Euclidean Spaces
3.1 Number Line
3.2 The Plane
3.3 Three Dimensions
3.4 Surface in A Space
4 Geometric Representation of Functions
4.1 Graphs
4.2 Level Curves
5 Differentiable Functions
6 Exercise
7 References
1. Learning Outcomes
2. Introduction
Thus, A is the point (with 2 and 3 as x and y coordinates respectively) in the plane with
an ordered pair.
Geometric Representation of Functions: Graphs and Level Curves
We can use these number lines to find a point with a particular triple of numbers. We can
plot the points in the same way as in R2. Now, ignoring the coordinate p, we can plot coordinates
q and r easily on x and y axis as we did earlier. Now, from the point (q,r), move A units in the
direction parallel to z-axis. Move forward of the plane; if A is positive and move behind the
plane if A is negative. Remain still if A is 0. We finish at a point (p,q,r). Now, we can then take
y-axis and z-axis and move B units to the right if B is positive and left if B is negative. Similarly,
in x-axis and z-axis, we can move C units up if C is positive and move down if it is negative. We
see that whichever method we use, we end up at same point (p,q,r).
The diagram below shows the three dimensional figure with coordinates p, q and r where
the coordinates q and r are found using 2-space technique. We have already seen how moving
parallel, up, down, right, left, from point (p,q,r) we form a three-dimensional figure with
different coordinates as seen in the figure below.
We have thus seen how to plot R1, R2 and R3 using Euclidean space. R1 consists of single
real numbers, represented by a number line. R2 consists of ordered pairs, represented by a graph
Geometric Representation of Functions: Graphs and Level Curves
i.e. a point set in a plane and R3 consists of ordered triples, represented by a graph i.e. a point set
in 3-space, forming a surface in a space. Thus, Euclidean n-spaces consist of n-tuples of numbers
i.e. ordered list of n numbers. Thus, Euclidean n-space is represented as Rn. The number n, called
the dimension of Rn, is used to describe how many numbers are required to describe each
location, for example, R3 has three dimensions.
Thus,
Points in three-dimensional space have three coordinates as shown in the figure above.
Similarly, the figures below show the pieces of other two equations:
Geometric Representation of Functions: Graphs and Level Curves
EXAMPLES
1) SPHERE: let us consider the equation a2 + b2 + c2 = 16. Now,
a2 + b2 + c2 = (a2-0) + (b2-0) + (b2-0) is a square of the distance to the point (a,b,c) from
the origin (0,0,0). Thus, the equation a2 + b2 + c2 = 16 consists of those points (a,b,c)
Geometric Representation of Functions: Graphs and Level Curves
whose distance is 4 from the origin. Thus, it represents a sphere with radius=4 and centre
as (0,0,0).
We generally denote a function from set X to set Y as f:X→Y, which is a rule that
assigns one and only one object in Y, to each object in X.
EXAMPLE: Suppose f(x,y)=x2 + y2 which defines f:R2→R1. The image of f is set of all non-
negative real numbers. The target space of f is R1 and the domain of f is all of R2.
EXAMPLE: Suppose f(x) = 1/x. The domain of f(x) is all real numbers except 0. It has the same
image as domain i.e. R1 – {0}.
Map makers draw contours to get an idea about altitude variations on earth’s surface; like
closer the contours, steeper the slope. For example, they draw contours or level curves
connecting points on the map representing places on earth’s surface at same distance (eg. 100
meters) above the sea level. We can apply the same concept for geographical representation of
arbitrary functions Z=f(X,Y). We have already seen that the graph of functions in a three-
dimensional space seems as being cut by horizontal planes parallel to XY-plane. This
intersection onto the XY-plane is known as the level curve for height ‘a’ for ‘ f ’, if the
intersecting plane is z=a. This level curve consists of points that satisfy the equation:
f(X,Y) = a
EXAMPLES
1) What are the level curves for the equation z = f(x,y) = x2+y2. Represent it graphically.
Solution: We know that Z will always be positive i.e. z ≥ 0. The equation of the level curve is:
x2+y2 = a ≥ 0
From this equation, we may say that these are circles in the XY-plane with radius and
centre at the origin.
Now, we know that all the level curves are circles. Z=Y2, if X=0, thus it is a parabola in
the YZ-plane. Similarly, Z=X2, if Y=0, thus it is a parabola in the XZ-plane. Thus, we got the
above figure by rotating the parabola Z=X2 around the Z-axis. The surface is a paraboloid, as
shown in the figure below:
Now we know, if X=0, Z=Y2 i.e. a parabola in the YZ-plane. Similarly, if Y=0, Z= - X2 is
an inverse parabola in XZ-plane. Now, after plotting the graph putting different values of Y in
the XZ-plane and continuing the process, we get a saddled-shape graph as follows:
Solution: The level curve for the above production function in the KL-plane is called an iso-
quant. Graphically, we can see the iso-quants of the production function Q=K.L as follows:
Thus, value of f(a,b) is a constant 3/5, for all values (a,b) where ab=3. Thus, ab=3 lies on a
level curve for f, at height 3/5.
7) The linear function Z=aX+bY+c has a plane in space for its graph.
Geometric Representation of Functions: Graphs and Level Curves
We can find the partial derivatives of the function Z but the function is not differentiable at the
point (0,0).
** Thus, we can say that if there exists partial derivatives of a function, which is also continuous
at a neighboring point, then the function is differentiable at that point.
6) Suppose, Z=f(X,Y) and
Geometric Representation of Functions: Graphs and Level Curves
We can find the partial derivatives of the function Z but the function is not differentiable at the
point (0,0). Thus, the function is not differentiable.
7) Find if the function a is differentiable: a=b2c+5b2c5d-bc+5d
∂a/∂b = 2bc+10b c5d-c
∂a/∂c = b2+25b2c4d
∂a/∂d = 5b2c5+5
Thus, all the partial derivatives exist. Also we can see that the function a is defined and
continuous for all the values of b,c and d. Thus, the function is differentiable.
8) Find if the function a is continuous:
a=
The function is not defined if the denominator is zero. We can see that the function a is defined
for all values (X,Y) except the points lying on the circle X2+Y2=4. Thus, the function is not
continuous.
6 Exercise
1) Prove that the level curve of the function z, where
Geometric Representation of Functions: Graphs and Level Curves
Z=
has level curves centered at origin. Also show that X2+Y2=6 is a level
curve of the function Z.
[Hint: we can trace it as absolute value functions and by staring at it we see that Z=|r|, thus it is
a cone.]
3) Plot the graph for Z=X2-Y2. [Hint: see section 4.2, example 6.]
4) Plot Z=
Hint: The contours can be plotted as follows:
We know that this is also an indifference curve. Thus we can plot the three-dimensional
figure from it.
(a) XY
(b) Log XY
(c) XY
(d)
7) How would you use the graph of Z=f(X,Y), to draw level curves of f?
8) Geometrically plot the graphs of the following functions:
(a) Z=5-X-Y
(b) Z= - X2 – Y2
7 References
K. Sydsaeter and P. Hammond, Mathematics for Economic Analysis, Pearson Educational Asia,
Delhi, 2002
Carl P. Simon, Lawrence Blume, Mathematics for Economists
Higher Order Differentiation and Its Applications
DC-1
Semester-II
Paper-IV: Mathematical methods for Economics-II
Lesson: Higher Order Differentiation and Its Applications
Lesson Developer: Sarabjeet Kaur
College/Department: Department of Economic, P.G.D.A.V College,
University of Delhi
1
Higher Order Differentiation and Its Applications
CONTENTS
1. Learning Outcomes
2. Higher Order Differentiation
3. Partial Derivatives
3.1Higher order partial derivative
3.2Partial derivative with many variables
4. Quadratic Forms
5. Exercise
6. References
1. Learning Outcomes
If f (x) be differentiable function of x, then f'(x) or is the first derivative or first order
derivative of y = f (x) w.r.t ‘x’. Since the derivative of function is also a function, therefore
another derivative can also be find. The second order derivative, or second derivative, is the
derivative of the first derivative of the function f(x). Other notations are:
or or or f '' (x)
Since f '' (x) is also a function, therefore, its derivative can also be find which is denoted as f '''
(x). For higher order derivatives superscripts can be used i.e. f4 = fourth derivative etc.
Example: -f(x) = 4x5 + 6x3+2x+1
f'''(x) = 240x2 + 36
3. Partial Derivatives
Given a function y = f(x), the derivative f '(x), represents the rate of change of the function as x
changes. For a function of two variables, such as z = f (x,y), one variable could be changing
faster than the other variables. It will be completely possible for the function to be changing
differently.
For a function of two independent variables, z = f(x,y), the partial derivative ‘z’ with respect to x
may be found as normal rule of differentiation. The only difference is that, whenever or
wherever the second independent variable ‘y’ appears, it will be treated as constant in every
respect. Also the partial differentiation of y can be found by treating x variable as constant.
Notations of partial differentiation are given below:
Example: Z = x4 y2 – x2 y6
= 4x3y2 – 2xy6
= 2x4y – 6x2 y5
, called partial derivatives of z with respect to x and y respectively, be the derivative z w.r.t.x
3
Higher Order Differentiation and Its Applications
by keeping y as constant and the derivative z w.r.t.y by keeping x as constant. All the rules of
differentiation can be applied when partial differentiation can be calculated.
fx = =LtΔx →0
fy = = LtΔy →0
= 3x2y + 2xy2 + y + 1
= x3 + 2x2y + x +2y
For a function z = f (x, y); fˈ (x) & fˈ (y) are the two first order partial derivatives with respect to
x and y respectively. Since ‘z’ is a function hence fˈ (x) and fˈ (y) are also a function, hence,
second order partial differentiation can also be found.
The second order partial derivatives are called mixed partial derivative because derivatives of
more than one variable are to be observed. e.g differentiating a function with respect to ‘x’ first
and then ‘y’ is called as mixed partial derivative. The various notation of partial derivative are
given in table:
4
Higher Order Differentiation and Its Applications
A function has four possible second partial derivatives ones that are obtained by differentiating
function w.r.t ‘x’ twice, w.r.t. y twice, w.r.t. x first than y and w.r.t. y first then x. All derivatives
have sign (+ or -) interpretation of these signs are as follows.
5
Higher Order Differentiation and Its Applications
Since x and y are positive, positive number raised to any power is positive; hence y0.5 and x-1.5
are positive , the term -0.25 in equation show that second order differentiation of z with respect
to x twice is negative meaning that the slope in the x direction decreases as x increases when y is
constant.
= x3 + 2x2y + 2xy2 +y +1
fy= x3 + 2x2y + x + 2y
fyy = 2x2 + 2
.: fxy = fyx .
The two mixed second order partial derivatives (also called as cross partial derivatives) are
always equal when fxy and fyx are continuous. It is explained by the following theorem given by
Alexis Clairant also know as Young’s theorem.
Theorem: Suppose f is defined on a disk D, which contains the point (a, b). If the partial
derivatives fxy and fyx are both continuous on disk D, then
fxy (a, b) = fyx (a, b).
Solution:-
6
Higher Order Differentiation and Its Applications
fx(x,y) = -2 x2 y2
fy(x,y) = -2 x3 y
fxy(x,y) = -2 x2 y -4 x2 y +4 x4y3
= -6 x2 y +4 x4y3
fyx(x,y)= -6 x2 y +4 x4y3
.: fxy = fyx
Hence proved.
is the differentiation of the function w.r.t. xi when all the other variables xj (j≠i ) are held
constant.
i.e =
and = and so on
Suppose, there is a function which consists of three variables v = f (x, y, z). For such a function,
there are partial derivatives of w.r.t x, y and z. When partial derivative has to take with respect to
one of x, y and z assuming other two independent variables are constant.
w.r.t. xi is when all the other variables xj (j≠i) are held constant.
fxi = = Lth →0
7
Higher Order Differentiation and Its Applications
fx = = 2x
fy=3y2
fz = 4z3
Z= 3x2(5x+7y)
= 45x2 + 42xy
Zxx= 90x+42y
Zxxx= 90.
Zxy = 0 + 42x
= 42x
Zxyx = 42
=21 x2
Zyy = 0
Zyyy = 0.
Zyx = 42x
Zyxy = 0.
8
Higher Order Differentiation and Its Applications
= 216 x – 30 y
Zxx = 216
Zxyy = 0
Zxxx = 0
Zxy = -30
Zxyy = 0
= -30x -16y
Zyy = -16
Zyyy =0
Zx =
Zxx= 0
Zy =
= =
9
Higher Order Differentiation and Its Applications
Clairaut theorem (Young’s theorem) can be extended to any function of ‘n’ number of variables
and their mixed partial derivations. The only thing has to remember that in each derivative, we
differentiate with respect to each variable the same number of times.
There are n partial derivatives of first order. For each of the first order partial order derivative of
the function, there are n second order derivatives. i.e.,
( )= =fxixj (i=1..n;j=1..n)
So, total n2 elements are there. Therefore, n*n matrix of second order partial derivative is the
Hessian matrix which is symmetric and all f11=f22=….=fn(Clairant theorem)
Example:
If the two demand functions for the two commodities are given by
x= y=
= =
10
Higher Order Differentiation and Its Applications
= =
Example:
x=ae-pq
y=bep-q
= -aqe-pq = bep-q
= -ape-pq = -bep-q
Because ≤ 0 and ≥0, therefore the given commodities are neither competitive nor
complementary.
Example
Consider two products, A and B. the demand for good A and B, & described by following two
equations
qa =
Solution
11
Higher Order Differentiation and Its Applications
qa = =
= -100
qb =
= =( )
= ( (pa-1/3)
= ( )
= ( pa-1/3-1)
=
pa-4/3 . pb-1
We know that pa and pb are positive because prices can never be negative therefore:
=-( ) = -( )<0
=- =-( )<0
12
Higher Order Differentiation and Its Applications
a. Marginal utility is given by : = ; which is greater than zero; because both the
c. ; which is less than zero. Since the second derivative is negative, the
Find out the partial elasticity with respect to labor at (L,K)= (1024,27).
This explains that if capital remains constant at K=27 and at L=24 labour increases with 1
percent, then output will increase by percent.
13
Higher Order Differentiation and Its Applications
U = U= X0.5Y0.5.
Function is U= X0.5Y0.5.
MUx= = 0.5X-0.5Y0.5.
MUy= = 0.5X0.5Y-0.5.
MRS= = = y/x
=- /
= K1/6L-1/2
= K-5/6L1/2
= −( K1/6L-1/2)/ ( K-5/6L1/2)
=- K1/6 K5/6L-1/2L-1/2
=-3(K/L)
Therefore, the slope of isoquant is 3(K,L).
Example: Given demand function Q- 90+ 2P=0; and average cost function
AC= Q2- 39.5Q+ 120+ 125/Q
14
Higher Order Differentiation and Its Applications
=45 – Q=0
Q=45
= 6Q -79=0
Q= 13.167
And, =6>0.
15
Higher Order Differentiation and Its Applications
= - 6Q + 78
When Q = 1 then,
= 72>0.
= -72<0.
MC = = 10
MR1= MC
210- 20Q1=10
Q1= 10
When Q1= 10, P1 = 210- 10(10)= 110
demand function in second market is, Q2= 50 – o.4P2
hence, P2= 125 – 2.5 Q2
TR2= (125 – 2.5 Q2)Q = 125Q – 2.5 Q22
16
Higher Order Differentiation and Its Applications
When MR2=MC
125- 5Q2=10
Q2= 23
When Q2= 23, then P2= 125 – 2.5(23)= 67.5
The discriminating monopoly charges a lower price in the second market where the demand is
relatively more elastic, and a higher price in the first market where the demand is relatively less
elastic.
Example: A producer is a price-taker on both the market for input factors labor and capital, and
the market for end products. The cost of one unit of labor equals w = 2, the cost of one unit of
capital equals r = 32 , while the selling price of the end products equals p =32. The production
function of this producer is given by Y(L,K) = L1/8 K1/2. Determine the maximum profit.
Solution:
The revenue function is R(L,K) = p.Y(L,K) = 32. L1/8 K1/2
Cost function
C(L,K) = wL + rK= 2L+32K, and
Hence, profit function becomes
Π(L,K) = 32 L1/8 K1/2 – 2L – 32K
Partial derivative of π(L,K) is given by:
= 4L-7/8K1/2 -2 and
= 16 L1/8K-1/2 -32
the stationary points of profit function are solutions of the following system
4L-7/8K1/2 -2 = 0
16 L1/8K-1/2 -32 = 0
Hence, K1/2=1/2L7/8 and
therefore, K = ¼ L14/8
Consequently, L1/8(1/4 L14/8)-1/2= 1
which gives L=1 and therefore, K=1/4
Hence, (L,K) = (1,1/4) is the only stationary point. By the use of the criterion function we
investigate whether or not this point is a maximum location.
= -3.1/2L-15/8K1/2;
17
Higher Order Differentiation and Its Applications
= -8L1/8K-3/2 and
= 2L-7/8K-1/2, which implies that the criterion function is given by
C(L,K) = . -( )2
= (-3.1/2L-15/8K1/2)(-8L1/8K-3/2)- (2L-7/8K-1/2)2
= 28 L-14/3K-1 – 4L-14/8K-1
= 24 L-14/8K-1>0
Hence, as C(1,1/4) >0 and (1,1/4) <0 it follows that has a maximum profit
at (L,K)= (1,1/4), with value π=6.
4. Quadratic Forms
A quadratic form of two variables is
f(x,y) = ax2 + 2bxy +cy2;
a,b,and c are constants. Now, using matrix notation:
f(x,y) = (x,y)
= 2a, = = 2b and = 2c are the second order partial derivatives of the function f(x,y)
Therefore, the Hessian of f is given by
The given quadratic form is said to positive definite if f(x,y) >0; for all values of x and y i.e,
(x,y) ≠ (0,0), and positive semidefinite if f(x,y)≥ 0 for all values of (x,y). The given function is
negative definite if f(x,y)<0; for all values of x and y; and it is negative semidefinite if f(x,y)≤0.
And it is indefinite we have two different pairs of x and y; (x-,y-) and (x+,y+); and also f(x+,y+)
>0.
Example: Express the quadratic form below as a matrix form. Determine the definiteness of the
equations:
a) f(x1,x2) = 4x2 +8xy +5y2
b) f(x1,x2) = -x2 +xy - 3y2
Therefore, symmetric matrix is , whose determinant is positive. Hence,f(x,y) > 0 for all
18
Higher Order Differentiation and Its Applications
f(x,y) = (x,y)
0 for all values of x and y. Therefore, the quadratic form is negative definite.
5. Exercise:
1. Find the second – order partial derivatives fxx, fyy and fxy for each of the following
functions:
(a) Z=
Qd Pc
d. Find the cross-price elasticity of demand when Y = 10,000, Ps = 5 and Pc = 7.
Pc Qd
4. Show that fxz = fzx and fxzz = fzxz = fzzx from the following function:
19
Higher Order Differentiation and Its Applications
F(x,y,z) = y
X1= p1-1.7p20.8
X2= p10.5p2-0.2
What can you say about the two commodities X1 and X2 and also find all partial elasticties.
20
Higher Order Differentiation and Its Applications
TC = 20 + 2q + 3q2 where q= q1+ q2, what price will the firm charge in two markets to maximize
profit?
Solution:
b.fxx = 294(7x + 3y); fyy = 54(7x + 3y) and fxy=fyx= 126 (7x +3y)
5. Since and are both greater than zero. Hence, the commodities X1 and X2 are
competitive.
6. p1= 3.2 and p2= 3.9; e11= -1.7; e21= 0.8; e22= -.2 and e12 = 0.5.
7. Answers:
21
Higher Order Differentiation and Its Applications
1
y K 2 40
a. 5 1
L L 2
L
b. L = 16
c. L = 25; labor demand increases with wage decline.
1
y K 2 50
d. 5 1 = 8; i.e., L = 39.
L L 2
L
1 1
e. 2.5L 2 K 2
6.References:
1. K. Sydaster and P. Hammond, Mathematics for Economic Analysis, Person Educational
Asia, Delhi, 2002.
2. M. Hoy et.al, Mathematics for Economics, PHI Learning Private Limited, Delhi, Second
Edition, 2001.
3. J.E. Draper and J.S. Klingman, Mathematical Analysis Bussiness and Economic
Applications, Harper & Row Publishers, New York, 1967.
4. Rosser, Mike, Basic Mathematics for Economists Second Edition, London, 2003.
22
Homogeneous and Homothetic Function
DC-1
Semester-II
Paper-IV: Mathematical methods for Economics-II
Lesson: Homogeneous and Homothetic Function
Lesson Developer: Sarabjeet Kaur
College/Department: P.G.D.A.V College, University of Delhi
1
Homogeneous and Homothetic Function
Contents
1. Learning Outcomes
2. Tools of Comparative Static Analysis
2.1 Chain Rule
3. Exercise
4. References
1. Learning Outcomes
2
Homogeneous and Homothetic Function
In economic analysis, the theory represents certain association between the independent variables
and the dependent variables. It is harder to solve clearly by transmuting the equations to ones
that reveal the dependent/endogenous variables as functions of the independent/exogenous
variable of the given data. When there is change in exogenous variable then endogenous variable
also change, to determine this change; the method of implicit differentiation is applied. This
technique of finding rates of change of endogenous variables, as exogenous variables change, is
known in economic as comparative statistics.
2.1Chain Rule:
One of the most important techniques of differentiation is chain rule. The chain rule is a rule of
differentiating compositions of functions. Composition of function signifies the function of
another variable. These are functions of one or several variables in which the variables
themselves functions of the another basic variables.
If a function consists of two variables and both are function of common variable ‘t’, e.g.
= + .
=2t; =12t2
3
Homogeneous and Homothetic Function
=3 and =5
= +
=6t + 60t2
If x and y are both multivariable functions; i.e. x = x(u,v) and y=y(u,v); have first order partial
derivatives at the point (u,v) and suppose z = f (x, y) is differential at point (x(u,v); y(u,v)) then
f(x(u,v); y (u,v)) has first order partial derivatives at (u,v) given by:
= +
and
= +
Example:
;
Let z= where x(u,v) =
= 2xy and = x2
= +
.
= (2xy) ( ) + x2.0
4
Homogeneous and Homothetic Function
= . 2xy. .
=
. xy. .
= (uv)1/2. . ..
= eu.
= . + .
=(2xy ). . + x2 .
( )
= (xy. - )
= ( – )
= ( –
= (0)
=0
The chain rule can also be extended to a number of variables, which is a function of other
variables:
X1= x1 (t1……..tm)
X2= x2(t1……….tm)
5
Homogeneous and Homothetic Function
= . + . +…….+ .
For a function z=f(x,y), the partial derivative with respect to x gives the rate of change of f in the
x0 direction and the partial derivative with respect to y gives the rate of change of f in the y0
direction. How do we compute the rate of change of f in an arbitrary direction? The rate of
change of a function of several variables in the direction u is called the directional derivative in
the direction u. Here u is assumed to be a unit vector.
If z=f (x,y), the partial derivatives f'1 (x,y) and f'2 (x,y) Choose a particular point (xo,yo) in the
domain. Any nonzero vector (h,k) is then a direction in which we can move away from (xo, yo)
in a straight line to points of the form.
Given the point (xo, yo) and the direction (h, k) ≠ (o,o), define the directional function g by
By using the chain rule, the derivative of this directional function can be calculated as
If t=0 then
For the case when the vector (h,k) has length 1, the derivative of f in the direction (h,k) is called
the directional derivative of f in the direction (h,k) at (xo,yo). It is denoted by Dh,k f(x0,y0). Hence,
the directional derivative of f(x,y) at (x0,y0) in the direction of unit vector (h,k) (where h2+k2=1)
is
6
Homogeneous and Homothetic Function
Any move from (x0,y0) to (h,k) changes the value of f by approximately Dh,k f(x0,y0). The vector
(f1(x0,y0), f2(x0,y0)) is called as gradient of the function f (x,y) at (x 0,y0). Therefore, it is the
scalar product of gradient with vector (h,k).
Now, differentiating (2) with respect to t, we get second derivative of the directional function g.
i.e,
where x= x0 +th, and y= y0 +tk. Again, applying the chain rule, the above equation becomes:
again x= x+ t.h and y= y+t.k. Assuming t=0 and (h,k) has length 1, then above equation
becomes:
Compute the first and second directional derivatives of f at (xo yo) in the directions:
Solution: We have
7
Homogeneous and Homothetic Function
and
If (h,k) = ( , - ); then
and
f(xo.yo)= = -1
Suppose that x and y are related to each other with the relation; F(x,y)=0 where y = f(x) is a
differentiable function of x. Find by using chain rule method.
Consider a function:
then
Example:
8
Homogeneous and Homothetic Function
y3 + y2-5y-x2+4=0.
Solution:
Define a function
F(x,y)=y3+y2-5y-x2+4.
Fx (x,y)= =-2x.
Fy(x,y)=(∂Fy(x,y))/∂y=3y2+2y-5
Let’s assume that there is an implicit function consists of three variables; i.e, F (x,y,v) = 0.
In order to get derivative hold one of the variable constant. Suppose v is constant then dv=0 and
0= ∂z/∂x.dx+ .dy., rearrange the equation we get as in two
variable case, because one variable was constant, so this difference is partial. So notations have
to change. Therefore,
∂y/∂x=(-∂z/∂x)/(∂z/∂y)
∂y/∂x=(-fx)/fy.
Example:
Fx= ∂v/∂x=1, Fz = 2z -3
Fy= -2.
= = .
Where Y= 10,000, P2= 200 and P1= 100. Find the income elasticity of demand and cross
elasticity of demand for first commodity.
Solution:
ey= /
= ( )
= 0.1 ……..(1)
10
Homogeneous and Homothetic Function
ec = ( )
= 1.5
2.3.b Implicit functional theorem (for 2 variables): Let F(x,y)=0 be an implicit function with
continuous first derivatives, which is satisfied at some point, (x0, y0) and is defined in some
neighborhood of this point. If Fy≠0 at this point, then there is a function y=f(x) defined in some
neighborhood of x=xo corresponding to the relationship defined by F(x,y)=0 such that:
Statement: Let F (x1, x2……xn,y)=0 be an implicit function with continuous first derivatives
which is satisfied at some point, (x1, x2…….xn, y) is defined at some neighborhood of this point.
If Fy≠0 at this point, then there is a function y=f(x1, x2…..xn) defined in some neighborhood of
x=xo=(x1, x2…..xn) such that
(i) yo=f(xo)
(ii) fi(xo)=-Fxi/Fy.
Example: The Cobb-Douglas production function: 50 K0.3 L0.7 = Q, where Q is a given level of
output, K is the amount of capital and L is the amount of labor. The isoquant associated with the
function reflects the levels of capital and labor that yield a constant level of output.
a. Use the Implicit Function Theorem to derive an equation for the slope of an isoquant
associated with the production function.
b. When K = 6 and L = 2, what is the slope of a line tangent to this isoquant? What is the
slope of the line when K = 3 and L = 14?
c. Find the MRTS for both examples in part (b).
11
Homogeneous and Homothetic Function
a. Slope of Isoquant = =- =-
=-
Slope of Isoquant= - = . = -7
Slope = - = .
c.i) MRTS= =7
ii) MRTS=
Note: The marginal rate of technical substitution (MRTS) is the rate at which the two production
inputs can be substituted if output is held constant. It is the absolute value of the slope of the
isoquant.
σk,L = = =
12
Homogeneous and Homothetic Function
Example: The implicit function U AB shows what combinations of apples (A) and bananas
(B) provide the levels of utility U. Find the derivative of the implicit function to determine the
MRS of apples for bananas (MRSAB).
Solution: Given the utility function:
U=
Slope of IC = =-
MRS A,B=
Note: The absolute value of the slope of the indifference curve is the marginal rate of substitution
(MRS), which measures the rate at which one good can be substituted for another, while
maintaining the same level of utility.
13
Homogeneous and Homothetic Function
If utility function is
u(x; y) = xy,
is a homogenous function of degree 2. Then the monotonic transformations
g1(z) = z + 1;
g2(z) = z2 + z;
g3(z) = log z
generate the following homothetic (but not homogenous) functions
v1(x; y) = xy + 1;
v2(x; y) = x2y2 + xy;
v3(x; y) = log x + log y:
Example
For the function f (x1, x2) = Ax1ax2b, test the homogeneity of function.
Solution:
= Ata+bx1ax2b
14
Homogeneous and Homothetic Function
Example: Given the function, check whether function is homogeneous functions or not
= α10 x5 y2 z3
Now differentiate both sides of this equation with respect to xi, to get
t f 'i(tx1, ..., txn) = tk f 'i(x1, ..., xn),
If the function z = f(x,y) is a homogeneous of degree ‘n’ then according to Euler’s theorem:
x. + y. = n.f(x,y)
15
Homogeneous and Homothetic Function
Example:
Use Euler’s theorem to determine the degree of homogeneity of the following functions
=fx(x,y)= 4x+y
=fy(x,y)= x-2y
x +y = nf(x,y)
= x(4x+y)+y(x-2y)
=4x2+xy+xy-2y2
=4x2+2xy-2y2
=2(2x2+xy-y2 )
Example:
Use Euler’s theorem to determine the degree of homogeneity of the following function
f(L,K)=ALαKβ
= αALα-1 β
= βALα β-1
By Euler’s theorem
L +K = nf(L,K)
16
Homogeneous and Homothetic Function
=(α+β) (ALαK β)
=(α+β) f(L,K)
Example: Suppose that f (x1, ..., xn) is homogeneous of degree r. Show that each of the
following functions h(x1, ..., xn) is homogeneous, and find the degree of homogeneity.
Solution:
a. Is the function (x3 − y3)/(x1/2 + y1/2) homogeneous of any degree? (If so, which
degree?)
b. Is the function x3y3 + x1/2 homogeneous of any degree? (If so, which degree?)
c. A consumer's (differentiable) demand function for some good is f (p1, ..., pn, w),
where pi is the price of the ith good, and w is the consumer's wealth. This function
f is homogeneous of degree 0. Is there any necessary relationship between
∑i=1n(pi f i'(p1, ..., pn, w)) and w f n+1'(p1, ..., pn, w)?
17
Homogeneous and Homothetic Function
Solution: a. Given the function (x3 − y3)/(x1/2 + y1/2)= ((tx)3 − (ty)3)/((tx)1/2 + (ty)1/2)
c .Given the function, f (p1, ..., pn, w); which is homogeneous of degree 0
∑i=1npi f i'(p1, ..., pn, w) + w f 'n+1(p1, ..., pn, w) = 0. (Note that f has n + 1 arguments.)
Q=AKαLβ
= α A Kα-1Lβ
= β A KαLβ-1
18
Homogeneous and Homothetic Function
= (α+β) (A KαLβ)
= (α+β). Q
a. If (+) = 1, then K +L = Q . If the value of K and L doubled, i.e., 2K and 2L, then
output also doubles; then there is a constant return to scale in the production.
the right hand side adds up to 2(+)Q. If (+) > 1, the output more than doubles, i.e,
there are increasing returns to scale. If (+) < 1, the output is less than double, or the
decreasing returns to scale in production.
Note: A proportional increase in all the values of inputs in a production function increases the
scale of production. If there are constant returns to scale, then output will increase equi-
proportionally to the increase in all inputs. If there are increasing returns to scale, an increase in
all inputs will lead to a more than proportionate increase in output. If there are decreasing
returns to scale, then output will increase less than proportionately with an increase in all inputs.
Example: Consider the following Cobb-Douglas production function, which is homogeneous of
degree 1 in capital and labor Q 50K 0.4 L0.6 . The value of the output (Q) includes the payment
made to the labor, i.e., the wages paid to the labor (wL), which is equal to .L in a competitive
labor market. Also, the value of the output includes the payment made to the capital suppliers
(rK), which is equal to .K. Show that the sum of the total factor payments (wL + rK) equals
the value of the output, i.e., wL + rK = Q, such that wL + rK = Q + (1-) Q, where = 0.6.
Solution: Given the production function
Q= 50 K0.4 L0.6
19
Homogeneous and Homothetic Function
Therefore,
Hence proved.
Example: Given the following production function; find out the elasticity of substitution:
Solution: partial differentiation the function z = A(aK−ρ + bL−ρ)−m/ρ with respect to L and K
respectively,
therefore,
MRTSK,L = RK,L=
= (RK,L)1/( ρ+1)
Hence,
σK,L= ElRk,L( = .
Example: Without solving the equation, show that 2x2+5xy+y2=19 defines an implicit function
20
Homogeneous and Homothetic Function
y(x) for which y(2)=1, and find dy/dx when x=2. Express the answer in geometrical terms.
Solution: Putting x=2 and y=1 in given function 2x2+5xy+y2=19, we see that equation satisfied,
=- = - 13/12
When (x,y)=(2,1). In geometrical terms, this means that the slope of the contour 2x 2+5xy+y2=c
which passes through point (2,1) is -13/12 at that point.
21
Homogeneous and Homothetic Function
= a – a(1)
=0
Example: The twice-differentiable function f (x, y) is homogeneous of degree k, and its second
derivatives are continuous. Show that
x2 f "11(x, y) + 2xy f "12(x, y) + y2 f "22(x, y) = k(k − 1) f (x, y) for all (x, y).
Solution: We know that f is homogeneous of degree k which means that f '1 and f '2 are
homogeneous of degree k − 1. Thus by Euler's theorem applied to f '1 and to f '2 we have,
x2 f "11(x, y) + 2xy f "12(x, y) + y2 f "22(x, y) = (k − 1)[x f '1(x, y) + y f '2(x, y)] for all (x, y)
Finally, the term in brackets on the right-hand side of this equation is equal to k f (x,y) by
Euler's theorem, yielding the required result.
Example: A firm uses two inputs to produce a single output. Its production function f is
homogeneous of degree 1. An implication of the homogeneity of f , which you are not asked to
prove, is that the partial derivatives f 'x and f 'y with respect to the two inputs are homogeneous
22
Homogeneous and Homothetic Function
of degree zero. Use Euler's theorem to find an expression for the cross partial derivative
f "xy(x, y) in terms of x, y, and f "xx(x, y).
x f "xx(x, y) + y f "xy(x, y) = 0,
3. Exercise:
1. Given Q=440-8P +0.05 Y, where P=15 and Y=12,000. Find the income and price
elasticity of demand.
2. Given Q1= 110-P1+0.75 P2-0.25 P3+0.0075 Y. At P1=10, P2=20, P3=40, and Y=10,000,
Q1=170. Find the different cross elasticities of demand.
f(x,y,w)= 3x2y -
a ) z = 10x + 5y
b) z = x2 + 5xy + 12 y2
c) z= x0.3 + y0.4
d) z = 10 x5 + 10x2y3 +y5
5. Assume the demand for sugar is a function of income (Y), the price of sugar (Ps) and the
price of saccharine (Pc), a sugar substitute, as follows:
Qd f (Y , Pc , Ps ) 0.05Y 10 Pc 5Ps2 .
23
Homogeneous and Homothetic Function
Qd Pc
d. Find the cross-price elasticity of demand when Y = 10,000, Ps = 5 and Pc = 7.
Pc Qd
6. Consider the production function y = f(x1,x2) = x1 x2 defined over the domain x1 > 0 and
x2 > 0. Also, consider the functions g(y) = ln(y) and j(y) = y2.
Solution:
24
Homogeneous and Homothetic Function
b. homogenous of degree 2
c. not homogenous
5. Answers:
Qd Qd Qd
a. 0.05; 10; 10 Ps
Y Pc Ps
b. 1.12
c. - 0.56
d. 0.16
5. Answers:
a. homogeneous of degree 2
b. homothetic; not homogeneous in x1 and x2
c. homothetic; homogeneous of degree 4 in x1 and x2
6. Answers:
d. homogeneous of degree 2
e. homothetic; not homogeneous in x1 and x2
f. homothetic; homogeneous of degree 4 in x1 and x2
7. Answers:
a. homogeneous of degree 7/12 (i.e., k=1/12)
and take the partial derivative of the production function with respect to x2 and show that
f2(x1,x2) is homogeneous of degree -5/12.
c. Apply Euler's Theorem.
4. References:
1. K. Sydaster and P. Hammond, Mathematics for Economic Analysis, Person Educational
Asia, Delhi, 2002.
2. M. Hoy et.al, Mathematics for Economics, PHI Learning Private Limited, Delhi, Second
Edition, 2001.
3. J.E. Draper and J.S. Klingman, Mathematical Analysis Bussiness and Economic
Applications, Harper & Row Publishers, New York, 1967.
4. Rosser, Mike, Basic Mathematics for Economists Second Edition, London, 2003.
25
DC-1
SEM-II
University of Delhi
1. Learning outcomes
2. Introduction
3. Derivative Test for Concavity and Convexity
4. Second Derivative and Concavity and Convexity
4.1 Total Differential Method
4.2 Definitions
4.3 Use of Hession Matrix for the determination of Convexity and Concavity
After you have read this chapter, you should be able to:-
2. Introduction
A function f is concave if and only if any pair of distinct point p and R in the domain of f
and 0 1
f ( p (1 – ) R ) f ( p) (1 – ) f ( R )
0 0 1 1
Where p = ( x1 , x2 ) and R = ( x1 , x2 )
The definition can be extended to strict concavity by changing the weak inequality ≥ to
the strict inequality >.
A function f is convex if and only if any pair of distinct points p and R in domain of f and
for 0<θ<1
f ( p (1 – ) R ) f ( p) (1 – ) f ( R )
The right hand side is the height of line segment and the left hand side is the height of
the arc AB.
Figure 1
ƶ = f(x1, x2)
The function f(x,y) is concave (convex) if and only if for any pair of distinct points A and
B on its graph (a-surface) the line segment lies either on or below (above) the surface
except at point A and B. Strict concavity requires the line segment AB lies below the arc
AB. Imagine a dome-shaped surface. The surface of convex function typically be bowl-
shaped. For non-strictly concave and convex function the line segment AB is allowed to
lie on the surface itself, some portion of the surface, or even the entire surface may be
flat rather than curved
Figure 2
In the case of functions of two or more than two variable, it becomes difficult to use
diagrammatic method or algebraic method to determine the concavity or convexity of
function. The functions are such that they require a lot of algebraic manupulation to use
the algebraic formula. A way out is to use the derivatives if the functions is
differentiable.
A differentiable function f(x) = f(x1, x2,..., xn) is concave if and only if for any given point
0 2 0 1 1 1
p = ( x1 , x2 ,....., xn ) and any other point R = ( x1 , x2 ,...., x n ) in convex domain
n
f ( R ) f ( p) f i ( f ) ( R – p)
L 1
Geometrically it means that for a concave function the tangent plane on point p on the
graph of the function lies initially above the graph of the function.
In the case of a convex function graph of the function lies strictly above all the tangent
planes or the hyper planes, except the point of tangency.
Example
2 2
ƶ = x1 x2
The function is convex if for all X = (x1, x2) and Y = (y1, y2)
f f
(where 2 x1 and 2 x2 ) = 2 x1 y1 – 2 x12 2 x2 y2 – 2 x22
x1 x2
( y1 – x1 ) 2 ( y2 – x2 ) 2 ≥ 0
The expression in the brackets will remain positive whatever the value of (x1, x2) and
(y1, y2). This proves that the function is convex
Till now we have discussed about curvature properties of the function by using algebra
or first derivative concavity and convexity of a function is usually discussed using the
second derivative. The second derivative shows how the function represented by the first
derivative changes. In the case of function of one variable we saw that if f''>0 is convex
which means that for f'>0 the function increases more rapidly as x increases while for
f'<0 the function values full less quickly. For f''<0 the function is concave which means
that for f'>0 the function value increases less quickly as x increases while for f'<0 the
function value falls more quickly.
We cannot use the method of determining concavity and convexity for function of two
variables (or n variables). Second partial derivatives cannot be used directly because
there are infinite number of paths that one can take from same point.
2 2
ƶ = x1 x2 – 20 x1 x2
f11 and f22 are positive, the function is not strictly convex in all directions. Cross partial
derivative also plays a role in determining the curvature of the function.
In order to determine the concavity (convexity) of the functions of two variables (this
approach can be extended to n-variables also) we shall use the method of total
differential.
Let y = f(x)
d (dy )
d2y = d(dy) = dx
dx
d [ f '( x) dx]
dx
dx
d (dx )
f ''( x ) dx dx f '( x ) dx
dx
= f''(x) dx2
This is called the second total differential of f(x). Since the term dx2 = (dx)2, it is strictly
positive for any value of dx ≠ 0. It follows that d2y has some sign as f''(x). Therefore the
determination of convexity and concavity which relies on the sign of f''(x) can be
presented using the sign of d2y. A function is convex if f''(x) ≥ 0 and concave if f''(x) ≤ 0
then
The same conditions relating to the sign of d2y to concavity/convexity apply to functions
of n-variables. Here we shall explain this method for two variables.
y = f(x1, x2)
f f
dy dx1 dx2
x1 x2
= f1 dx1 + f2 dx2
(dy) (dy )
d (dy ) d 2 y dx1 dx2
x1 x2
2 2
= f11 dx1 f 21 dx1 dx2 f12 dx1 dx2 f 22 dx2
2 2
= f11 dx 2 f12 dx1 dx2 f 22 dx2 ` (f12 = f21)
The expression makes it clear that d2y depends on cross partial derivative f12 as well as
f11 and f22.
4.2 Definations
Def: A twice continuously differentiable function y = f(x1, x2) is concave if and only if,
d2y is everywhere negative semi definite
Def : A twice continuously differentiable function y = f(x1, x2) is convex if and only if d2y
is everywhere positive semi definite.
If the second order total differential is satisfies the condition d2y ≶ 0 then the function is
strictly concave/convex.
The method of determining the sign of d2y directly can involve a lot of algebraic
manipulation even when the function is function of two variables. In an earlier topic
dealing with maxima and minima we have used quadratic forms and their properties to
determine maxima-minima. Here also we can use the same method to determine the
sign of d2y.
f f12 dx1
[dx1 dx2 ] 11 ...............
f 21 f 22 dx2
We known that f12 = f21 from young's theorem. It follows that 2×2 matrix is symmetric.
This matrix whose elements are second order partial derivatives and cross partial
derivatives is called the Hession matrix and is denoted by H. Hession matrix can be used
to determine the concavity and convexity of the function.
4.3 Use of Hession Matrix for the determination of Convexity and Concavity
Def: For any function y = f(x1, x2, ...., xn) = f(X) where X............. which is twice
diffeerntiable with Hession H, the function f is strictly concave on Rn iff H is negative
definite for all X in Rn, that is
Then
d2y = dxT H dX
The function f is strictly convex on Rn if and only if H is positive definite for all x € Rn,
that is
The Hession H is positive definite if and only if all the leading principal minors of Matrix H
are positive.
For example if y = f(x1, x2, x3) the leading principal minors are
f11 f12
|H1| = |f1|, |H2| = |H3| = |H| = |f11 f12 f13|
f 21 f 22
If |H1|<0, |H2|>0 |H3|>0 the f(x1, x2, x3) is strictly convex (the Hession
is positive definite.
then H is negative definite, and the f(x1, x2, x3) is strictly concave.
H is positive definite on Rn if and only if its leading principle minors are positive |H1|>0,
|H2|>0, .... |Hn| = |H|>0 for X ∈ Rn
H is negative definite on Rn if and only if its leading principle minors alternate in sign
begining with a negative value for |H1|.
>0 n is even
|H1|<0 |H2|>0, ... |Hn| = |H|
<0 n is odd
Note : A leading principle minor of order r of Hession matrix is found by suppressing the
last n–r rows and columns.
Example. In the case of a 3×3 matrix |H1| is found by suppressing 2nd and 3rd rows and
columns.
|H1| = f11
A leading principle minor of order 2 |H2| is found by suppressing the third row and third
column.
So far we have given the conditions for strict concavity and strict convexity. There are
functions which are not strictly concave/convex. They are concave or convex.
Example
A twice differentiable function y = f(x1, x2...xn) is concave if and only if d2y is everywhere
negative semidefinite.
In terms of Hession matrix this means H is negative semi definite on Rn if and only if all
its principle minors alternate in sign begining with negative or zero value for k=1 (HK).
A twice differentiable function y = f(x1, x2, ... xn) is convex if and only if d2y is
everywhere positive semi-definite.
In terms of H this means all its principal minors are positive or zero
Note : Principal minors of order are found by suppressing n-k rows and columns of H.
f11 f12
|HX1| f11, f12 |HX2| =
f 21 f 22
f11 f12
f11 ≤ 0, f22 ≤ 0, ≥0
f 21 f 22
If we choose any two points b and d on the line and connect then by a line. The line
connecting the points b and d also lies on the straight line which we drew in the
beginning. All the point on the line connecting b and d lie on the original line.
Now look at the circle choose any two points on or in the circle and connect them by a
straight line x1x2. This straight line also lies within the circle.
We are now in a position to define the property of straight line and the circle shown
above.
A set is convex if the line joining any two points of the set lies entirely within the set.
The straight line and the circle which includes the area within the circle is an example of
convex set.
Figure 4
In fig. 4, we have drawn the first quadrant of the eucledium space. If we take two points
a1 and a2 in the figure and connect them by a straight line then the entire line lies within
1 2
the quadrant. For example if we look at the point ƶ = a1 a2 the point ƶ lies on the
3 3
straight line connecting the two points a1 and a2. Any point on the straight line
connecting points a1 and point a2 can be expressed as :
Def: A set S is convex if for every pair of points x1 ∈ S and x2 ∈ S, point x̄ = x1 + (1–
)x2is also an element of , for every value of when 0 ≤ ≤ 1.
A set containing only one point is a convex set. Null set is also considered as a convex
set.
Figure 5
The circle is not hallow. All the three figures given above are examples of convex sets.
Figure 6
In these figures there is a feature of reentrance (and also a hole). This is a cause for
non-convexity. To qualify as a convex set, the set of points in the figure must contain no
holes, and its boundary points must be not be reentered anywhere.
A function which gives rise to a hill over its entire domain is a concave function. A
function which gives rise to a valley over its entire domain is convex function.
If the hill (or valley) does not contain any flat surface then the function is suidth be
strictly concave (convex) function. In case a function which give rise to hill (or valley)
and contains flat surface also, is a concave function (convex function).
4.5 Assumption
The domain of the function is a convex set. This assumption is necessary because we
use the combination of x1 and x2 in the domain D to prove whether the f is a concave or
convex function.
In fig. 7, let x ≥ 0 be the domain of the function. This domain is a convex set. If we take
two values of x in the domain x1 and x2. The associated values of the function are f(x1)
and f(x2) connect these two points by a straight line AB. The graph of the function is also
given in the fig1. The graph of f is shown by are AB.
The straight line lies below the arc AB. The value of the function at x̄ between x1 and x2
is f(x̄) = C. This is higher than the point (D) on line AB immediately above the value x̄. It
is clear that f(x̄) > d. This property can be expressed as strict concavity of the function.
c = f(x1 + (1 – )x2)
In simple words if we take only two points on the domain where the domain is a convex
set then convex combination of these points is also in the domain of the function.
4.6 Theorem
c = f(x1 + (1 – )x2) > f(x1) + (1 – ) f(x2) ∈ [0, 1] and x ∈ Dan interval which is a
convex set.
In fig. 8, line AB lies entirely above the graph of the function except at point A & point B.
the f(x) is a convex function. A convex function bends below the line joining points f(x1)
and f(x2) (AB).
It is a concave function if
A linear function is a convex and concave function because it satisfies the conditions of
both convex and concave function.
Figure 9a Figure 9b
Figure9c
For a convex function the inequality is reversed. In fig 9(a) the function is convex and in
fig 9(b) the function is concave. In fig 9(a) look at all the points above the graph of the
function and below the straight line k parallel to x-axis. The set of points satisfy the
above definition and are actually a convex set. Similarly the shaded area in fig b is a
convex set. The function depicted in fig a is a convex function on the convex domain [a,
b].
Now observe fig 9c. There is a tangent at point x0. The tangent line h on any point on a
concave function will lie above the function (except at point f(x0)). For a convex function
the tangent line at any point x0 will lie below the graph of the function.
The right hand side is actually the equation of the tangent at x0 on the function. If we
move slightly away from point x0 on either side of x0 the tangent line at point x0 lies
above the graph of the function.
In the case of convex function (if the function is differentiable) the tangent line at (x0,
f(x0)) will lie below the graph of the function except at point (x0, f(x0))
If the function is differentiable twice then we can use the second derivative to test the
concavity and convexity.
f'(x) > 0 on an interval (a, b) means that the function is increasing on (a, b)
f'(x) < 0 on an interval (a, b) means that the function is decreasing on (a, b)
If f''(x) > 0 it means the slope of the tangent is increasing as we move from left to right
on the graph. In the fig (a) the slope of the tangent is increasing when we move from x1
to x2, where x1 < x2. This happens when the function is convex.
If f''(x) < 0 on (a, b) then tangent becomes flatter when we move to x0 from the left. It
means the slope of the tangent is decreasing as we move from left to right on the graph.
In other words when we move from x1 to x2 where x1 < x2 the slope of the tangent is
decreasing.
We conclude f is strictly concave on interval I if and only if f''(x) < 0 for all x in the
interior of I.
Function f is strictly convex on interval I if and only if f''(x) > 0 for all x on interior of I.
Example
Y X
(y + y ) – (x + x ) ≥ (2x1 2x2)
Y X
= 2x1y1 – 2x + 2x2y2 – 2x
y + y – x – x2 + 2x + 2x – 2x1y1 – 2x2y2 ≥ 0
y + y + x + x – 2x1y1 – 2x2y2 ≥ 0
f11 < 0 f22 > 0 f11 f22 – (f12)2 > 0 = 4 0 > 0 proved
While discussing concave and convex functions we saw that if the function is concave
(convex) there is no need to check the second order condition to determine whether the
function achiever maxima (minima) or not. When we are dealing with problem of
constrained optimization, it is again possible to dispense with the second order condition.
In the case of constrained optimization quasi-concavity of the function obviates need for
second order condition for determining the maxima. In a similar manner quasi-convex
function removes the need for second order condition when we are trying to find out
minimum of the function.
4) Choose two distinct point xi and xj (or x1 and x2) such that xi < xj in the convex
domain of the function.
5) The function f(x) forms an arc between xi and sj such that f (xi) = A and f (xj) =B
In fig. 11(a) point B is higher in height than A. In other words f(xj) > f(xi). The function
is strictly quasi-concave if all other points on are AB are higher in height than point A.
3) The function f(x) forms an arc CD between xi and xj such that f(xi) = c and f(xj) =
D. In fig(b) f(xj) < f(xi). The function is strictly quasi-concave if all other points on
the arc are lower in height than f(xi)
Let f be a function of x. Then for any two distinct points xi and xj in the convex domain of
the function such that xi < xj and 0 < θ < 1, the function is strictly quasi-concave
function if the following inequality is satisfied.
If we replace the strict inequality with weak inequality then the function is quasi-
concave.
The weak inequality implies that there is some horizontal straight line segment also on
the arc AB.
7. Quasi-convex function
Suppose f is a function of x. then for any two points xi and xj and for 0 < θ < 1 the
function is strictly quasi-convex if the following inequality is satisfied.
If we replace the strict inequality with weak inequality the function satisfy the condition
is quasi-concave.
Differentiable functions
2) All concave (convex) functions (strict or non strict) arc quasi-concave (quasi-
convex). But the opposite is not true.
Suppose a function z = f(x1, x2, ........ xn) is twice continuously differentiable. The quasi-
concavity and quasi-convexity of the function can be checked with the help of first and
second partial derivatives of the function arranged as a bordered determinant.
0 ……….
………….
|B| = …………
.. .. ..
. …………
Quasi-convex.
Strictly quasiconvex if
11. Exercises
1) Are the following function quasiconcave ? Which of them are also concave
c) f(x, y) = x2 y3
d) f(x, y) = x y2
2) Which of these function defined on are quasiconvex which are also convex
a) +
b) 3 +4
c) 2x1 + 3x2 –
a) z = – (x + x )
is it a concave function ?
d) x4 + x2 + y2 + y4 – 3x – 8y (Convex)
f) x – y – x2
a) (x1 + x2) /
defined on R2++
12. References
Allen, R.G,D, Mathematical Analysis for Economists, London: Macmillan and Co. Ltd
Knut Sydsaeter and Peter J. Hammond, Mathematics for Economic Analysis, Prentice Hall
Carl P. Simon and Lawrence Blume, Mathematics for Economists, London: W .W. Norton & Co.
DC-1
Semester-II
Table of Contents
1. Learning outcomes
2. Introduction
7. Envelope results
8. Exercises
9. References
Learning outcomes:
After you have read this chapter, you should be able to:-
Introduction
In the last chapter, we covered optimization if objective function of two or more choice variable.
But that optimization was unconstrained. It was unconstrained in the sense that for example, in
case of discriminating monopolist; there was no restriction or limit to what level of output to be
produced. But there could have been constraints, that given the level of technique, & machinery,
the total maximum output that could be produced is suppose; 1000 units. In such a case,
optimized maximum may differ if value of extrema is greater than provided this constraint.
In this chapter, we will cover optimization with equality constraints. The new optimum referred
here is constrained optimum, which is likely to differ from free optimum.
This chapter is divided into sections. In the first section, we will analyze geometric properties of
constraint.
The primary purpose of imposing a constraint is to give due cognizance to certain limiting
factors present in the optimization problem under discussion. In the last chapter, we saw hills
and valleys in 2D and bowls and domes in 3D and found relative extrema in all such cases. But
there were no constraints.
Let us take example of a consumer who consumes only good & has utility function:
U(x) = - (x-2)2+4
Its free optimum (maximum) is at x=2. But if government imposes some restriction that no one
can consume more than 1 unit of x then constrained optimum is at x=1 .This is shown in fig 1
below.
Fig.1
Let us consider another example that utility of a consumer depends on two goods, x1 & x2, with
utility function as follows:-
U = x1x2+2x1
If one finds, partial derivative of U then one would find that marginal utilities are positive and
increasing function of x1 & x2. Hence, an unconstrained optimum would give result for
purchasing infinite amount of goods. But a consumer has constraint known as budget constraint.
If price of x1 is Rs. 4 and price of x2 is Rs.2; income of consumer is Rs.100 then the constraint
becomes:
4x1+2x2=100
This constraint now narrows down the choice of x1 & x2 & one can find optimum x1& x2.
Fig.2
If one considers a general function z = f( x,y) and assumes it appears like a dome in 3D then free
extremum is peak of the dome but constrained extremum is at the peak of u-shaped curve
situated on top of the constraint. In fig.2, MN is constraint line indicating that sum of x & y
cannot go beyond this line. Then constrained maxmium is at point B.
All the points viz. C,D,E,F,G & H are feasible (Infact, the entire section of the bowl in the right
hand side of constraint is feasible section). In Fig.3, various level curves of the function Z=f(x,y)
are drawn. It is 2D projection of a 3D dome. M N is the constraint on sum of X & Y say of the
form g(x,y) =c.
Fig.3
A constrained maximum can be expected to have a lower value than the free
maximum. It could also be that free optimum is also constrained maximum in
which a case, constraint is not binding. If constraint is binding, which
generally it is; free maximum is higher than constrained maximum. But
constrained maximum can never exceed free optimum.
The condition for optimum of f(x,y) = z requires that steep of level curve f (x,y) = z1 is equal to
the slope of constraint g(x,y) = c, which can be expressed as follows:-
g x ( x, y ) f x( x, y)
g y ( x, y ) f y( x, y )
Example1:
Suppose one wishes to maximize f(x,y)= xy subject to 2x+y = m.The constraint can also be
written as y=m-2x. Then objective function becomes f(x,y(x)) = x (m-2x). Now, the objective
function becomes function of one variable. So, for its optimization
f
0
x
f
m 4x 0
x
4x m
m
x
4
2 m m
and y m
m 4 2
Similar result can also be desired by using calculus techniques as follows:-
f x( x, y ) 2
f y ( x, y ) 1
y 2
x 1
y 2x
Suppose f (x0,y0) is optimum value of f(x,y) = z with the constrains g (x,y)= c. Then we know
that:
f x( x, y ) g x ( x, y )
f y( x, y ) g y ( x, y )
f x( x, y ) f y ( x, y )
g x ( x, y ) g y ( x, y )
At (x0,y0); the above ratios would have some common value. The common value of these ratios
is known as Lagrange multiplier and then the above equation becomes:
f x( x, y) g x ( x, y) 0
f y( x, y) g y ( x, y) 0
L( x, y) f ( x, y) ( g ( x, y) c)
The partial derivatives of L(x, y) with respect to x and y are Lx ( x, y) f x( x, y) g x ( x, y) and
Ly ( x, y) f y( x, y) g y ( x, y) , respectively.
Equate these partial derivatives equal to zero and solve these equations along with constraint
The advantage of lagrangean function over slope equality method is that this method
can involve more than two variable and more than one constraint (which we will solve
is coming examples).
Example 1 contd.:-
The lagrangean is L( x, y) xy (2 x y m)
Lx ( x, y) y 2 0 (1)
Ly ( x, y) x 0 (2)
L ( x, y) 2 x 2 m 0 (3)
y = 2x
Putting y = 2x in third equation , we get x=m/4 and y = m/2. The result obtained here again is
same as done by previous techniques. Hence, any of these techniques is equally applicable.
Also y = x from equation 2, so x= m/4. Notice that x,y & λ: all are function of m. m, here is
referred as parameter because optimal value of f(x,y) is also function of m as optimum value
equals (m/4) (m/2) i.e.(m/8).
Suppose consider the objective is to maximize f(x,y) subject to g (x,y) = c . Suppose that, x* &
y* are the values of x and y that solve for this problem. In general, x & y depend on (parameter
of this model) we assume x=x*(c) and y = y*(c) are differentiable functions of c. The associated
value f* of f(x,y) is then also function of c,
Here f*(c) is also called optimal value function. Also, λ is a function of parameter: c. taking
differential of above equation, we get:
df * (c) df ( x* , y* )
g x ( x* , y* (c)dx* g y ( x* , y* )dy* dc
So it implies df * (c) dc
df * (c)
Also (c )
dc
Thus, the Lagrange multiplier is the rate at which the optimal value of the objective function
changes with respect to changes in the parameter c.
In economic applications, c after denotes the available stock of resources which acts as constraint
on utility or profit function: f (x,y). λ becomes then the shadow price of the resource as it
indicates how utility or profit changes as dc more units of recourses are provided.
Consider again that objective is to maximize f(x,y) subject to g (x,y) = c. Then, we know that
f* f y
gx g y
In other words, at maximum point ratio of fi to gi is same for every choice variable, i (x & y
here). The numerators (fi) are the marginal contributions of each choice variable to function f.
They show the marginal benefit that one more unit of x or y will have for the function to be
maximized (f(x,y).Denominators i.e. gi:s are marginal cost of each choice variable. That is, they
reflect the added burden on the constraint of using slightly more of x ( or y).
Example 1 contd:-
Objective was to maximize f (x,y )= xy subject to 2x+y = m. As solved earlier, x(m) = m/4,
y(m)= m/2, and λ(m) = m/4. So the value function is f* (m) = (m/4) (m/2) = m2/8.
df * (m) m
(m)
dm 4
Suppose m is 100 so f *(100) = 100/8. If m increases to 101 then, new optimized value would be
f *(101) = 101/8. f*(101) – f*(100) = 25.125.
df * (c)
(c) (100) 25
dc
which is a good approximation to the actual change is the optional value function
Example 2 :
Sufficient Conditions
The conditions that we studied till now were necessary conditions but not sufficient. To make
this clear, let us consider following example:
max f ( x, y) 2 x 3 y subject to
x y 5
1
Lx ( x, y ) 2 0
2 x
1
Ly ( x, y ) 3 0
2 y
x y 5
Solving first two equations yields y 4 x / 9 . Putting this value is third equation yields x = 9 and
y = 4. This is indicated by point P in Fig. 4. But as it is evident (9, 4) does not solve the problem
(of maximizing f(x, y)). Rather solution to this problem is Q = (0, 25) where constraint is
satisfied and 2x + 3y optimized value of 75 (instead of 30 at point P).
Figure 4
So, this lays the ground that these first o5rder conditions are though necessary but not
sufficient.
Let there be some stationary point ( x0 , y0 ) . By implicit function theorem, the equation
g ( x, y) c defines g as differentiable function of x in some neighbourhood of ( x0 , y0 ) . Let this
be denoted by y h( x) , then
y h( x) g x ( x, y) / g y ( x, y)
Then
dz
f x( x, y ) f y( x, y ) y
dx
dz g ( x, y )
f x( x, y ) f y( x, y ) x
dx g y ( x, y )
dz
The necessary condition becomes 0.
dx
The sufficient condition for maximum of z becomes that second order derivative of z with
respect to x becomes less than zero (for max.).
d2z g x g xy
( g xx g ) g y ( g yx g yy y ) g x
f f y ( f f y ) f
g y ( g y )2
xx xy yx yy xy
dx 2
d 2z 1
[ f xx f xx ( g y )2 2( f xy f xy ) g x g y ( f yy g yy )( g x ) 2 ]
dx 2
( g y ) 2
0 g x g y
D( x, y ) g x f xx g xx f xy g xy
g y f xy g xy f yy g yy
d 2z 1
D ( x, y )
dx 2
( g y )2
A sufficient condition for ( x0 , y0 ) to solve constraint problem is that ( x0 , y0 ) satisfies the first
order conditions and moreover, that the bordered Hessian D ( x0 , y0 ) given above is > 0 in the
maximization case and, is < 0 in the minimization case.
Envelope Results
maximization, income is fixed and we find optimal values of xi ' s that maximize utility. But
then in next period if income changes, then optimal solution would also change. To see, how
optimal solution changes when parameters change, we would encounter here Envelope Theorem.
where r = (r1, ..., rk) is vector of parameters and x = (x1, ..., xn) is vector of choice
variables.
The optimization would give solution as x1* (r), ...., xn* (r) and optimal value of
f (x, r) f * (r) , where f * (r ) is optimal value function for this problem.
Suppose now we wish to study how our optimal value function changers when nth parameter rh
changes. One way is to assume this new rh, set lagrangean function, obtain value of f (x, r ) . To
avoid such tedious process is study how optimal value function changes as rh changes.
The above equation implies that f * ( x) changes on two accounts : first, change in rj changes
vector r and it changes f (x, r ) directly and second, rh changes all the functions x1* (r ) and
m
hence indirectly changes f (x* , r), r) . Let the L(x* , r ) f (x, r )
j 1
j g j (x, r ) . The first
So
rh i 1 j 1 xi rh rh
i 1 xi
.
rh
rh
0
f * (r ) m
gi (x* (r), r) f (x* (r), r)
So j
rh j 1 xi rh
L(x* (r ), r )
rh
Lx e x p 0 (1)
Ly e y q 0 (2)
L px qy m 0 (3)
e x p
Solving first two equations yields y
e q
p
e x e y
q
p
x y ln
q
p
x y ln
q
Putting value of x in (3)
p
P y* ln q*y m
q
p
ln m
y*
q
pq
p
ln m
p
x*
q
ln
pq q
Cost function of the firm that uses capital K and labour, L to produce single output q. and
production function is Q F ( K , L) K 1/ 2 L1/ 4 . The prices of labour and capital are w and r
respectively.
K 1/ 2 L1/ 4 Q
L( K , L) k wL ( K 1/ 2 L1/ 4 Q)
1
Lk r K 1/ 2 L1/ 4 0 (1)
2
1
LL w K 1/ 2 L3 / 4 0 (2)
4
r
L K
2w
r * 2 / 3 2 / 3 2 / 3 4 / 3
L* K 2 r w Q
2w
Example 4:
Suppose a firm produces TV sets at two different locations and x1 units are produced at first
location and x2 at second location. The joint cost function is given by
If the firm has to supply an order of 100then this being constraint x1* and x2* would be produced
so as to reduce cost.
L ( x1 x2 1000) 0 (3)
x1* 400
x2* 600
So firm should 400 units of TV sets at first location and 600 units at second location.
0 1 1
D( x1 , x2 ) 1 0.2 0.2 0.2 0
1 0.2 0.4
Hence (400, 600) stationary point is a minima and min. cost is 2,69,000.
C *
(3.22 / 3 r 2 / 3 w1/ 3Q 4 / 3 )
r r
2
3 22 / 3 r 1/ 3 w1/ 3Q4 / 3
3
K
Now, if instead envelope theorem is used, which states that
C * (r , w, Q) L( K , L, r , w, Q)
=K
r r
C *
Similarly, L.
w
Exercises
Q1. If the utility of a consumer depends on two goods: x and y and utility function is given by
U= (x+2)(y+1). If prices of x and y are Rs. 2 and Rs.5, respectively and income is Rs.51. find the
optimal levels of x and y purchased by the consumer and indirect utility function.
Q2. The production function of a firm is given by X= and prices of capital and labor are
fixed at Rs. r and Rs. w, respectively.
Q3. The incomes of an individual in current and next year are Rs.500 and Rs. 792
respectively.his utility function of two consumption expenditures x and y is U= . If the
market interest rate is 10% p.a. ,determine optimum consumption expenditures and amount
consumer should borrow or lend in current year.
[Hint: constraint is income in two periods but consumption could differ from income of that
period as consumer can borrow or lend in the market.]
Q4. A monopolist has the following demand functions for each of his products X and Y; x=72-
0.5px and y=120-py. the combined cost is C=x2+xy+y2+35 and the maximum product is 40 units.
Find
Q5. Find the optimal mix and its cost in the case when a producer chooses an output
corresponding to isoquant k2l=16 and respective prices of c apital and labor are Re.1 and Rs.2,
respectively. Also find the expansion path.
[Hint: expansion path shows change in optimal values when output changes parameter changes]
References
K. Sydsaeter and P. Hammond, Mathematics for Economic Analysis, Pearson Educational Asia, Delhi,
2002
DC-1
Semester-II
Table of Contents
1. Learning outcomes
2. Introduction
a. Geometrical characterstics
5. Quadratic forms
b. Price discrimination
c. Duopoly
7. Exercises
8. References
Learning outcomes:
After you have read this chapter, you should be able to:-
Introduction
In case there are two or more choice variables that affect the objective function
then optimisation techniques need to studied from lens of Total Differential
instead of Derivative technique (used in case of one choice variable). So to
create a analogy first we will study optimisation of objective function of one
choice variable with help of Differential. Also, geometric characteristics would
be analysed for one variable objective function.
In the first section of this chapter, we will analyse geometric characteristics and
calculus of optimisation in case of one variable. In the second section, we deal
with the optimisation in two variable case. In third section, quadratic forms of
total differential is covered and sufficient conditions for n-variable case is
derived. In last section, application of optimization technique in economics is
discussed.
dz f ( x).dx
dz
From derivative form we know that f ( x) must be zero. f ( x) 0 . This
dx
is equivalent to saying that dz 0 as x varies.
Fig. 1
Let us calculate d 2 z .
For maximum of z : d 2 z 0
For minimum of z : d 2 z 0
1. Geometrical Characteristics
Fig. 2 Fig. 3
Fig. 4
Fig. 5. Fig. 6
dz f x dx f y dy
fx f y 0
This condition was satisfies by point A' in Fig. 2. B' in Fig. 3, C' in Fig.5 and
C" in Fig. 6. But at points C' and C" the function did not attain an extreme
value so, this condition though necessary but is not sufficient.
As in one variable case second order total differential will determine the
whether any point is extremum or not. For the function z f ( x, y); d 2 z is
calculated as follows:
d 2 z d (dz )
(dz ) (dz )
dx .dy
x y
( f x dx f y dy )dx ( f x dx f y dy )dy
x y
f xx dx 2 f yx dydx f xy dxdy f yy dy 2
By young's theorem we know that cross partial derivatives are identical i.e.
f yx f xy . So d 2 z f xx dx 2 2 f xy dxdy f yy dy 2 .
2
0 if f xx 0; f yy 0 and f xx f yy f xy2
d z
0
if f xx 0; f yy 0 and f xx f yy f xy2
f x ze2 x 2 f xx 4e2 x
fy 4y f yy 4 f xy 0
fx 0
2e2 x 2 0
e2 x 1
2x 0
x0
and fy 0
4y 0
y0
So the stationary point for z is (0, 0). To know whether z attains minimum or
maximum at (0, 0), lets check second order conditions.
f xx (0,0) 4
f yy (0,0) 4
f xy (0,0) 0
f xx f yy 16
f xx 0; f yy 0 and f xx f yy f xy2
So that d 2 z f xx dx 2 2 f xy dxdy f yy dy 2
But here we are interested in knowing the sign that d 2 z may assume at an
extremum. d 2 z needs to be negative for a maximum and positive for
minimum. The second order total differential will assume value at a point
depending upon specific value of partial derivatives for any value of dx and dy.
Hence, dx and dy can vary but at a extremum value second order partial
derivatives should have specific sign.
d 2 z au 2 2huv bv2 ;
the sign of first and third terms are independent of the values of variables u and
v since they are squared in the above equation. Thus, for positive or negative
definiteness of these terms alone, depend on signs of a and b. But sign of
middle term could turn the sign of d 2 z .
h2 2 h2
d 2 z au 2 2huv v bv 2 v 2
a a
2h h2 h2
a u 2 uv 2 v 2 b v 2
a a a
h ab h 2 2
2
au v v
a a
Now, d 2 z is positive definite iff a > 0 this would give first term as positive and
second term would also be positive iff ab h2 0 .
a h u
d 2 z [u v]
h b v
a h
Let us denote matrix by D
h b
then
f xx f xy f xw dx
[dx dy dw] f yx f xy f yw dy
f wx f wy f ww dw
f xx f xy
| D1 | f xx ; | D2 | and
f yx f yy
f xx f xy f xw
| D3 | f yx f yy f yw
f wx f wy f ww
f xx f xx
f xx f yy f xy2 f xx f yw f xy f xw
2
dy dw
f xx f xx f yy f xy
2
f xx f yy f ww f xx f yw
2
f yy f xw2 f ww f xy2 2 f xy f xw f yw
(dw)2
f xx f yy f 2
xy
this is equivalent to :
2
f f
d z | D1 | dx xy dy xw dw
2
f xx f xx
f xx f yw f xy f xw
2
| D | | D3 |
2 dy dw (dw)2
| D1 | f xx f yy f xy
2
| D1 |
| D1 | 0, | D2 | 0 and | D3 | 0
So, for minimum of z all principal minors must be positive. For negative
definitness of d 2 z ;
| D1 | 0
dx1
f11 f12 ... f1n dx2
d 2 z [dx1 , dx2 ,..., dxn ] dx
3
f n1 ... ... f nn
dxn
C F ( x1 , x2 ) .
p1x1 p2 x2 C
c
p1 0 or p1 MC1
x1 x1
c
p2 0 or p2 MC2
x2 x2
2 2c 2 2
0 0, 0 and
x12 x12 x22 x22
2 2 2 z
.
dx12 dx22 x1x2
x2 f 2 ( p1 , p2 ) .
R( x1, x2 ) C ( x1, x2 )
R C
0 MR1 MC1
x1 x1 x1
R C
0 MR2 MC2
x2 x2 x2
2 2 R 2C MR1 MC1
2 2 0
x12
x1 x1 x1 x1
2 2 R 2C MR2 MC2
0
x22 x22 x22 x2 x2
2
2 2 2
and .
x12 x22 x1x2
For maximum profit, 10 4 x1 x2 0
x1
and 15 x1 4 x2 0
x2
5 10
x1 and x2
3 3
5 10 25 5 10 100
max 10 15 2 2
3 3 9 3 3 9
= R.s 33.3
2 2
4 0, 4 0
x12 x22
2
1
x1 x2
2
2 2 x 2
.
x12 x22 x1 x2
5 10
Hence profit is maximised at , .
3 3
For maximum of ;
36 8 x1 2 x2 0
x1
50 16 x2 2 x1 0
x2
119 82
Solving above two equations, we get x1 and x2
31 31
119 759
P1 36 3
31 31
82 1140
and P2 50 5
31 31
2
119 (119)2 82 82
max 36 4 50 8
31 (31) 2
31 31
119 82 129952
2
31 31 (31)2
= Rs. 135.23
2 2
8 0; 16 0
x12 x22
2
2
xy
2
2 2 2
.
x12 x22 x1x2
119 82
Hence profit is maximum at , .
31 31
200( x1 x2 ) 0.8( x1 x2 )2
Total cost, TC = C1 + C2
200 1.6( x1 x2 ) 0.6 x1 60 0
x1
200 1.6( x1 x2 ) 1.0 x2 30 0
x2
2 2
2.2, 2.6
x12 x22
2
2 2 2
.
x12 x22 x1 x2
2. Price Discrimination
Let a monopolist who could sell its produce in two markets and charge
different prices in two markets. Let inverse demand functions
P 1 f ( x1 ) and P2 f ( x2 ) . Let total cost functions be C ( x) where x x1 x2 .
R1 C x
. R1 C 0
x1 x1 x1 x1
R2 C x
. R2 C 0
x2 x2 x2 x2
R1 R2 C
MR1 MR2 MC
2 2 R1 2C
0 2
x12 x12 x1
2 2 R2 2C
0 2
and x22 x22 x2
2
2 2 2
and .
x12 x22 x1 x2
60 x1 5x12 16 x2 20 x22 20
60 10 x1 0
x1
x1 6
160 40x2
x2
x2 4
2 2
10 0, 40 0
x12 x22
2
and 0
x1 x2
2
2 2 2
.
x12 x22 x1 x2
x1 P1
Elasticity of demand in market I , 1 .
P1 x1
50
0.2 1.67
6
x2 P2
Elasticity of demand in market II , 2 .
P2 x2
100
0.05 1.25
4
Monopolist can charge higher price in case elasticity of demand is lower and
vice-a-versa. Price charged in a market and elasticity of demand is that market
are negatively related (You will be asked to derive this in exercises).
3. Duopoly
For each duopolistic profit function would be maximised with respect to its
own output. Profit functions of each duopolist are as follows:
Duopolist I : 1 R1 C1
px1 C1 ( x1 )
f ( x) x1 C1 ( x1 )
Deupolist II : 2 R2 C2
px2 C2 ( x2 )
f ( x) x2 C2 ( x2 )
1
0
x1
f ( x) x
f ( x) x1 . F1( x1 ) 0
x x1
2
and 0
x2
Equation 1 gives level of output duopolist I would produce for a given level of
output of second. This is known as reaction function of deuoplist I. Likewise
Equation 2 represents reaction function of duopolist II. Solving 1 and 2
simultaneously, values for x1 and x2 can be calculated.
Example 6
The market demand of a product is given by p = 100 4x. The cost function for
two duopolist are
1 (100 4 x) x1 x12 17 x1 40
1
83 10 x1 4 x2 0
x1
83 4 x2
So reaction function of firms 1 is x1
10
2
For maximum profit, 85 4 x1 9 x2 0
x2
85 4 x1
Reaction function of firm 2 is x2 . Solving reaction functions of two
9
duopolist would give equilibrium output i.e. x1 5.5 and x2 7
A firm suppose if use not only labour but also capital in the production
then production function would be x f ( K , L) where x is production
dependent on capital (K) and labour (L). If monopolist purchases labour and
capital at constant prices of Rs. w and Rs. r per unit respectively. If demand of
monopolist's output is given by P F ( x) ; then profit function becomes:
p.x (wL rk )
x P x
0 P x . w0
L L L x L
P x
wP x
x L
x P x
w P 1
P x L
1
w P 1 MPL
w MR.MPL
1
where P 1 MR
Similarly, 0
K
r MR.MPK
Example
LK
A firm's production function is X 12 .
LK
Let prices of labour (L), capital (K) and output (X) be Re.1, Rs.4 and Rs. 9
respectively.
X PX ( L 4K )
LK
9 12 L 4K
LK
0
L
LK ( L K ) K
9 1 0
L2 K 2
L2 K 2 9K 2
L2 9
0
K
LK ( L K ) L
9 4 0
L2 K 2
9L2 4L2 K 2
9
K2
4
3
K
2
3
Output X, when L = 3 and K
2
3 3/ 2
X 12
9/ 2
9 2
12
2 9
= 11 Units
Exercises
2. A monopolist sells two products x and y for which the demands are: x=50-
0.5px and y-76-py
(i) The profit maximizing levels of output and price for each product.
4. (a) A discriminating monopolist can separate his consumers into two distinct
markets with the following demand functions:
Market I: Q1=16-0.2P1
Assume that the monopolist’s total cost function takes the form TC+20Q-20=0,
Where Q (=Q1+Q2) is the total output. Obtain the total profit function and
determine the prices he would charge in the two markets to maximize profits.
What is the total profit? Do you agree that the price charged in the market with
a higher elasticity of demand would be higher? Show by calculations.
6. the production function of a firm is given by q=12 and the prices (in
Rs) of q,l and K are 9,2 and 4 respectively
References
TABLE OF CONTENTS
Section No. and Heading Page No.
Learning Objectives 2
1 Sample Statistic 2
1.1 Sampling With and Without Replacement 3
1.2 Sample Statistic and Sampling Distribution 4
2 Sampling Techniques 5
2.1 Simple Random Sampling 5
2.2 Stratified Random Sampling 6
2.3 Two- Stage or Multi-Stage Random Sampling 6
2.4 Systematic Random Sampling 7
2.5 Purposive Sampling 7
2.6 Cluster Sampling 7
3 Sampling and Non-sampling Errors 8
4 Deriving a Sampling Distribution 9
5 Analytical Methods for Deriving a Sampling Distribution 10
5.1 Using Probability Rules 10
5.2 Simulation Experiments 11
6 Distribution of the Sample Mean 12
7 Distribution of Sample Means when Population is Normally Distributed 14
8 Central Limit Theorem 16
9 Distribution of Sample Means when the Population is Non-Normal 17
10 Distribution of the Sum and Difference of Sample Means 18
Practice Questions 20
References:
1. Jay L. Devore, Probability and Statistics for Engineering and the Sciences,
8th edition, Cengage Learning
2. Irwin Miller and Marylees Miller, Mathematical Statistics, Seventh Edition,
Pearson.
Learning Objectives:
In this chapter you will learn what is a sample statistic and its sampling
distribution. You will learn how to derive the probability distribution of a sample
statistic and the three alternative methods that can be used for this purpose. The
first method is based on selecting samples from the population. You will learn
about different methods for selecting a representative sample and the difference
between sampling and non-sampling errors. You will study in depth about the
probability distribution of sample means. You will learn about the significance of
the Central Limit Theorem in this context. The chapter ends with the study of
distribution of combinations of two sample means. The chapter is followed by
practice questions so that you can test your understanding of the chapter
contents.
Chapter Outline
1. Sample Statistic
2. Sampling Techniques
3. Sampling and Non-sampling Errors
4. Deriving a Sampling Distribution
5. Analytical Methods for Deriving a Sampling Distribution
6. Distribution of the Sample Mean
7. Distribution of Sample Means when Population is Normally Distributed
8. Central Limit Theorem
9. Distribution of Sample Means when the Population is Non-Normal
10. Distribution of the Sum and Difference of Sample Means
1 SAMPLE STATISTIC
Measures which describe some characteristics of the population are known as
parameters. Examples of population parameters are population mean μ,
~ , etc.
population standard deviation σ, population ratio p, population median
These are constants for a population and remain unknown in the absence of
complete population census data.
From a population of size N, the first unit can be drawn in N ways, the second
unit in N-1 ways and so on, when sampling is without replacement. Since the
order in which the sample units are selected is not relevant, the total number of
N N!
possible samples of size n from a population of size N is
n n!N n !
When the population from which the sample is drawn is very large in relation to
sample size, ie, when n < 0.05N, then for all practical purposes we can consider
the population to be infinitely large. If population is infinitely large then number
of samples that can be drawn from the population is also infinitely large,
irrespective of whether sampling is with or without replacement. It is only when
population is finite, sample size n > 0.05N, and sampling is without replacement
N
that the total number of samples will be and probability of selection of any
n
one of the equally likely samples is 1 .
N
n
Several different functions of sample values could be used to obtain the estimate
of the parameter value. Examples of estimators for population mean μ are the
~
sample mean X , the trimmed mean X tr , the median X , or some weighted
average of sample values.
Consider selecting two different samples of the same size n (x1, x2,……xn) from
the same population. It is very unlikely that all the sample values of the first
sample will be repeated in the second sample. Since a sample is only a small
subset of the population and a large number of samples of the same size can be
drawn from the same population, the sample values will be likely to differ from
one sample to the other sample.
Before we obtain sample data, there is uncertainty about the value of each Xi,
since each sample element can be any one of the population units. Because of
this uncertainty, each observation is a random variable Xi before the data
becomes available.
Since sample observations are random variables, the value of any function of the
sample observations (eg, sample mean X , sample variance S2, etc) is also a
random variable which varies from sample to sample. There is uncertainty about
the value of X , the value of S2, and so on prior to obtaining the sample
observations. The value of the sample statistic will depend on which sample was
selected and the parameter estimate would differ accordingly.
2 SAMPLING TECHNIQUES
Samples selected from a population must be representative of the population. If a
sample is unrepresentative of the population and the sample statistic is used to
estimate the corresponding parameter value, then this will result in an inaccurate
estimate. If the selected sample contains a disproportionately large number of
units from one end of the population distribution then the sample statistic will
provide an underestimate or overestimate of the parameter value. For example, if
the sample observations are atypically larger than most of the population values,
then the sample mean will be an overestimate of the population mean.
The technique used in the collection of sample data should be such that it
minimizes the possibility of such errors. There exist a number of alternative
Simple random sampling ensures that each unit of the population gets an equal
chance of being selected in the sample. Since several different samples can be
selected from any population, this method ensures that each sample of the same
size has the same probability of being selected. This method is useful for
homogeneous populations where there are no extreme values. In case of
homogeneous populations atypical observations are unlikely in the selected
sample and the estimate is unlikely to be biased.
The random variables X1, X2,…….Xn are said to form a simple random sample of
size n if the following two conditions are satisfied:
1. The Xi’s are independent random variables
2. Every Xi has the same probability distribution.
In a simple random sample (SRS), each unit of the sample is then said to be
independently and identically distributed (iid).
as that found in the population. The proportions in the sample from each
subgroup conform to the proportions in the population. However, more
information is required about the population in this method than in SRS.
This method is often used in estimating the timber available in a forest. A tree is
selected at random and then a direction is selected at random. Every i th tree in
the selected direction, starting from the first tree, is then examined.
We see from the listed sampling techniques that there is no unique method for
obtaining a representative sample from a given population. Other sampling
methods exist which combine features of more than one of the above methods.
One such example is Stratified Cluster Sampling. The method adopted when
selecting a sample will depend on the nature of the population, purpose of study,
along with time and expenditure constraints.
It is a matter of pure chance which sample is selected. Hence sampling errors are
due to chance factors. Sampling errors are observed only in a sample survey. It is
completely absent in the census method. Factors which contribute to sampling
errors are:
1. Heterogeneity or variability of the population.
2. Bias in the estimation method if incorrect formula used for the statistic
3. Sometimes, in a properly selected sample, some of the sample units
cannot be observed and these are substituted by other units and the
The total number of likely samples is 32 = 9. Following is the list of samples with
x
n n
x
2
i i x
their respective means and variances, where x= i 1
and s2 i 1
n n 1
(x1,x2): (2,2) (2,6) (2,10) (6,2) (6,6) (6,10) (10,2) (10,6) (10,10)
x: 2 4 6 4 6 8 6 8 10
s2 : 0 8 32 8 0 8 32 8 0
Sampling distribution of x :
x : 2 4 6 8 10
P( x ) : 1/9 2/9 3/9 2/9 1/9
1 3 2 1
Var( x ) = 4 16 36 64 100 62
2 372 10.667
36 5.33
9 9 9 9 9 9 2
We can similarly derive the distribution of sample variances and obtain the mean
and variance of the sampling distribution.
2
Sampling distribution of s :
s2 : 0 8 32
2
P( s ) : 3/9 4/9 2/9
3 2 96
2
4
V( s )= 0 64 1024
2304 9216
142.22
2
9 9 9 9 9 81
If the form of the population distribution is known then the probability of selection
of a sample unit will be the probability of its occurrence in the population. Using
this information, the probability of selection of a particular sample is the joint
probability of all the sample units. If the sample units are assumed to be
independent, then the probability of selection of a particular sample is the product
of the probabilities of the sample units. This follows from the assumption that the
sample units are independent. The probability associated with a particular sample
is also the probability of its mean and its variance. The probability of a sample
mean is the same as the probability of selecting the sample for which the mean is
computed. Similarly, probability of a sample variance is equal to the probability of
selecting the sample for which the variance is computed.
The value of mean and variance of each of the 9 samples is equally likely with
probability 1/9. Now we have three samples (2,10), (6,6), and (10,2) which have
same mean value 6. The probability of obtaining a mean value of 6 is the same as
the probability of selecting any one of the three samples: (2,10) or (6,6) or
(10,2). The sum of the probabilities is 3/9, which is the probability of x =6.
Thus the sampling distribution of means and variances can be derived from the
probability of selecting a sample that results in specific values of sample mean
and variance.
Then use a computer to obtain k different random samples, each of size n, from
the population distribution.
For each such sample, calculate the value of the statistic. From k replications we
get k samples and k calculated values of the statistic. Now construct a histogram
for the k values. The histogram gives the approximate sampling distribution of
the statistic.
The larger the number of replications (k), the better will be the approximation of
the sampling distribution. In practice, k=500 or 1000 is usually enough. Actual
sampling distribution is obtained when k→∞.
The sample mean X is useful as it can be used to draw conclusions about the
population mean μ. Some of the most frequently used inferential procedures are
Let X1, X2,….Xn be a random sample from a population with mean μ and standard
deviation σ. Since this is a random sample, the Xi’s are independently and
identically distributed.
Since the sample units are drawn at random from the same parent population
distribution, with mean μ and variance σ2, each observation Xi is independently
and identically distributed with mean μ and variance σ2.
Since any unit of the population could have been selected and any unit could take
any of the population values with respective probabilities, hence E(Xi) = µ and
V(Xi) = σ2. This is because any population unit could be selected in the sample.
So every sample unit could be any one of the population values. Given the
population has a distribution with different populations values associated with
some probability, every sample unit has the same probability distribution as the
population.
n
X = X
i 1
i n is a linear combination of n independent random variables Xi, each
n n
E( X ) = E Xi n = 1 E X i = 1
. n.µ = µ =µ
i 1 n i 1 n X
And,
n n
2
V( X ) = V Xi n = 1
V X i =
1
.n. σ2
= =
i 1 n2 i 1 n2 n X
n
Thus, the mean of sampling distribution is independent of sample size but the
As the sample size increases we obtain more information from the sample and we
can expect the value of sample mean to be closer to the population mean value.
As the sample size increases the sampling distribution becomes narrower and the
sample means become clustered closer to μ. In the limit as sample size increases
indefinitely the sampling distribution collapses to a single point. Each and every
sample mean will be equal to the population mean, ie, X = → 0 as n →∞
n
As long as n>1, X . The reason being that for each sample, the sample
X
mean ( X ) averages out the variability of the observations within the sample.
The sample mean is a central value for each sample. Although the value of the
sample mean is affected by all sample values, by its very nature the mean value
must lie somewhere in the middle of the range of sample values. This is true for
each and every possible sample drawn from the population. Thus the variability in
values of sample means must be less than the variability in the population values
2 N n
distributed with mean µ and variance . However, if the sample
n N 1
N n 2 when
→ 1 and x →
2
size is very small relative to population size,
N 1 n
n<0.05N
10 12.2
P( X <10) = P Z 1.83 = 0.0336
1.2
3
If sampling (a) with replacement then X 0.6 and
25
3 3000 25
(b) without replacement then X 0.6
25 3000 1
[We obtain similar results for (a) and (b) since n = 0.0083N]
69 68
P( X >69)= 1- P( X <69)= 1- P Z 1 1.67
0.6
= 1-0.9525 = 0.0475
Number of samples with X >69 = (80)(0.0475)=3.8 4
The central limit theorem states that when an infinite number of successive
random samples of the same size, n, are taken from a given population with
mean μ and variance σ2, the distribution of sample means X will be
approximately normally distributed with mean μ and standard deviation ,
n
provided n is sufficiently large, irrespective of the shape of the population
distribution.
The larger the value of n, better is the approximation. Even when the population
distribution is highly non-normal, averaging of sample values while computing
X produces a distribution more bell-shaped than the population itself. If n is
large, a suitable normal curve will approximate the actual distribution of X . That is
why sampling distribution of X is said to be asymptotically normal. This is
illustrated in figure 2.
The red curve in figure 2 is the positively skewed population distribution. The
green and blue curves depict two distributions of sample means for different
sample sizes where n1 < n2. The distribution of X with sample size n1 is less
skewed than the distribution of the rv X. As sample size is increased suitably to
n2, the distribution of X is approximately normal.
How large must be sample size will depend on how much is the departure of the
shape of the population being sampled from a normal distribution. In many cases
the sampling distribution quickly approaches a normal distribution, as in case the
population has a uniform distribution where sample size of 12 is sufficient. In
some other cases sample size of 60 or more may be required. There is no hard
and fast rule about the sample size required for the sampling distribution of
means to be normally distributed. In practice, quite satisfactory approximations
can be obtained for n > 30, provided N > 2n where N is population size.
CLT plays an important role in estimation and tests of hypotheses about the mean
as the probability distribution of the population being sampled is often not known.
The central limit theorem enables us to use the normal distribution as an
approximation of the distribution of sample means.
2 6.928 . Based on the CLT, X will be approximately
12
normal with E( X )=36 and =0.49.
X
Let X 2 be the mean of a random sample of size n2 drawn from a population with
n1 n2 n1 n2
sample sizes.
If populations are not known to be normal and sample sizes are sufficiently large
(n1 > 30, n2 > 30) then by CLT the distribution of ( X 1 + X 2 ) will be approximately
If population variances 12 and 22 are not known and are estimated by sample
variances S i2 and S 2 respectively, then by CLT, if n1 > 40, n2 > 40, the
2
standard deviation S 12 S 22
n1 n 2
If the populations are finite of size N1 and N2, and the samples from the two
populations are drawn without replacement, then the finite population correction
factor must be applied to the variance. If the population variances are known and
If the populations are not normal or population variances are unknown and
estimated from sample data, then the sample sizes must be large enough.
If both random samples are independent and from the same population so that
μ1= μ2= μ and 12 = 22 = σ2, then E( X 1 + X 2 ) = 2μ and E( X 1 - X 2 ) = 0. The standard
X2
1.82 0.0324 & Y2
22 0.04 X Y 0.0724 0.269
100 100
Since sample sizes are large, by CLT, the distribution of ( X - Y ) is
approximately normal with mean X Y X Y 0.4 hours and
0 0.4
= 1- P Z
1 1.487 = 1- 0.9319=0.0681
0.269
References:
1. Jay L. Devore, Probability and Statistics for Engineering and the Sciences,
8th edition, Cengage Learning
2. Irwin Miller and Marylees Miller, Mathematical Statistics, Seventh Edition,
Pearson.
3. A.L.Nagar and R.K.Das, Basic Statistics, Second Edition, Oxford University
Press
PRACTICE QUESTIONS
4. What properties of the SRS help in deriving the sampling distribution of means?
OR
Why are random samples commonly used to obtain estimates of unknown
parameters?
10. A random sample of size 81 is taken from an infinite population with the
mean = 128 and the standard deviation σ = 6.3. With what probability
can we assert that the value we obtain for the sample mean X will not fall
between 126.6 and 129.4?
11. The mean production level at a firm is assumed to be 47.3 units per day
with a standard deviation of 12.7. The manager takes a sample of output
for 25 days. If the sample mean exceeds 49 units then the workers are
promised a Diwali bonus. How likely are the employees to get the bonus?
What assumption did you make, if any?
12. Independent random samples of size 400 are taken from each of two
populations having equal means and std deviations σ1 = 20 and σ2 = 30.
What can we assert with a probability of 0.99 about the value of the
difference in sample means?
14. Given that test scores in an entrance examination are normally distributed
with mean of 30 and a std. dev. of 6
(i) What is the probability that a single score drawn at random will be
greater than 34?
(ii) What is the probability that a sample of 9 scores will have a mean
greater than 34?
(iii) Explain the difference in the results obtained in parts (i) and (ii)
Table of Contents
Learning Objectives 2
6. The F Distribution 18
Practice Questions 21
Reference:
Jay L. Devore, Probability and Statistics for Engineering and the Sciences, 8th
edition, Cengage Learning
Learning Objectives:
The chapter begins with a brief introduction to the uses of sampling distributions
in problems of statistical inference. Some special distributions you will need for
statistical inferences will then be discussed. You will also be taught how to use
the tables given for each of the distributions. The chapter is followed by practice
questions so that you can test your understanding of the distributions discussed
in the chapter.
Chapter Outline
When estimating the parameter value θ we may use the value of the statistic as a
point estimate. Or, in case of interval estimation, we can give a range of values of
the statistic within which θ is expected to lie with a specified probability.
1.1 ESTIMATION
Let the unknown parameter be the population mean µ. If the population is
normally distributed then there are several different possible estimators eg,
~
X , X , X tr , semi-interquartile range etc. Knowledge of sampling distribution of
alternative estimators helps us choose the best estimator.
~
If population is normally distributed, then X , X tr , will also be unbiased
estimators. With more than one unbiased estimator we need additional criterion
to select the best estimator from among the unbiased estimators. We compare
the variance of the sampling distributions of the different unbiased estimators and
choose the one with the minimum variance. It can be shown that X is the
minimum variance unbiased estimator (MVUE) of µ
The precision of the point estimate is measured by the standard deviation of the
sampling distribution of the estimator, referred to also as the standard error of
the estimator.
The point estimate does not indicate how close it might be to the true parameter
value. An alternative to point estimation is interval estimation. A confidence
interval is always calculated by first selecting a confidence level, which is a
measure of the degree of reliability of the interval. Most frequently used
confidence levels are 95%, 99% and 90%.
A confidence level of 95% implies that 95% of all samples would give an interval
that includes the parameter value and only 5% of all samples would yield a wrong
interval. Higher the confidence level, the more strongly we believe that the value
of the parameter being estimated lies within the interval. Precision of the interval
estimate is indicated by the width of the interval. Smaller the width, greater is
the precision. A narrow width combined with a high confidence level means that
the estimate of the parameter value is reasonably precise.
A test of hypothesis is a method for using sample data to decide whether the null
hypothesis should be rejected or not. We select a test statistic on which the
decision to reject or not reject Ho is to be based. Then we set up the rejection
region or critical region, which is the set of all test statistic values for which the
null hypothesis will be rejected.
The normal approximation does not apply if population is non-normal and sample
size is small. In this case the sampling distribution follows the t-distribution.
Some of the other statistics of interest may not have normal distributions. The
sample proportion will have a normal distribution for large sample sizes but a
binomial distribution for small samples. Variance and standard deviation will be
described by the chi-square distribution. The F-distribution is useful in making
inferences about variances of two populations. You will learn about the different
distributions in this chapter. You will also learn how to use the tables given for
each of the distributions.
Table A.3 in Devore gives the standard normal curve areas for z values upto the
second decimal place. The table gives cumulative probabilities for z values in the
z
range of -3.4 and 3.4, denoted by z . Thus, z f z dz , shown as shaded
area in figure 1. If z<0, z < 0.5000; if z>0, z > 0.5000.
If z1< z2 , P(z < z1) = z1 ; P(z2 < z)= 1- z 2 ; P(z1 < z < z2) = z 2 - z1 .
[Note that P(z1 < z < z2) = P(z1 < z < z2) since z is a continuous variable.] So, if
x1 x2
P(z1 < z < z2) = z 2 - z1 .
x1<x2, P(x1 < X < x2) = P Z =
If x1<μ, then z1< 0. If both x1and x2 are less than μ, then z1< 0 and z2< 0, but
z1< z2.
Exercise 1 The average monthly production costs for a printing facility is $410,
with a standard deviation of $87. The manager promises the owner
that he will keep costs below $300 the next month. If costs are
normally distributed, what is the probability the manager will keep
his promise?
Solution Let X= monthly production costs ($) where X~N(410, 872).
300 410
P(X < 300) = P Z 1.26 0.1038
87
We can calculate the x-value from a known probability by using the z-distribution.
In other words, we can obtain the percentiles of the non-standard normal
distribution, with known parameter values (μ, σ2), from the percentiles of the z-
distribution. Let P(X < x) = α, shown as shaded area in figure 2 (a).
Fig 2(b) Percentiles for areas under the standard normal curve
From the body of the table of z we can locate α (See Devore table A.3). The
intersection of the row and column of the tabulated z-distribution at α gives us
the corresponding z value for which P(Z < z) = α, the shaded area in figure 2(b).
X
Substituting this z-value in the formula Z , we obtain the 100αth
percentile of X as x = μ + zσ. If α < 0.5, then z < 0 so that x < μ. Similarly, if
α > 0.5, then z > 0 and x > μ.
Exercise 3 In above execise 2, what must be the area of a storage unit for it
to be smaller than 75 percent of all units?
x 2 0.5 np x 1 0.5 np
P(x1 < X < x2) = B(x2; n, p) - B(x-1; n, p) -
npq npq
x 2 0.5 np x1 0.5 np
P(x1 < X < x2) = B(x2; n, p) - B(x1; n, p) -
npq npq
x 2 0.5 np x1 0.5 np
P(x1 < X < x2) = B(x-2; n, p) - B(x1; n, p) -
npq npq
115.5 90 114.5 90
-
49.5 49.5
= (3.62) - (3.48) 1 – 0.9997 = 0.0003
X X
Z will have a standard normal distribution with mean 0 and variance 1
X
particular sample got selected and the computed value X . With an unknown σ,
the variability of the standardized variable now arises from both the numerator
X X X X
and the denominator. Thus Var Z > Var T and we have a
/ n S/ n
family of probability distributions called t distributions.
If sample size is large (n > 40) then the variability is reduced since standard
error of X is inversely related to the sample size. Then, the distribution of the
standardized variable can, with the help of CLT, be approximated by the standard
normal distribution, Z.
However for small sample sizes, when X is the mean of a random sample of size
n from a normally distributed population with mean μ, the random variable
X X
T has a probability distribution called a t distribution with (n-1)
S
n
degrees of freedom.
Fig 3 The standard normal distribution (z) and the t distribution with df
Like the Z-distribution, the t-distribution has a mean of zero, is symmetric about
the mean and ranges from -∞ to +∞. Whereas variance of Z-distribution σ2 = 1,
n 1
the variance of t-distribution σ2 = 1 . Therefore it is flatter and more
n3 2
spread out than the Z-distribution. As n increases the variance of t-distribution
decreases. As n → ∞, → ∞, and σ2 → 1, and the t-distribution approaches the
Z-distribution. Hence the z-distribution is the limit of the t-distribution.
If the sample is small, population σ is unknown and population is not normal then
the t distribution cannot be used for making inferences about population mean.
The only way out is to increase the sample size by taking sufficiently large
samples to generate an approximately normal sampling distribution. This would
enable the use of the z statistic for making inferences.
Since there are an infinite number of t curves for infinite number of df, it would
require a computer programme to find the required probability under a specific
curve. Table A.5 in Devore gives the t values corresponding to areas under the
t-curve in the upper tail of the distribution for selected probabilities, in
combination with df from 1 to 30 and then = 32, 34, 36, 40, 50, 60, 120, ∞.
Each row of the table corresponds to values of the t statistic for the given df
and specified upper tail probabilities. Thus each row is for a different member of
the family of t distributions. Since the distribution is symmetric and centered at
zero, the t values will have a negative sign for same areas in the lower tail of
the distribution for specified df.
The symbol tα, will denote the t-value for which the area under the t-curve to
the right of tα, is α, where degrees of freedom is , as illustrated in figure 5.
x 200
1.397 x 200 (1.397)(4) 194.412 ml
12 9
Table A.8 in Devore gives t-curve tail areas for combinations of t-values and df.
Since t is a continuous random variable, there would be an infinite number of
possible t values. The table is restricted to positive values of t from 0.0 to 4.0,
where the t-values are given till the first decimal place. The property of symmetry
allows us to obtain the probabilities for t = 0.0 to t = -4.0. This is explained in the
following figure
47 42
Solution t 3.571 3.6
7 25
P( x > 47) = P(t > 3.6) where = 24
From Table A.8, P(t24 > 3.6) = 0.001
CHI-SQUARED ( ) DISTRIBUTION
2
5
other two distributions, the 2 random variable can only take non-negative
values. Like the non-standard normal distribution and the t distribution, the 2
distribution is an entire family of distributions.
The parameter value which distinguishes one member of the family from another
is the number of degrees of freedom (df) of that particular distribution. Just like
different combinations of values of μ and σ distinguishes one normal distribution
from another, the degrees of freedom distinguishes one t distribution from
The possible values of are 1, 2, 3,….. and will depend on the number of
skewed to the right. Smaller the df greater is the extent of skewness. As the
value of increases, the distribution becomes progressively less skewed.
For large ( > 40), the 2 distribution approaches a normal distribution. So,
as →∞, 2 → N( ,2 )
If X1, X2, X3,…….X are independent random variables, each having a standard
normal distribution, then the sum of the squares of these variables, 2 = X i2 ,
i 1
under the density curve 2 for different intervals of the value on the
measurement axis for specified df . Table A.7 in Devore gives values of the
statistic 2 for = 1, 2, 3, ……..40, for select areas in the upper tail of the
distribution. Each row of the table corresponds to values of the 2 statistic for the
given df and specified upper tail probabilities. So that, each row is for a different
The notation 2 , indicates the value of 2 when df = and area in the upper
Fig 8(a)
P 2 2 , and P 2 2 ,
Fig 8(b)
P 2 2 , and P 2 2 ,
Fig 8(c)
P 2 2 , and P 2 2 ,
Exercise 10 What is the probability that 152 < 5.229 or 152 > 30.577?
= P(9.591 < 20
2
< 34.170) = P(9.591 < 20
2
< 34.170)
If a random sample X1, X2,… ,Xn is from a normal population then the sampling
n 1S 2 X
2
X
distribution of the random variable
i
has a 2 distribution
2
2
with = n-1 df. This is why the 2 distribution is used for inferences about
6 THE F DISTRIBUTION
The F distribution is useful in inferences about variances of two normal
populations. The F distribution is obtained as a ratio. If X1 and X2 are two
independent random variables having chi-squared distributions with 1 and
X1 1
2 degrees of freedom respectively, then the ratio F is a random
X2 2
variable having an F distribution, with 1 and 2 degrees of freedom.
Since the numerator is the ratio of a 2 variable and its df 1 , and both are
Since the two random variables X1 and X2 are independent, the two 2
distributions are also independent.
Just like the 2 variable has a positively skewed distribution for < 40, the F
distribution is also skewed to the right. Figure 9 shows the F density curve with
1 and 2 degrees of freedom.
The notation F ,1 , 2 is used to denote the F-value on the measurement axis, for
the F1 , 2 density curve with 1 and 2 df, such that the area in the upper tail of
the distribution is α, that has a lower bound of F ,1 , 2 . This is indicated by the
shaded area in figure 9.
Table A.9 in Devore gives values of the statistic F1 , 2 for select values of
1 (numerator df) and 2 (denominator df) and four α-values: 0.100, 0.050,
0.010 and 0.001. For example, F0.1, 4,3 =5.34. This means the area to the right of
5.34 is 0.1 or 10% when numerator df ( 1 ) is 4 and denominator df ( 2 ) is 3.
Since the F-table gives f 1 , 2 values for α in the upper tail of the distribution and
(1-α) in the lower tail, greater the f 1 , 2 value for given df ( 1 and 2 ), smaller
will be α and larger will be (1-α). Similarly, smaller (1-α) and higher α will result
There are many instances when we require the F ,1 , 2 value on the measurement
axis when α is large, say α = 0.90 so that area in the lower tail of the distribution
is 0.10. This is not available in the given table. Since the F density curve is not
symmetric, it would mean that F1 , 2 values would be required to be tabulated for
both upper and lower tails of the distribution. However this is not necessary as we
can use a property of the F distribution to obtain such F ,1 , 2 values. The fact is
1
that we can use the relation: F1 , 1 , 2 . Note that on the tight hand side
F , 2 , 1
1
of the equation, in , 2 is the numerator df and 1 is the denominator df.
F , 2 , 1
For example, if α = 0.1 then (1-α) = 0.9. Given 1 = 4 and 2 = 3, then
1 1
F0.9, 4,3 0.24 = 10th percentile of the distribution. This is illustrated
F0.1,3, 4 4.19
in figure 10.
Solution P( F12,15 > 2.02) = 0.100, and P( F12,15 > 5.81) = 0.001. Therefore,
1 1
Solution f 0.95,5, 4 0.19
f 0.05, 4,5 5.19
Reference:
Jay L. Devore, Probability and Statistics for Engineering and the Sciences, 8th
edition, Cengage Learning.
Practice Questions
(c) Explain the reason for the difference in pairs of results in (a) & (b).
(a) P(2.19 < F10,12 < 7.29); (b) P(3.37 < F7 ,5 < 10.46)
TABLE OF CONTENTS
Section No. and Heading Page No.
Learning Objectives 2
Practice Questions 19
POINT ESTIMATION
Learning Objectives
Now we will explain that estimators themselves are random variables. Usually we
describe a sample of size n by the values , ... of the random variables , , ...
. If sampling is with replacement, , , ... would be independent, identically
distributed random variables having probability distribution ( ). Their joint distribution
would then be
P( = , = ,........, = )= .....
Now we can use the sample values , ... to compute some statistic (mean, variance
etc.) and use this as an estimate of population parameter. Algebraically, a statistic for a
sample of size n can be defined as a function of the random variables , , ... , i.e.,
g( , , ... ).The function g( , , ... ), that is any statistic, is another random
variable, whose values can be represented by g( , ... ). The same holds true if
we have more than one sample. Suppose we take two samples of heights of m male
students and n female students at a particular university. We represent sample values by
, ... and , , ... respectively. The difference between the two sample mean
heights is - , and is the sensible statistic for estimating - , the difference between
the two population mean heights. Now the statistic - is a linear combination of two
random variables , , ... and , , ... and so itself is a random variable.
Since estimators are random variables, one of the key problems of point estimation
is to study their sampling distributions to make a comparison among different estimators.
For instance, when we estimate the variance of a population on the basis of a random
sample, we can hardly expect that the value of we get will actually equal ,but it would
be reassuring, at least, to know whether we can expect it to be close. Similarly, suppose we
draw a random sample of size n from a normal population with mean value . Now sample
arithmetic mean is a natural statistic for estimating . However, median of the population,
average of the two extreme observations in the population and k% trimmed mean are also
equal to , since normal distributions are symmetric. So we can consider any of the
following estimators for :
(a) Estimator = =Arithmetic Mean
(b) Estimator = =Median
in the sample
Which of these estimates is closest to the true value? We cannot answer this without
knowing the true value of (in which case estimation is unnecessary).Questions that can be
answered are, "which estimator, when used on other samples of 's will tend to produce
estimates closest to the true value, which will expose us to the smallest risk, which will give
us the most information at the lowest cost and so forth?"To decide which estimator is most
appropriate in a given situation, various statistical properties of estimators can be used.
Figure 1. The pdf's of a biased estimator and an unbiased estimator for a parameter .
One may feel that, it is necessary to know the true parameter value to see whether
an estimator is biased or unbiased. This is not usually the case because unbiasedness is a
general property of the estimator's sampling distribution-where it is centered-which is
typically not dependent on any particular parameter value.The following examples will
illustrate this:
Proof: E =E = E(X) =
Hence the distribution of the estimator will be centered at the true value p.
Example 3: Let , ,----- be a random sample from a normal population with mean
a biased estimator of .
Proof: Since , ,----- are random variables having the same distribution as the
population, which has mean µ, we have
E( )=µ for i=1,2,........n
We have as required
E( )= = (nµ)=µ
E( )=E
= E
it follows that E( )= =
which is very nearly only for large values of n(say, n 30). The desired unbiased estimator
is defined by
= = so that E( )=
It can be noted that we have divided the sum of squared deviations by (n-1) instead of n.
The reason for this is that by definition we should have taken deviations from µ rather than
. But we do not know the value of µ so we have to take deviations from . Since s will
always be closer to than to µ so the sum of squared deviations is underestimating the true
sum of squared deviations.
Proof:
Denote by L. Now L will be minimised when its first derivative with respect to c is
zero and second derivative with respect to c is positive. Differentiating with respect to c we
get
=2 (-1) = 0
= =
=2 >0
In order to make a correction for this underestimation we divide by (n-1) rather than
n.
Now we will discuss two basic difficulties associated with the concept of
unbiasedness. One difficulty associated with the concept of unbiasedness is that it may not
be retained under functional transformations, i.e. if is an unbiased estimator of , it does
not necessarily follow that g is an unbiased estimator of g For example, although is
an unbiased estimator of but is not an unbiased estimator of .Taking the square root
messes up the property of unbiasedness. Second difficulty associated with the concept of
unbiasedness is that unbiased estimators are not necessarily unique. The following example
will illustrate this:
.So for any fixed , Y is a random variable having mean value . That is, we assume that
the mean value of Y is related to by a line passing through (0,0) but that the observed
value of Y will typically deviate from this line. Now we can consider any of the following
three estimators of
(1) =
(2) =
(3) =
(1) E = = = = =
(2) E = = ( )= =
(3) E = = = = ( )=
Similarly, if , ,----- is a random sample from a normal distribution with mean µ, then
, and trimmed mean with any percentage are all unbiased estimators of µ.
It can be seen that both and are unbiased estimators of as pdf of each is centered at
, but has more spread as compared to . So we select . is also called minimum
variance unbiased estimator (MVUE) of as it has least variance among all unbiased
estimators of .
Example 5: For a normal population, the sampling distributions of the mean and median
both have the same mean, namely, the population mean. So both are unbiased estimators.
However, the variance of the sampling distribution of mean is equal to which is smaller
Therefore, the mean provides a more efficient estimate than the median and the
efficiency of the median relative to the mean is approximately
= =
or about 64%. It means that mean requires only 64% as many observations as the median
to estimate with the same reliability.
Question 1: Show that is a biased but more efficient estimator of population variance
Var( )=
So Var( )=Var =
Hence MSE( )= + =
that when n is sufficiently large, we can be practically certain that the error made with a
consistent estimator will be less than any small pre-assigned positive constant.
Figure 4 : variance 0 as n
as n
1) If we draw a random sample from a normal population, then is the best among the
four estimators ( , , and ), since its variance is least among all unbiased
estimators.
2) If we draw a random sample from a Cauchy distribution,
then and are bad estimators for , while is reasonably good. is bad as it is very
sensitive to extreme observations, and due to heavy tails of the Cauchy distribution it is
very likely that a few such observations appear in any sample.
3) If we draw a random sample from a uniform distribution, then is the best estimator.
is very sensitive to extreme observations but such observations are unlikely to
appear in any sample as uniform distribution does not have any tails.
4) The trimmed mean is not best in any of these three situations. However it is quite good
in all three. Hence with small trimming percentage is called a robust estimator i.e.
one that performs reasonably well for a wide variety of population distributions.
So both i.e. distribution of population and sampling distribution of estimator are important
to decide which estimator is best for a given situation.
standard error of the relevant estimator which we can denote by . It is the size of
Example 6: Let , ,----- be a random sample from a normal population, then the
standard error of = is given by = . If, we do not know, the value of then we can
We can also use the standard error of the estimator used to convert point estimate
into interval estimate.
As we have seen in this chapter, there can be many different ways (estimators) of
estimating a parameter of a population. Further different estimators have various desirable
properties to varying degrees. Therefore, it would seem desirable to have some general
methods that yield estimators with reasonable desirable properties. Here we will discuss two
such methods, the method of moments, which is historically one of the oldest methods
and the method of maximum likelihood. Although maximum likelihood estimators are
generally preferable to moment estimators because of certain efficiency properties, they
often require significantly more computation than do moment estimators.
Thus the first population moment is E(X)= , and the first sample moment is = .
The method of moments consists of equating the first few moments of a population
to the corresponding moments of a sample, thus getting as many equations as are needed
to solve for the unknown parameters of the population.
Thus the method of moments consists of solving the system of equations
= k=1,2-----p
Hence =
If both n and p are unknown, then the system of equations we shall have to solve is
= and =
we get
and solving these two equations for n and p, we find the estimates of the two parameters of
the binomial distribution.
Since npq +
= q+
= =(1- )
=1-
= =
Question 5: Given a random sample of size n from a uniform population with =1, use the
method of moments to obtain a formula for estimating the parameter .
Example 8: Suppose Mr X receives five letters on some particular day, but unfortunately
one of them gets misplaced before he has a chance to open it. If among the remaining four
letters three contain credit-card billings and the other one does not, what might be a good
estimate of k, the total number of credit-card billings among the five letters received?
Clearly k must be three or four. Assuming that each letter had the same chance of being
misplaced, we find that the probability of the observed data is
= for k=3
and
= for k=4
Therefore, if we choose as our estimate of k the value that maximizes the probability of
getting the observed data, we obtain k=4. We call this estimate a maximum likelihood
estimate and the method by which it was obtained is called the method of maximum
likelihood.
In the general case, if the observed sample values are , ,...... ,we can write in the
discrete case
P( = = ,......, = )= ( ,
; ) which is just the value of the joint
,......
probability distribution of the random variables , ,...... at the sample point ( ,
,...... ). Since the sample values have been observed and are therefore fixed numbers,
we regard ( , ,...... ; ) as the value of a function of the parameter ,referred to as
the likelihood function L( ). A similar definition applies when the random sample comes
from a continuous population, but in that case ( , ,...... ; ) is the value of the joint
probability density at the sample point ( , ,...... ). The method of maximum
likelihood consists of maximizing the likelihood function with respect to , and we refer to
the value of which maximizes the likelihood function as the maximum likelihood estimate
of .To maximize L( )= ( , ,...... ; ) we take the derivative of L( ) with respect
to and set it equal to zero.
Question 6: Given "successes" in n trials, find the maximum likelihood estimator of the
parameter of the binomial distribution.
L( )=b( ;n, )=( ) , it will be convenient to make use of the fact that the
value of which maximizes L( ) will also maximize
Thus we get = -
and, equating this derivative to 0 and solving for , we find that the likelihood function has
binomial distribution is = .
Question 7: Suppose that n observations, , ,...... are made from a normally
distributed population. Find
(a) the maximum likelihood estimate of the mean if variance is known but mean is unknown
(b) the maximum likelihood estimate of the variance if mean is known but variance is
unknown.
Solution:
(a) Since f( , )=
we have
(1) L = f( , )........ f( , )=
Therefore,
(2) ln L = - ln -
(3) =
Setting = 0 gives
(4) = 0 i.e. =0
or
(5) =
(b) Since f( , )=
we have
(1) L = f( , )........ f( , )=
Therefore,
(2) ln L = - ln -
(3) =- +
Setting = 0 gives
being the sample value. Show also that the estimate is biased.
Solution: Sample of unit size =1
likelihood function L( ) = ( = f(
=- +
= -
- + =0 = = =
When = ,
E( ) = E( )=2 = =
Practice Questions:
Q.1 Assuming that the population is normal, give examples of estimators (or estimates)
which are
(a) unbiased and efficient
(b) unbiased and inefficient
(c) biased and inefficient.
Q.2 Show that is a minimum variance unbiased estimator of the mean of a normal
population.
Q.3 If is an estimator of a parameter , its bias is given by b=E( )- . Show that
E =V( )+ .
Q.4 If and are unbiased estimators of the same parameter , what condition must be
imposed on the constants and so that + is also an unbiased estimator of ?
Q.5 Suppose that we use the largest value of a random sample of size n to estimate the
parameter of the population.
( )=
=0 Otherwise
Check whether this estimator is (a) unbiased and (b) consistent.
Q.6 Show that for a random sample from a normal population, the sample variance is a
Q.7 In estimating the mean of a normal population on the basis of a random sample of
size 2n+1, what is the efficiency of the median relative to the mean?
Q.8 If , ,...... are the values of a random sample of size n from a population having
the density
( ; )=
=0 otherwise
find an estimator for by the method of moments.
Q.9 Let ,... be a random sample from a gamma distribution with parameters and .
a. Derive the equations whose solutions yield the maximum likelihood estimators of
and . Do you think they can be solved explicitly ?
b. Show that the mle of = is = .
Q.10 Among N independent random variables having identical binomial distribution with the
parameters and n=2, take on the value zero, take on the value one, and take on
the value two. Find an estimate of using
1
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
TABLE OF CONTENTS
Learning Objectives 2
1. Basic Concepts 2
2.1 Methodology 3
2.2 Solved Examples 6
3.1 Methodology 10
3.2 Solved Examples 10
4.1 Methodology 14
4.2 Solved Examples 15
Practice Questions 20
2
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
After completing study of this chapter we will be able to make a reasonably precise
inference about the population parameters like mean, variance and proportion on the basis
of sample data. We will also be able to make an inference about the difference between the
means, variances and proportions of two different population distributions on the basis of
samples collected from each of these populations. We will also be able to have an idea
about the accuracy of the above estimates.
1.Basic Concepts
Point estimate is a single number determined from a sample and is used to estimate
the population value. By implication, the term estimate refers to the actual sample result
which is used to represent the parameter being estimated. If the average age based on a
random sample of size n = 36 is 65 years, the sample mean = 65 years is an estimate of
the parameter and the statistic its estimator.
Clearly, a point estimate is normally different from the actual value of the parameter
for the simple reason that a point estimate is derived from a random sample and the value
of the point estimate varies from sample to sample. So while reporting the value of a point
estimate, we should also give some indication of its precision or error. The best indicator is
standard error of the estimator used. The standard error of an estimator is its standard
deviation which can be denoted by . It is the size of an average deviation between and
. If we use estimated values of some unknown parameters, then we call it estimated
standard error and denote it by or by .
Now we will show the computation of point estimates and their standard error for
population mean, variance and proportion for a single sample. We will also extend these
computation methods to situations involving the means, proportions and variances of two
different population distributions.
3
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
2.1 Methodology
Sample arithmetic mean, ; sample median, ; sample k% trimmed mean,
and average of the two extreme observations in the sample, , can all be used as
estimators of population mean . However, when there is more than one estimator, the best
estimator is the one which gives an estimate closer to the actual value of which will
depend on the sampling distribution of the estimator. However, the sampling distribution of
the estimator itself depends on the distribution of the population from which the sample is
drawn. In particular,
1) If we draw a random sample from a normal population, then is the best among the
four estimators ( , , and ), since its variance is least among all unbiased
estimators. An estimator is called an unbiased estimator of population parameter if
E( )= .
4
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
then and are bad estimators for , while is reasonably good. is bad as it is very
sensitive to extreme observations, and due to heavy tails of the Cauchy distribution it is
very likely that a few such observations appear in any sample.
3) If we draw a random sample from a uniform distribution, then is the best estimator.
is very sensitive to extreme observations but such observations are unlikely to
appear in any sample as uniform distribution does not have any tails.
4) The trimmed mean is not best in any of these three situations. However it is quite good
in all three. Hence, with small trimming percentage is called a robust estimator
i.e. one that performs reasonably well for a wide variety of population distributions.
Let , ,----- be a random sample from a normal population with mean and
variance then = is the best estimator of . It can be shown that the expected value
of is , so is an unbiased estimator of .
Hence
E( )= = (nµ) [since = for i=1,2…….. n]
=µ as desired.
Further it can be shown that if the value of is known, the standard error of is =
Proof:
5
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
Hence = .
If we do not know the value of , then we substitute the estimate = s into and denote
the estimated standard error by = = .
Now we extend the above methods to problems which deal with the means of two different
population distributions. For instance, if denotes true average Rockwell hardness for heat
-treated steel specimens and denotes true average hardness for cold-rolled specimens,
then an investigator might wish to use samples of hardness observations from each type of
steel as a basis for calculating an estimate of - , the difference between the two true
average hardnesses. Assuming that
(1) , ,........ is a random sample from a distribution with mean and variance
, and
(2) , ,........ is a random sample from a distribution with mean and variance ,
and
It can be shown that - , the difference between the two sample means can be used as
natural estimator of - , the difference between the corresponding means of two different
population distributions. The expected value of - is equal to - , so - is an
unbiased estimator of - .
Proof: E( - )= E
= - as desired.
Proof: Since X and Y samples are independent, so and will be independent quantities
implying that Cov( ) = 0. Hence the variance of the difference between the two sample
means is the sum of V( ) and V( ):
6
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
V( - )= V( )+ V( )=
Example 2.1:
We examine each one of the 150 newly typed pages and record the number of mistakes per
page (the pages are supposed to be free of mistakes). We observe the following data:
Number of mistakes
per page 0 1 2 3 4 5 6 7
Observed frequency 18 37 42 30 13 7 2 1
Let X=the number of mistakes on a randomly chosen page. Also assume that X follows a
Poisson distribution with parameter .
a) Find an unbiased estimator of and compute the estimate for the data.
b) What is the standard error of your estimator? Compute the estimated standard error.
Solution:
a. An unbiased estimator of is given by sample mean, , since
E( )=E =
= as desired.
Estimate= = = 2.11
7
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
b. Let the standard deviation of our estimator, , be denoted by
Substituting the estimated value of i.e. to compute the estimated standard error, we get
= =0.1186.
Example 2.2:
If , ..... constitute a random sample from a population with the mean , what condition
must be imposed on the constants , ..... , so that .....+
is an unbiased estimator of ?
Solution:
.....+ is an unbiased estimator of if E( .....+ )=
Now E[ .....+ ]
= E( )+ E( )+.......+ E( )
= +.....+ [since = for i=1,2…….. n]
=( +.....+ )
= only if ( +.....+ )=1
So +.....+ should be equal to one for .....+ to be an unbiased
estimator of .
Example 2.3:
Independent random samples of size and are taken from a normal population with the
mean and the variance . If =25, =50, = 27.6 and =38.1, find an unbiased
estimator of .
Solution:
E( )= E( )+ E( )
= as desired.
Example 2.4:
A sample of 20 measurements each on flexural strength (MPa) for concrete beams of a
certain type and cylinders respectively gave the following results.
8
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
Beams: 5.9 7.2 7.3 6.3 8.1 6.8 7.0 7.6 6.8 6.5
7.9 9.0 8.2 8.7 7.8 9.7 7.4 7.7 11.6 11.3
Cylinders: 6.1 5.8 7.8 7.1 7.2 9.2 6.6 8.3 7.0 8.3
7.8 8.1 7.4 8.5 8.9 9.8 9.7 14.1 12.6 11.2
Before obtaining data we denote the beam strengths by , and the cylinder
strengths by , . Suppose that the 's are drawn from a population with mean
and standard deviation . Similarly 's are drawn from another population with mean and
standard deviation . Also assume that 's are independent of the 's.
Solution:
a. - is an unbiased estimator of - if E( - )= - .
Now E( - )=E
= - as desired.
To find an estimate for the given data, we first compute and .
Table: Calculations for mean, variance
X Y X2 Y2
9
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
8.2 7.4 67.24 54.76
8.7 8.5 75.69 72.25
7.8 8.9 60.84 79.21
9.7 9.8 94.09 96.04
7.4 9.7 54.76 94.09
7.7 14.1 59.29 198.81
11.6 12.6 134.56 158.76
11.3 11.2 127.69 125.44
158.8 =171.5 =1304.3 =1554.73
= = = 7.94
= = = 8.575
So = + and hence =
To compute the estimated standard error we will first have to compute standard deviation,
= =1.512
Similarly =
= = 2.104.
Hence = = 0.579.
10
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
problems where comparison of two population variances (or standard deviations) is
required. Now we will discuss methods for computing these estimates.
3.1 Methodology
If the population is normal then we can use the following result concerning the
sample variance to draw inferences about a population variance.
estimator of .
Proof: E( )=E
= E
Assuming that the populations under investigation are normal, can be used as a point
estimator of .
Solution:
Now E { }=
= as desired.
Example 3.2:
11
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
Consider a hypothetical normal population comprising only three values 2,5 and 8. Draw all
possible samples of size 2 and calculate the mean and variance
Solution :
So there are 9 possible samples of size 2 as shown in column (2). The mean =
and variance = for each sample are shown in columns (4) and
(5) respectively.
To examine whether the statistics are unbiased for the corresponding parameters,
we will first have to calculate the mean and variance for the population.
Population mean = = =5
Population variance = = =6
E( )= and E( )=
12
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
Now E( )= = =3
Question 3.3:
Suppose each side of a square plot has length . So area of the plot will be . Since value
of is unknown so we take independent measurements , ..... of the length.
Assume that each has mean and variance .
a. Show that is a biased estimator for .
b. What value of will make the estimator - unbiased for , where
Solution:
= + -
= only if =
Example 3.4:
A sample of 10 television tubes produced by a company showed that the mean lifetime is
1200 hours and the standard deviation is 100 hours.
a. Calculate the mean of the population of all television tubes produced by this company.
b. Compute the standard deviation of the population of all television tubes produced by this
company.
c. If the same results are obtained for 30, 50 and 100 television tubes, estimate the mean
and the standard deviation of the population.
13
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
d. What can you conclude about the relation between sample standard deviation and
estimates of population standard deviation for different sample sizes?
Solution:
a. We can use sample mean as an estimator of population mean . So =1200 hours.
c. The estimate for the population mean will remain same i.e. 1200 hours in all cases.
However, the estimate for population standard deviation will differ for different sample
sizes. If sample size is 30 then = 100 =101.7 hours. If sample size is 50 then
= 100 =101 hours. If sample size is 100 then = 100 =100.5 hours.
d. As sample size increases, estimates of population standard deviation come closer and
closer to sample standard deviation.
Example 3.5:
Suppose use of a certain type of pesticide increases average yield per acre by with
variance , whereas the use of second type of pesticide increases average yield per acre by
with the same variance .Let and denote the unbiased estimators of population
variances of yields based on sample sizes and respectively, of the two pesticides. Show
Solution:
The pooled estimator is an unbiased estimator of if E( )=
Now E( )= E( )+ E( )
= as desired.
Example 3.6:
14
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
Using data and calculations of example 2.4 compute a point estimate of the ratio of the
Solution:
A point estimate of the ratio of the two standard deviations, , is given by
= = = 0.719.
Inferences concerning population proportion for specified characteristics are often required
by the policymakers. Similarly sometimes estimates regarding differences in proportions of
two different populations are needed for policy decisions. Now we will discuss methods for
estimating these population parameters and also give an expression for estimating the
reliability of the estimates.
4.1 Methodology
Suppose a random sample of size n is taken from a population and it is found that
the number of “successes” is X. Now we can use = , the sample fraction of “successes” as
an estimator of p. E( ) = (unbiasedness) and =
= = = = as desired.
Now we extend these methods to situations involving the proportions of two different
population distributions. Let denote the true proportion of nickel-cadmium cells produced
under current operating conditions that are defective because of internal shorts, and let
represent the true proportion of cells with internal shorts produced under modified operating
conditions. If the rationale for the modified conditions is to reduce the proportion of
defective cells, a quality engineer would want to use sample information as a basis for
calculating an estimate of .
Suppose that a sample of size is selected from the first population and
independently a sample of size is selected from the second one. Let X denote the number
of successes in the first sample and Y be the number of successes in the second.
Independence of the two samples implies that X and Y are independent. Provided that the
two sample sizes are much smaller than the corresponding population sizes, X and Y can be
regarded as having binomial distributions. The natural estimator for , the difference in
population proportions, is the corresponding difference in sample proportions X/ – Y/ .
15
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
E( - )=
V( - )= (where = )
Example 4.1:
A sample of 20 students of XYZ College gave the following information on the brand of
calculator used (F = Fiamo, O = Orpat, C = Citizen, S= Sharp):
F F O F C F F S C O
S S F O C F F F O F
a. Estimate the true proportion of all such students who used a Fiamo calculator.
b. Of the 10 students who used a Fiamo calculator, 4 had graphing calculators. Estimate the
proportion of students who do not use a Fiamo graphing calculator.
Solution:
a. An estimate of the true proportion of all such students who used a Fiamo calculator is
given by = = = 0.5 where x is the number of favourable cases i.e. number of
students who used Fiamo calculator and n is the total number of cases i.e. total number
of students.
b. Using the same method as above, an estimate of the proportion of students who do not
use a Fiamo graphing calculator is given by = = = 0.8 where x is the number of
students who do not use a Fiamo graphing calculator.
Example 4.2:
16
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
A sample of 80 components is taken from a large factory and it is found that 68 components
are not defective.
a. Estimate the proportion of all such components that are not defective.
b. Suppose now we randomly select two of these components and connect them in series,
as shown here to construct a system.
The system will function if both components are not defective. Give a point estimate of the
proportion of properly working systems?
Solution:
Let p denote the probability that a component works properly and P denote the probability
that the system works properly. Then
a. Estimate of the proportion of all such components that are not defective is given by
= = =0.85.
where x is the number of favourable cases i.e. number of components that are not
defective and n is the total number of cases i.e. total number of components sampled.
( )= = = 0.721.
Example 4.3:
A sample of 10 measurements of the weights of female students at xyz university gave the
following results.
Student 1 2 3 4 5 6 7 8 9 10
Weight(kg) 40 45 47.6 48.2 52.8 57 52.5 52 59 49
c. A point estimate of the proportion of all such female students whose weight exceeds
53kg.
17
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
Solution:
a. Since population is normally distributed so both sample mean and median are unbiased
estimators of population mean
= = =50.31.
To calculate median our first step is to arrange weights of students either in increasing or
decreasing order as follows.
Since the total number of students is 10, so median weight would be the weight of th
student i.e. the average of the weights of 5th and 6th student which is = 50.5.
To make efficiency comparisons we should compare MSEs of mean and median. Since both
are unbiased so MSE of and is equal to V( ) and V( ) respectively. We know that for a
normal distribution V( )= and V( )= 1.57 .
= = 31.35
c. Since the number of students in the sample whose weight exceeds 53kg is two, so a
point estimate for the population proportion of all such students whose weight exceeds 53kg
is given by = = where is the number of favourable cases i.e. number of students
whose weight exceeds 53kg and is the total number of cases i.e. total number of students
sampled.
Example 4.4:
18
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
a. Calculate an estimate of the average value of plywood thickness. Which estimator did
you use?
b. Calculate a point estimate of the median of the plywood thickness distribution, and state
which estimator you used.
c. Calculate a point estimate of the value that separates the largest 10% of all values in
the thickness distribution from the remaining 90%, and state which estimator you used.
d. Estimate P(X<1.5) i.e. the proportion of all thickness values less than 1.5.
e. What is the estimated standard error of the estimator that you used in part (b)?
Solution:
a. A point estimate of the mean value of plywood thickness is
= = =1.348
and is sample standard deviation defined as . Now our next step is to compute
= = = = 0.114965.
as . Then use this Z value to find the area (probability) under the standard normal
The area under the standard normal curve corresponding to Z=0.448 is 0.6736. So
P(X<1.5)= 0.6736.
19
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
e. The estimated standard error of the estimator, , that we used in part (b) is given by
= =0.08475.
Example 4.5:
A random sample of male students of size is taken from XYZ University and it is found
that scored more than 70% in their final exams. Another sample of female students of
size from the same University showed that scored more than 70%. Let denotes the
probability that a male student scores more than 70% and denotes the probability that a
female student scores more than 70% in their final exams.
b. Find an expression for the standard error of your estimator in part (a).
c. What is the use of and in estimating the standard error of your estimator?
d. If = = 200, =127,and =176, compute a point estimate for and also give
an estimate of its standard error.
Solution:
E =
Now E = E( )- E( )
= -
= as desired.
b. The standard error of the estimator in part (a) is given by
20
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
c. We will use the observed values and to estimate the standard error of our estimator
by using for and for .
- = - =- = -0.245
Using the relevant values, we get an estimate of the standard error of the estimator as
follows =
= .041.
PRACTICE QUESTIONS
Q.3 A random sample of size 65 was taken to estimate the mean annual income of 1000
families and the mean and S. D. were found to be Rs. 6300 and Rs. 9.5 respectively. Find
an estimate for the population mean. Also calculate its standard error.
Q.4 A sample of 150 bulbs of brand A showed an average life of 1800 hrs with a standard
deviation of 15 hrs. Another sample of 100 bulbs of brand B showed an average life of 1500
hrs with a standard deviation of 11 hrs. Find an estimate for the difference of the mean life
of the population of A and B brand bulbs. Also calculate the standard error of the estimate.
21
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
Q.5 According to the mendelian law of segregation in genetics, when certain type of peas
are crossed, the probability that the plant yields a yellow pea is 3/4 and that it yields a
green pea is 1/4. For a plant yielding 400 peas, find the standard error of the proportion of
yellow peas.
Q.6 An insecticide of brand A was sprayed to kill mosquitos of a container having 150
mosquitoes. It was found that 100 of the mosquitoes were killed. When another container
having 170 mosquitoes of the same type was sprayed with brand B, 130 mosquitoes were
killed. Find an estimate for the difference in the effectiveness of the two brands of
insecticides. Also calculate the standard error of the estimate.
Q.7 The marketing manager of a large company conducted a sample survey in two states,
Bihar and Orissa, taking 400 sample salesman in each case. The main findings of research
are given in the following table;
Find an estimate for the difference in average per day sales of the salesmen in two states.
Also calculate the standard error of the estimate.
Q.8 The following results were obtained from two samples each drawn from two different
populations A and B;
Population A B
Sample I II
Sample size = 16 =9
Sample S. D. =3 =2
Find an estimate for the ratio of the population variances for brand A and B i.e. .
Q.9 A population consists of numbers 4, 5, 8, 10, 13. Enumerate all possible samples of
size 3 which can be drawn from the population without replacement and show that the
mean of the sampling distribution of the sample means is equal to the population mean.
22
Institute of Lifelong Learning, University of Delhi
Point Estimates For Population Mean, Variance And Proportions: Single Sample And
Two Samples
Calculate the variance of the sampling distribution of the sample mean and show that it is
less than the population variance.
Q.10 A builder is considering two different areas of a large western state as sites for
primary school. Of 50 households surveyed in one area, the proportion of households having
primary school going children was 0.52. Similarly, of 45 households surveyed in another
area, the proportion of households having primary school going children was 0.48. Find an
estimate for the difference in the proportions of primary school going children in the two
areas of the state? Also calculate the standard error of the estimate?
23
Institute of Lifelong Learning, University of Delhi
P-value Tests for the Population Means
TABLE OF CONTENTS
Learning Objectives 3
1. Introduction 3
3. Calculation of β 14
4. Selection of a test 16
Practice Questions 19
Content Developer
References
1. Jay L. Devore: Probability and Statistics for Engineering and the Sciences, Cengage
Learning, 8th edition [Chapter 8 and 9].
2. Irwin Miller and Marylees Miller: Mathematical Statistics, Pearson, 7th edition.
3. Allen Webster: Probability and Statistics, 4th edition, Richard D. Irwin/McGraw-Hill, Burr
Ridge, IL, 2010
Learning Objectives
This chapter aims at showing how p-values are calculated during hypothesis testing
procedures. The chapter focuses mainly on the tests of population means in single and two
samples. It will demonstrate how these tests can be carried out using p-values in both small
and large samples. Apart from this, you will learn how to find value of type II error in a
given scenario, when to use a particular test and also you will learn about the likelihood
ratio principle. The chapter concludes with practice questions which will help you to test
your concepts learned from this chapter.
1. INTRODUCTION
As already discussed, the p-value is the observed or the actual level of significance at
which we reject the null hypothesis. Here, instead of first finding the critical value of a test
statistic and comparing it with the calculated value to conclude whether null hypothesis is
rejected or not, we first calculate the value of the test statistic and then find the smallest
level of significance (which is basically the p-value) at which the null hypothesis is rejected.
Main advantage of using p-value is that there is no need to find critical values of the test
statistic for different levels of significance α every time. We just need to compare p-value
with α to check at what levels of significance null hypothesis is rejected and at what levels it
is not. Generally, the smaller is the p-value, the greater is the evidence against the null
hypothesis and hence greater is the probability of accepting the alternative hypothesis. In
particular, we reject the null hypothesis if p-value less than or equal to α and do not reject
null hypothesis if p-value is greater than α. Now, we move to the tests of population means
in case of large and small samples using the p-values.
Since the sample size is large, the test statistic for the population mean µ will be Z which is
the standard normal variable. Depending upon the alternative hypothesis, the p-value is
calculated for the observed or the calculated value of test statistic Z and then this p-value is
compared with the given level of significance α.
p-value = area to the left of calculated z (negative value)= P(Z ≤ -z) = Φ(-z)
Figure 1
p-value = area to the right of calculated z (positive value)= P(Z ≥ z)= 1 - Φ(z) as shown in
following figure 2.
Figure 2
c) Two-tailed test
Alternative Hypothesis H a : µ ≠ µ0
p-value = sum of the area to the left of calculated negative z value and to the right of
calculated positive z value = P(Z ≤ -z) + P(Z ≥ z) = [Φ(-z)]+ [1 - Φ(z)] = 2[1 - Φ(z)] as
shown in Figure 3 below.
{since, area to left of -z is same as the area to the right of z due to symmetry of z-curve}
Figure 3
Once the p-value is calculated, it is compared with α using the decision rule as given by the
following figure 4:
(a) (b)
0 α 1
Figure 4
If p-value lies in (a) then we reject H0 but if p-value lies in (b) then we do not reject H0. In
other words,
Now, the following examples illustrate the use of p-value to test population means in case of
large samples.
Example 3.1 The mean lifetime of a certain batteries in a sample of 100 manufactured by a
firm is found to be 1680 hours with a standard deviation of 150 hours. Do a two-tailed test
to check whether the true average lifetime of batteries is 1700 hours for level of significance
(a) 0.05 and (b) 0.01.
Step 2: H0 : µ = 1700
Step 3: Ha : µ ≠ 1700
Step 6: Since p-value is greater than 0.05 and also 0.01, therefore we do not reject the null
hypothesis that the true average lifetime of batteries is 1700 hours at both 0.05 and 0.01
level of significance.
Example 3.2 Consider the previous example, test the hypothesis µ = 1700 against the
alternative hypothesis µ < 1700 for level of significance (a) 0.10 and (b) 0.05.
Step 2: H0 : µ = 1700
Step 6: Since p-value is less than 0.10 and but greater than 0.05, therefore we reject the
null hypothesis that the true average lifetime of batteries is 1700 hours at α = 0.10 but do
not reject the null hypothesis at α = 0.05.
Until now tests of single population mean were done, so we move to tests of two population
means in large samples. The hypothesis testing in this case is as follows:
Null Hypothesis H0 : µ1 - µ2 = θ0
(Where µ1 and µ2 are the means from two different population distributions whereas θ 0 is the
null value of µ1 - µ2).
Test Statistic:
(where are the sample means of the corresponding population distribution, s 1 and s2
are the respective sample standard deviations and m and n are the sample sizes taken from
the two populations such that m > 40 and n > 40).
Alternative Hypothesis:
Example 3.3 In class A of 50 students, the mean height was found to be 62.4 inches with
a standard deviation of 2.25 inches. In another class B of 50 students, the mean height was
61.5 inches whereas standard deviation was 2.5 inches. Test the hypothesis at level of
significance 0.05 and 0.01 that the students in class A are taller than the students in the
class B.
Step 1: µ1 - µ2 = difference between true average height of students in class A and class B.
Step 2: H0 : µ1 - µ2 = 0 i.e. there is no difference in the mean heights of students in the two
classes.
Step 3: Ha : µ1 - µ2 > 0 i.e. students in class A are taller than students in class B on
average.
= (62.4-61.5-0)/
= 0.9/0.476 = 1.89
Step 6: Since p-value is smaller than 0.05 and but greater than 0.01, therefore we reject
null hypothesis at α = 0.05 and conclude that students in class A are taller than students in
class B on average, however do not reject the null hypothesis that there is no difference in
the mean heights of students in the two classes at α = 0.01.
For large sample tests, the test statistic of population means was Z. However, in the case of
small samples, the test statistic is t with number of degrees of freedom (df) equal to n-1
and the p-value is calculated under the t-distribution curve with n-1 df, assuming that the
population is normally distributed. The three different tests are shown as below for tests of
single population means:
p-value = area to the left of calculated t (negative value)for given df = P(T ≤ -t)
Figure 5
p-value = area to the right of calculated t (positive value) for given df = P(T ≥ t).
Figure 6
c) Two-tailed test
Alternative Hypothesis Ha : µ ≠ µ0
p-value = sum of the area to the left of calculated negative t value and to the right of
calculated positive t value for given df = P(T ≤ -t) + P(T ≥ t). This is shown in figure 7
given below.
Figure 7
Once the p-value is calculated, it is again compared with α and following decision rule is
used:
Now, the following examples illustrate the use of p-value to test population means in case of
small samples.
Example 3.4 Breaking power of 10 cables manufactured by a firm were tested which gave
a mean of 6600 lb and a standard deviation of 550 lb. The manufacturer of the firm claimed
that the mean breaking power is 7000 lb. Test this claim against alternative hypothesis that
the breaking power is less than 7000 lb at (a) 5 % and (b) 1 % level of significance,
assuming normal distribution.
Step 2: H0 : µ = 7000
= -400/174.1 = -2.30
(since the test is left-tailed, we look for area to left of -2.30 or to right of 2.30 in t
distribution areas table with df = 9)
Step 6: Since p-value is less than 0.05 and but greater than 0.01, therefore we reject the
null hypothesis at α = 0.05 but do not reject the null hypothesis at α = 0.01.
Tests of two population means in small samples is similar to that of large samples with the
difference that now the test statistic is t and we have the t-distribution assuming that both
populations are normally distributed, where
and
df =
Example 3.5 The I.Q's of 20 students in a class showed a mean of 101 along with standard
deviation 11.5 whereas I.Q's of another class containing same number of students showed a
mean of 110 with standard deviation of 15.5. Test whether there is a difference between
I.Q's of the two classes using α = 0.05 and 0.01 assuming that the IQ's in both classes
follow a normal distribution.
Step 3: Ha : µ1 - µ2 ≠ 0 i.e. there is a statistical difference in the mean I.Q's in two classes.
= (101-110-0)/
= -9/4.32 = -2.1
(since the test is two-tailed, we look for area to left of -2.1 and to right of 2.1 in t
distribution areas table with df = 35)
Step 6: Since p-value is smaller than 0.05 and but greater than 0.01, therefore we reject
null hypothesis at α = 0.05, however do not reject the null hypothesis that there is no
difference in the mean I.Q's in two classes at α = 0.01.
The pooled t-test is applied in hypothesis testing procedures whenever the two populations
of interest are normal distributed and also they have equal population variances that is
σ12 = σ22. Let σ2 be the common variance of both the populations. Then if the test statistic
for the hypothesis testing is z then it is given by the formula-
and so
Now, if the common population variance σ2 is known then the z value can be easily found by
using above formula. However, if σ 2 is unknown then we have to estimate it using the
information from the sample. If s 12 and s22 are the sample variances of the population
where (m-1) degrees of freedom are being contributed by the first sample and (n-1)
degrees of freedom are being contributed by the second sample to the estimate of σ2 . The
total degrees of freedom turns out to be (m-1)+(n-1) = m+n-2. From the statistical theory,
if σ2 is replaced by sp2, then the test statistic used will be t which will follow a t-distribution
with m+n-2 degrees of freedom. This t-statistic is called pooled t and the confidence
intervals and tests based on this t variable for the tests of difference between two
population means are called pooled t confidence intervals and pooled t test respectively.
However, it is advised to use pooled t procedure when the null hypothesis that there is no
difference between the two population variances i.e. H0: σ12 = σ22 does not get rejected. In
practice, it is recommended to use the usual two sample t procedure un till we have a very
strong reason to believe that the two normally distributed populations have equal variances
especially when the two sample sizes are not equal.
Example 3.6 Consider the previous example 3.5 with additional information that σ 12 = σ22
and suppose that in first class there are 20 students but in the second class there are 15
students. Now, test whether there is a difference between I.Q's of the two classes using α =
0.10 and 0.05 assuming that the IQ's in both classes follow a normal distribution.
The first three steps will be the same but step 4 now becomes-
(since the test is two-tailed, we look for area to left of -2.0 and to right of 2.0 in t
distribution areas table. For df = 30 and 35, the area to right of 2.0 is given as 0.027 and so
for df = 33 we take the area to right of 2.0 equal to 0.027)
Step 6: Since p-value is smaller than 0.10 but greater than 0.05, therefore we reject null
hypothesis at α = 0.10, however do not reject the null hypothesis that there is no difference
in the mean I.Q's in two classes at α = 0.05.
3. CALCULATION OF β
This means that H0 will not get rejected in the interval ( zα/2(s/ ), µ0 + zα/2(s/ )). Let
β( = P (not rejection H0 | µ = )
Example 3.6 Consider a random sample consisting of 100 students who gave a statistics
test of marks 50. Let µ denote the true average marks obtained. Consider testing H0 : µ =
35 against alternative µ > 35 with a sample standard deviation of 7.5 at α = 0.05. Find
β(37).
This means that, 15.39 % of the times the null hypothesis does not get rejected even
though it is false.
Now, consider large sample test but of two population means and a two-tailed test. In that
case, H0 : µ1 - µ2 = θ0 and Ha : µ1 - µ2 ≠ θ0 . Let θ' be the alternative value of µ1 - µ2 then,
the rejection areas are given by z≤-zα/2 and z≥ zα/2 which are equivalent to
zα/2(S ) and zα/2(S )respectively,
where S = .
The area in which H0 is not rejected will lie between zα/2(S ) and zα/2(S ).
=Φ( )-Φ( ).
Example 3.7 consider example 3.3 and suppose that alternate value of µ1 - µ2 = 2. At α =
0.01 find β(2 .
Here, we have a right-tailed test with zα = z0.01 = 2.33. We have already calculated
S = = 0.476.
This implies that 3.07 % of all experiments of this kind would lead to not rejecting null
hypothesis even though it is false!.
In this way β can be calculated given the alternative value of the parameter of interest.
However, the farther the alternative value of the parameter from the null value, the smaller
will be β since there will be a greater chance that the given null hypothesis gets rejected.
4. SELECTION OF A TEST
(b) Use of an appropriate test statistic along with the corresponding rejection region in
accordance with the alternative hypothesis (left-, right- or two-tailed).
(c) Using particular values of level of significance, the critical values of the test statistic are
found or p-value is calculated.
(d) Depending upon where the calculated value of test statistic falls, either null hypothesis
is rejected or not rejected.
When the population is assumed to be normally distributed with mean µ and standard
deviation σ which is known then we use the Z-test to test a given null hypothesis. In case,
the sample size is large with unknown σ, we still apply Z-test for hypothesis testing due to
Central Limit Theorem (CLT) . Thus, whenever sample size is large or σ is known of a
normal distribution, we use Z-test. However, if sample size from a normally distributed
population is small and σ is unknown then we use the t-test. And depending upon the
values of the test statistic calculated in a given test with given α, we decide whether to
reject or not to reject H0.
For example, consider a normal distribution with mean µ and standard deviation σ. If we are
testing H0 : µ = 500 against Ha : µ > 500 where suppose the true value of µ = 501. Now,
true value of µ does not show any large departure from the null hypothesis. However, if
sample size is large, then the p-value of the test might be very low indicating statistical
significance or rejection of null hypothesis even though true value of µ did not differ much
from the null value 500 practically, thereby indicating little practical significance.
To sum up, we must be very careful in carrying out interpretation of the evidence in the
case of large sample size because any minute departure from the null hypothesis will lead to
rejection of the null hypothesis even though there is little significance of such departure.
Now, suppose we doing following hypothesis testing where ψ0 and ψ1 are two disjoint sets.
H0 : η Ε ψ0 i.e. η belongs to ψ0
Ha : η Ε ψ1 i.e. η belongs to ψ1
Step 1: We find the maximum value of the likelihood function by finding maximum
likelihood estimate of η in ψ0.
Step 2: Then we find the maximum value of the likelihood function by finding maximum
likelihood estimate of η in ψ1.
λ(x1,x2,.....xn) =
This ratio λ(x1,x2,.....xn) is known as the likelihood ratio statistic value. In this test, the
null hypothesis gets rejected if this ratio is small compared to a selected constant, say, c. In
other words, we have an evidence against the null hypothesis and in favour of alternative
hypothesis when the denominator of λ(x1,x2,.....xn) is large compared to its numerator.
The choice of c depends upon the desired probability of Type 1 error. For example, if we
have a normal distribution with known σ and if
then we will reject the null hypothesis if λ(x1,x2,.....xn) ≤ c which will be equivalent to |z| ≤
|c| and thus |c| = |zα|. Here, the likelihood ratio test is same as the z-test in a single
sample.
Among its advantages, this test can be applied when X's have different distributions and
also when they are not independent. However, one disadvantage of using this test is that
the functional form of the probability distribution from which the sample is derived must be
known in order to find out the value of λ(x 1,x2,.....xn). For example, to get t-test from the
likelihood ratio test we must assume that X's have a normal distribution or else there will
not be any way to write the joint probability distribution for all sample values.
PRACTICE QUESTIONS
Q1) Given the following p-values, find for which p-values the null hypothesis gets rejected
at level of significance 5 %
(i) 0.195
(ii) 0.005
(iii) 0.065
(iv) 0.049
(v) 0.025
Q2) Consider a large sample test of a single population mean. Suppose the test is right-
tailed, find the p-values associated with the following z-values:
(i) 1.96
(ii) 0.35
(iii) 2.33
(iv) 0.95
(v) 0.05
Q3) The IQ's, which follows a normal distribution, of 50 students in a certain class gave a
mean of 106 with standard deviation of 9. Test the null hypothesis that µ = 108 against the
alternative µ < 108 using p-value method at α = .10, 0.05 and 0.01. Also test the null
hypothesis that µ = 108 against the alternative µ ≠ 108 for given levels of significance.
Q4) Two types of land, 1 and 2, each having 60 plots of equal areas were selected to test
effect of a particular pesticide on rice production. The plots of land 1 were given the
treatment of the pesticide while the plots of land 2 were not. The plots of land 1 yielded a
mean output of 150.5 kg with the standard deviation of 10 kg whereas the plots of land 2
gave a mean output of 145.6 kg with the standard deviation of 1.5 kg. Check, using p-value
method, whether there is a significant improvement in the rice output due to the application
of pesticide for given levels of significance 0.01, 0.05 and 0.10.
Q5) The lifetime of a sample of 16 lights bulbs manufactured by a company were tested
which gave a mean of 1550 hours with a standard deviation of 125 hours. suppose µ is the
true average mean of lifetime of all bulbs, test the following hypothesis using p-value
method assuming normal distribution of lifetime of light bulbs-
Q6) In a particular class 40 students were selected and divided into two groups, A and B,
each having 20 students. The average height of students in group A was found to be 66.8
inches with standard deviation of 2.55 inches whereas the average heights of students in
group B was found to be 65.6 inches with standard deviation of 2.67 inches. Test whether
the true average height of students in group A is more than that of students in group B
using p-value method for α = 0.05 and 0.10 assuming that heights in both classes have a
normal distribution.
Q7) A statistic exam of 100 marks was given to students in class A and Class B. There were
20 students in class A and 25 students in class B. In class A, mean marks obtained was 72
with standard deviation of 7.5 whereas in class B, mean marks obtained was 75 with
standard deviation of 7.25. Test whether there is a significant difference between the
performance of the two classes for levels of significance 0.01, 0.05 and 0.10 using p-value
method assuming normal distribution.
Q8) Consider a random sample consisting of 100 cables manufactured by a firm. These
cables were tested for their breaking strength giving a mean of 5700 lb and a standard
deviation of 450 lb. However, the manufacturer of the firm claimed that the true average
breaking strength is 6000 lb. Consider testing this claim against alternative hypothesis that
the breaking strength is less than 6000 lb at α = 0.05. Find the probability of Type II error
given the alternative value of mean as 5800.
Q9) Consider Q4 where two types of land 1 and 2 were given treatment and non-treatment
of pesticide in case of rice production respectively. Suppose the given level of significance is
0.01 and the alternative value of µ1 - µ2 = 5, find the probability of Type II error i.e β(5)
and interpret it.
(ii) Calculate β for the given alternative value of µ for the sample sizes n = 50, 100 and
1600, given α = 0.05.
(iii) Find the p-value if the observed value of the mean i.e = 101 and the sample size is
1600. Check whether there is any statistical significance for the chosen levels of
significance. Explain.
TABLE OF CONTENTS
Learning Objectives 3
1. Introduction 3
Practice Questions 24
Content Developer
References
1. Jay L. Devore: Probability and Statistics for Engineering and the Sciences, Cengage
Learning, 8th edition [Chapter 8 and 9].
2. Irwin Miller and Marylees Miller: Mathematical Statistics, Pearson, 7th edition.
3. Allen Webster: Probability and Statistics, 4th edition, Richard D. Irwin/McGraw-Hill, Burr
Ridge, IL, 2010
1. INTRODUCTION
If the sample size n is small in comparison to the size of the population then X will follow a
Binomial distribution with mean E(X) = np and variance, Var (X)= σ2 = np(1-p). However, if
sample size is large such that both conditions - np ≥ 10 and n(1-p) ≥ 10 - are satisfied then
X and hence both will follow a normal distribution.
given by .
(ii) = = = = =
Now, the following sections will show the tests concerning population proportion for both
small and large samples.
Since the sample size is large, irrespective of the fact that σ is known or unknown, the test
statistic for the population proportion p will be Z which is the standard normal variable
following a standard normal distribution. Here, both X and will follow a normal distribution.
Depending upon the alternative hypothesis, the critical values of Z or the p-values are
calculated for the observed value of test statistic Z and then this critical value is compared
with the calculated value or p-value is compared with the given level of significance α to
come to the conclusion whether to reject null hypothesis or not. However, it must be noted
that these tests are valid only when both the two conditions - np0 ≥ 10 and n(1-p0) ≥ 10 -
are satisfied given that H0 is true.
Test Statistic: Z =
p-value = area to the left of calculated z (negative value)= P(Z ≤ -z) = Φ(-z). [where Φ
stands for cumulative area to left of z]
Figure 1(a) shows the rejection area for the z-test whereas figure 1(b) shows p-value given
by the shaded area.
Test Statistic: Z=
p-value = area to the right of calculated z (positive value)= P(Z ≥ z)= 1 - Φ(z) where we
reject H0 if p-value ≤ α and do not reject H0 if p-value > α. The following figure 2(a) shows
the rejection area for the z-test whereas figure 2(b) shows p-value given by the shaded
area.
c) Two-tailed test
Alternative Hypothesis Ha : p ≠ p 0
Test Statistic: Z=
p-value = sum of the area to the left of calculated negative z value and to the right of
calculated positive z value = P(Z ≤ -z) + P(Z ≥ z) = [Φ(-z)]+ [1 - Φ(z)] = 2[1 - Φ(z)] as
shown in figure 3(b).
{since, area to left of -z is same as the area to the right of z due to symmetry of z-curve
and we reject H0 if p-value ≤ α and do not reject H0 if p-value > α}.
Now, the following examples illustrate the test procedures for population proportions in case
of large samples.
Example 1: A firm manufactured a particular medicine for curing a disease. It claimed that
the medicine was 85% effective in curing the disease within a time span of 3 days. A
random sample of 400 people was selected having this disease and it was found that the
medicine cured the disease for 320 people. Check whether the firm claim is true at level of
significance (i) 0.05 and (ii) 0.01.
Step 1: p = the true proportion or probability that the disease is cured using the medicine.
Step 2: H0 : p = 0.85
Step 4: Here, np0 = 400(0.85) = 340 > 10 and n(1-p0) = 400(0.15) = 60 >10 and so we
can apply the large sample z- test in this case.
= 320/400 = 0.8]
Step 6: (a) Critical values of z: For α = 0.05, the value of z such that area to left of it equals
0.05 is given by -1.645. Similarly for α = 0.01, the value of z such that area to left of it
equals 0.01 is given by -2.33. (since the test is left-tailed and values of z are taken from the
standard normal curve areas table).
Step 7: (a) The calculated value of z = -2.77 is less than both -2.33 and -1.645 implying
that it lies in the rejection area for both α = 0.05 and 0.01 and hence we reject the firm's
claim that the medicine is 85% effective in curing the disease at both levels of significance.
(b) The p-value is less than both 0.01 and 0.05 again implying that we reject the firm's
claim that the medicine is 85% effective in curing the disease at both levels of significance.
Example 2: Consider an experiment where 200 cars were tested for emission of a
particular toxic pollutant. If a car emitted this pollutant more than a certain desired level, it
was considered to be defective. Out of 200 cars, 45 cars were found to be defective.
However, the manufacturer of these cars claimed that the proportion of such defectives was
0.20. Test this claim against p ≠ 0.20 at α = 0.01.
Step 2: H0 : p = 0.20
Step 3: Ha : p ≠ 0.20
Step 4: Here, np0 = 200(0.20) = 40 > 10 and n(1-p0) = 200(0.80) = 160 >10 and so we
can apply the large sample z- test in this case.
= 45/200 = 0.225]
Step 6: (a) Critical values of z: For α = 0.01, the value of z such that area to left of it equals
0.01/2 = 0.005 is given by -2.58 and area to the right of it equals 0.01/2 = 0.005, so that
the total rejection area equals 0.01, is given by 2.58. (since the test is two-tailed and
values of z are taken from the standard normal curve areas table).
Step 7: (a) The calculated value of z = 0.89 is greater than -2.58 and less than 2.58
implying that it lies in the acceptance area at α = 0.01 and hence we do not reject the
manufacturer's claim that the true proportion of defective cars is 20% at α = 0.01.
(b) The p-value is greater than both 0.01 again implying that we do not reject the null
hypothesis.
Now, we move to tests concerning population proportions in case of two samples in large
samples since until now we did tests of single population proportion in large samples. The
hypothesis testing procedure in this case is as follows:
Consider two population distributions and let's denote them by Popu1 and Popu2. Let p1 and
p2 be the true fraction of successes in Popu1 and Popu2 respectively. Suppose random
samples of sizes m and n, both large, are selected from Popu1 and Popu2 respectively
independent of one another. Let X1 and X2 denote the sample number of successes for
Popu1 and Popu2 respectively. The X1 and X2 will follow a binomial distribution given that
the sample sizes, m and n, are relatively smaller as compared to the respective population
sizes. Here, the parameter of interest is the difference in the two population proportions i.e.
p1 - p2. An estimator of p1 - p2 is its sample counterpart which is = X1/m - X2/n (the
In this case, X1 ~ Bin (m, p1) and X2 ~ Bin (n, p2) where X1 and X2 are independent random
variables and E(X1) = mp1, E(X2) = np2, Var(X1)=mp1(1-p1) and Var(X2)=np2(1-p2). The
mean and variance of are given by-
Since m and n are both large, X1 and X2 will approximately follow a normal distribution
implying that and hence also follows a normal distribution approximately.
Z= =
If null hypothesis is true then p1 = p2 and let's assume p1 = p2 = p which means that we are
assuming two population distributions to have a common parameter p. In this case, we
combine two random samples into one with sample size equal to m+n with total number of
sample successes equal to X1 + X2. The estimator of p is given by a weighted average of
-
[using, ]
Decision rule: Reject H0 if p-value ≤ α and do not reject H0 if p-value > α and the test is
valid if m , m(1- ), n and n(1- ) are all more than 10.
Example 3: Two groups of dogs, A and B, were formed to test impact of a particular
vaccine to cure anemia . Both the groups consisted of 250 dogs but the vaccine was given
only to group A and not to group B. In group A 190 dogs and in group B 160 dogs recovered
from the disease. Test the hypothesis that the vaccine is able to cure the disease using level
of significance (i) 0.05 and (ii) 0.01.
Step 2: H0 : p1 - p2 = 0 i.e. there is no difference or the vaccine is not able to cure anemia.
250(0.24)=60, n = 250(0.64)= 160 and n(1- )= 250(0.36)=90 are all more than 10 and
Step 7:(a) Critical values of z: For α = 0.05, the value of z such that area to right of it
equals 0.05 is given by 1.645. Similarly for α = 0.01, the value of z such that area to right
of it equals 0.01 is given by 2.33. (since the test is right-tailed and values of z are taken
from the standard normal curve areas table).
Step 8: (a) The calculated value of z = 2.93 is greater than both 2.33 and 1.645 implying
that it lies in the rejection area for both α = 0.05 and 0.01 and hence we reject the null
hypothesis and conclude that the vaccine is effective at both α = 0.05 and 0.01.
(b) The p-value is less than both 0.01 and 0.05 again implying that we reject the null
hypothesis and conclude that the vaccine is effective at both α = 0.05 and 0.01.
We now move to tests concerning a population proportion in case of small samples. When
the sample size n is small, then the variable X (the number of sample successes) will simply
follow a Binomial distribution i.e. X ~ Bin(n,p). The null hypothesis is common to all tests
which is given by : H0 : p = p0 and tests procedure are as follows-
Test statistic: X
Given that H0 is true, X will follow binomial distribution having parameters n and p 0 i.e.
X ~ Bin(n,p0). Now,
Since X is a discrete random variable, it becomes difficult to find value of a such that P(Type
I error) = α and therefore we use the condition [1 - B(a-1; n,p0)] ≤ α to find the critical
value a, since this condition gives the largest rejection area which include the values
(a,a+1,...,n). Then we compare this value a with the observed value of X to conclude
whether the null hypothesis is rejected or not.
Test statistic: X
Given that H0 is true, X will follow binomial distribution having parameters n and p 0 i.e.
X ~ Bin(n,p0). Now,
Again, since X is a discrete random variable, it becomes difficult to find value of a such that
P(Type I error) = α and therefore we use the condition B(a; n,p0) ≤ α to find the critical
value a. Once the value a is found, we compare it with observed value of X to conclude
whether the null hypothesis is rejected or not.
c) Two-tailed test
Alternative Hypothesis: H a : p ≠ p0
Test statistic: X
Given that H0 is true, X will follow binomial distribution having parameters n and p 0 i.e.
X ~ Bin(n,p0). Now,
Again, since X is a discrete random variable, it becomes difficult to find value of a such that
P(Type I error) = α and therefore we use the conditions B(a1; n,p0) ≤ α/2 and 1 - B(a2-1;
n,p0) ≤ α/2 to find the critical values a1 and a2. Comparing these critical values with
observed value of X, we finally conclude whether the null hypothesis is rejected or not.
Now, the following examples illustrate the testing procedures concerning population
proportion in case of small samples.
Example 4: In a particular area, two candidates A and B stood for an election. From that
particular area, it was claimed that 60% of these voters were in favour of candidate A. A
random sample consisting of 25 voters was selected which showed that 13 of these voters
voted for candidate A. Test the hypothesis p=0.60 against p<0.60 at α = 0.05.
Step 2: H0 : p = 0.60
Step 4: Test statistic: X = the number of voters favouring candidate A in the sample (since
sample size is small).
Using the cumulative binomial probabilities table, we have B(10; 25,0.60) = 0.034 ≤ 0.05
whereas B(11; 25,0.60) = 0.078 > 0.05. Therefore the rejection area is given by x ≤ 10.
Step 6: Conclusion- Since the observed value of x is 13 which is greater than 10 does not
fall in the rejection area therefore we do not reject the null hypothesis that 60% of the
voters favoured candidate A.
Example 5: A group of scientists developed a new type of batteries claiming that only 10%
of such batteries have a life span of less than 1800 hours. To check this claim, a random
sample of 15 such batteries was selected and it was found that 7 batteries had a life span of
less than 1800 hours. Test this claim at level of significance 0.01.
Step 1: p = the true proportion of batteries having a life span of more than 1800 hours.
Step 2: H0 : p = 0.10 i.e. 10% of such batteries have a life span of less than 1800 hours.
Step 3: Ha : p > 0.10 i.e. more than10% of such batteries have a life span of less than 1800
hours.
Step 4: Test statistic: X = the sample number of batteries having a life span of less than
1800 hours (since sample size is small).
Using the cumulative binomial probabilities table, we have [1-B(5; 15,0.10)] = 0.002 ≤
0.01 whereas B(4; 15,0.1) = 0.013 > 0.01. Therefore we get (a-1)=5 implying a=6 and so
the rejection area is given by x ≥6.
Step 6: Conclusion- Since the observed value of x is 7 which is greater than 6 does fall in
the rejection area therefore we reject the null hypothesis that 10% of such batteries have a
life span of less than 1800 hours.
The small sample tests in case of difference between two population proportions are rather
difficult as compared to their large sample tests. One such test is called Fisher-Irvin test
which is based on hyper-geometric distribution.
This section presents calculation of probability of Type II error, denoted by β, for population
proportion in case of large samples. To find value of β for given values of p, consider large
sample test of a single population proportion where test statistic is z and suppose we have a
two-tailed test. In that case, the rejection areas are given by z≤-zα/2 and z≥ zα/2 which are
equivalent to -
zα/2 respectively.
This means that H0 will not get rejected in the interval (p0-zα/2 ,
β( = P (not rejecting H0 | p = )
=P( ≤ ≤ |p = )
=P( ≤ ≤ )
=Φ( )-Φ( ).
Example 6: Consider example 1 where H0 : p = 0.85 and Ha : p < 0.85. Suppose that the
medicine is only 80% effective in curing the disease, find the probability of Type II error at
level of significance 5%.
In this case, we have a left-tailed test and n = 400, p0 = 0.85, , zα = z0.05 = 1.645
then,
β( =1-Φ( )=1-Φ( )
This means that, 14.69 % of the times the null hypothesis does not get rejected even
though it is false.
Now, consider large sample test but of two population proportion and again a two-tailed
test. In that case, H0 : p1 - p2 = 0 and Ha : p1 - p2 ≠ 0. The rejection areas are given by z≤-
zα/2 and z≥ zα/2 which are equivalent to -
Let p1 - p2 be the alternative value then the probability of Type II error will be a function of
p1 and p2 and is given by-
Hence, we have-
β( = P[ ≤ ≤ ]
=P[ ≤ ≤ ]
= ≈
=Φ( )-Φ( ).
Example 7: A test on statistics was given in two classes, A and B, each containing 100
students. Let p1 and p2 be the true proportion of students who passed the test in class A
and B respectively. Suppose p1 = 0.9 and p2 = 0.75. Consider the test where H0: p1-p2 = 0
against Ha: p1-p2 > 0. Find the value of β if it is given that (p1-p2) = 0.1 at α = 0.01.
In this case, we have a right-tailed test and m = n = 100, p1 = 0.9, p2 = 0.75, zα = z0.01 =
2.33 then
= = = 165/200 = 0.825
β( = Φ( ) = Φ( )
This means that, 68.08 % of the times we end up not rejecting the null hypothesis even
though it is false!.
s2 =
where Xi is ith observation on random variable X in the sample, is the sample mean and n
The square root of population variance and sample variance are called population standard
deviation (denoted by σ) and sample standard deviation (denoted by s) respectively. The
testing procedures concerning a population variance is based on the null hypothesis that the
population distribution has a particular value of variance, where the population has a normal
distribution. In such a case, the test statistic is called chi-squared statistic given by-
χ2 = =
which follows a chi-squared distribution with degrees of freedom (d.f) equal to (n-1).
Let χ2α, (n-1) and χ2(1-α), (n-1) denote the critical values of chi-squared variable such that the
area to right of it under chi-squared distribution with (n-1) d.f is α and (1-α) respectively.
Now, since the chi-squared distribution is not symmetric χ2α, (n-1) will not be same as χ2(1-α),
(n-1) as shown in following figure 5. In fact χ2α, (n-1) will be greater than χ2(1-α), (n-1).
For example, using the chi-squared distribution table, we can see that for α = 0.01 and n =
25, χ20.01, 24 =42.980 (99th percentile)and χ2(0.99), 24 = 10.856 (1st percentile). So, the tests
concerning a single population variance for a normal distribution are as follows:
Following figure 6 shows the rejection areas for the above mentioned tests.
Example 8: A certain firm tested weights of its randomly selected twenty six machines. The
sample mean and variance of weights were found to be 155 kg and 16.6 kg 2 respectively.
Test the claim that the variance of the weights of machines in the firm is 15 kg2 against the
alternative that it is more than 15 at α = 0.05.
Step 2: H0: σ2 = 15
Step 5: Critical value of χ2 = χ2α, (n-1) = χ20.05, 25 = 37.652 [using the chi-squared
2
distribution table] and rejection area is χ ≥ χ20.05, 25.
Step 6: Conclusion: Since the calculated value of the χ2 = 27.67 < χ20.05, 25 = 37.652 (the
critical value), it does not fall in the rejection area and hence we do not reject the null
hypothesis that the true variance of the weights of machines in the firm is 15 kg2 at α=0.05.
Now, let's consider two normal population distributions, Popu1 and Popu2, with variances
σ12 and σ22 respectively. Suppose a random sample is selected from both these populations
and let m and n denote the sample sizes and s12 and s22 denote sample variances of Popu1
and Popu2 respectively. The tests concerning two population variances, where population
distributions are normal, are based upon the null hypothesis that the two population
distributions have same variances. The test statistic in this case is F-statistic which is a ratio
of two chi-squared random variables. Let χ12 and χ22 be two chi-squared random variables
with d.f say d.f1 and d.f2, then F-statistic is given by-
F= ~ Fd.f1,d.f2
which follows an F-distribution with numerator degrees of freedom d.f1 and denominator
degrees of freedom d.f2.
F= =
which follows an F-distribution with numerator degrees of freedom denoted by (m-1) and
denominator degrees of freedom denoted by (n-1).
The critical value F(1-α), (m-1),(n-1) can be calculated using Fα, (m-1),(n-1) by following formula-
We use F-distribution table to find critical values of F. For example, the critical value of F
when α = 0.01 and (m-1) = 25, (n-1) = 30 is F0.01,25,30 = 2.45 whereas F(1-0.01),25,30 =
F0.99,25,30 = 1/ F0.01,30,25 = 1/2.54 = 0.39 using above formula.
The tests concerning a two population variances for a normal distribution are as follows:
Test statistic: F=
since under Null Hypothesis σ12 = σ22 and F follows an F-distribution with numerator degrees
of freedom (m-1) and denominator degrees of freedom (n-1).
The figure 8 shows the rejection areas for the above mentioned tests.
The following example illustrates the use of F-distribution in case of testing procedures
concerning two population variances.
Example 9: Consider two classes A and B containing 16 and 25 students respectively. The
mean heights of students in both classes were computed and it was found that there was no
significant difference in their mean heights. However, the sample standard deviations in
class A and B were 9 inches and 12 inches respectively. Check whether class B has a higher
variability in heights than class A for α = 0.01.
Step 1: σ12 - σ22 = the difference between true variances of the heights in two classes A
and B.
Step 2: H0: σ12 = σ22 i.e. both classes have same variances of heights.
Step 3: Ha : σ12 < σ22 (left-tailed test) i.e. class B has a higher variability in heights than
class A.
Step 5: Critical value of F = F(1-α), (m-1),(n-1) = F(1-0.01), (16-1),(25-1) = F0.99, 15,24. We can find this
value using the formula- F0.99, 15,24 = 1/ F0.01, 24,15 = 1/3.29 = 0.3039. [using the F-
distribution table] and rejection area is F ≤ F0.99, 15,24.
Step 6: Conclusion: Since the calculated value of the F =0.5625 > F0.99, 15,24 = 0.3039 (the
critical value), it does not fall in the rejection area and hence we do not reject the null
hypothesis that both the classes have same variances of heights at α=0.01.
The p-value is the observed or the actual level of significance at which we reject the null
hypothesis. In particular, we compare p-value with α to check at what levels of significance
null hypothesis is rejected and at what levels it is not. In other words, we follow this
decision rule: reject the null hypothesis if p-value ≤α and do not reject null hypothesis if p-
value > α.
Consider two normal population distributions and a right-tailed F-test where the numerator
and denominator degrees of freedom are (m-1) and (n-1) respectively and Ha : σ12 > σ22. In
this case, the p-value is given by the area to the right of calculated value of F under F-
distribution curve. For example, if (m-1) = 10, (n-1) = 11 and calculated F-value = 2.85
then p-value = area to right of 2.85 under F-distribution curve. Using the F-distribution
table for numerator and denominator d.f 10 and 11 respectively, we see that this area is
0.05 and so p-value = 0.05 in this case. However, here the given value of calculated F
matched the value in the table but what if this value is does not match. In that case we
have following procedure.
Considering again a right-tailed F-test. For numerator and denominator d.f 10 and 11
respectively, the F-distribution table gives the critical values of F for different α as below-
α Critical value of F
0.10 2.25
0.05 2.85
0.01 4.54
0.001 7.92
If the calculated F-value is 2.01 then p-value will be area to right of 2.01 and since in the
table the area right of 2.25 is 0.10, this means that area to right to 2.01 will be greater
than 0.10. Hence, p-value > 0.10. Similarly, if calculated F-value is 2.45 which lies between
2.25 and 2.85 implies that the p-value lies between 0.05 and 0.10. Again, if calculated F-
value is 3.50 which lies between 2.85 and 4.54 implies that the p-value lies between 0.01
and 0.05. If calculated F-value is 6.70 then the p-value lies between 0.01 and 0.001. Lastly,
if calculated F-value is 8.88 which is greater than 7.92 then the p-value < 0.001.
After finding p-value, we compare it with given level of α and conclude whether or not to
reject null hypothesis. For example if 0.05< p-value < 0.10 then we reject the null
hypothesis at α = 0.10 but do not reject it when α = 0.05. Similarly if p-value < 0.001 then
we reject H0 at α = 0.01.
Consider now a two-tailed F-test with (m-1)=10 and (n-1)=11. In this case p-value is the
twice the area to right of larger calculated F-value or twice the area to the left of smaller
calculated F-value under F-distribution curve. In this case, we first find the p-value of say
the larger calculated F-value and then multiply it by 2. For example, if calculated F-value is
6.70 then the p-value here will be - 2(0.001) < p-value < 2(0.01) which gives 0.002 < p-
value < 0.02 and so the null hypothesis gets rejected at α = 0.05 but we cannot say
whether it will get rejected or not at α = 0.01 since we do not know whether p-value is
more or less than 0.01. However, for a given F-test various statistical software gives the
exact p-value and hence we can check the rejection of H0 in that case.
Lastly, for a left-tailed F-test we find the area to the left of the calculated F-value to get p-
value. However, in this case we have to find the left-tailed critical value of F which has to be
computed using the formula F(1-α), (m-1),(n-1) = 1/ Fα,(n-1),(m-1). After which we can easily find p-
value and compare it with given level of α.
PRACTICE QUESTIONS
Q1) A manufacturer supplied fax machines to a particular industry and claimed that 90% of
such machines were in good conditions. To check his claim, the industry took a sample of
250 fax machines and found that 30 machines were defective. Test the claim at α = 0.05
and 0.01.
Q2) Consider example 2. Test the manufacturer's claim p = 0.20 against the alternative that
p > 0.20 at α = 0.05 and 0.01.
Q3) An examination on English was taken in two classes 1 and 2, each having 150 students.
It was found that 120 students in class 1 and 130 students in class2 passed the exam. Test
the hypothesis that the performance of students in class 2 was better than that of class 1 at
α = 0.05 and 0.01.
Q4) Consider example 5 and test the given claim against the alternative that p ≠ 0.10 at α
= 0.05 and 0.01.
Q5) A book store purchased 20 copies of a statistics book. It was claimed that 95% of these
books got sold. Up on testing, it was found that 17 copies were sold. Test the claim at α =
0.05 and 0.01.
Q6) In question 1, it was claimed that 90% of fax machines were in good conditions.
Suppose that the alternative value of p is 0.85, find probability of Type II error at α = 0.05
and 0.01.
Q7) Find value of β in case of example 7 by taking the alternative value of (p 1-p2) as 0.20 at
α = 0.05 and 0.01.
Q8) The life span of certain light bulbs supplied by a company gave a standard deviation of
100 hours. A random sample of 15 such bulbs were tested and it was revealed that the
standard deviation was 105 hours. Test the hypothesis that the standard deviation is
significantly different from 100 hours at α = 0.05 and 0.01.
Q9) A case study was done to rate the soft drink Coke. Two groups, A and B, of 20 people
each were selected to rate Coke on a scale of 0 to 10. The group A gave a standard
deviation of 10 while group B gave a standard deviation of 8. Test whether group A has a
greater variability in rating than group B at α = 0.05 and 0.01.
Q10) Find p-value for the following cases and check whether H 0 gets rejected at α = 0.05
and 0.01.
(i) Left-tailed F-test where numerator d.f = 9, denominator d.f = 12 and calculated F-value
=0.45.
(ii) Right-tailed F-test where numerator d.f = 15, denominator d.f = 7 and calculated F-
value =5.50.
(iii) Two-tailed F-test where numerator d.f = 20, denominator d.f = 15 and calculated F-
value =6.70.
Table of Contents:
1. Introduction
2. Budget Constraint
2.1 Budget Constraint equation
2.2Drawing Budget Constraint
2.3 Working with slope of budget line algebraically
2.4 What alters budget line?
2.4.1 Budget line pivot
2.4.2 Parallel shift in budget line
2.4.3 Non- parallel shift in budget line
2.4.4 Schemes conditioned on quantity of commodity
3. Summary
4. Exercises
5. Glossary
6. Appendix
7. References
Learning Outcomes:
After studying this chapter, a student should be able to:-
1. Introduction
A consumer has to go market, he faces different prices, choose among different
alternatives, he chooses the best bundle he likes, etc. But, wait! What makes him a
consumer? Or from where does he start? Yes, he has some money income at hand to spend
which then follows the pursuit.
Two different persons have difference in incomes; this could alter their tax liability and
hence disposable incomes differ too. These two persons, if residing at different locations,
could face different prices. A consumer with lower level of income could be more happier if
his income commands greater amount of goods & this is possible when prices are lower (
relatively) compared to that faced by other person .
Different permutations & combinations of prices & income will be tried & tested in this
chapter. This chapter basically provides you with one of the apparatuses used in consumer
theory in microeconomics and is called ‘budget constraint’. This chapter is divided into four
sections and further into subsections. First section covers budget constraint algebraically &
in second, it is analyzed graphically. In third section, slope of budget line is calculated &
interpreted. In the last section, changes in budget set are analyzed via prices changes,
income change, taxes, and subsidies & rationing.
2. Budget Constraint
“Budget”! This word is often heard when you ask your parents for some expensive toy,
latest version of smart phone or when you ask for some trip. The reply that you get is ‘It is
not in our budget, this time!”. So, one has basic understanding that, one can’t consume
infinitely any amount of the goods. There is a binding constraint and as above example
makes it clear it is ‘your parents’ Income’; beyond which things become unaffordable.
In this section, we will understand feasible set of goods that can be consumed, given
somebody’s income. Some common notations and concepts would be used, which are
explained below:-
Px . x + py . y
This expenditure need be less than or equal to one’s money income M, so budget set
becomes:
Px .x + py. y M …………..(1)
If one wishes to spend entire income on the two goods, then above equation has to satisfied
with equality & budget constraint / line is
Px .x+py . y = M …………..(2)
Px . x + py . y +pz . z=M
i) How much x a consumer can buy if he spends entire income on x? Answer. Since
y = 0,so x=M/px
ii) How Much y a consumer can buy if he spend entire income on y? Answer, Since
x= 0, y = M/py .
So, vertical intercept is M/py & horizontal intercept is M/Px, & budget line is a line
joining these two points: ( 0 ,M/py) & (M/px,0)
1
See the appendix to this chapter to understand n goods case.
The shaded region in fig.1 is budget set including the line since all consumption bundles like
(x1,y1), (x2,y2) and (x3,y3) are affordable at prices ( px,py ) and with money income, M. In
three goods case, budget constraint Px x+py y+pz z=M; which is a budget plane that can
be drawn in 3 D.
y= - x1 …………..(3)
2
So the slope of budget line is- .The negative sign shows downward slope of budget
P x x1+py y1 =M …………..(4)
2
See appendix to this chapter for calculus treatment of slope of budget line.
Px Δx+py Δy=0
The negative sign signifies the fact that to be on the same budget line, one has to make
changes in such a manner that consumption of x & y moves in opposite directions i.e, if a
consumer increases consumption of good x then consumption of y must fall (since income is
constant; one can’t afford increase in both commodities, simultaneously).
Slope of Budget line is price ratio of the goods or the relative price of one good is terms of
Δy/Δx=-(1/py/px)
A consumer has to give up less of y to add one unit of x if either Py is higher or Px is lower
or both i.e. the relative price of x in terms of y is lower. This means if you give up some
expensive commodity like a laptop (represents “y” here), you can add more of cheaper
commodity like bread (represents ”x” here). Put it other way, you have to sacrifice a smaller
fraction of laptop (not literally) to add a unit of bread loaf.
Budget line pivots to the right from GH to GH1(GH2) if price of good x falls(increases) from
px to ( ). This fall in price of x allows for increased (decreased) consumption of x while
keeping maximum possible consumption of y unchanged. The budget line GH1 (GH2) is
flatter( steeper) than GH.3
Point to remember is that a consumer has greater budget set when price of good falls &
vice-a – versa.
3
To remember slope assume you are on x-axis at points H,H1 or H2 you have to drive up to
point G. Ask yourself, when it most difficult ? Your answer must be when you are at point
H2, so it is a steep road and flattest is x-axis itself.
a) When Money income rises( falls) from M toM1(M2) then budget line shifts from GH
to GH1(G2H2)
b) When a lump sum tax5 is changed from consumer, of the amount T (=M-M1) then
budget line shifts from GH to G1H1.
When a lump sum subsidy of amount S (=M2-M)is given to a consumer then budget
line shifts from GH to G2H2.
c) When prices change proportionately i.e. the ratio P x/py is held constant then also
budget line shifts parallel. If Px and Py increase (decrease), then consumer can afford
less (more) of both goods, then budget line shifts from GH to G1H1(G2H2).
5
In lump sum case, a fixed amount is taken away from income irrespective of consumption
bundle or prices.
G1H1 is flatter implies a lower slope. Let GH’s slope be -Px/py. A lowering of slope means
either Px falls or Py increases or both.6 Now, lets analyze cases in which budget constraint
shifts non-parallely rightward:-
When Px & Py both fall but disproportionately, then more of both goods could be
consumed. If Px/py falls, then change in budget line to G1H1 is depicted in panel A; else if
px/py increases it is depicted is panel B of Fig.5.
b. Px & M Changes
For a rightward shift, M must increase and for slope to lower down px must fall which is
shown by shift from GH to G1H1 in panel A of Fig 5. Otherwise, if Py falls budget line shift
is as shown in Panel B of Fig.5.
c. Py & M Changes
If M increases along with fall in Py, then budget constraint shift outward indicating
greater quantities of both goods & additional increase in y as its price has fallen would
mean a steeper budget line shifted outward as depicted in panel B of fig.5.
6
While changing slope take absolute of –px/py and then analyze steepness or flatness.
When we say income of the consumer is given, he can purchase only out of his
income; we are referring to ‘hard’ budget constraint. Softening of the budget
constraint means that consumer can spend more than his income due to
paternalistic role of the government. This sort of softening is relevant not for
consumer households but also for private firms, NGOs and other economic
organizations.
Also, apart from this support from government is at all times, is expected by
consumer and consumer now behave taking this support into account.
Assume case ‘a’ & you reach G1H1 from GH in a panel of Fig.5 due to disproportionate
change in prices.(1) Assume then, you have extra money income with you which shifts
G1H1 to G2H2 ,(2). If your some money income would have been taken back from your to
for the fall in prices your budget line would shift parallel leftward to G3H3, which is non-
parallel rightward shift of budget line of GH.
For working out, leftward shift reverse all the cases. Start from G1H1 & arrive at GH.
a) Rationing7 Constraint
A constraint could be binding or not. A constraint is said to be binding if could alter your
feasible set. Suppose in any economy good x can’t be consumed more than quality x1 by
any individual. Then, a constraint is binding if M/px>x1 & Consumer’s budget line is lopped
off, as shown is panel A of fig.5. Else wise, if M /px<x1, then budget set is unaltered as
shown in panel B of figure 6.
7
Rationing is any method of allowing a scare product or service other than by price
mechanism.
After x1, since x is costlier you have to give up more of y to get one unit of x. In case of
subsidy, subsidy is given up to x1 change in budget line would be same just – with this
assumption, that px1 is original price and px is subsidized price.
Summary
Budget set is the set of all consumption bundles that are affordable at the ongoing
prices in the market; given a consumer’s income.
8
It could be of any amount and need not only be equivalent to Px.
Slope of the budget constraint is negative showing that a consumer has to substitute
one good for another. The rate at which consumer is willing to substitute good y for
good x is relative price of good x to good y and is denoted by -px/py.
If price of either or both good falls then consumer can consume more as purchasing
power increases and hence, budget set enlarges due to fall in price(s). Price changes
can change the slope of budget constraint unless both goods’ prices changes
proportionately.
Similarly, if income rises consumer has enlarged budget set. But change in income
alone does not change the slope of budget constraint.
A tax on commodity is viewed by consumer as price rise and subsidy as price fall. So
taxes and subsidy treats budget constraint in the same manner as price change.
Lump sum tax or subsidy is like decrease or increase in income respectively and
hence alters budget constraint like income change.
Various marketing schemes and government’s rationing schemes could also change
consumer’s budget constraint.
Exercises
Q1. There was a consumer Abel who resided in a country where people only
consumed Pepsi and Burger. The price of Pepsi was $1.5 per bottle and price of
Burger was $ 2 per unit. Abel’s income was $ 30.
Q2. In a food stamp program in a country, food coupons upto 5 kgs. of grain
amounting to Rs.250 are given for Rs. 100. After a consumer consumes this limit,
there are no coupons for him and he pays market price for the grain. Assume all
other goods on the other axis and its price Rs. 1. Draw a person’s budget constraint
before and after the food stamp program, if his income is Rs. 1000.
a. Government announces lump sum tax’ T’, quantity tax on good x of ‘t’ and
quantity subsidy on good y of ‘s’.
b. Price of good x doubles, the price of good y becomes four times larger and
income become eight times larger.
Glossary
Budget set: Budget set is the set of all consumption bundles that are affordable
at the ongoing prices in the market; given a consumer’s income.
Budget constraint: Budget constraint is a line showing (locus of) all
affordable bundles at which entire income is spent.
Budget line pivot: A budget line is called pivoted when slope of budget line
changes & purchasing power stays constant.
Lump sum tax(subsidy): In lump sum case, a fixed amount is taken away
from income irrespective of consumption bundle or prices.
Quantity tax(subsidy): . If a tax is levied on quantity of x consumed then it
is called quantity tax. Thus = Px + t, where ‘t’ is tax per unit x.
Ad valerom tax(subsidy): If a tax is levied on value (price) of good x then
it is called ad valorem tax.
Rationing : Rationing is any method of allowing a scare product or service other
than by price mechanism.
Appendix
Dot product of P & X gives total expenditure which is equated to money income M on budget
constraint:
PX=M
Px Δx+py Δy=0
On rearranging we get,
Δy/Δx=-px/py
References:
Hal R. Varian, Intermediate Microeconomics: A Modern Approach, W.W. Norton and
Company/Affiliated East-West Press (India), 8th edition, 2010.
Table of Contents:
1. Introduction
2. Preferences
2.1 Axioms of Preferences
2.1.1 Complete
2.1.2 Reflexive
2.1.3 Transitivity
3. Utility
3.1 assigning utility
3.2 Positive Monotonic transformation
3.3 total and marginal utilities- one good case
4. Indifference Analysis
4.1 Well behaved Indifference curve
4.2 Properties of Indifference curves
4.2.1 Negatively sloped
4.2.2 Thinner lines
4.2.3 Convexity
4.2.4 Indifference curves don’t intersect
4.3 Well behaved preferences and Indifference curves
4.4 satiation point
4.5 Marginal Rate of Substitution
4.6 Special cases of preferences and their indifference curves
4.6.1 Neutral good
4.6.2 Perfect substitutes
4.6.3 Perfect Compliments
4.6.4 Cobb Douglas Preferences
4.6.5 Discrete goods
5. Summary
6. Exercises
7. Glossary
8. Appendix
9. References
Learning Outcomes:
After studying this chapter, a student should be able to:-
1. Introduction
Consumers are rational decision makers in the sense they know their preference over goods
and buy goods that get them maximum satisfaction. In this chapter we will assume
consumers to be rational and study preferences and utility derived from various bundles of
commodities. Only difference between actual consumer in the market and our study is that
here we would have calculus applied to the situations consumer faces in the market.
This chapter is divided into three sections. First section covers axioms of preferences. In the
second section, utility function is derived. In the last section, indifference curves are
detailed at length.
2. Preferences
If I ask you, what would you like to have in dinner: fried rice or chowmien? I would receive
three of these answers: (a) fried rice, (b) chowmein, (c) either of these. Irrespective of your
answer choice, I would conclude something about your preferences either you like fried rice
over chowmein or chowmein over fried rice or like both equally. As in this example
commodities are ordered/ranked so could be bundles of commodities.
Let us understand the notation used for it. ’>’ symbol is used when one bundle is strictly
preferred over other.’~’ symbol is used when two bundles give equal level of satisfaction to
the consumer. When consumer prefer or is indifferent between two bundles over other and
≥ symbol is used to compare two same bundles.
2.1.1 .Complete
Axiom of completeness states that two bundles can be compared. Assume two bundles 1
and 2; both comprising of gods x and goods y.
2.1.2 Reflexive
If a bundle is as good as itself then it follows reflexive. For example: a cold drink bottle on
left is as good as on the right. Both the bundles contain one cold drink bottle. For kid, this
sort of assumption could seen invalid but not for adult, at least.
2.1.3 Transitivity
If bundle 1 is preferred over bundle 2 is preferred over bundle 3. Suppose you prefer
studying in India over US and US over Australia, then it seen that given all three choices,
you must choose India
Example 1
Consider the following binary relation defined over where x is set of human beings. Check if
each of these relations satisfies reflexivity, completeness and transitivity.
A is at least as tall as B could be used in first & third instance or B is at least as tall as a
can be used is second & third instance. Since, this relation helps is comparing A & B. This
relation is complete.
This relation is not complete since is third case of same height this relation last be used. It
is not reflexive since A can’t be taller than her/herself.
If A is taller than B & B is taller than C then it follows that A is taller than C. So this
relation is transitive.
3. Utility
In the last section, consumer preferences were
Did You Know??
discussed, where in at a time two consumption
Utility is a concept that was
bundles were compared & ordered (ranked).
Indifferent, strictly preferred and weakly preferred,
introduced by Daniel
all are binary relationships defined over bundles of Bernoulli. He believed that
commodities. It would be easy if we cold assign for the usual person, utility
numeric values to the bundles in such a manner increased with wealth but at
that preserve the ordering of bundles.1 Preserving a decreasing rate.
the order of consumption bundles, refer to
assigning higher value to preferred bundle than
less-preferred bundles.
U: (x,y) →
a) Ordinal Utility
As the name suggests, ordinal is, only order mattes. Utility can be assigned to
different consumption bundles irrespective of the magnitude, as long as the ordering
of preference is maintained for a particular consumer. For ex: A consumer prefers
Bundle 1 to bundle 2 then, Utility assigned to bundle 1 can be 1, 10 or 1000
provided that every time bundle 2 has “7” and bundle 3 has 90 , then it could be
inferred that bundle 3 is most preferred bundle, nest to it is bindle 1 and least
preferred one is bundle 2 . But it cannot be inferred from the above information
that bundle 3 is 9 times better bundle than bundle 1.
b) Cardinal Utility
There are economists who hold that magnitude of utility is of significance. This is the
known as cardinal utility assignment to consumption bundles.
If is above example, it cold be said with precision that utility of bundle 1 is 10 and of
bundle 3 is 90 : then it means consumer like bundle 3 times be drawn if consumer is
ready to pay nine times the price of bundle 2, for bundle 3.
1
Here axiom of transitivity should be followed for consistency of values assigned.
Example:
1 9 3000 18 -18
2 10 4000 20 -20
3 11 5000 22 -22
In column 2, bundles are given utility in two digit column 3 bundle 1-3 are assigned utilities
in thousand but order of preference is bundle 3,2 and 1.U 1 and U2 are ordinal utilities where
preference of order is important. In column 4, positive monotonic transformation of U1
function is computed by doubling U1. This does not alter the order of preferences of bundle
1-3.But a negative monotonic transformation represented in column 5 by (-2 U1), reverse
the order of preference. Least preferred bundle 1 becomes most preferred; as -18 is greater
than -20 and -22.
When more of good is consumed ,total utilities goes on increasing .Ask yourself, would you
prefer two pair of shoes or one ?obviously , your answer would be two. But, when you buy
first shoe its utility is highest became that one is first one in your wardrobe. Another one
Later, the marginal-utility theory of value resolved the paradox. Water in total is much
more valuable than diamonds in total because the first few units of water are necessary
for life itself. But, because water is plentiful and diamonds are scarce, the marginal value
of a pound of diamonds exceeds the marginal value of a pound of water. It can be
assumed that for water we are at a point on MU curve at higher quantity and for
diamonds we are still at a point with less of Diamond (i.e. commodity on x-axis) is
consumed.
have some utilities but obviously less than first one and so on .These two argument are
consistent since former is in relation to total utility and latter to marginal utility. Total utility
always increase with increase in number of units consumed. While addition made to this
total utility falls.
x2-x1
If x2>x1, then U(x2)>U(x1), hence marginal utility is positive. Though it falls if x2 increases
to x32 that is MU’ is smaller than MU measure, where MU’ is
given by
MU'=U(x3)-U(x2)
x3-x2
There could be various consumption bundles amongst which a consumer is indifferent .For
example two serving of rice with three chapattis could give same satisfaction as three
serving of rice with two chapattis .All such a combination give same fixed level of utility to
2
Please see appendix to this chapter for the calculus treatment of it.
Let us assume that the above curve is locus of all consumption bundles at which consumer
has some constant utility throughout. There are upward sloping and downward sloping
sections3 of our assumed indifferent curve. A positively sloped indifferent curve would mean
that more of both commodities like at point B and less of both commodities like at point C
3
BC ,DE are upward sloping section and AB,CD, and downward sloping section of indifference curve here.
both commodities; give same level of utility to consumer. But is it consistent? No, strictly
more of both commodities should add utility for the consumer.4
So, now our indifferent cure would not be upward sloping. We can now have only indifferent
curve which is downward sloping .A downward sloping indifferent curve mean that as we
move along the indifference curve from point C to point D (in figure 3) would mean more of
goods x with less of goods y. So good y is sacrificed so that addition made to utility from
extra consumption of x, is equivalent to loss of utility due to reduction of y. CDE is
negatively sloped 5 but with D as point of inflexion6.
CD section of assumed indifference curve is convex to the origin while DE section is concave
to the origin.
Now, next have to examine whether indifference cure need be convex or concave or could
be both.Let us examine concave section DE, first. In figure 4, on x-axis change in units of x
is assumed to be 1 unit and change in y is due to change in x.
4
Though addition made would be declining (due to diminishing marginal utility), but both add to positive utilities.
Unless either good becomes ‘bad’, this is discussed in sections to come.
5
You can check its slope as negative or downward by drawing tangent at various points.
6
Point of inflexion is a point at a curve from where the curvature of a curve changes.
To increase consumption of x by 1 unit, change in y (along DE) goes on increasing. But from
our knowledge of marginal utility, marginal utility of a good is high at lower levels of
consumption and low at higher levels of consumption. So as we move from point 1 to 2 and
further towards E ; loss of marginal utility from reduced y is greater than additional of
marginal utility from increased x. The reason for the same being y is lost at greater pace
than addition made to x .So, indifference curve cannot be concave7.
Indifference curve, so assumed in figure 1 had a thicker section GHI reconstructed here. It
can have bundles 1 and 2 and many such bundles. Here bundle 2 contains more of both
goods compared to bundle 1 and hence have higher utility.8 So, assumption that these
points are on same indifference curve is violated.
7
Concave indifference curve can be observed and will be explained in coming sections.
8
This property is known as assumption of monotonicity.
So, indifference curve sections excluded from our assumed shape of indifference curve are
4.2.3 Convexity:
Indifference curve must be convex. Convexity would imply that fall in y- accompanied with
increase in x- should get smaller as more of x is added. Look at the following two diagrams
and the explanation that follows.
At point 1 on Indifference curve I0 ,x1 of x and y1 of y is consumed that yields utility of level
I0. Panel B of figure 6 shows corresponding utility achieved from these consumptions. When
one moves to point A, consumption of y decrease to y 2 and of x increases to x2.This leads to
increase in utility from TU to TU1 so MU1 is added. To compensate, there should be greater
fall in y (more than one unit) since fall in one unit of y decreases utility by MU 3(which is less
than MU1). So, y must fall by two units (since MU3 + MU2 = MU1). Further, if x is added then
TU3 is achieved and change is MU2 but now y must fall by less than a unit since again fall in
utility due to y (MU1) would be greater than rise in utility from x (MU2). A change in y
should be brought such that loss in utility equals MU2.
The second assumption for well behaved preferences is that averages are preferred to
extremes. Let (x1, y1) & (x2, y2) be on the same indifference curve in fig. 10. Then average
of these two extreme boundless lie above the curve. And, this average would be in a weakly
preferred set iff indifference curve is convex. Consider other (non convex) cases in panel b
& c of fig 10.
Figure 10
Highest level of utility is achieved is at ( 6,2) which is known as bliss point ( fig.12)
Indifference curve is downward sloping in quadrant ‘I’ showing that one good has to be
sacrificed to add extra unit of good to keep utility constant
In ‘II’, Good x (chapatti) becomes ‘bad’ i.e. add to disutility (you can think beyond this he
can have digestive order). But good on y – axis, tablets is yet a ‘good’. Indifference curve
is reproduced again in fig14. The disutility from addition of x is to be compensated by utility
from good y. Indifference curve I2, denotes higher utility than I1 and I0. At I2, y1 level of y
is combined with smaller level of bad x compared to Io, hence I2 shows greater utility.
For Scooby and Shaggy facing moster is a bad…. So to keep them at same utility velma needs
to offer Scooby Snax(good).
In ‘III’, good on y – axis becomes ‘bad’ & good x is yet ‘good’. The indifference curves are
reproduced in fig.15.
Figure15 y is ’bad’
In ‘IV’, both goods become ‘bad’. Indifference curves are negatively sloped but direction of
maximal increase is towards origin & indifference curves are concave to the origin as shown
in fig 16. The indifference curves are circles with satiation point at the centre.
The slope of indifference curve9 is known as marginal rate of substitution. The slope
indicates change in y when x changes by 1 unit. For indifference curve, it would mean that y
is substituted for x in a way that utility is held constant. And, as we saw in earlier sections;
at smaller levels of y, smaller sacrifice would be made as loss in utility from sacrificing 1
unit is high and at greater levels of y, larger sacrifice could be made. Hence, marginal rate
of substitution is diminishing10 along the indifference curve (assuming convexity and
monotonocity). Figure17 shows Δy is declining11 for 1 unit addition in good x.
9
Slope of indifference is different at at different points of indifference curve.Only for a straight line slope is
constant.
10
It is the abosulate number which is reffered here.
11
Also see appendix to this chapter for calculus treatment of it.
4.6.2Perfect substitutes
Two goods are substitutes if the consumer is willing to substitute one good for another at a
constant rate. Constant marginal rate of substitution means that indifference curves are
straight lines.
One ‘Ten rupee bank note’ is perfect substitute of two ‘five rupee bank note’. If ten rupee
note is denoted by y & five rupee note by x: then utility U = x + 2y.
Figure19
The trick to remember that what should be coefficient of x & y is utility function is: since, 1
unit of y gives twice the utility compared to 1 unit of x. Likewise , if ‘a’ units of y gives same
utility as ‘b’ units of x then utility function becomes: U = ax+by and MRS =-a /b
Indifference curves for perfect complements are hence L–shaped with kink at 45˚ line
(because of 1:1 ratio). Slope at vertical portions is infinity as Δx (denominator) is zero &
slope at horizontal portion is zero as Δy (numerator) is zero. If two teaspoons of sugar is
added to one cup of tea then indifference curves are as in panel A of fig 21. For a general
case, where ‘a’ units of x are consumed with & ‘b’ units of y indifference curves are as
shown in panel B of fig.21.
Figure 21: (a) 2 tps of sugar is taken with 1 cup of tea; (b) ‘a’ units of x is taken with ‘b’ units
Institute of Lifelong Learning, Delhi University
of y
Preferences And Indifference Curves
& U = Min { bx, ay}; if ‘a’ units of x are consumed with ‘b’ units of y.
MRS=-MUx/MUy =-axa-1yb/bxayb-1=-ay/bx
U(x,y) = v(x) + y. Here utility is equal to the height of indifference curve along the y-axis
i.e. when x is zero, utility is equal to consumption of y.
The indifference curves are just vertically shifted versions of one indifference curve.
The dashed lines connect indifferent bundles (though consumption at other than dots are
not possible). Strong lines show weakly preferred to (x 1,y1)
Summary:
(1)Three axioms of preferences viz. completeness, reflexivity and transitivity are made
about the consistency of consumers’ preferences.
(a) Negatively sloped, (b) convex to the origin, (c) curves are thin, (d) indifference curves
do not intersect and (e) higher the indifference curves higher the utility.
(4) Beyond a point /level of consumption of any good, disutility is generated i.e. too much
of a good or negative utility. At this level, utility is maximum and such a point is known as
satiation point.
(5)Well behaved preferences exhibit declining marginal rate of substitution i.e. sacrifice of y
each time when x is increased; keeps on declining.
(6) Indifference curves that are convex (but do not general smooth curve with declining
MRS) are for perfect substitutes, perfect compliments, and quasi-linear preferences and
even for discrete goods.
(6)Goods can be good, bad or neutral. If a good is good then it adds to utility when its
consumption is increased else add to disutility (negative utility) if bad and zero utility if
neutral.
Exercises
Q1. Sumit has Cobb –Douglas preferences and his utility function is given by U(x,y) =xy.
State true or false about the following statements considering sumit’s preferences:
a) (10,5)~ (5,10)
b) (20,5) ≥ (4,25)
c) (15,4) ≥ (7.5,7.5)
Q2. The marginal rate of substitution for sumit with utility function is U(x,y)=xy; is given
by:
a) x/y
b) y/x
c) x2/y2
d) y2/x2
i) x+y a) 1.5
ii) x+√y b) 1
iii) x0.5y0.5 c) 6
Q4. If the utility function is U(x,y) = x2-y2. Then what can you conclude about nature of
these goods:
Q7.If both good x and y are bads, draw and explain consumer’s indifference curves.
i) (10,20)~(20,10)
ii) (20,10)>(15,15)
a) U(x,y)= xy
b) U(x,y)= x2+y
Glossary
Indifference curve: A curve showing the locus of combinations of the amounts of
two goods such the consumer is indifferent between any combinations on that curve.
Marginal rate of substitution: It refers to the amount of one good that is required
to compensate the consumer for giving up an amount of another good such that the
consumer has same level of utility as before.
Perfect compliments: Perfect complements are goods that are consumed together
in fixed proportions.
Bad good: A good is said to be bad if addition in its quantity creates disutility even
at lower levels of consumption.
Neutral good: A good is said to be neutral if any quantity of a good can be added &
no change in utility is brought about.
Well behaved preferences: preferences are well behaved when indifference curves
are negatively sloped and convex to the origin.
Appendix
At various heights say α, a disc is seen (with point A' on it). Assume, this disc drops down
on x y plane; which then appears like in fig.A3. C' is the Bliss point where utility is
maximum. C' corresponds to point C in xy plane, which is satiation point.
U = f (x,y)
ΔU= Δx+ Δy
ΔU=MUx Δx +MUy Δy
Or, Δy/Δx=-MUx/MUy
As x increases MUx falls and as y declines MUy increases. Both of which imply that this
fraction falls as x increases. Hence we witness diminishing marginal rate of substitution
along the indifference curve.
References:
Hal R. Varian, Intermediate Microeconomics: A Modern Approach, W.W. Norton and Company/Affiliated
East-West Press (India), 8th edition, 2010.
www.wikipedia.org
Table of Contents:
1. Introduction
2. Optimization
2.1case of well behaved preferences
2.1.1 Diagrammatic treatment
2.1.2 Algebraic expression
2.2 Interior solution and boundary optimums
2.2.1 Kinked preferences
2.2.2 Perfect substitutes
2.2.3Nuetrals and bads
2.2.4 Concave preferences
2.2.5 Cobb Douglas Preferences
3. Demand
3.1 Well behaved preferences
3.2 Perfect substitutes
3.3 perfect compliments
3.4 Giffen goods
4. Engel’s curve
4.1 Well behaved Indifference curve
4.2 inferior goods
4.3 perfect substitutes
4.4 Perfect compliments
4.5 Cobb Douglas Preferences
4.6 Homothetic Preferences
4.7 Quasi linear preferences
5. Summary
6. Exercises
7. Glossary
8. Appendix
9. References
Learning Outcomes:
After studying this chapter, a student should be able to:-
1. Introduction
Can you recall when you were given pocket money at the age of 7 or 8! You always knew
how to utilize that money. Either kids at that age would spend on cola, ice-cream, or
whatever toys one wanted. But I ‘m sure you must have chosen whatever must have
brought you joy and satisfaction. You didn’t know what optimization was, what
microeconomics technique to be applied and what conditions were to be met. But you were
genius who did optimization subconsciously and actually every consumer does.
In this chapter, we will deal with optimization formally. This chapter is divided into three
main sections. First section covers optimization’s conditions for various preferences. In
second section, demand curve of a good for a consumer is desired. In last section, impact
of change in income on optimal quantity of good is analyzed.
2. Optimization
Optimization in context of utility would mean maximizing utility given the budget set. Given
any level of income, M and prices & good x & y as p X & pY a consumer maximizes his utility
by choosing a consumption bundle that gives him highest satisfaction. This bundle choice is
dependent on consumer’s preferences. It is obvious to assume that consumer is happier
consuming good x (relatively to good y), then his optimal consumption bundle would have
more of good x. But, this consumer’s choice is also affected by price of good x in the
market and is constrained by his income. Hence, optimal choice is decided by nexus
between budget set and preferences.
curve is locus of all consumption bundles which yield some constant level of utility. Budget
set and indifference map have same x-axis and y – axis labeling as x-good and y-good,
respectively. Lt us super impose indifference curves on budget set, like in
Figure1
A Few affordable bundles given some income, ‘M’ are marked in above figure. Point A,B and
C yield utility U0 and likewise points F,G, H yield utility U1 & point E yield U21.
Amongst all such affordable bundles, the point that maximized utility is point E that gives
utility U2. There are two remarkable things that point ‘E’:-
= MRSxy=
1
Assumed here that U2>U1>U 0and ‘0’,’1’&’2’ subscripts create correspondence between indifference curve with
their respective utility level.
Slope of budget line measures that rate at which market is willing to substitute good y for
good x. The above equation implies that rate of substitution in a market should be equal to
marginal rate of substitution of two good by a consumer.
Max : U(x,y)
- λ Px =0 …..1
- λ Py=0 ….. 2
-(px x +py-M)=0 …… 3
2
optional
3
λ is lagrange multiplier and here it becomes ratio of befits to cost. Additional benefit from each good is
MU and cost is its price. So, condition implies that marginal benefit to cost ratio must be equal for all
goods.
Gossen's laws, named for Hermann Heinrich Gossen (1810 – 1858), are three
laws of economics:
Gossen's First Law is the “law” of diminishing marginal utility: that marginal
utilities are diminishing across the ranges relevant to decision-making.
Gossen's Second Law, which presumes that utility is at least weakly quantified,
is that in equilibrium an agent will allocate expenditures so that the ratio of
marginal utility to price (marginal cost of acquisition) is equal across
all goods and services.
where
is utility
is quantity of the -th good or service
is the price of the -th good or service
Where, U’x is marginal utility of x and U’y is marginal utility of y. It is same equilibrium
condition required for optimal choice bundle, which was attained in last section. Equation 3
implies that this choice bundle (x,y) must end up entire income.
act as solution to optimization problem. But technique of calculus is of no use in such cases.
We will analyze them in this section.
Figure 2
When two goods are perfect complements, then indifference curves are L shaped.
Indifference curve for perfect complements also have kink.
Figure 3
Optimal bundle for perfect complements indifference curve is where kink touches the budget
line like at point E in figure 3 .Since at kink slope cannot be calculated so there has to be
alternate method to complete optimal bundle.
Consider a consumer’s preferences that ‘a’ units of x are consumed with ‘b’ units of y .The
indifference curves would appear as in figure 4.The kinks would be on line OA whose slope
is b/a .Origin is also one point and one indifference curve is an L at origin (that is x axis
and y axis itself is an indifference curve).
Figure 4
Optimal point is hence at intersection of line OA and budget line. Budget line is given by
pxx+py y=M and line OA’s equation is y=(b/a) x. Solving these two equations yield :
figure 5
Figure 6
This would mean that consumer is willing to substitute y for good x at greater pace .This
mean consumers values good x more than good y. since two goods can be substituted
easily (perfectly) and optimal choice would be at point E1 in panel (1) of figure 6;where
consumer consumes all x and zero units of good y.( ,0) is boundary optimum.
This case is just reverse of above discussed case. It is depicted in panel (2) of figure 6 and
in such case consumer consumes all y and nothing of good x. (0, ) is boundary optimum.
In such a case, one indifference curve overlap budget line and hence all points starting from
(0, ) and in between and including ( ,0) are optimal.
Let us write down demand function of x when goods x and y perfect substitutes as follows:
{
when >
0 when <
X=
Figure 7
Now let good y be bad and good x as good. Then consumer has highest utility when bad y is
not consumed. This is depicted in figure 8.
Figure 8
In both the cases, boundary optimum is achieved, where all income is spent on good and
nothing on bad or neutral commodity.
Figure9
xcyd)=c xc-1yd
(xcyd)=d xc yd-1
MRSxy=
Px x* + Py x* =M
( )Px x* =M
x* = ( )
y*=
In case of Cobb Douglas preferences, share of income (M) spent on good x (P xx) is equal to
the ( ). Hence, fraction of income spent on either good is fixed. The size of this fraction is
determined by the exponent (of quantity of that good) in Cobb Douglas function. In two
goods case, hence, it is better to assume that c+d=1. This assumption makes it clear that
income is spent on these goods with some weights given by respective exponents of units of
goods in utility function.
3. Demand
Demand function shows the relationship between price and quantity demanded. For a
normal good, there exists negative relationship between and price and quantity demanded.
In this section, we will analyze and derive demand curve in case of consumer’s different
preferences.
In panel (i) of Figure 10, when price of good x falls from P 1 to P2 and then to P3 (while
holding price of y constant), budget line shifts from GH to GH1 to GH2. The optimal bundles
are marked E0, E1 and E2, respectively. With the fall in price of good x consumer has
enlarged budget set and hence, more of good x can be consumed.4 The quantity demanded
rises from x1 to x2 to x3 which corresponds to prices P1, P2 and P3, respectively. Connecting
all the optimal bundles lead to construction of price offer curve. This curve shows bundles
that would be demanded at different prices of good x. In panel (ii) of figure 10, we trace
down quantities and plot these quantities against their respective prices. We get demand
curve which is downward sloping i.e.
The price and quantity demanded of that good move in opposite direction cetirus peribus (Py, M and
consumer’s preferences are held constant).
which means slope of indifference curve is greater than slope of budget line and hence only
good x will be demanded. If price of good x rises above , only good y will be consumed
and zero quantity of good x is demanded. When price of good x is equal to any quantity
i. Budget Line
GH1, since
4
Fall in price leads to two effects. First, purchasing original bundle leaves consumer with some extra income at
hand and second, fall in price of x makes it cheaper and sometimes consumer consume more of it in place of good
y. This will be discussed in chapters to come.
all bundles on this budget line are optimal i.e. when price of good x is .
ii. X-axis when price of good x has fallen below , only good x is demanded.
Figure 11
Again plotting down the quantities against respective prices yield demand curve in panel (2)
of figure 11. Above , zero units of good x are demanded; at any quantity between zero
and M/p*x can be demanded & if price falls further more of (only) good x is demanded.
Figure 12
Price offer curve is the line joining all the kinks of indifference curves starting from origin.
For ‘a’ units of x with ‘b’ units of y example, we computed optimal bundle x*=
= .a =
Or <0 (since M is always positive and rest all terms are squared)
The demand curve then for giffen good is positively sloped. These type of goods are
exception to law of demand.
4 Engel curves
Engel curve shows the relationship between demand for a good and income of the
consumer. For a normal good, one can argue that there exists positive relationship between
the two. But there are goods whose demand falls when income of the consumer goes up.
Such goods are known as inferior goods.
income increases, the entire addition to income is used up to consume good x. So when
income is M1 then x1= and then income increased to M2; x2 = .
x1 and x2 are boundary optimum shown as bundles E0 and E1 in panel (i) of figure 16. Panel
(ii) of figure 16 depicts Engels curve. Slope of Engel ´s curve is calculated as follows:
= = =Px
Or >0
Figure 17 Figure 18
x* = .
Again, differentiating this equation with respect to Δx* and upon rearranging, we get:
Px =
Assuming c+d =1 the reason for which was explained earlier, = ; which is slope of
Engel’s curve.
Figure19
This would mean Income offer curve is a straight line joining (x 1,y1),(2x1,2y1) ,(3x1,3y1) and
so on. Where (x1,y1) is optimal bundle when income is M and (2x1,2y1) is when income
doubles is optimal and likewise.
on how much a consumer x and not on ratio (y/x).If consumer’s income is M1 and optimal
bundle is (x1,y1) and now if income increases his optimal bundle becomes (x1,y1+k) for any
constant k.
Figure 20
The example of such a good is salt. Even when income is added there is no increase in the
quantity of salt demanded. You spend addition to income on all goods but salt. Hence there
is ‘zero income effect’.
Summary
For solution to utility maximization problem, it requires that indifference curve is
tangent to budget line or equivalently slope of the two are equal. When indifference
curves have kink, the kinked point should touch budget line for optimal solution.
For a normal good, law of demand operates and quantity demanded moves in
opposite direction of –in response to- prince change. Demand curves are negatively
sloped in all cases but Giffen goods.
For a normal good, change in quantity demanded is positively related to the change
in income of the consumer and hence, Engel curve is positively sloped in all cases
but inferior goods.
Exercises
Q1. a) If a consumer has a utility function U(x,y)= x1y4, what fraction of his income will
he spend on good y?
b) If prices are Px and Py and income, M; what will be consumer’s optimal choice
bundle?
Q2. Suppose that a consumer always consumes 2 spoons of sugar with 1 cup of tea and
their respective prices are Ps and Pt and consumer has m rupees to spend on sugar and tea.
How much will he demand?
c) Solve for optimal choice bundle if prices are Px and Py and consumer’s income is M.
Q4. Henry is currently consuming only Coke and Pizza. At his current consumption bundle
marginal utility of Coke is 10 and that of Pizza is 5. Each Coke costs Rs.2 and each Pizza
costs Rs.10. Is he maximizing his utility? Explain. If he is not, how can he increase his utility
while keeping his expenditure constant?
Q5. Assume good x is inferior. Draw income offer curve. Is it possible, even good y is
inferior? Explain.
Q6. Madhu views Pepsi and Coca-cola as perfect substitutes. The price of 750 ml bottle of
Pepsi is Rs. 10 and price of 750 ml bottle of Coca-cola is Rs.12. what does Madhu’s Engel
curve for Pepsi look alike? By how much her Budget should increase so that she can
consume one more unit of Pepsi?
Glossary
Optimal choice: It is optimum when it is the best state of affairs and choice which
is optimum is called optimal choice.
Price offer curve: The locus of all consumer equilibria when price changes is known
as price offer curve.
Demand curve: Demand curve is curve showing the negative(for normal good)
relationship between price and quantity demanded by consumer.
Giffen good: In case consumer violates law of demand, and for a good positive
relationship between price and quantity demanded by consumer is observed then
that good is called giffen good.
Income offer curve: The locus of all consumer equilibria when income of consumer
changes is known as income offer curve.
Engel’s curve: Engel’s curve is curve showing the positive (for normal good)
relationship between income and quantity demanded by consumer.
Inferior good: : If for a good negative relationship between income and quantity
demanded by consumer is observed then that good is called inferior good.
References:
www.wikipedia.org