You are on page 1of 42

Mean AVERAGE()

MODE MODE.SNGL()
MEDIAN MEDIAN()
QUARTILE QUARTILE.INC()
INTERQUARTILE QUARTILE.INC(data array,1)- QUARTILE.INC(data array,3)
covariance covariance.s(if data is sample of population)/ covariance.p(if
data is total population)
Range:-infinty to +infinity
corelation CORREL()
Range:-1 to +1
Normal distribution =NORM.DIST(x, mean, std, TRUE).It will give results left of the
curve means less than a certain value.To get the value for right
of the curve means higher than a certain value.we need to 1-
NORM.DIST(x, mean, std, TRUE). As the whole probability Is 1

What we will do when we need to calculate probability between


two values.
Prob(X<Defective<Y) = NORM.DIST(X,MEAN,SD,TRUE) -
NORM.DIST(Y,MEAN,SD,TRUE)

Prob(Defective < 75) = Less than 75


Prob(Defective ≤ 75) = Less than or equal 75
The two probabilities are the same
…because Prob(Defective = 75) = 0

NORN.INV When the percentage is known it will give the value


= NORM.INV(probability, mean, std)
It will give a value left(less) of which probability is given.
BINOM.DIST =BINOM.DIST(x, n, p, FALSE /TRUE)
Fasle when P(successes = x); True P(successes ≤ x)
POISSON.DIST =POISSON.DIST(x, lemda, FALSE/TRUE)

T.DIST(x, df, TRUE) =T.DIST(x, df, TRUE)

T.INV(probability to the left,df) =T.INV(probability to the left,df)


=T.INV(α/2, n-1);df= sample size -1
CONFIDENCE.NORM =CONFIDENCE.NORM(α, σ, n)

CONFIDENCE.T = CONFIDENCE.T(α, s, n)

Differnce mean hypothesis testing Creat pivot table>data analysis/unequal variance/paired two
samples
Difference mean hypothesis Data analsis >t test paired two sample for mean
testing

Standard deviation STDEV.S / STDV.P


Rule of thumb 95% data will be in the range of 2 standard deviation from the
mean
Variance (standard deviation)^2

(1-(1/k^2))th Atleast 75% data will lie in between k=+- 2 SD of mean

measures of association in a set of numbers, the relationship between two or more


variables.
Measures of central tendency

measures of dispersion

Random Variable A random variable is a variable that takes on values determined


by the outcome of a random experiment
Discrete distribution Discrete data:height,distance traveled
the are infinite amount of possibility.because it can be 1
km/1.1km/1.111km etc.only limited by the instrumental
measurement
Continuous distribution Amount of water

Probability mass It is a rule that assigns probability value to a random variable


function/probability density
function
T distribution is also a symmetric distribution centered at zero.
. That is, it has a mean equal to zero. the spec of this distribution
depends on a parameter called the degree of freedom,
commonly denoted by df. As the degree of freedom increases,
the t distribution becomes closer and closer to the standard
normal distribution.
For population mean

For population percentage

Hypothesis testing( difference in Connected by some means


mean) No connection
T test paired
T test equal/unequal varience
Sample size calculation

In all cases P= 50%


Hypothesis testing(normal case) 1. Formulate hypothesis:

2. Calculate t value:
For population portion

3. Calculate cut off value

4. Determite what to reject

Linear Regression

R square value 0.X -- this implies, that this regression model is able to explain
about X percentage of variation or changes in the unit sales of
the toy. What happens to the remaining radiation? It goes
unexplained.This will always increase by adding new
independent variables.
Adjusted R variable the adjusted R square will only increase if the added X variable
makes the modle better. It may go down if the added X variable
does not make any sense.
Hypothesis testing of linear From the table created from data analysis(upper ,lower value),p
regration value.
Multicollinearity

Mean centering

what are descriptive statistics?

These are a set of numbers that describe a data. A data may have many observations, and a summary
set of numbers that describe those multiple observations, is called descriptive statistics.

there are many other descriptor statistics. The more important and commonly used amongst them can
be categorized broadly into two categories. Measures of central tendency, also known as various
averages of data. And secondly, measures of dispersion. The former tries to capture some central aspect
of the data, while measures of dispersion summarizes how spread out or dispersed the data is.

Measures of central tendency:There are three important averages or measures of central tendency used
to summarize data. The mean, the median, and the mode.

Mean: mean most of the time refers to the arithmetic mean some other means, such as the geometric
mean or the harmonic mean

Median: median of a set of ordered observations is a middle number that divides the data into two
parts.

When should you report the median and when should you report the mean?

The mean is influenced to a greater extent by extreme observation( Some data has higher value than
most of the data). So if you notice extreme observations in your data, then perhaps a median is a better
summary of data than a mean.

Relationship between the mean and the median relates to the skewness of data: Mean higher than the
median the data is skewned to the right

Mode: The Excel command to calculate mode is MODE.SNGL. A mode is not a very relevant statistic
when the data is essentially continuous. For example, consider the daily exchange rate between the US
dollar and Euro in a particular month.The mode is not very relevant because the nature of data is such
that no value occurs more than once.Even if it occurs more than once, the likelihood of such occurrence
would be low.Thus, little information is gained knowing the mode.

Dispersion or spread in the data:  How does one translate dispersion into some meaningful
descriptive statistic?

One way to do so is to calculate the range of data, which simply is the difference between maximum
and minimum values in the data.

Another way is: the inter quartile range, or IQR. This defines the middle 50% of data, leaving 25% of
the data to the right, and 25% to the left. The median, incidentally, is the second quartile. The
minimum number in the range is the zeroth quartile. And the maximum number in the range is the
fourth quartile. Finally, the interquartile range or IQR is the third quartile minus the first quartile.

QUARTILE.INC().Your data array, and the particular quartile, zeroth, first, second, third, or the fourth
that you wish to calculate. The inter quartile range can then be calculated in Excel as the difference
between the third and the fourth quartiles.

Standard deviation: If we had data which was a sample from some larger population of data, which,
by the way, typically would be the case in a majority of business applications. We would use the
Excel command STDEV.S to calculate the standard deviation rather than STDEV.P. Which will be
used to calculate the standard deviation of the whole population.The P and S standing for population
and sample.

Box plot:

Rule of thumb: It says that approximately 68% of data lie within one standard deviation, and
approximately 95% lie within 2 standard deviations from the mean.

Variance: variance is the square of standard deviation


The covariance and correlation:Another set of descriptive measure of data, rather than looking at a
single variable considers the co-variation amongst two variables. That is, how do the two variables
vary together or co-vary?

These measures are called measures of association, and they're described in a set of numbers, the
relationship between two or more variables.

covariance.s, use to measure the covariance between two variable.

#data range must be equal.

#doesn’t matter which data range u select first.

#Unit will be unit of first data range*Unit of second data range.

#When the answer is positive means there is positive relation and when the answer is negative
means there is nagitive relation

# affected by the unit of the variables.so, The measure is fine as long as we need to know the
direction of relationship. That is, when one variable increases or decreases, what happens to the
other variable? the covariance cannot be directly interpreted in terms of how strong is the
relationship.# covariance and correlation measures. The covariance measure can theoretically vary
between negative infinity and positive infinity. A positive value for the covariance indicating a positive
relationship between the two variables. That is, if one increases, the other increases. If one
decreases, the other decreases. A negative value indicating a relationship where, when one variable
increases, the other decreases, and vice versa.

Correlation: the correlation is not affected by change of units. And is always bound between -1 and
+1, with the positive value of correlation indicating a positive relationship and negative value
indicating a negative relationship. Further, closer the correlation is to a +1 or -1, stronger is the
positive or negative relationship between the two variables. Loosely speaking of correlation and
excess of positive 0.5 is considered a strong positive relationship. And a correlation less than
negative 0.5, is considered a strong negative relationship between the two variables. EXCEL
function is CORREL()
Causation: the covariance and correlation tell us how two variables vary with respect with each
other. That is, if one variable increases or decreases, how is that related to the increase or decrease
of the other variable. We looked at the height and weights of certain Olympic athletes and concluded
that there was a strong positive correlation between the weight of an athlete and his or her height.
This correlation however in no way proves that weight causes height or that height causes weight.
This is where causation comes in, which is a distinct concept in correlation. However, unfortunately,
many a times this distinction is glossed over.

Usually to establish causation we need to have correlation and then a temporal distinction between
the two variables. That is, the variable causing the other variable needs to proceed or occur before
the variable that it causes.Further, we need to rule out our control for other many variables that
might be causing the variable in question.

Probability: Probability is a numerical measure of the frequency of occurrence of an event. It is


measured on a scale from 0 to 1. An event of probability 0 will definitely not occur. An event with
probability 1 will occur with certainty

Random experiment: Random Experiment It is simply any situation wherein a process leads to more
than one possible outcome.

 A Coin toss

 Roll of a dice

 A company declaring its Earnings

 Closing value of the stock market tomorrow

 Bonus that you get at year end

Random Variable: A random variable is a variable that takes on values determined by the outcome of a
random experiment

Statistical Distributions

Discrete distribution: A statistical distribution used for Discrete data.Example: Height, Distance traveled
in a road trip.(because the are infinite amount of possibility.because it can be 1 km/1.1km/1.111km
etc.only limited by the instrumental measurement)

Continuous distribution: A statistical distribution used for Continuous data. Example: Amount of water in
a bucket, the grains of sand in the universe.

Discrete Data

 number of students in class

 number of patients admitted to a hospital

 number of companies with revenue > 1 b$

Test of Discreteness

 The data is Discrete if between any two realizations a finite


number of outcomes can occur

Discrete versus Continuous data:

It is common in business applications to use a continuous distribution such as the Normal (the Bell
curve) for discrete data. Because the most well understood distribution is normal distribution.

Normal distribution: One of the most popular statistical distributions is Normal Distribution. Which
importantly happens to be a continuous distribution. However, As mentioned earlier it is okay many at
times in business applications to approximate even discreet data with the continuous distribution such
as the normal.

Probability mass function (PMF): a PMF, or probability Mass function is a rule that assigns probabilities
(values between 0 to 1) to various possible values that the random variable takes.example:coin toss 0.5
for heads and 0.5 for tails. Total 1

PDF: or probability density function is a rule that assigns probabilities to various possible values that the
random variable takes.

the probability of a particular outcome for a content of distribution is always zero:

Height.This is continuous data as discussed in the previous lesson, because if you take any two heights,
for example, 5' and 6', then the possible value of heights that can occur between 5' and 6' are infinite. If
you then ask, what is the probability of someone's height being 5'2"? The answer is 0, because even if
your friend has a height of 5'2", my response will be that you get a better measuring instrument and you
will find that the height is not 5' 2'', but say 5' 2.01''. Such kind of argument can be given for any height
that you come up with, thus implying that the probability of someone having a particular height is
always 0.This is the reason why, when using a continuous distribution, we always consider ranges of
outcomes. For example, what is the probability that someone's height is between 5'2" and 5'5",
probability for a range of outcomes.What is the probability that someone's height is less than 5'
feet.Again, a range of outcomes.

Applications of the Normal Distribution

Ques: What is the probability that on a particular day the demand for falafel
sandwiches is less than 300 at the restaurant? Ans: 0.4098

Demand ~ Normal(313 sandwiches, 57 sandwiches)

Ques: If the restaurant stocks 400 falafel sandwiches for a given day, what

is the probability that it will run out of these sandwiches on that day?

Ans: 0.0635

Ques: How many sandwiches must the restaurant stock to be at

least 98% sure of not running out on a given day?

430.0637

This number needs to be rounded up to the next integer, giving us

an answer of 431 sandwiches to be stocked.

Another Problem…

John can take either of two roads to the airport from his home (Road A

or Road B). Owing to varying traffic conditions the travel times on the

two roads are not fixed, rather on a Friday around midday the travel

times across these roads can be well approximated per normal

distributions as follows,

Road A: mean =54 minutes, std = 3 minutes

Road B: mean =60 minutes, std = 10 minutes

Ques: Which road should he choose if on midday Friday he must

be at the airport within 50 minutes to pick up his spouse?

Road A

Prob(Time < 50) = NORM.DIST(50,54,3,TRUE)

= 0.0912

Road B

Prob(Time < 50) = NORM.DIST(50,60,10,TRUE)

= 0.1586

The Standard Normal Distribution:

It is a Normal distribution with mean = 0 and std = 1

Population and sample:


The central Limit Theorem: Sample averages are normally distributed irrespective of where the sample
came from. Not only are they normally distributed but more importantly their mean equal to the
population mean.

Each sample mean is normally distributed and their mean is the population mean.
Bernoulli Process: Multiple Trials of the Bernoulli Process

Game of Dice

Win (if you roll a 6)

Lose (otherwise)

The Binomial Distribution:(one of the 2 most commonly used distribution used in discrete distribution)

Consider a situation where there are n independent trials, where the

probability of success on each trial is p and the probability of failure

is 1-p.

Define random variable X to denote number of successes in n trials.

Then this random variable is said to have a Binomial distribution

Binomial distribution:
Poission Distribution:
T distribution: T distribution, just like the standard normal, is also a symmetric distribution centered at
zero. That is, it has a mean equal to zero. the spec or the standard deviation of this distribution depends
on a single parameter called the degree of freedom, commonly denoted by df. As the degree of freedom
increases, the t distribution becomes closer and closer to the standard normal distribution.

The t distribution, unlike the normal distribution, does not have any stand-alone business applications.
Rather, it is used as a tool for the calculations used in coming up with confidence intervals and
hypothesis testing, which we will be introducing in subsequent lessons.

Degrees of freedom is the parameter of the t distribution that uniquely identifies one t distribution from
another. Just like in the case of a normal distribution, the combination of two parameters, the mean and
the standard deviation, uniquely identify one normal distribution from another.

the degrees of freedom is linked to the size of data being used. So, a t distribution using a larger set of
data would have a greater degree of freedom than a t distribution using a smaller set of data.

So for 11 df probability to the left = probability to the right but negative

Confidence Interval: It is an ‘interval’ with some ‘confidence’ or probability attached to it


[ lower limit < mean < upper limit ]

Example:

US Presidential Election, predicting the proportion of votes for a particular candidate

confidence interval for the ‘population proportion/percentage ’

Example:
Average starting salary of all business students who graduated last year in New York city

confidence interval for the ‘population mean’

The z statistic and the t statistic:

When the population standard deviation is unknown sample standard deviation is used in place of
population standard deviation.Then it will be converted to t distribution with (n-1) degree of freedom.

Confidence interval calculation: A random sample of 20 observations from a population data had a mean
equal to 70 The standard deviation of the population data is 10. Find an 85% confidence interval for the
population mean.

Probability outside the confidence interval is referred to as ‘α’ … and we wish to construct a (1-α)
confidence interval for the population mean

63
For population portion/Percentage: (we have to do manually)

P (cap) sample population percentage

How big should be the sample size:


Population std deviation: 0.9 batteries

Sample size calculation for population proportion:

The sample size, n, will be largest and hence, a conservative estimate in the calculation If the
underline expression p hat times 1 minus p hat is the maximum.And for this expression to be
maximum, p hat has to be 0.5 or 50%
Hypothesis testing:

 Hypothesis Test is a scientific tool to aid your decision making.


It takes into account…

− size of the sample

− variability in the sample

− level of ‘significance’ you desire in your conclusion


Hypothesis testing:[Example]You are the production manager at a beverage manufacturer and you
receive a bottling unit that has been recently re-adjusted so that it puts 200 milliliter of beverage in
disposable plastic bottles. You need to test that indeed the bottling unit puts in 200 milliliter of
beverage. For that you fill out 10 bottles using the unit at different times so as to obtain a random
sample and very carefully measure the amount of beverage inside each bottle.

{Here 200 is population mean or claimed population mean}


The claim being made is that the population mean is equal to 200 milliliter. So you would reject the
claim if you get a sample mean way above 200 milliliter and you would also reject the claim if the
sample mean is way below 200 milliliter. We will shortly, in these steps, lay out what do we imply by
way up and way below 200 milliliter.
EXAMPLE: A fuel additive manufacturer claims that through the use of its’ fuel additive, automobiles in
the small car category should achieve on average an increase of 3 miles or more per gallon of fuel.

So the null hypothesis is not rejected

EXAMPLE:

We wish to test a claim that the average age of Men MBA students across various MBA programs in the
US is greater than 28 years. For this we collect data on average ages of men MBA students across a
sample of 40 MBA programs in the US.
It will not have any effect if we switch the hypothesis

Hypothesis testing for population proportion: We will be using the z-statistic.For the population mean
where we have been using the t-statistic. All hypothesis tests involving a population proportion will be
using the z-statistic.

P(bar) = sample proportion.

P= Around which we use null hypothesis.

Example:

A medium sized university in the US introduces a new lunch facility on campus on a trial basis. The
university operates the Lunch facility for a few months and then decides to survey the student body.
Based on the survey, university would make this facility a permanent fixture or do away with it.
Specifically, if more than or equal to 70% of the student body approves of it then the facility would be
made permanent else it would shut down. The university conducts a survey with 750 randomly selected
students on campus and finds that 510 of these students (or 68% of the sampled students) approve of
the new facility and the remaining 240 students or 32% students do not approve of it. Based on the
criteria set by University should the facility be made permanent?
Error: Type 1 error & type 2 error

Example:

Your friend Sam claims that he can shoot 40 or more baskets in an hour from the 3-point line in a
Basketball court. So, Sam is making a claim about a population parameter, in this case it is his true
shooting ability from the 3-point line in a Basketball court. This can be likened to the population mean
mu. Thus Sam is claiming that the population mean mu of his shooting ability is greater than or equal to
40 baskets in an hour from the 3-point line in a Basketball court.
Example: An empirical study using data on heights of people claimed that the average height of men
aged 18 years to 45 years across the world was 173 cm. This study included men not only from the
sports fraternity but across a wide spectrum of professions and walks of life. One could argue that men
Olympians are likely to be taller than this claimed average height of 173 cm.

We can switch the hypothesises

X(bar)=average height of athlates

Mu=173 cm

S= sample stantaded deviation

N=sample size
The difference in mean hypothesis testing:
This calculations can be done using excle data analysis tool

T test:Paired two sample for mean

When it is needed to compare before and after result in a hypothesis testing paired two sample for
mean is to be used. In this case the difference is made for each pair .And then average is made.

T test:Two sample assuming equal variance/ unequal variance:

Average is made base on two sample (example: average is made for man side and average is made for
female side ) then results are compaired.

T test:Paired two sample for mean and T test:Two sample assuming equal variance/ unequal variance
will give different cutoffs and different results.
In this case we will use t-test sample assuming equal variance. As there is no sense of paring between
the data.because the data represents age of 40 different men and 40 diffent women how could there be
a chance of paring.
In this case we will use t test paired two sampling because there is a sense of pairing between the
data .As both sells occures at the same month.

Linear regression:
Interpreting: FOR BETA 1

Firstly, the interpretation is in terms of x one increasing by one unit. Not one percentage, so if the unit
of x1 is kilogram, then one unit increase implies that x1 increases by one kilogram.If the unit 1,000
kilogram, then one unit increase implies x1 increased by 1,000 kilograms.Secondly, the interpretation
says that y increases by beta one units.So for example, if the y variable is measured in terms of million
dollars, then beta one units imply beta $1 million. Thirdly, the last part of the interpretation is
important, which says that all of the variables are kept at the same level. To interpret the impact of a
particular exponential variable on the y variable, it is important that all of the variables are held constant
at water level there.

From above: for Beta 1:

All other variables remaining in the same level. Implying that if ad expenditure and promotional
expenditure are kept at the same level, they are not changed, and only the price is increased by $1, one
unit, then we would expect the unit sales to drop by 5,055.27 units.

FOR BETA 2:

For every one unit increase in ad expenditure, in this case, the unit of ad expenditure is $1,000, because
that is what ad expenditure is measured in our data in $1,000. So, the interpretation is, for every $1,000
increase in advertising expenditure, the unit sales increase by 648.61 units. All other variables remaining
at the same level.

FOR BETA 3:

The value of the coefficient is a positive 1802.65. Implying that for every one unit increase in
promotional expenditure, and once again, the promotional expenditure, the units of measurement are
$1,000. So it means, for every $1,000 increase in promotional expenditure, we would expect unit sales
to increase by 1802.61 units. All other variables remaining at the same level.

FOR BETA ZERO:

The interpretation of beta zero coefficient, and the estimate of the beta zero coefficient is the value of
my y variable, when all x variables to zero. So, in this case, it implies that the value of unit sales would be
a negative 25096.83, when all my x variables are zero. That is to say, when the price is zero, the ad
expenditure is zero, and the promotional expenditure is also zero. Now, this is the technical
interpretation of beta zero. Clearly, in this case, this technical interpretation does not have a managerial
relevance, why? Because, talking of a situation where you're selling a toy for free. You're pricing it at $0.
And then you're trying to see what would the unit sales be, does not make managerial sense.

Prediction:

Errors,residuals and R-square:

Some cases it is seen that the residuals are negetive. Which means that the model is over predicting

The model is linear so the graph is a straight line

R square value:
R-squared is equal to 0.61899 -- this implies, that this regression model is able to explain about 61.9
percentage of variation or changes in the unit sales of the toy. What happens to the remaining
radiation? It goes unexplained. You may notice from the earlier regression we carried out, that
increasing the number of X variables, increases R-squared. The R-squared was higher in the model when
we also included the advertising expenditure and promotion expenditures. Higher the value of R-
squared, that is, closer it is to one, implies that a greater proportion of variation in the Y variable, is
explained by the regression model. Or in other words, the model fits well to the data. Lower the value of
R-squared, that is closer it is to zero, implies that a lesser proportion of variation in the Y variable is
explained by the regression model. Or in other words, the model does not fit well to the data.
Unfortunately, there is no one value of R-squared, above which you can claim that you have a good
fitting model, and below which you can claim that you have a poor fitting model.

Why do we have errors in the regression model which then lead to these observe residuals?

There could be a multitude of reasons for this. The major reasons tend to be omitted variables and the
functional relationship between the Y and X variables. Omitted variables mean that your model may be
missing some important explanatory or X variables, which may be aggravating these errors. While the
functional relationship means that there may be some non-linearity in the relationship between the Y
variable and the set of X variables, implying that a straight line relationship may not be most
appropriate.

Normality assumptions on the error:

One important assumption about the error is that it has a a normal distribution with the mean equal to
0, and some constant standard deviation.Visually what this assumption means is that the vertical red
error bars shown in the scatterplot from the previous lesson tend to be approximately equally
distributed above and below the regression line. So that the average across the positive and negative
errors tend to be approximately 0. Further, the spread of these vertical error bars tend to be similar
across the entire regression line. Another way to think about this is that if you plotted a histogram of all
the error terms using your data you would tend to get a bell-shaped curve centered at 0.

The relationship between the betas and the b's is that bs are an estimate of the betas. Depending on the
sample use for estimation, the value of b's may change. For example, in a toy sales regression, had we
used 36 months of data rather than 24 months we may have gathered slightly different estimates of the
impact of price and other variables on sales. This indicates that the b's themselves can be considered as
random variables, and in turn have a distribution which is a normal distribution centered at the so-called
true value of the betas. For example, b0 follows normal distribution that is centered at the true value of
beta 0. b1 follows a normal distribution that is centered at the the true value of beta 1, and so on. The
relationship between the betas and the b's is analogous to the relationship between the population
mean and the sample mean that we studied in course two of the specialization. The population mean is
fixed but unknown. And the sample mean can be thought of as a random variable having a normal
distribution centered at the population mean.

b1- beta 1 divided by the standard error of b1 follows the t distribution with n-k-1 degrees of freedom,
and so on for the other estimated b's.

Hypothesis testing for linear regration:


3rd way of hypothesis testing:

The automatic generated P value:


Because it rejects the null hypothesis. Because when the p value is less then alpha it rejectes the null
hypothesis.

When p value is low null is gone

So the null hypothesis is rejected.

From this data set we see that P value for %_commercial is greater than 5% so we can reject the null
hypothesis which signifies that %_commercial has less significance on the data

See the value 5000 falls between the upper limit and lower limit so we cant reject the claim.

R square and adjusted R square:


R Square values always increase by addition of another variables. But the adjusted R square will only
increase if the added X variable makes the modle better. It may go down if the added X variable does
not make any sense.

Categorical Variables in a Regression: Dummy Variables:


The value of REGA is 1 and the value of B is 0. The value of coefficient indicates that’s for region A the
truck takes 106.84 min more than region B when all other parameters are same.

For 3 categorical variable:

The p value of REGB is greater than alpha value which indicates that area B does not have any
significant differnce than area c
Mean Centering Variables in a Regression Model: When the intercept does not have any manegerial
significance then we have to usemean centering variables in the regression model. To do this we will
change the height column with a column by (height -average(all height)).and by running this we will get
a meaning full intercept.
The intercept indicates that when male =0 ( means female because it is a dummu variable) and the
height =0 (means average height) then the person has 69.62 kg weight and to get male weight we will
add 5.52 kg.
Interaction model:

Beta 2 must be for female because female means male=0 which will remove the interaction effect out of
equation.

Log-log model and semi log :

You might also like