You are on page 1of 12

Name – Prasanth K S

Roll No - 2129MBA0064

Centre for Distance Education, Anna University Chennai, 600025. University


College of Engineering, Konam, Nagercoil – 629 004.
First Semester [Regulations 2017] Master of Business Administration [MBA] DBA 5102 –
Statistics for Management Written Assignment

Faculty: Dr. Senthil Velmurugan N Maximum: 100 marks

ANSWER ALL QUESTIONS PART


- A (10 x 02 = 20 marks)

1. The mean of a Binomial distribution is 20 and standard deviation is 4. Find the


parameters of the distribution
2. State Baye’s Theorem on Probability.
Answer:

Bayes' theorem, named after 18th-century British mathematician Thomas Bayes, is a


mathematical formula for determining conditional probability. Conditional probability is
the likelihood of an outcome occurring, based on a previous outcome occurring. Bayes'
theorem provides a way to revise existing predictions or theories (update probabilities)
given new or additional evidence. In finance, Bayes' theorem can be used to rate the risk of
lending money to potential borrowers.

Bayes' theorem is also called Bayes' Rule or Bayes' Law and is the foundation of the field
of Bayesian statistics.

o Bayes' theorem allows you to update predicted probabilities of an event by


incorporating new information.
o Bayes' theorem was named after 18th-century mathematician Thomas Bayes.
o It is often employed in finance in updating risk evaluation.

3. What is Standard Error?


Answer:
The standard error (SE) of a statistic is the approximate standard deviation of a statistical
sample population. The standard error is a statistical term that measures the accuracy with
which a sample distribution represents a population by using standard deviation. In
statistics, a sample mean deviates from the actual mean of a population; this deviation is
the standard error of the mean.
o The standard error is the approximate standard deviation of a statistical sample
population.
o The standard error can include the variation between the calculated mean of the
population and one which is considered known, or accepted as accurate.
o The more data points involved in the calculations of the mean, the smaller the
standard error tends to be.
4. Give any four factors that are to be decided for choosing a sample size.
Answer:
Effect Size, Standard Deviation, Power, and Significance Level
In general, three or four factors must be known or estimated to calculate sample size: (1)
the effect size (usually the difference between 2 groups); (2) the population standard
deviation (for continuous data); (3) the desired power of the experiment to detect the
postulated effect; and (4) the significance level. The first two factors are unique to the
particular experiment whereas the last two are generally fixed by convention. The
magnitude of the effect the investigator wishes to detect must be stated quantitatively, and
an estimate of the population standard deviation of the variable of interest must be
available from a pilot study, from data obtained via a previous experiment in the
investigator’s laboratory, or from the scientific literature. The method of statistical
analysis, such as a two-sample t-test or a comparison of two proportions by a chi-squared
test, is determined by the type of experimental design. Animals are assumed to be
randomly assigned to the various test groups and maintained in the same environment to
avoid bias. The power of an experiment is the probability that the effect will be detected. It
is usually and arbitrarily set to 0.8 or 0.9 (i.e., the investigator seeks an 80 or 90% chance
of finding statistical significance if the specified effect exists). Note that 1-power,
symbolized as β, is the chance of obtaining a false-negative result (i.e., the experiment will
fail to reject an untrue null hypothesis, or to detect the specified treatment effect).
5. Explain the two types of errors with examples
Answer:
In statistical hypothesis testing, a type I error is the mistaken rejection of the null
hypothesis (also known as a "false positive" finding or conclusion; example: "an innocent
person is convicted"), while a type II error is the mistaken acceptance of the null
hypothesis (also known as a "false negative" finding or conclusion; example: "a guilty
person is not convicted").[1] Much of statistical theory revolves around the minimization
of one or both of these errors, though the complete elimination of either is a statistical
impossibility if the outcome is not determined by a known, observable causal process. By
selecting a low threshold (cut-off) value and modifying the alpha (p) level, the quality of
the hypothesis test can be increased.[2] The knowledge of Type I errors and Type II errors
is widely used in medical science, biometrics and computer science.

6. What are the assumptions on which F-test is based?


Answer:
An F-test assumes that data are normally distributed and that samples are independent
from one another.
Explanation:
An F-test assumes that data are normally distributed and that samples are independent
from one another.
Data that differs from the normal distribution could be due to a few reasons. The data
could be skewed or the sample size could be too small to reach a normal distribution.
Regardless the reason, F-tests assume a normal distribution and will result in inaccurate
results if the data differs significantly from this distribution.

7. Define the statistics used in the U – test and give its mean
Answer:
In statistical theory, a U-statistic is a class of statistics that is especially important in
estimation theory; the letter "U" stands for unbiased. In elementary statistics, U-statistics
arise naturally in producing minimum-variance unbiased estimators.

The theory of U-statistics allows a minimum-variance unbiased estimator to be derived


from each unbiased estimator of an estimable parameter (alternatively, statistical
functional) for large classes of probability distributions.[1][2] An estimable parameter is a
measurable function of the population's cumulative probability distribution: For example,
for every probability distribution, the population median is an estimable parameter. The
theory of U-statistics applies to general classes of probability distributions.

Many statistics originally derived for particular parametric families have been recognized
as U-statistics for general distributions. In non-parametric statistics, the theory of U-
statistics is used to establish for statistical procedures (such as estimators and tests) and
estimators relating to the asymptotic normality and to the variance (in finite samples) of
such quantities

8. In 30 tosses of a coin the following sequence of heads (H) and tails (T) is obtained H T T
H H H T H H T T H TH H TH HT TH T H HT HT. Define the no. of runs.

No of runs means in a sequence when a transition takes place, each part is called run.
Here number of runs is 18
9. Differentiate between correlation and regression.
Answer:

A correlation coefficient is applied to measure a degree of association in variables and is


usually called Pearson’s correlation coefficient, which derives from its origination source.
This method is used for linear association problems. Think of it as a combination of words
meaning, a connection between two variables, i.e., correlation.

Regression can be defined as the parameter to explain the relationship between two
separate variables. It is more of a dependent feature where the action of one variable
affects the outcome of the other variable. To put in the simplest terms, regression helps
identify how variables affect each other.

Differences are:

• The regression will give relation to understand the effects that x has on y to change and
vice-versa. With proper correlation, x and y can be interchanged and obtained to get the
same results.
• Correlation is based on a single statistical format or a data point, whereas regression is an
entirely different aspect with an equation and is represented with a line.
• Correlation helps create and define a relationship between two variables, and regression, on
the other hand, helps to find out how one variable affects another.
• The data shown in regression establishes a cause-and-effect pattern when change occurs in
variables. When changes are in the same direction or opposite for both variables, for
correlation here, the variables have a singular movement in any direction.
• In correlation, x and y can be interchanged; in regression, it won’t be applicable.
• Prediction and optimization will only work with the regression method and would not be
viable in the correlation analysis.
• The cause-and-effect methodology would be attempted to establish by regression, whereas
not it.

You might also like