# Introduction to Statistics

Statistics, Data, & Statistical Thinking

Idham Fahumy

STA001, TT-term 2, 2012 10:30-12:30 13:00-15:00 16:30-18:30 18:30-20:30 STA001 (L), ACIM1,DIB1,DIB3adv,DIB1(E),Audi STA001(T), DIB1(E) B1-04

Sun STA001(T), ACIM1, STA001(T)DIB1(M)B1-04 DIB3-adv B1-04

Consultation STA001

Mon

Lecturer: Idham Fahmy Phone: 3345 481 Email: idham.fahumy@mnu.edu.mv

2

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Objectives
At the end of this topic, students will be able to:  To present a broad overview of the subject of statistics and its applications  To distinguish between Descriptive and Inferential statistics.  To discuss sources of data  To discuss types of data

3

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

What is statistics?

4

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Introduction
Definition of Statistics:
1. A collection of quantitative data pertaining

to a subject or group. Examples are sales, income, employment statistics etc.
2. The science that deals with the collection,

tabulation, analysis, interpretation, and presentation of quantitative data

5

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

WHAT IS MEANT BY STATISTICS?
• a branch of mathematics that provides techniques

• • • •

6

STA001-Introduction to Statistics

to analyze whether or not your data is significant (meaningful) Statistical applications are based on probability statements Nothing is “proved” with statistics Statistics are reported Statistics report the probability that similar results would occur if you repeated the experiment For a layman, „Statistics‟ means numerical information expressed in quantitative terms. This information may relate to objects, subjects, activities, phenomena, or regions of space.

Faculty of Management and Computing, MNU

WHAT IS MEANT BY STATISTICS?
Why?

1.Collecting Data
 e.g. Survey

Data Analysis

2. Presenting Data
 e.g., Charts & Tables

3. Characterizing Data
 e.g., Average

DecisionMaking

7

STA001-Introduction to Statistics

© 1984-1994 T/Maker Co. Faculty of Management and Computing, MNU

 Data are numerical facts and figures from

which conclusions can be drawn. Such conclusions are important to the decision-making processes of many professions and organizations. For example:  government officials use conclusions drawn from data on unemployment and inflation to make policy decisions.  Financial planners use recent trends in stock market prices to make investment decisions

8

STA001-Introduction to Statistics Faculty of Management and Computing, MNU

 Businesses decide which products to develop

and market by using data that reveal consumer preferences.  Production supervisors use manufacturing data to evaluate, control, and improve product quality.  Politicians rely on data from public opinion polls to formulate legislation and to devise campaign strategies.  Physicians and hospitals use data on the effectiveness of drugs and surgical procedures to provide patients with the best possible treatment.

9

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Major characteristics of statistics
 Statistics are the aggregates of facts. It means a

10

STA001-Introduction to Statistics

single figure is not statistics. For example, national income of a country for a single year is not statistics but the same for two or more years is statistics.  Statistics are affected by a number of factors. For example, sale of a product depends on a number of factors such as its price, quality, competition, the income of the consumers, and so on  Statistics must be reasonably accurate. Wrong figures, if analysed, will lead to erroneous conclusions. Hence, it is necessary that conclusions must be based on accurate figures.

Faculty of Management and Computing, MNU

Major characteristics of statistics
 Statistics must be collected in a systematic

manner. If data are collected in a haphazard manner, they will not be reliable and will lead to misleading conclusions.  Collected in a systematic manner for a predetermined purpose  Lastly, Statistics should be placed in relation to each other. If one collects data unrelated to each other, then such data will be confusing and will not lead to any logical conclusions. Data should be comparable over time and over space.
11
STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Statistics deals with numbers
 Need to know nature of numbers collected
 Continuous variables: type of numbers associated

with measuring or weighing; any value in a continuous interval of measurement.
 Examples:

Weight of students, height of plants, time to flowering

 Discrete variables: type of numbers that are

counted or categorical
 Examples:

Numbers of boys, girls, insects, plants

12

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Can you figure out…
 Which type of numbers (discrete or continuous?)
 Numbers of persons preferring Brand X in 5

different islands  The weights of high school seniors  The lengths of banana leaves  The number of seeds germinating  Answers: all are discrete except the 2nd and 3rd examples are continuous.

13

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Populations and Samples
 Population includes all members of a group  Example: all 12th grade students in CHSE  Sample  Used to make inferences about large populations  Samples are a selection of the population  Example: Gift shops in Majeedhee Magu  Why the need for statistics?  Statistics are used to describe sample populations as

estimators of the corresponding population  Many times, finding complete information about a population is costly and time consuming. We can use samples to represent a population.
14

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Sample Populations avoiding Bias
 Individuals in a sample population
 Must be a fair representation of the entire pop.  Therefore sample members must be randomly

selected (to avoid bias)  Example: if you were looking at strength in students: picking students from the football team would NOT be random

15

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Is there bias?
 A cage has 1000 rats, you pick the first 20 you can

catch for your experiment  A public opinion poll is conducted using the telephone directory  You are conducting a study of a new diabetes drug; you advertise for participants in the newspaper and TV  All are biased: Rats-you grab the slower rats. Telephone-you call only people with a phone (wealth?) and people who are listed (responsible?). Newspaper/TV-you reach only people with newspaper (wealth/educated?) and TV( wealth?).
16
STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

SOURCES OF STATISTICAL DATA
 Researching problems involving topics such

as crime, health, imports and exports, production, hourly wages etc. generally requires published data. Statistics on these and information on thousands of other topics can be found in published articles, journals, magazines, WWW.  Published data are not always available on a given subject. In such cases, information will have to be collected and analyzed. One way of collecting data is through questionnaires.
17
STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Primary and Secondary Data
 Secondary data: They already exist in some

form: published or unpublished - in an identifiable secondary source. They are, generally, available from published source(s), though not necessarily in the form actually required.
 Primary data: Those data which do not

already exist in any form, and thus have to be collected for the first time from the primary source(s). By their very nature, these data require fresh and first-time collection covering the whole population or a sample drawn from it.
18
STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

 What information in the

NID application form can be used to generate some indicators.
19
STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

 What information in

the application form can be used to generate some indicators.

20

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Two phases of statistics
 Descriptive Statistics:

 Describes the

21

STA001-Introduction to Statistics

characteristics of a product or process using information collected on it.  Inferential Statistics:  Draws conclusions on unknown process parameters based on information contained in a sample.  Uses probability

Descriptive objectives/ research questions

Descriptive statistics

Comparative objectives/ hypotheses

Inferential Statistics

Faculty of Management and Computing, MNU

Descriptive Statistics
 Involves
 Collecting Data  Presenting Data  Characterizing Data

50

\$

25
0 Q1 Q2 Q3 Q4

 Purpose
 Describe Data

X = 30.5 S2 = 113
22
STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Descriptive Statistics
EXAMPLE:
 A poll found that 49% of the people in a

survey knew the name of the first president of the Maldives. The statistic 49 describes the number out of every 100 persons who knew the answer.

23

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Inferential Statistics
 TV networks constantly monitor the popularity

of their programs by hiring research firms and other organizations to sample the preferences of TV viewers.  The accounting department of a large firm will select a sample of the invoices to check for accuracy for all the invoices of the company.  Ice-cream tasters tast a few spoon of icecream to make a decision with respect to all the ice-cream waiting to be released for sale.
24

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Inferential Statistics
Involves
 Estimation  Hypothesis
Population?

Testing
Purpose
 Make Decisions

25

STA001-Introduction to Statistics
Faculty of Management and Computing, MNU

TYPES OF VARIABLES
 Qualitative or Attribute variable: when the

characteristic or variable being studied is categorical or non-proportional.  EXAMPLES: Gender (male, female), type of automobile owned, Island of birth, eye color, etc.  Quantitative variable: when the variable can be reported non-categorical or proportional.  EXAMPLES: Balance in your checking account, salaries of faculty members, number of children in a family etc.
26
STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

TYPES OF VARIABLES (continued)
 Quantitative variables can be classified as

either discrete or continuous.  Discrete Variables: can only assume certain values and there are usually “gaps” between the values.  EXAMPLE: The number of bedrooms in a house (1, 2, 3, ..., etc.).  Continuous Variables: can assume any value within a specific range.  EXAMPLE: The time it took to fly from Male’ to Colombo (Sri Lanka).
27
STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

SUMMARY OF TYPES OF VARIABLES
Data

Qualitative or attribute
Type of car owned. Color of pens.

Quantitative or numerical

Discret Numbere children. of
28
STA001-Introduction to Statistics

Continuou s Time taken for an exam.
Faculty of Management and Computing, MNU

Example:  Population statistics are usually collected and presented by social statisticians  but enormous importance to businesses.  Identify products suitable for age groups.  National Income and Expenditure survey is not solely of interest of

29

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

 All published statistics are part of management

30

STA001-Introduction to Statistics

information  generated as raw data and in treated form.  We may have tables of output, sales, stock levels, etc;  charts of production, machine utilization, productivity, etc;  ratios of stock turnover, gross profit, net profit, working capital, etc.  Countless analysis will be made of products, customer trends, sales areas, sales periods, order size, distribution method, maintenance programs, vehicle usage, cash flow, etc.  All such information requires us to be aware of statistical techniques, familiar with statistical jargon, appreciate of its uses.

Faculty of Management and Computing, MNU

There are three major functions in any business enterprise in which the statistical methods are useful.  The planning of operations: This may relate to either special projects or to the recurring activities of a firm over a specified period.  The setting up of standards: This may relate to the size of employment, volume of sales, fixation of quality norms for the manufactured product, norms for the daily output, and so forth.  The function of control: This involves comparison of actual production  achieved against the norm or target set earlier. In case the production has fallen short of the target, it gives remedial measures so that such a deficiency Faculty of Management and Computing, MNU does not occur again.

31

STA001-Introduction to Statistics

Precision and Accuracy
Precision

32

STA001-Introduction to Statistics

The precision of a measurement is determined by how reproducible that measurement value is. For example if a sample is weighed by a student to be 42.58 g, and then measured by another student five different times with the resulting data: 42.09 g, 42.15 g, 42.1 g, 42.16 g, 42.12 g Then the original measurement is not very precise since it cannot be reproduced.
Faculty of Management and Computing, MNU

Precision and Accuracy
Accuracy
 The accuracy of a measurement is determined by

how close a measured value is to its “true” value.
 For example, if a sample is known to weigh 3.182

g, then weighed five different times by a student with the resulting data: 3.200 g, 3.180 g, 3.152 g, 3.168 g, 3.189 g
 The most accurate measurement would be 3.180

g, because it is closest to the true “weight” of the sample.
33
STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Application Areas

Economics
 Forecasting  Demographics

Engineering
 Construction  Materials

Sports
 Individual & Team

 Consumer Preferences  Financial Trends

Performance

34

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

LIMITATIONS OF STATISTICS
Statistics has a number of limitations, pertinent among them are as follows  There are certain phenomena or concepts where statistics cannot be used. This is because these phenomena or concepts are not amenable to measurement.
 For example, beauty, intelligence, courage cannot be

quantified. Statistics has no place in all such cases where quantification is not possible.
 Statistics reveal the average behaviour, the normal or

the general trend. An application of the 'average' concept if applied to an individual or a particular situation may lead to a wrong conclusion and sometimes may be disastrous.
 For example, one may be misguided when told that the
35
STA001-Introduction to Statistics

average household income is Rf 5000, but there maybe Faculty of households with 10000-20000 income whileManagement and Computing, few MNU

LIMITATIONS OF STATISTICS
 Since statistics are collected for a particular

purpose, such data may not be relevant or useful in other situations or cases.
 For example, secondary data (i.e., data originally

collected by someone else) may not be useful for the other person.
 Statistics are not 100 per cent precise as is

36

STA001-Introduction to Statistics

Mathematics or Accountancy. Those who use statistics should be aware of this limitation.  In statistical surveys, sampling is generally used as it is not physically possible to cover all the units or elements comprising the universe. The results may not be appropriate as far as the universe is concerned. Moreover, different surveys based on the same size of sample but

Faculty of Management and Computing, MNU

LIMITATIONS OF STATISTICS
 At times, association or relationship between two

37

STA001-Introduction to Statistics

or more variables is studied in statistics, but such a relationship does not indicate cause and effect„ relationship. It simply shows the similarity or dissimilarity in the movement of the two variables. In such cases, it is the user who has to interpret the results carefully, pointing out the type of relationship obtained.  A major limitation of statistics is that it does not reveal all pertaining to a certain phenomenon. The user of Statistics has to be well informed and should interpret Statistics keeping in mind all other aspects having relevance on the given problem.

Faculty of Management and Computing, MNU

Misuses of Statistics
 Sources of data not given: In the absence of

the source, the reader does not know how far the data are reliable. Further, if he/she wants to refer to the original source, he/she is unable to do so
 Defective data: This may be done knowingly in

38

STA001-Introduction to Statistics

order to defend one's position or to prove a particular point. For example, in case of data relating to unemployed persons, the definition may include even those who are employed, though partially. The question here is how far it is justified to include partially employed persons amongst unemployed ones.
Faculty of Management and Computing, MNU

Misuses of Statistics
 Unrepresentative sample: In conducting surveys

we need to choose a sample from the given population or universe.
 The sample may turn out to be unrepresentative of the

universe.  One may choose a sample just on the basis of convenience.
 Inadequate sample: Earlier, we have seen that a

39

STA001-Introduction to Statistics

sample that is unrepresentative of the universe is a major misuse of statistics. This apart, at times one may conduct a survey based on an extremely inadequate sample. For example, in a city we may find that there are 100,000 households. When we have to conduct a household survey, we may take a Faculty of Management and 0.1 sample of merely 100 households comprising onlyComputing, MNU

Misuses of Statistics
Unfair Comparisons  An important misuse of statistics is making unfair comparisons from the data collected.  For instance, one may construct an index of production choosing the base year where the production was much less. Then he may compare the subsequent year's production from this low base. Such a comparison will undoubtedly give a rosy picture of the production though in reality it is not so.
 Another source of unfair comparisons could be when one makes

absolute comparisons instead of relative ones. An absolute comparison of two figures, say, of production or export, may show a good increase, but in relative terms it may turnout to be very negligible.
 Another example of unfair comparison is when the population in

40

two cities is different, but a comparison of overall death rates and deaths by a particular disease is attempted. Such a comparison is wrong. Likewise, when data are not properly classified or when changes in the composition of population in the two years are not taken into consideration, comparisons of such data would be STA001-Introduction to Statistics Faculty of Management and Computing, unfair as they would lead to misleading conclusions. MNU

Misuses of Statistics
 Unwanted conclusions:Another misuse of statistics may

be on account of unwarranted conclusions. This may be as a result of making false assumptions.  For example, while making projections of population in the next five years, one may assume a lower rate of growth though the past two years indicate otherwise.  Sometimes one may not be sure about the changes in business environment in the near future. In such a case, one may use an assumption that may turn out to be wrong.  Another source of unwarranted conclusion may be the use of wrong average. Suppose in a series there are extreme values, one is too high while the other is too low, such as 800 and 50. The use of an arithmetic average in such a case may give a wrong idea. Instead, Median or harmonic mean would be proper in such a case.
41
STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Misuses of Statistics
Confusion of correlation and causation  In statistics, several times one has to examine the relationship between two variables.  A close relationship between the two variables may not establish a cause-and-effect-relationship in the sense that one variable is the cause and the other is the effect. It should be taken as something that measures degree of association rather than try to find out causal relationship..

42

STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

Statistical Computer Packages
Typical Software  SAS  SPSS  MINITAB  Excel
Need Statistical

Understanding
 Assumptions  Limitations
43
STA001-Introduction to Statistics

Faculty of Management and Computing, MNU

End of Topic