You are on page 1of 29

Introduction to Statistics

and Data Analysis


Module 1: Data Analytics
Week 1
Objectives of the Lesson

 Give an overview about statistics and probability.


 Differentiate the two branches of statistics.
 Demonstrate knowledge of statistical terms.
 Identify types of data, sampling techniques, variables, and level of measurement of
each variables.
 Explain difference between an observational study and an experimental study.
 Explain how statistics can be used and misused.
 Explain the importance of computers and calculators in statistics.
Overview: Statistical Inference, Samples,
Populations, and the Role of Probability

 Beginning in the 1980s and continuing into the twenty-first century: an inordinate
amount of attention has been focused on improvement of quality in American
industry
 Japanese "industrial miracle" which began in the middle of the twentieth century
(they succeed to create an atmosphere that allows the production of high quality
products)
 Much of the success of the Japanese has been attributed to the use of statistical
methods and statistical thinking among management personnel.
Overview: Statistical Inference, Samples, Populations, and the Role of Probability
Use of Scientific Data

 Quality may well be defined in relation to closeness to a target density value in


harmony with what portion of the time this closeness criterion is met.
 Sources of Variation
 True Variation (Natural Variation)
 …….. Variation
 It is the natural variation from study to study that must be taken into account in the
decision process.
Overview: Statistical Inference, Samples, Populations, and the Role of Probability
Use of Scientific Data

 The use of statistical methods in manufacturing, development of food products, computer


software, pharmaceutical, and many other areas involves the gathering of information or
scientific data.
 There is a profound distinction between collection of scientific information and inferential
statistics.
 Statistical methods are designed to contribute to the process of making scientific:
judgments in the face of uncertainty and variation.
 Statistical methods are used to analyze data from a process such as this one in order to gain
more sense of where in the process changes may be made to improve the quality of
the process.
Overview: Statistical Inference, Samples, Populations, and the Role of Probability
The Role of Probability

 Concepts in probability form a major component that supplements statistical


methods and help gauge the strength of the statistical inference.
 The discipline of probability, then, provides the transition between descriptive
statistics and inferential methods.
Elements of probability allow the conclusion to be put into the language that the
science or engineering practitioners require.
Overview: Statistical Inference, Samples, Populations, and the Role of Probability
How Do Probability and Statistical Inference Work Together?

 Inductive Reasoning. The sample along with inferential statistics allows us to draw conclusions
about the population, with inferential statistics making clear use of elements of probability.

 Deductive Reasoning. Problems in probability allow us to draw conclusions about characteristics of


hypothetical data taken from the population based on known features of the population.
Sampling Procedures and Collection of Data
Data Collection and Sampling Techniques

 Surveys are the most common method of collecting data. Methods of


surveying are:
Telephone Surveys
Mailed questionnaire surveys
Personal interviews
Online survey through different social media platforms (recent trend)
Sampling Procedures and Collection of Data
Sampling Methods

Take Note!!!
 Investigating the whole population is often impossible due to expenses, time or size of
population.
 Using samples saves time and money and, in some cases, enables the researcher to get more
detailed information about a particular subject.
 Samples cannot be selected in haphazard ways because the information obtained may be biased.
To obtain unbiased sample, give each subject in the population an
equally likely chance of being selected.
Sampling Procedures and Collection of Data
Sampling Methods

1. Random Sampling. Samples are collected using chance methods or


random methods.

Example:
Number each subject in the population. Then place numbered cards in a bowl, mix
them thoroughly, and select as many cards as needed. The subjects whose numbers are
selected constitute the sample.
Sampling Procedures and Collection of Data
Sampling Methods

2. Systematic Sampling. Samples are collected by numbering each subject of


the populations an then selecting every kth number.

Example:
Suppose there were 2000 subjects in the population and a sample of 50 subjects were needed.
Since 2000/50 = 40, then k=40, and every 40th subject would be selected; however, the first
subject (numbered between 1 and 40) would be selected. Suppose subject 10 is selected, then the
sample would consist of the subjects whose numbers were 10, 50, 90, 130, etc. until 50 subjects
are collected.
Sampling Procedures and Collection of Data
Sampling Methods

3. Stratified Sampling. Samples Class # of Students


A 53
are collected by dividing the B 45
population into groups according C 37
D 50
to some characteristics that is
E 27
important to the study, then
sampling from each group. Perform a stratified random sampling. (Use 4-decimal
places); n = 30
Sampling Procedures and Collection of Data
Sampling Methods

4. Cluster Sampling. Samples are selected using intact groups called


clusters.

Example:
Suppose a researcher wishes to survey apartment dwellers in a large city. If there are 10
apartment buildings in the city, the researcher can select a random 2 buildings from 10
and interview all the residents of these buildings. Cluster sampling is used when the
population is large or when it involves subjects residing in a large geographic area.
Sampling Procedures and Collection of Data
Sampling Methods

5. Convenience Sampling. Researcher uses subjects that are convenient.

Example:
The researcher may interview subjects entering a local mall to determine the nature of
their visit or perhaps what stores they will be patronizing. This sample is probably not
representative of the general customers for several reasons.
Nature of Statistics and Probability
Important Terminologies

Statistics is the science of conducting studies to:


1. Collect
2. Organize
3. Summarize
4.Analyze
5. Draw conclusions from data.
Nature of Statistics and Probability
Important Terminologies

Inferential Statistics
Descriptive Statistics
consist of the
consist of the  generalizing from samples
 collection to population
 organization  performing estimations
 summarization and  hypothesis testing
 presentation of data.  determining relationships
among variables
 and making predictions.
Nature of Statistics and Probability
Important Terminologies

 Probability is the chance of


an event occurring.
 A population consists of all
subjects that are being studied.
 A sample is a group of subjects
selected from a population.
Nature of Statistics and Probability
Variables and data

In order to gain knowledge about seemingly haphazard events, statisticians


collect information for variables that describe the events.

 A variable is a characteristic or attriburte that can assume different values.


 Data are the values that variables can assume.

X = ( 150 degree Celsius, 20 students, 4.5 cars/weeks)


Nature of Statistics and Probability
Variables and Data

 A data set is a collection of


data values.
 Each value in the data set is
called a data value or a datum.
Random variables have values
that are determined by chance.
Nature of Statistics and Probability
Variables and Types of Data

 Types of Variables:

Qualitative variables can be placed into distinct categories according


to some characteristic or attribute (ex. colors of cars, gender, religious
preference, ethnic group, nationality).
 Quantitative variables are numerical in nature and can be ordered or
ranked (ex. Number of pages in the book, capacity of students in a
classroom, weights of fish caught)
Nature of Statistics and Probability
Variables and Types of Data

 Types of Quantitative Variables:

 Discrete variables assume values that can be counted (ex. Students in


a class, books in the library, bottles in a case).
 Continuous variables can assume all values between any two specific
values and are measurable (ex. length, temperature, mass, height)
Nature of Statistics and Probability
Measurement Scales (Level)

 Levels of Measurement:
1. Nominal
2. Ordinal
3. Interval
4. Rational (Ratio)
Nature of Statistics and Probability
Measurement Scales (Level)

 Levels of Measurement:

 Nominal Scale classifies data into mutually exclusive (nonoverlapping),


exhausting categories in which no order or ranking can be imposed on
the data.
 Example: Zip Code, Gender, Eye Color, Political Affiliation, Religious
Affiliation, Nationality, Study Field
Nature of Statistics and Probability
Measurement Scales (Level)

 Levels of Measurement:

 Ordinal Scale classifies data into categories that can be ranked;


however, precise differences between the ranks do not exist.

 Example: Grade (A, B, C, D, F), Judging (1st, 2nd, 3rd place, etc.), Rating
Scale (poor, good excellent)
Nature of Statistics and Probability
Measurement Scales (Level)

 Levels of Measurement:

 Interval Scale ranks data, and precise differences between units of


measure do exist; however, there is no meaningful zero.

 Example: Intelligence Quotient (IQ), Temperature


Nature of Statistics and Probability
Measurement Scales (Level)

 Levels of Measurement:

 Rational Scale possesses all the characteristics of an interval


measurement, and there exist a true zero.

 Example: Height, Weight, Time, Salary, Age


Nature of Statistics and Probability
Observational and Experimental Studies
Nature of Statistics and Probability
Use and Misuses of Statistics

 Suspect samples
 Very small samples
 Bias sample selection
 Volunteer samples
 Ambiguous averages (mean, median, mode)
 Changing the subject (10% of 1000 instead of 2000)
Faulty Survey Questionnaires
Misleading Graphs
Nature of Statistics and Probability
Technology for Statistical Calculations

With the advent of calculators and


statistical software (Minitab,
Statistica, SPSS, MS Excel, etc.)
numerical computations are a lot
easier.

You might also like