11 upvote00 downvotes

0 views65 pagesStatistics

Jan 21, 2019

Math III - Statistics

© © All Rights Reserved

0 views

11 upvote00 downvotes

Math III - Statistics

You are on page 1of 65

MATHEMATICS III: REGULARITY AND REPETITION

TOPIC 11: DATA COLLECTION

MODULE 3. PROBABILITY AND STATISTICS

INTRODUCTION

In Mexico, in 1983, the National Institute of Statistics, Geography and Informatics (INEGI) was

instituted by presidential decree. Its creation brought the modernization of the rich tradition our country had for

gathering, processing and disseminating information about the nation, the population and the economy.

“If there is no data, it is simply not possible to work with the tools Statistics provides” (Gutiérrez, 2010).

11.1 BASIC CONCEPTS OF STATISTICS

the compilation, organization, presentation, analysis of

data, as well as their interpretation. Statistics

Descriptive statistics is the part that collects

(through polls), introduces and describes (using tables

or graphs) data from sample.

Inferential statistics covers the part that interprets

Descriptive Inferential

the resulting values, either through estimates or Statistics Statistics

predictions, in order to assist in the decision making.

Population

Statistic Sample

Parameter Variable

Concepts

Qualitative

Experiment

variable

Quantitative

Data

variable

CONCEPTS

Population Sample

It is the total of the elements of the study; also It is the subset or representative part of the

known as universe. population and it is selected by different methods,

which are called sampling methods.

CONCEPTS

Variable

It is the characteristic or attribute the sample or

population can have. There are qualitative and

quantitative variables.

Qualitative variable

It classifies or describes an element of the

population and they can be nominal or ordinal.

Quantitative variable

It numerically describes an element of a population

and they are classified as discrete or continuous.

CONCEPTS

Variable

It is the characteristic or attribute the sample or

population can have. There are qualitative and

quantitative variables.

Qualitative variable

It classifies or describes an element of the

population and they can be nominal or ordinal.

Quantitative variable

It numerically describes an element of a population

and they are classified as discrete or continuous.

CONCEPTS

Data

It is the value compiled for the variable of each

element of the population or sample.

Experiment

It is the operation of observing results and through

which a set of data is obtained.

Parameter

It is the numerical value that describes the data of a

population.

Statistic

It is the numerical value that describes the data of a

sample.

CONCEPTS

Data

It is the value compiled for the variable of each

element of the population or sample.

Experiment

It is the operation of observing results and through

which a set of data is obtained.

Parameter

It is the numerical value that describes the data of a

population.

Statistic

It is the numerical value that describes the data of a

sample.

11.2 SAMPLING METHODS

Sampling is the set of techniques used to select the ”best” possible sample (the one that is considered to

represent the population better).

To carry out sampling it is necessary to identify appropriate sources (called “frames”) because if these are

inadequate, then the sample will be too and our estimates or predictions will be wrong.

the elements are taken without regard to statistical work that precedes the

probability; sometimes these are used selection of the elements, as these are

even though they are not good for chosen according to a probability of

generalizing actual estimates, since there occurrence.

is no certainty that their sample has been

representative.

NON-PROBABILITY METHODS

11.2 SAMPLING METHODS

Sampling is the set of techniques used to select the ”best” possible sample (the one that is considered to

represent the population better).

To carry out sampling it is necessary to identify appropriate sources (called “frames”) because if these are

inadequate, then the sample will be too and our estimates or predictions will be wrong.

the elements are taken without regard to statistical work that precedes the

probability; sometimes these are used selection of the elements, as these are

even though they are not good for chosen according to a probability of

generalizing actual estimates, since there occurrence.

is no certainty that their sample has been

representative.

PROBABILITY METHODS: SIMPLE RANDOM SAMPLING

It is the simplest and most known method of selection, and even though sometimes it is difficult to use because it

requires a sample frame, it is the base of other sampling techniques. The sample obtained had the same probability

of being selected than any other element of the population.

PROBABILITY METHODS: SYSTEMATIC RANDOM SAMPLING

In this method, an element is selected and then the rest is selected until completing the desired sample number.

To obtain the sample it is necessary to divide the population size (P) by the number of desired elements (n),

where the result (k) is rounded to the nearest integer. To get the result, the first element is selected randomly and

then the rest are selected to every k elements.

This method is NOT convenient to use when there is a pattern in the arrangement of the data. (For example, in a

list of students, as these can be arranged alphabetically or by enrollment number either in ascending or

descending form).

PROBABILITY METHODS: STRATIFIED RANDOM SAMPLING

In this technique it is necessary to divide the population into groups (called strata) formed with certain

characteristics.

This method requires auxiliary information from the sampling frame because each stratum requires homogeneity

and heterogeneity of the elements in each of the strata.

PROBABILITY METHODS: CLUSTER SAMPLING

It consists in dividing a population into smaller groups (or clusters). We can take as an example the elements of a

particular city or school or even a box with certain product, then random elements must be selected until

completing the sample.

CONCLUSION

There is no better way to do a study than with 100% of the items, but sometimes this is difficult and also very

expensive, which is why Statistics uses sampling techniques that allow us to save resources as well as to obtain

reliable results.

Various media such as television news or even print media (newspapers, magazines) mention Statistics to show

the opinion of people. Because not everyone is familiar with the basic concepts, many years ago the use of

graphics was implemented, so that we could better understand the data they show.

One of the novel aspects was the use of statistical diagrams, as her intention was for readers to understand their

message.

QUIZ #11

1. The branch of mathematics devotes to the compilation, organization, presentation, analysis of data, as well as their interpretation; is called…

2. Statistics is dived into 2 branches that are :

3. The branch of statistics that interprets the values in order to assist in the decision making; is called…

4. The branch of statistics that collects information and describes the data from a sample; is called…

5. Explain the difference between population and sample.

6. Explain the difference between parameter and statistic.

7. Variable that classifies and describes an element; is called…

8. Variable that numerically describes an element; is called…

9. Is a characteristic the sample or population can have…

10. Is the value compiled for the variable of each element…

11. Is the operation of observing results and through which a set of data is obtained.

12. The 2 types of samples are:

13. Using the systematic random sampling formula,: If you have 550 as a population of students and you only want 20 students as your sample; what is the

resulting number?

14. Give an example of ordinal, nominal, continuous and discrete variables (different from your exercise in class).

15. Briefly explain the stratifies random sampling.

TOPIC 12: MEASURES OF CENTRAL TENDENCY

MODULE 3. PROBABILITY AND STATISTICS

INTRODUCTION

of a dataset, that represent an extract of a sample

or a population.

The measures of central tendency are the mean,

the median and the mode, and they serve as a

representative value for a dataset.

12.1 MEAN, MEDIAN AND MODE

Central tendency measures are divided in 2 groups: ungrouped data and grouped data.

Ungrouped data: refers to data that have not been summarized in any way.

The measures of central tendency, in ungrouped data are: mean, median, and mode.

Grouped data: refers to data when we have them divided in classes and we only have the frequency of each one of

them, that is, when we have a frequency table. When there is a frequency table, the values obtained for the

measures of central tendency will be approximate.

The measures of central tendency in grouped data sets are: approximate mean, approximate median, and

approximate mode.

MEAN (UNGROUPED DATA)

It is the most used central measure, and it is also known as arithmetic mean or simply average. This measure is

very simple to obtain, as it requires adding up each one of the data and then dividing them by the total number of

data.

EXAMPLES

MEDIAN (UNGROUPED DATA)

It is the numerical value found at the middle of the data, once these have been arranged in ascending order, which

is the same as arranging them from smallest to largest.

There are two cases we must take into account:

If the number of data (n) is odd, the median corresponds to the number in the middle.

If the number of data (n) is even, the median will correspond to the average obtained between the 2 central numbers.

EXAMPLES

Find the median among the following values: 6, 5, 10, 8, 10, 4, 7, 6, 9, 11.

MODE (UNGROUPED DATA)

This is the value that appears most often in a data list; when there are two modes, it is called bimodal, when this is

repeated more than 2 times, it is called multimodal. The mode, depending on the case study, also could NOT exist.

Four possibilities:

Unimodal: 1 mode

Bimodal: 2 modes

Multimodal: 3 or more modes

No mode

EXAMPLES

Find the mode through the following information:

In this set of data: 1, 3, 5, 7, 9, 11, 13, 15.

APPROXIMATE MEAN (GROUPED DATA)

It is obtained by adding up each one of the frequency products by the class mark, and then dividing them against

the total data.

Approximate mean

APPROXIMATE MEDIAN (GROUPED DATA)

Just like in ungrouped data, it can be obtained in a different way for values in which the total number of data “n” is even or

odd.

When the total number of data (n) is odd: the approximate median corresponds to the value found in the median place =

When the total number of data (n) is even: the approximate median corresponds to the value found in the median place =

APPROXIMATE MODE (GROUPED DATA)

It corresponds to the value of the class mark with the highest frequency.

In case there are two modes, it is called a bimodal distribution, when the number of modes is higher than two,

it is called multimodal.

Four possibilities:

Unimodal: 1 mode

Bimodal: 2 modes

Multimodal: 3 or more modes

No mode

MEASURES OF VARIABILITY OR DISPERSION (GROUPED DATA)

Is the expectation of the squared deviation of Is a measure that is used to quantify the amount of

a random variable from its mean. variation or dispersion of a set of data values.

EXAMPLE

APPROX. MEAN

APPROX. MEDIAN

APPROX. MODE

the following group of ordered pairs.

Xi f Cumulative f f ∙ X f ∙ X²

28 3 3 84

30 4 7 120

Approx. mean=

32 2 9 64

Approx. median=

Standard deviation=

38 7 31 266

40 6 37 240

n= 37 ∑= 1304 ∑=

EXAMPLE

APPROX. MEAN

APPROX. MEDIAN

APPOX. MODE

Calculate the approximate variance and the standard deviation from the

following group of ordered pairs.

Xi f Cumulative f f ∙ X f ∙ X²

1 2 2 2 2

2 3 5 6 12

Approx. mean=

3 3 8 9 27 Approx. median=

Approx. mode

4 6 14 24 96

Standard deviation=

6 4 23 24 144

7 1 24 7 49

n= 24 ∑= 97 ∑= 455

CONCLUSION

By collecting a number of elements, these can be located in tables of classes and frequencies, which will help

identify them or analyze them more quickly. On the other hand, these data can be summarized by the mean, mode

and median, so they help us make a more complete statistical analysis.

It is appropriate to mention that the measures of central tendency help us to locate the center of a dataset, but in

order to form a picture of what this data actually indicates, these measurements cannot be considered sufficient,

since they do not provide all the information we need to understand the distribution of the same data, so it is

also convenient to study measures of variability.

QUIZ #12

QUIZ #12

5-9 7 4

10-14 12 6

15-19 17 5

20-24 22 15

25-29 27 13

30-34 32 3

35-39 37 1

TOPIC 13: MEASURES OF VARIABILITY

MODULE 3. PROBABILITY AND STATISTICS

INTRODUCTION

Range

dispersion.

The most common are range, variance and standard

deviation. Measures

Represent the variation presented with data, in of

relation to their average. variation

Standard

Variance

Deviation

13.1 RANGE, VARIANCE, AND STANDARD DEVIATION

Range: It is considered the easiest to obtain measure of dispersion, since it is just a matter of subtracting the

maximum value of the data, minus the minimum value of the same data.

13.1 RANGE, VARIANCE, AND STANDARD DEVIATION

Variance: It is considered the most important measure of dispersion, the one that tells us how far or near the

data are in relation to the mean.

13.1 RANGE, VARIANCE, AND STANDARD DEVIATION

Standard deviation: It is the square root of the variance and it is expressed in the same unit of data.

EXAMPLE

Find the variance and standard deviation of the following vales sample: 3, 3, 4, 5, 6, 7, 8, 9, 9, 9.

13.1 RANGE, VARIANCE, AND STANDARD DEVIATION

13.1 RANGE, VARIANCE, AND STANDARD DEVIATION

EXAMPLE

The time range, in minutes, that a sample of high school students spends on Facebook throughout the day.

Time in minutes

Amount of students

spent on Facebook Class mark (X) f∙X f ∙ X2

(f)

(Class interval)

[0-60) 17

[60-120) 3

[120-180) 4

[180-240) 8

[240-300) 3

[300-360) 4

[360-420) 2

[420-480) 1

[480-540) 1

N= N=

CONCLUSION

At present there are several technological resources that will help us to expedite the calculation of measures of

central tendency and dispersion, but it is advisable that you know the procedure by which you will be able to

obtain these measures when you do not have these tools on hand.

Sometimes we can find grouped data and ungrouped data, so it is very helpful to be familiar with the calculation of

measures of central tendency and variability, which will help us to interpret and analyze a dataset, which will help

you make a possible decision.

QUIZ #13

TOPIC 14: PROBABILITY

MODULE 3. PROBABILITY AND STATISTICS

14.1 BASIC PRINCIPLES OF PROBABILITY

Probability is the chance that something to happen (the proportion of favorable cases among the total cases).

The probability values are always between zero and one.

BASIC PRINCIPLES OF PROBABILITY

results in certain conditions; these can be deterministic

experiments, which is when we have the same result as long

as we are under the same conditions (F=ma); or random

experiments, which is when the results are variable after

conducting an experiment (heads/tails).

Event: It corresponds to the set of one or more results of an

experiment.

Sample space: It is the set of the possible events that might

happen.

Sample point: It is each one of the elements in the sample

space.

𝑓𝑎𝑣𝑜𝑟𝑎𝑏𝑙𝑒 𝑐𝑎𝑠𝑒𝑠

Probability of an event =

𝑡𝑜𝑡𝑎𝑙 𝑐𝑎𝑠𝑒𝑠

𝑒𝑣𝑒𝑛𝑡/𝑠

=

𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠

EXAMPLE 1

VENN DIAGRAM

Set: is an specific collection described clearly where the elements or people that make up a set are

called elements or members of a set.

A set can be described by enumeration (by making a list that includes the elements of the

set), understanding (by providing a rule that identifies the elements of the set) and Venn diagram (which is

the graphic method where the set and their relations are represented).

May, June, July, August, September, October,

November, December}

By understanding: {the months of the year}

By Venn diagram:

EXAMPLE 2

By enumeration:

By understanding:

S

By Venn diagram:

U

VENN DIAGRAM

Universe set: It is defined with the letter ∪, and it includes all the elements of the set.

Subset: When all the elements of a set (A) belong to another set (B) it is said that A is a subset of B, and it is

represented bye the symbol ⊆, for example A ⊆B.

Empty set: it is the set that has no elements, and it is represented by { } or with the symbol ∅.

Venn diagrams are used to graphically show the grouping of elements in sets..

14.2 BASIC SET OPERATIONS

The basic operations in sets are four and these are: union (∪), intersection (∩), complement (𝐴𝐶 ), and difference

(–).

A B

U

BASIC SET OPERATIONS

Union Intersection

The union of the A and B sets is the set of elements The A and B intersection corresponds to the set of

that are in A, in B or even in both; it is represented elements that are in A and also are in B; it is

by the ∪, for example A∪B. represented with the symbol ∩, for example A∩B.

BASIC SET OPERATIONS

Complement Difference

The complement of A 𝐴𝐶 , are those elements of The difference of A and B is the set of elements that

the universe ∪, that do not belong to A, therefore are in A but not in B; its symbol is A – B.

𝐴𝐶 = ∪ – A.

EXAMPLE 3

CONCLUSION

One of the objectives of probability will be to calculate the odds that an event occurs, which will be of great help

for decision-making.

QUIZ #14

QUIZ #14

What is the probability of taking a number of a 52 poker deck shuffled?

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.