You are on page 1of 41

Introduction to Statistics

#01
Statistic and Statistics
• Statistic
– a fact or piece of data from a study of a large quantity of numerical
data
– Example: the statistics indicated that the crime rate has increased.

• Statistics
– the practice or science of collecting and analyzing numerical data
in large quantities, esp. for the purpose of inferring proportions in
a whole from those in a representative sample.
– a branch of mathematics used to summarize, analyze, and
interpret what we observe, to make sense or meaning of our
observations.
Source: Dictionary Application (Apple Inc.)
Why should
you study statistics?
• A family counselor may use statistics to describe patient behavior and the
effectiveness of a treatment program

• A social psychologist may use statistics to summarize peer pressure among


teenagers and interpret the causes

• A college professor may give students a survey to summarize and interpret how
much they like (or dislike) the course

• In each case, the counselor, psychologist, and professor make use of statistics to do
their job.

How about you?


The Importance of Statistics

• The reason it is important to study statistics can be described by the words


of Mark Twain: “There are lies, damned lies and statistics.”
• He meant that statistics can be deceiving, and so can interpreting them.
• Statistics are all around you, from your college grade point average (GPA) to
a Newsweek poll predicting which political candidate is likely to win an
election. In each case, statistics are used to inform you.
• Statistics are part of your everyday life, and they are subject to
interpretation. The interpreter, of course, is YOU.
Data vs. Information

Data
• Data is raw, unorganized facts that need to
be processed. Data can be something simple
and seemingly random and useless until it is
organized.
• Example: Each student’s test score is one
piece of data.

Information
• When data is processed, organized,
structured or presented in a given context so
as to make it useful, it is called information.
• The class’s average score is the information
that can be concluded from the given data
The use of statistics in
the academic information system
Other Statistic Examples

Toyota Global Production by Region


Other Statistic Examples

Toyota Global Sales by Region


Indonesia
Car Sales
Percentage Value
Ag
ric
ul
tu
re
,L
iv
es
to
ck
, Fo
re
st
ry
an
d
Fi
sh
10.00%
15.00%
20.00%
25.00%
30.00%

0.00%
5.00%
M er
in
in y
g
an
d
M Q
3

El an ua
ec uf rry
tri ac in
ci
ty tu g
,G ri ng
as In
an du
d st
4

W ry
at
er
1

2001
up
pl
Tr y
ad C
e, on
H st
GDP by Sctor

Fi ot ru
2002 na T el ct
nc ra an io
ia n s d n
l,
O p R
w
or
ta e st
ne nd au
rs C ra
Sector

hi nt
p om
an m
d un
2003*

B
2

us ic
in at
io
es n
s
Se
rv
ic
es
2004**

Se
rv
ic
es
. . . And another one
The role of agriculture in
employment
Main Industry 2001 2002 2003 2004 2005
1. Agriculture, Forestry, Hunting and Fishery 43.77% 44.34% 46.38% 43.33% 44.04%
2. Mining and Quarrying 0.69% 0.79% 1.10% 0.85%
3. Manufacturing Industry 13.31% 13.21% 12.39% 11.81% 12.27%
4. Electricity, Gas, and Water 0.19% 0.16% 0.24% 0.20%
5. Construction 4.23% 4.66% 4.37% 4.84% 4.65%
6. Wholesale Trade, Retail Trade, Restaurants and Hotels 19.24% 19.42% 18.59% 20.40% 19.90%
7. Transportation, Storage, and Communications 4.90% 5.10% 5.32% 5.85% 5.85%
8. Financing, Insurance, Real Estate and Business Services 1.24% 1.08% 1.41% 1.20% 1.10%
9. Community, Social, and Personal Services 12.12% 11.30% 10.60% 11.22% 11.14%
10. Others 1.20%
Total 100.00% 100.00% 100.00% 100.00% 100.00%

Try to elaborate this data with one in the previous page. What do you see?
Statistics Classification
Descriptive and Inferential
• Descriptive statistics are procedures used to summarize,
organize, and make sense of a set of scores or observations.
– Descriptive statistics are typically presented graphically,
in tabular form (in tables), or as summary statistics
(single values).

• Inferential statistics are procedures used that allow


researchers to infer or generalize observations made with
samples to the larger population from which they were
selected.
Descriptive and Inferential
Descriptive Statistics Inferential Statistics

• Organize • Generalize from samples


• Summarize to pops
• Simplify • Hypothesis testing
• Presentation of data • Relationships among
variables

Describing data Make predictions


Determine conclusion
Data Classification
Types of Data
• Based on the source
– Primary
– Secondary

• Based on the characteristics


– Quantitative
• States the quantity of certain object, can be expressed in a certain numerical scale.
• Example : waiting time (in minute) before a service begins
– Qualitative (categorical)
• Does not express any quantitative interpretation. It can only be classified.
• Example : field/area of expertise mostly occupied by the graduates of IE-ITS
• Based on the method to obtain the data
– Discrete: a data of well defined value, obtained by counting. For example a number of customers in an
hour.
– Data kontinu: one can take a value between any other two values, obtained by measuring the objects.
Example: indoor temperature, body weight, height.
Scales of Measurements

• Scales of measurement are rules that describe the properties of numbers.


• Scales of measurement are characterized by three properties: order,
differences, and ratios.
• Each property can be described by answering the following questions:
– Order: Does a larger number indicate a greater value than a smaller
number?
– Differences: Does subtracting two numbers represent some
meaningful value?
– Ratio: Does dividing (or taking the ratio of) two numbers represent
some meaningful value?
Scales of Measurement and
Data Properties
Scales of measurements
Nominal Scale (Categorical Scale)
• Nominal scales are measurements where a number is assigned to represent
something or someone.

• Label observations so they fall into different categories, but no quantitative


distictions

• E.g. person’s race, gender, nationality, sexual orientation, hair and eye color,
season of birth, marital status, or other demographic or personal information.

• A researcher may code men as 1 and women as 2. They may code the seasons as 1,
2, 3, and 4 for spring, summer, fall, and winter, respectively. These numbers are
used to identify gender or the seasons and nothing more.
Ordinal Scale
• Set of categories that are organized into an ordered sequence
• Rank in terms of size and magnitude
• E.g: 1st 17th; low medium high
Scales of measurements
Interval Scale
• Interval scales are measurements where the values have no true zero and the
distance between each value is equidistant.
• A true zero describes values where the value 0 truly indicates nothing.
• It can compare magnitude differences, but not ratios of magnitude
• Example: temperature

Ratio Scale
• Ratio scales are measurements where a set of values has a true zero and are
equidistant.
• Allows for ratio of magnitudes, so you can compare ratios of magnitude
• Common examples of ratio scales include counts and measures of length, height,
weight, time, and age.
• Hence, it is meaningful to state that 60 pounds is twice as heavy as 30 pounds.
Data Sources
Data collection

Primary Secondary

Observation Printed

Survey Electronic

Interview

Experimentation
Statistics Methods

Statistic Methods

Descriptive Inferential

Data collection Estimation

Data presentation Hypothesis testing

Estimating/claiming population
Data classification characteristics based on the sample
parameter.
Main purpose is to describe and present
data in a more useful and meaningful way.
First Assignment

Collect at least 3 examples of statistics usage in a media (printed


or electronic). Mention the source and give a brief explanation
about the examples. Submit the assignment next week, before
the meeting starts.
Terms Used in
Inferential Statistics

◼ A population (or universe) is the whole collection of things under consideration.

◼ A sample is a portion of the population selected for analysis.

◼ A parameter is a summary measure computed to describe a characteristic of the


population.

◼ A statistic is a summary measure computed to describe a characteristic of the


sample
Sample? Why?

• Less time consuming than a census


• Less costly to administer than a census
• Less cumbersome and more practical to administer than a census of the targeted
population
Representative Sample

• A set of sample is a set of selected part of population to be observed and to


infer the population. A good inference can only be achieved if sample is
representative.
• A representative sample can be obtained through a good mechanism called
sampling technique.
Population Vs. Sample
Population Sample

a b cd b c
ef gh i jk l m n gi n
o r u
o p q rs t u v w
y
x y z
Sampling Techniques

Judgement
Non probability
samples
Convenience

Samples Simple random

Stratified
Probability
samples
Systematic

Cluster
Sampling Techniques

• Probability sampling
– Bagian dari sampel dipilih berdasarkan probabilitas tertentu (dihitung
sebelumnya).
– Teknik sampling yang memberikan peluang yang sama bagi setiap
anggota populasi untuk dipilih menjadi anggota sampel.

• Nonprobability sampling
– Teknik sampling yang tidak memberikan peluang yang sama pada
semua anggota populasi untuk dipilih menjadi anggota sampel.
Simple Random Samples

• Every individual or item from the population has an equal chance of


being selected.
• Selection may be with replacement or without replacement.
• Samples can be obtained from a table of random numbers or computer
random number generators.
Stratified Samples

• Population divided into subgroups (called strata) according to some


common characteristic
• Simple random sample selected from each subgroup
• Samples from subgroups are combined into one

Population
Divided
into 4
strata

Sample
Stratified Sample (1)
Proportionate stratified random sampling
Pengambilan sampel dari populasi yang mempunyai anggota tidak homogen dan
berstrata secara proporsional

sampel
Populasi berstrata

Contoh: suatu organisasi mempunyai pegawai dengan latar belakang pendidikan yang berbeda.
Lulusan S1: 45 orang S2: 30 orang
STM: 800 orang SMEA: 400 orang
SD : 300 orang,
Jumlah sampel yang harus diambil berdasarkan strata pendidikan harus diambil secara
proporsional.
Stratified Sample (2)

Disproportionate stratified random sampling


• Pengambilan sampel dari populasi berstrata tapi tidak proporsional.
• Contoh: pegawai di perusahaan tertentu mempunyai latar belakang pendidikan sbb,
– S3 : 3 orang,
– S2 : 4 orang,
– S1 : 90 orang
– SLTA : 800 orang, dan
– SLTP : 700 orang.
• Untuk pengambilan sampel, 3 orang S3 dan 4 orang S2 diambil semua sebagai
sampel karena jumlahnya terlalu sedikit bila dibandingkan dengan kelompok yang
lain.
Systematic Samples

◼ Decide on sample size: n


◼ Divide frame of N individuals into groups of k individuals: k=N/n
◼ Randomly select one individual from the 1st group
◼ Select every kth individual thereafter

N = 64
n=8 First Group
k=8
Systematic Sampling

▪ Teknik pengambilan sampel ini berdasarkan urutan anggota populasi yang telah
diberi nomor urut.

▪ Contoh : anggota populasi yang terdiri dari 100 orang. Pengambilan sampel dapat
diambil dengan memperhatikan no ganjil saja, genap saja, atau kelipatan bilangan
tertentu, misal kelipatan 5, maka sampel yang diambil adalah no. 5, 10, 15, dsb.
Cluster Samples

• Population is divided into several “clusters,” each representative of the population


• A simple random sample of clusters is selected
– All items in the selected clusters can be used, or items can be chosen from a
cluster using another probability sampling technique

Population
divided into 16
clusters. Randomly selected
clusters for sample
Cluster Sample
• Pengambilan sampel dari populasi/obyek yang sumber datanya sangat luas, misal : penduduk di
suatu negara/propinsi/kabupaten (untuk menentukan penduduk mana yang akan dijadikan
sumber data, maka pengambilan sampel berdasarkan daerah populasi yang telah ditetapkan).

• Contoh : di Indonesia ada 33 propinsi dan sampelnya akan menggunakan 10 propinsi maka
pengambilan 10 propinsi ini dilakukan secara random. Biasanya dilakukan 2 tahap: sampel
daerah (10 propinsi dari 32 propinsi) dan sampel individu (menentukan orang yang akan
dijadikan sampel pada setiap propinsi).

B B
A A
C

E D
D
Nonprobability sampling (1)

• Sampling sistematis.
• Sampling kuota.
– Pengambilan sampel dari populasi yang mempunyai ciri-ciri tertentu sampai
jumlah (kuota) yang diinginkan.
– Contoh : sekelompok peneliti yang terdiri dari 5 orang melakukan penelitian
terhadap pegawai golongan II. Jumlah sampel ditentukan 100. sehingga setiap
anggota peneliti dapat memilih sampel secara bebas sesuai dengan
karakteristik yang ditentukan (golongan II) sebanyak 20 orang.
• Sampling aksidental.
– Pengambilan sampel berdasarkan kebetulan, yaitu siapa saja yang secara
kebetulan bertemu dengan peneliti dapat digunakan sebagai sampel bila
dipandang orang tersebut cocok sebagai sumber data.
Nonprobability sampling (2)

• Sampling purposive
– Pengambilan sampel dengan pertimbangan tertentu.
– Contoh : penelitian tentang disiplin pegawai, maka sampel yang dipilih adalah
orang yang ahli dalam bidang kepegawaian saja.
• Sampling jenuh
– Pengambilan sampel dengan mengambil semua anggota populasi sebagai sampel
– Dikenal sebagai sensus.
• Snowball sampling
– pengambilan sampel yang mula-mula jumlahnya sedikit, kemudian sampel itu
diminta memilih teman-temannya untuk dijadikan sampel, begitu seterusnya
sehingga jumlah sampel semakin banyak.
Additional references used
in this presentation

• Kamus Besar Bahasa Indonesia. kkbi.web.id. Diakses pada tanggal 2 September


2013.
• Pusat Pembinaan Sumber Daya Investasi, Badan Pembinaan Konstruksi, Kementrian
Pekerjaan Umum. 2011. http://pusbinsdi.net/semen.php?page=produksi. Diakses
pada tanggal 2 September 2013.
• Toyota Motor Corporation, 2013. http://www.toyota-
global.com/company/profile/figures/vehicle_production_sales_and_exports_by_reg
ion.html. Diakses pada tanggal 2 September 2013.
Your study guide
for next meeting
Find and understand the meaning of the following terms:
1. Population
2. Sample
3. Sampling
4. Mean
5. Median
6. Modus
7. Standard Deviation
8. Variance
9. Outlier
Use your own words. You may add examples to explain it.

You might also like