Statistical Concepts and Market Returns

STATISTICAL CONCEPTS AND
MARKET RETURNS
POPULATIONS AND SAMPLES

The subset of data used in statistical inference is known as a sample and the
larger body of data is known as the population.
- The population is defined as all members of the group in which we are
interested.
Population
Sample
Subset data yang digunakan dalam inferensi statistik dikenal sebagai

sampel dan tubuh yang lebih besar dari data yang dikenal sebagai
penduduk.
Populasi didefinisikan sebagai semua anggota kelompok di mana kita
tertarik.
2
PARAMETERS AND SAMPLE STATISTICS

A population has parameters, and a sample has statistics.
Sebuah populasi memiliki parameter, dan sampel memiliki
statistik.
Descriptive statistics that characterize population values are called parameters.

- Examples: mean, median, mode, variance, skewness, kurtosis
Descriptive statistics that characterize samples are known as sample statistics.
- Examples: sample mean, sample median, sample variance
By convention, we often omit the term sample in front of sample statistics, a practice
that can lead to confusion when discussing both the sample and the population.
statistik deskriptif yang mencirikan nilai-nilai populasi disebut parameter.
Contoh: mean, median, modus, varians, skewness, kurtosis
statistik deskriptif yang menjadi ciri sampel dikenal sebagai statistik sampel.
Contoh: sampel berarti, sampel median, varians sampel
Dengan konvensi, kita sering mengabaikan istilah "sampel" di depan statistik
sampel,
sebuah
praktek
yang
dapat
menyebabkan
kebingungan
ketika
membahas kedua sampel dan populasi.

3
MEASUREMENT SCALES
Statistical inference is affected by the type of data we are trying to analyze.
inferensi statistik dipengaruhi oleh jenis data yang kita coba
menganalisis.
Nominal scales categorize data but do not rank them.
Weak Scales
- Examples: fund style, country of origin, manager gender

Ordinal scales sort data into categories that are ordered with
respect to the characteristic along which the scale is measured.
- Examples: star rankings, class rank, credit rating
Interval scales provide both the relative position (rank) and
assurance that the differences between scale values are equal.
- Example: temperature
Ratio scales have all the characteristics of interval scales and a
zero point at the origin.
- Examples: rates of return, corporate profits, bond maturity
Strong Scales
4
LANJUTAN
skala nominal mengkategorikan data tetapi tidak peringkat mereka.
Contoh: Dana gaya, negara asal, manajer jenis kelamin
skala
ordinal
diperintahkan
mengurutkan
sehubungan
data
dengan
ke
dalam
kategori
yang
karakteristik
bersama
yang
skalanya diukur.
Contoh: "star" peringkat, kelas rank, peringkat kredit
skala Interval menyediakan baik posisi relatif (peringkat) dan
jaminan bahwa perbedaan antara nilai-nilai skala yang sama.
Contoh: suhu
skala rasio memiliki semua karakteristik dari skala interval dan titik
nol pada asal.
Contoh: tingkat keuntungan, keuntungan perusahaan, jatuh tempo
obligasi
HOLDING PERIOD RETURNS

Holding period returns are a fundamental building block of the statistical analysis
of investments.
Holding period return adalah sebuah blok bangunan fundamental
dari analisis statistik investasi.
Holding period returns (HPR) are calculated as the price at
the end of the period plus any cash distribution during the
period minus the beginning of period price, all divided by
the beginning period price.
For this stock, which is nondividend paying, the HPRs are:
Time
0
1
2
3
4
5
Price
27.00
25.77
24.73
24.32
24.39
24.71
HPR
4.57%
4.04%
1.64%
0.28%
1.34%
25.30
2.35%
Time
7
8
9
10
11
12
Price
25.90
27.01
28.20
29.52
31.63
35.25
HPR
2.38%
4.28%
4.42%
4.68%
7.16%
11.43%
6
Holding period returns (HPR) are calculated as the price at the end of the period
plus any cash distribution during the period minus the beginning of period price, all
divided by the beginning period price.
For this stock, which is nondividend paying, the HPRs are:
Holding period return (HPR) dihitung sebagai harga pada akhir
periode ditambah setiap pembagian uang tunai selama periode minus
awal harga periode, semua dibagi dengan harga awal periode.
Untuk saham ini, yang nondividend membayar, HPRs adalah:
Time
0
1
2
3
4
5
Price
27.00
25.77
24.73
24.32
24.39
24.71
HPR
4.57%
4.04%
1.64%
0.28%
1.34%
25.30
2.35%
Time
7
8
9
10
11
12
Price
25.90
27.01
28.20
29.52
31.63
35.25
HPR
2.38%
4.28%
4.42%
4.68%
7.16%
11.43%
FREQUENCY DISTRIBUTIONS
A tabular display of data summarized into intervals is known as a frequency
distribution.
Sebuah tampilan tabel data diringkas menjadi interval dikenal sebagai distribusi
frekuensi.
Constructing a frequency distribution:
Membangun distribusi frekuensi:
1.
Sort the data in ascending order.
1. Mengurutkan data dalam urutan menaik.
2.
Calculate the range of the data, defined as
2. Hitung berbagai data, yang didefinisikan sebagai
Range = Maximum value Minimum value.
Kisaran = Maksimum nilai - nilai minimum.
Decide on the number of intervals in the frequency

distribution, k.
3. Tentukan jumlah interval dalam distribusi frekuensi, k.
4.
Determine interval width as Range/k.
5.
5.
Determine the intervals by successively adding the

interval width to the minimum value to determine the
ending points of intervals, stopping after reaching an
interval that includes the maximum value.
menambahkan
6.
Count the number of observations falling in each

interval.
6. Menghitung jumlah pengamatan jatuh setiap interval.
7.
Construct a table of the intervals listed from smallest

to largest that shows the number of observations
falling in each interval.
hingga terbesar yang menunjukkan jumlah observasi
3.
4. Menentukan lebar interval sebagai Rentang / k.

Tentukan
interval
lebar
dengan
interval
nilai
berturut-turut
minimum
untuk
menentukan titik akhir dari interval, berhenti setelah

mencapai selang yang mencakup nilai maksimum.
7. Buatlah sebuah tabel interval terdaftar dari terkecil
jatuh setiap interval.
FREQUENCY DISTRIBUTIONS
Focus on: Holding Period Returns
Fokus pada: Holding Period Pengembalian
Suppose we have 12 holding period return observations from a non-dividend-paying stock, sorted in
ascending order:
4.57, 4.04, 1.64, 0.28, 1.34, 2.35, 2.38, 4.28, 4.42, 4.68, 7.16, and 11.43.
Using k = 4, we have intervals with width of 4.
The resulting frequency distribution is
Misalkan kita memiliki 12 observasi holding period kembali dari saham
non-dividen-membayar, diurutkan dalam urutan menaik:
-4,57, -4,04, -1,64, 0,28, 1,34, 2,35, 2,38, 4,28, 4,42, 4,68, 7,16, dan
11,43.
Menggunakan k = 4, kita memiliki interval dengan lebar 4.
Distribusi frekuensi dihasilkan
Interval
Absolute Frequency
4.57 observation < 0.57
7.43 observation 11.43
RELATIVE AND CUMULATIVE FREQUENCY

Relative frequency is the absolute frequency divided by the total number of observations.
Cumulative (relative) frequency is the relative frequency of all observations occurring before a given
interval.
frekuensi relatif adalah frekuensi absolut dibagi dengan jumlah total

pengamatan.
Kumulatif (relatif) frekuensi adalah frekuensi relatif dari semua pengamatan
terjadi sebelum interval tertentu.
Interval
4.57 observation <
0.57
0.57 observation <
3.43
7.43 observation
11.43
Absolute
Frequency
3 12
Relative
Frequency
0.250
+
=
Cumulative
Frequency
0.250
0.333
0.583
0.333
0.917
0.083
1.000
10
HISTOGRAMS
Histograms are the graphical representation of a frequency distribution.
Histogram adalah representasi grafis dari distribusi
frekuensi.
Absolute Frequency
Holding Period Return
5
4
4
3
2
1
1
0
11
FREQUENCY POLYGON
Frequency polygons are often used to provide higher visual continuity than
histograms.
Poligon frekuensi sering digunakan untuk memberikan kontinuitas visual

yang lebih tinggi dari histogram.
Absolute Frequency
Holding Period Return
5
4
3 3
2
1
0
12
MEASURES OF CENTRAL TENDENCY

These measures describe where the data are centered.
Langkah-langkah ini menjelaskan di mana data terpusat.
Arithmetic Mean
- The arithmetic mean is the sum of the observations
divided by the number of observations.
- Population mean
- Sample mean
- The sample mean is often interpreted as the fulcrum, or center of gravity, for a
given set of data.
- Cross-sectional data occur across different observation types at one point in time,
and time-series data occur for the same unit of observation across time.
13
LANJUTAN
Aritmatika Berarti
- Mean aritmetik adalah jumlah pengamatan
dibagi dengan jumlah observasi.
- Populasi berarti
- Sampel berarti
- Sampel berarti sering diartikan sebagai titik tumpu, atau pusat
gravitasi, untuk satu set data.
- data cross-sectional terjadi di seluruh jenis pengamatan yang
berbeda pada satu titik waktu, dan data time-series terjadi
untuk unit yang sama dari pengamatan di seluruh waktu.
14

Focus on: Cross-Sectional Sample Mean Return
Country
Return
Country
Return
Austria
Belgium
Denmark
Finland
France
Germany
Greece
Ireland
2.97%
29.71%
29.67%
41.65%
33.99%
44.05%
39.06%
38.97%
Italy
Netherlands
Norway
Portugal
Spain
Sweden
Switzerland
United
Kingdom
23.64%
34.27%
29.73%
28.29%
29.47%
43.07%
25.84%
25.66%
Source: www.msci.com.
15

Ukuran pemusatan
Mean as a center of gravity for the data object
MEAN sebagai pusat gravitasi untuk objek
Germany
Sweden
Finland
Greece
Ireland
data
Netherlands
France
Norway
Belgium
Denmark
Spain
Portugal
Switzerland
United Kingdom
Italy
Austria
44.05%
2.97%
31.25%
16

These measures also describe where the data are centered.
Langkah-langkah ini juga
menggambarkan dimana data
Weighted Mean
terpusat.
- The sum of the observations times each observations weight (proportional

representation in the sample), where the weight is chosen to meet a statistical or
financial goal. Example: Portfolio return
Geometric Mean
- Represents the growth rate or compounded return on an investment when X is 1 + R
Harmonic Mean
- A weighted mean in which each observations weight is inversely proportional to its
magnitude. Example: Cost averaging
17
LANJUTAN
Berarti tertimbang
- Jumlah dari pengamatan kali berat (proporsional representasi dalam
sampel) setiap pengamatan, di mana berat badan yang dipilih untuk
memenuhi tujuan statistik atau keuangan. Contoh: Portofolio pulang
Mean Geometrik
-
Merupakan
tingkat
pertumbuhan
atau
diperparah
pengembalian
investasi ketika X adalah 1 + R

Harmonic Berarti
- Sebuah rata-rata tertimbang di mana berat badan masing-masing
pengamatan ini berbanding terbalik dengan besarnya. Contoh: averaging
Biaya
18

These measures also describe where the data are centered.
Langkah-langkah ini juga menggambarkan dimana data terpusat.
median adalah pengamatan tengah dengan
The median is the middle observation by rank.
- When we have an odd number of observations, the
peringkat.
-Ketika
kita
memiliki
ganjil
pengamatan,
median will be the closest to the middle. When we
median akan menjadi yang paling dekat ke
have an even number, the median will be the average
tengah. Ketika kita memiliki bilangan genap,
of the two middle values.
median akan menjadi rata-rata dari dua nilai
The mode is the most frequently occurring value in a
tengah.
Modus adalah nilai yang paling sering terjadi
distribution.
- Distributions are unimodal when there is a single most
frequently occurring value and multimodal if there is
more than one frequently occurring value.
- Examples: Bimodal and trimodal
dalam distribusi.
-Distribusi adalah unimodal ketika ada satu
nilai yang paling sering terjadi dan multimodal
jika ada lebih dari satu nilai yang sering
terjadi.
-Contoh: bimodal dan trimodal
Unimodal
Bimodal
19
You do not have to have a mode. It will often occur with interval and ratio data.
The largest advantage associated with medians is a lack of sensitivity to extremely large
values (outliers). If you suspect that the large values are a result of mismeasurement in the
data or the inclusion of nonrepresentative units of analysis (sample contamination), then
median is probably a more appropriate measure of centrality than mean. It will almost always
be more appropriate when you have skewed data.
Anda tidak harus memiliki modus. Ini akan sering terjadi dengan data interval dan
rasio.
Keuntungan terbesar terkait dengan median adalah kurangnya kepekaan terhadap
nilai-nilai yang sangat besar (outlier). Jika Anda menduga bahwa nilai-nilai yang
besar adalah hasil dari mismeasurement dalam data atau masuknya unit
nonrepresentative analisis (kontaminasi sampel), maka median mungkin ukuran
lebih tepat sentralitas dari rata-rata. Ini akan hampir selalu lebih tepat bila Anda
memiliki data miring.
Rank
1
2
3
4
5
6
7
8
Country
Germany
Sweden
Finland
Greece
Ireland
Netherlands
France
Norway
Return
44.05%
43.07%
41.65%
39.06%
38.97%
34.27%
33.99%
29.73%
Rank
9
10
11
12
13
14
15
16
Country
Belgium
Denmark
Spain
Portugal
Switzerland
United Kingdom
Italy
Austria
Return
29.71%
29.67%
29.47%
28.29%
25.84%
25.66%
23.64%
2.97%
21
There is no mode for this distribution. Often, when we are using intervals to describe a
distribution, we may use the modal interval (the interval with the highest number of
observations) instead of the mode. This interval will be the interval with the highest bar
(highest point) in the frequency histogram (polygon).
We cant use mean or median with nominal data because the only thing that matters is the
classification of the data. We can, however, use mode to describe nominally distributed data.
Tidak ada modus untuk distribusi ini. Seringkali, ketika kita menggunakan
interval untuk menggambarkan distribusi, kita dapat menggunakan interval
modal (interval dengan jumlah tertinggi pengamatan) bukan mode. interval ini
akan menjadi interval dengan bar tertinggi (titik tertinggi) dalam histogram
frekuensi (poligon).
Kita tidak bisa menggunakan berarti atau median dengan data nominal karena
satu-satunya hal yang penting adalah klasifikasi data. Kami bisa, bagaimanapun,
menggunakan modus untuk menggambarkan data terdistribusi secara nominal.
NOTES
22
INTERVAL LOCATION MEASURES

Quantiles are values that identify the location of data at or below which
specified proportions lie.
Quantiles adalah nilai-nilai yang mengidentifikasi lokasi data pada atau di
bawah yang proporsi tertentu berbohong.
Quartiles, Quintiles, Deciles, and Percentiles

- Quarters, fifths, tenths, and hundredths
- Py = 0.25 or 0.20 or 0.10 or 0.01
Sometimes, we may be able to determine the exact location because the
percentile cutoff corresponds to an exact location in our data.
- Example: The quartile (25th percentile) of 60 observations is the 15th
observation as rank-ordered.
- Sometimes, the ordering doesnt lead to exact integer divisibility.
- Then, the position of percentile, Py, denoted as Ly, is found by
and the value of Py is found by linear interpolation.
23
LANJUTAN
Kuartil, Quintiles, Desil, dan Persentil
- Quarters, perlima, persepuluh, dan seratus
- Py = 0,25 atau 0,20 atau 0,10 atau 0,01
Kadang-kadang, kita mungkin dapat menentukan lokasi yang tepat karena cutoff
persentil sesuai dengan lokasi yang tepat di data kami.
- Contoh: kuartil (25 persentil) dari 60 pengamatan adalah observasi 15 sebagai
peringkat-memerintahkan.
- Kadang-kadang, pemesanan tidak menyebabkan dibagi bilangan bulat yang
tepat.
- Kemudian, posisi persentil, Py, dilambangkan sebagai Ly, ditemukan oleh
dan nilai Py ditemukan dengan interpolasi linier.

Each quantile can be converted to its
Setiap
percentile
representasi
representation. We interpret
kuartil
dapat
persentil
dikonversi
nya.
ke
Kami
each percentile representation as the cutoff
menafsirkan setiap representasi persentil
at which p% of the distribution falls below
sebagai cutoff di mana p% dari distribusi
that point.
turun di bawah titik itu.
NOTES
24
INTERVAL LOCATION MEASURES

Focus on: First Quintile
Rank
1
2
3
4
5
6
7
8
Country
Germany
Sweden
Finland
Greece
Ireland
Netherlands
France
Norway
Return
44.05%
43.07%
41.65%
39.06%
38.97%
34.27%
33.99%
29.73%
Rank
9
10
11
12
13
14
15
16
Country
Belgium
Denmark
Spain
Portugal
Switzerland
United Kingdom
Italy
Austria
Return
29.71%
29.67%
29.47%
28.29%
25.84%
25.66%
23.64%
2.97%
25
Many students may not directly recall how to perform interpolation. Note that the
interpolation process essentially finds the point between the interpolated points (here, the
returns for Finland and Greece) that is the same proportional distance from the reference
observation (here, Finland) as the interval location measure (here, 3.4) is from its reference
(rank of Finland) relative to the next interval location measure (here, rank of Greece). The
location value, 3.4, is 40% of the distance between 3 and 4, and so, 40.614% is 40% of
the distance between 41.65% and 39.06%.
Banyak
siswa
mungkin
tidak
secara
langsung
ingat
bagaimana
melakukan interpolasi. Perhatikan bahwa proses interpolasi dasarnya

menemukan titik antara titik interpolasi (di sini, kembali untuk Finlandia
dan Yunani) yang jarak proporsional yang sama dari pengamatan
referensi (di sini, Finlandia) sebagai lokasi ukuran selang (di sini, 3.4)
adalah dari referensi (pangkat Finlandia) relatif terhadap lokasi interval
ukuran berikutnya (di sini, pangkat Yunani). Nilai lokasi, 3.4, adalah 40%
dari jarak antara 3 dan 4, dan sebagainya, -40,614% adalah 40% dari
jarak antara -41,65% dan -39,06%.
26
WEIGHTED AVERAGE
Also known as a weighted mean, the most common application of this measure
in investments is the weighted mean return to a portfolio.
Consider again the country-level
Component
data. You have constructed a
Country Weight Return
Return
portfolio that has 50% of its
Portugal 12.50% 28.29%
3.54%
weight in Portugal, Ireland,
Greece, and Spain and 50% of
Ireland 12.50% 23.64%
2.96%
its weight in Germany and the
Greece 12.50% 39.06%
4.88%
UK. Each of the first four
countries is equally weighted
Spain 12.50% 29.47%
3.68%
within the 50%, as are Germany
Germany 25.00% 44.05%
1.01%
and the UK within their 50%.
What is the weighted average
UK
25.00% 25.66%
6.42%
return to the portfolio?
Weighted
Sum
100% Mean =
32.49%
Note that the weights must sum to 1. If we have only long positions, the
weights must also all be positive, but with negative positions, the weights
could also be negative.
27
LANJUTAN
Juga dikenal sebagai weighted mean, aplikasi yang paling
umum dari ukuran ini dalam investasi adalah mean kembali
tertimbang untuk portofolio.
Perhatikan
kembali
data
tingkat
negara.
Anda
telah
membangun sebuah portofolio yang memiliki 50% dari berat

di Portugal, Irlandia, Yunani, dan Spanyol dan 50% dari berat
di Jerman dan Inggris. Masing-masing dari empat negara
pertama sama tertimbang dalam 50%, seperti Jerman dan
Inggris dalam mereka 50%. Apa pengembalian rata-rata
tertimbang untuk portofolio?
Perhatikan bahwa bobot harus berjumlah 1. Jika kita hanya memiliki
posisi panjang, bobot harus juga semua positif, tetapi dengan posisi
negatif, bobot juga bisa menjadi negatif.
28
MEASURES OF DISPERSION
Dispersion measures variability around a measure of central tendency.
If mean return represents reward, then dispersion represents risk.
Dispersi mengukur variabilitas sekitar ukuran tendensi sentral.
Dari berarti kembali mewakili reward, maka dispersi merupakan risiko.
29
Range will work with interval or ratio data, and it is easy to compute. But range
provides only limited information (two pieces), does not have desirable
statistical properties, and is heavily influenced by extremely large or small
outcomes.
MAD addresses the problem of negative deviations from the mean canceling
out positive deviations from the mean, while also not being unduly affected by
extremely large or small values. It does not have desirable statistical or
mathematical properties particularly when compared with variance.
Kisaran akan bekerja dengan data interval atau rasio, dan mudah
untuk
menghitung.
Tapi
berbagai
memberikan
informasi
yang
terbatas (dua buah), tidak memiliki sifat statistik yang diinginkan, dan
sangat dipengaruhi oleh hasil yang sangat besar atau kecil.
alamat MAD masalah penyimpangan negatif dari mean membatalkan
keluar deviasi positif dari mean, sementara juga tidak sedang terlalu
terpengaruh oleh nilai-nilai yang sangat besar atau kecil. Itu tidak
memiliki sifat statistik atau matematika diinginkan terutama bila
dibandingkan dengan varian.
30
Dispersion measures variability around a measure of central tendency.
If mean return represents reward, then dispersion represents risk.
31
Focus on: Sample Standard Deviation
Country
Return
Squared
Deviation
from Mean
Germany
Sweden
Finland
Greece
Austria
44.05%
43.07%
41.65%
39.06%
2.97%
Sum=
s2=
s=
0.016384
0.013971
0.010816
0.00610
...
0.0780
0.1486
0.0099
9.95%
Units on squared deviations from the mean

are %-squared, which is generally not
meaningful to an audience.
Accordingly,
we typically only report standard deviations

whose units are the same as the units on
the mean. If this had been a population
mean instead, we would have divided by n
instead of n 1.
Unit pada penyimpangan kuadrat dari rata-rata adalah% -squared,

yang umumnya tidak berarti bagi penonton. Dengan demikian, kita
biasanya hanya melaporkan deviasi standar yang unit adalah sama
dengan unit pada mean. Jika ini telah mean populasi sebaliknya, kita
akan dibagi dengan n bukannya n - 1.
32
SEMIVARIANCE
We are often concerned with measures of risk that focus on the downside of the
possible outcomesin other words, the losses.
Kami sering berhubungan dengan langkah-langkah dari risiko yang
berfokus pada "downside" dari kemungkinan hasil-kata lain, kerugian.
33
If the return distributions are symmetrical, then these measures are a constant proportion of the variance and little
value is gained by using them. Risk ranking may change if the underlying return generating distributions are not
symmetrical. Both have unattractive statistical properties, but they do have intuitive appeal.
These measures are calculated using the following steps:
i.
Calculate the sample mean.
ii.
Identify the observations that are smaller than the mean (discarding observations equal to and greater than
the mean); suppose there are n observations smaller than the mean.
iii. Compute the sum of the squared negative deviations from the mean or target rate (using the n observations
that are smaller than the mean or the target rate).
iv. Divide the sum of the squared negative deviations from Step iii by n 1.
Jika distribusi kembali simetris, maka langkah-langkah ini proporsi konstan dari varians dan nilai
sedikit yang diperoleh dengan menggunakan mereka. peringkat risiko dapat berubah jika
distribusi pembangkit pulang mendasari tidak simetris. Keduanya memiliki sifat statistik
menarik, tetapi mereka memiliki daya tarik intuitif.
Langkah-langkah ini dihitung dengan menggunakan langkah-langkah berikut:
I.
Hitung sampel rata-rata.
II.
Mengidentifikasi pengamatan yang lebih kecil dari rata-rata (membuang pengamatan sama
dengan dan lebih besar dari rata-rata); misalkan ada n * pengamatan lebih kecil dari ratarata.
III.
Menghitung jumlah deviasi negatif kuadrat dari rata-rata atau tingkat menargetkan
(menggunakan n * pengamatan yang lebih kecil dari rata-rata atau tingkat target).
IV.
Bagilah jumlah deviasi negatif kuadrat dari Langkah iii oleh n * - 1.
34
CHEBYSHEVS INEQUALITY
ketidaksamaan Chebyshev
This expression gives the minimum proportion of values, p, within k standard
deviations of the mean for any distribution whenever k > 1.
k
Ungkapan
ini
memberikan
proporsi minimal nilai, p, dalam

standar deviasi k dari mean
untuk distribusi setiap kali k> 1.
Interval around the Mean
1.25
0.36
1.50
0.56
2.00
0.75
2.50
0.84
3.00
0.89
4.00
0.94
The high value of Chebyshevs Inequality comes from its generality (it is applicable
to any distribution). Note that this is a lower bound on the proportion. It may, in
fact, be higher. Students familiar with the normal distribution are likely to recall that
95% of the observations in a standard normal distribution lie within 1.96
(approximately 2) standard deviations from the mean. Using Chebyshevs
Inequality, you can see that a two-standard-deviation interval around the mean
must contain at least 75% of the observations. LANJUT DIBELAKANG
35
Because 95% is greater than 75%, this agrees with what has already been
covered. You may be able to get a higher p than Chebyshevs in practice, but you
CANT do worse than the Chebyshev p.
ARTI NOTES SLIDE 30..
Nilai tinggi Ketimpangan Chebyshev berasal dari umum (itu berlaku
untuk setiap distribusi). Catatan bahwa ini adalah batas bawah pada
proporsinya. Mungkin, pada kenyataannya, lebih tinggi. Siswa akrab
dengan distribusi normal cenderung mengingat bahwa 95% dari
pengamatan di kebohongan distribusi normal standar dalam 1,96
(sekitar 2) standar deviasi dari mean. Menggunakan Ketimpangan
Chebyshev, Anda dapat melihat bahwa interval dua standar deviasi
sekitar mean harus mengandung setidaknya 75% dari pengamatan.
Karena 95% lebih besar dari 75%, hal ini sesuai dengan apa yang telah
dibahas.
Anda
mungkin
bisa
mendapatkan
lebih
tinggi
dari
Chebyshev dalam praktek, tetapi Anda tidak bisa melakukan lebih

buruk daripada Chebyshev p.
36
CHEBYSHEVS INEQUALITY
Focus on: Calculating Proportions Using Chebyshevs Inequality
For our country data, the mean is 31.25% and the sample standard deviation
is 9.95%.
Fokus pada: Menghitung Proporsi
Lower cutoff at 1.25 standard deviations:
31.25% 1.25 (9.95%) = 43.6875%
Menggunakan Ketimpangan
Chebyshev
Upper cutoff at 1.25 standard deviations:

31.25% + 1.25 (9.95%) = 18.8125%
k
1.25
1.50
2.00
2.50
3.00
4.00
Lower
Cutoff
Upper
Cutoff
43.69%
46.18%
51.16%
56.13%
61.11%
71.07%
18.81%
16.32%
11.34%
6.37%
1.39%
8.57%
Actual p
0.875
0.938
0.938
0.938
0.938
1.000
Chebyshevs
p
0.36
0.56
0.75
0.84
0.89
0.94
37
COMBINING RISK AND RETURN

Measures of relative dispersion are used to compare risk and return across
differing sets of observations.
Ukuran dispersi relatif digunakan untuk membandingkan risiko dan
kembali di set berbeda pengamatan.
Koefisien variasi adalah rasio standar deviasi

dari serangkaian pengamatan untuk nilai rata-
The coefficient of variation is the ratio of the rata mereka.

standard deviation of a set of observations
- Rasio ini dapat dianggap sebagai unit risiko per
to their mean value.
- This ratio can be thought of as the units
of risk per unit of mean return.
unit rata-rata kembali.

The Sharpe Ratio adalah rasio mean excess
return (artinya pulang minus berarti tingkat

The Sharpe Ratio is the ratio of the mean
excess return (mean return minus the mean bebas risiko) per unit dari standar deviasi.
risk-free rate) per unit of standard deviation. - Rasio ini dapat dianggap sebagai unit pulang
- This ratio can be thought of as units of

risky return (excess return) per unit of
risk.
- This will also be the slope of a line in
expected return/standard deviation
space.
berisiko (excess return) per unit risiko.

- Ini juga akan menjadi kemiringan garis imbalan
/ ruang deviasi standar yang diharapkan.
E(r)
Sp
rf
38
These measures arise in large part because it is difficult to compare means and
standard deviations across different samples or portfolios. They are both measures
of relative dispersion. Each expresses the magnitude of dispersion with respect to a
common point. In the case of the coefficient of variation, that point is the mean of
the observations. In the case of the Sharpe Ratio, that point is the mean of the
returns above a risk-free return. BOTH ARE SCALE FREE, and thus provide ease
of use in comparing dispersion among datasets with different distributions.
The Sharpe Ratio plays a prominent role in much of investment analysis, including
the optimization of risky asset allocation in modern portfolio theory (more in Chapter
11). It is named after William Sharpe, a Nobel prizewinning economist, and is often
used as a portfolio performance measurement tool.
Two cautions in using the Sharpe Ratio: Negative Sharpe Ratios have a
counterintuitive interpretation (increasing risk increases the Sharpe Ratio), so
comparisons of negative and positive Sharpe Ratios should be avoided. The
Sharpe Ratio also focuses on only one measure of risk: standard deviation. It will
work well for portfolios with roughly symmetrical returns, but not so well for
portfolios without them, including those with embedded options. Users of the
Sharpe Ratio should ensure that it is an appropriate tool to assess a specific
strategy or manager.
39
COMBINING RISK AND RETURN

Focus on: Coefficient of Variation and the Sharpe Ratio
Consider a portfolio with a mean return of 25.26% and a standard deviation of
returns of 9.95%.
- The coefficient of variation is
- If the risk-free rate is 3%, then the Sharpe Ratio is
40
COMBINING CENTRALITY, DISPERSION, AND

SYMMETRY
For a symmetrical distribution, the
mean, median, and mode (if it exists)
will all be at the same location.
mode < median < mean
If the distribution is positively skewed,

then the mean will be greater than the
median, which will be greater than the
mode (if it exists).
If the distribution is negatively skewed,
then the mean will be less than the
median, which will be less than the
mode (if it exists).
Example: Positive skew
41
SKEWNESS
The degree of symmetry in the dispersion of values around the mean is
known as skewness.
If observations are equally dispersed around the mean, the distribution is said
to be symmetrical.
If the distribution has a long tail on one side and a fatter distribution on the
other side, it is said to be skewed in the direction of the long tail.
Skew Right
No Skew
Skew Left
42
KURTOSIS
Kurtosis measures the relative amount of peakedness as compared with the
normal distribution, which has a kurtosis of 3.
- We typically express this measure in terms of excess kurtosis being the
observed kurtosis minus 3.
- Distributions are referred to as being
1. Leptokurtic (more peaked than the normal; fatter tails)
2. Platykurtic (less peaked than the normal; thinner tails) or
3. Mesokurtic (equivalent to the normal).
43
SKEWNESS AND KURTOSIS

Focus on: Sample Skewness
Recall that a distribution with perfect symmetry has skewness of zero.
Because cubing preserves the sign of the original difference between Xi and its
mean, if deviations from the mean are equally distributed on each side of the
mean, they will cancel each other out, leading to skewness of zero.
- If there are some very large values, they become even larger when cubed,
and the skewness measure will then reflect this.
- Large negative values Negative sample skewness
- Large positive value Positive sample skewness
44
SKEWNESS AND KURTOSIS

Focus on: Sample Kurtosis
Kurtosis measures the relative peakedness of the distribution.
- A leptokurtic distribution is more peaked than the normal distribution.
- More observations closer to the mean and out in the tails.
- Often known as having fat tails.
- A mesokurtic distribution has peakedness equal to the normal distribution.
- A platykurtic distribution is less peaked than the normal distribution.
- It is more evenly distributed across the range of possible values.
The kurtosis of the normal distribution is 3; hence, excess kurtosis is sample
kurtosis minus 3.
45
SUMMARY
The underlying foundation of statistically based quantitative analysis lies with
the concepts of a sample versus a population.
- We use sample statistics to describe the sample and to infer information
about its associated population.
- Descriptive statistics for samples and populations include measures of
centrality, location, and dispersion, such as mean, range, and variance,
respectively.
- We can combine traditional measures of return (such as mean) and risk
(such as standard deviation) to measure the combined effects of risk and
return using the coefficient of variation and the Sharpe Ratio.
The normal distribution is of central importance in investments, and as a result,
we often compare statistical properties, such as skewness and kurtosis, with
those of the normal distribution.
46

Statistical Concepts and Market Returns

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistical Concepts and Market Returns

Uploaded by

Copyright:

Available Formats

STATISTICAL CONCEPTS AND

POPULATIONS AND SAMPLES

Subset data yang digunakan dalam inferensi statistik dikenal sebagai

PARAMETERS AND SAMPLE STATISTICS

Descriptive statistics that characterize population values are called parameters.

membahas kedua sampel dan populasi.

- Examples: fund style, country of origin, manager gender

HOLDING PERIOD RETURNS

Membangun distribusi frekuensi:

Sort the data in ascending order.

1. Mengurutkan data dalam urutan menaik.

Calculate the range of the data, defined as

2. Hitung berbagai data, yang didefinisikan sebagai

Range = Maximum value Minimum value.

Kisaran = Maksimum nilai - nilai minimum.

Decide on the number of intervals in the frequency

3. Tentukan jumlah interval dalam distribusi frekuensi, k.

Determine interval width as Range/k.

Determine the intervals by successively adding the

Count the number of observations falling in each

6. Menghitung jumlah pengamatan jatuh setiap interval.

Construct a table of the intervals listed from smallest

hingga terbesar yang menunjukkan jumlah observasi

4. Menentukan lebar interval sebagai Rentang / k.

menentukan titik akhir dari interval, berhenti setelah

4.57 observation < 0.57

0.57 observation < 3.43

3.43 observation < 7.43

7.43 observation 11.43

RELATIVE AND CUMULATIVE FREQUENCY

frekuensi relatif adalah frekuensi absolut dibagi dengan jumlah total

0.57 observation < 3.43

3.43 observation < 7.43

7.43 observation 11.43

Poligon frekuensi sering digunakan untuk memberikan kontinuitas visual

0.57 observation < 3.43

3.43 observation < 7.43

7.43 observation 11.43

MEASURES OF CENTRAL TENDENCY

MEASURES OF CENTRAL TENDENCY

MEASURES OF CENTRAL TENDENCY

MEASURES OF CENTRAL TENDENCY

- The sum of the observations times each observations weight (proportional

investasi ketika X adalah 1 + R

MEASURES OF CENTRAL TENDENCY

median will be the closest to the middle. When we

median akan menjadi yang paling dekat ke

have an even number, the median will be the average

tengah. Ketika kita memiliki bilangan genap,

of the two middle values.

median akan menjadi rata-rata dari dua nilai

The mode is the most frequently occurring value in a

MEASURES OF CENTRAL TENDENCY

INTERVAL LOCATION MEASURES

Quartiles, Quintiles, Deciles, and Percentiles

dan nilai Py ditemukan dengan interpolasi linier.

each percentile representation as the cutoff

menafsirkan setiap representasi persentil

at which p% of the distribution falls below

sebagai cutoff di mana p% dari distribusi

turun di bawah titik itu.

INTERVAL LOCATION MEASURES

melakukan interpolasi. Perhatikan bahwa proses interpolasi dasarnya