Kompendium Analisis Regresi Dalam Kajian Lingkungan

Diabstraksikan oleh: Smno.psl.ppsub.
agst2012
LINGKUP REGRESI
Diunduh dari: http://en.wikipedia.org/wiki/Regression.. 27/8/2012
Regression dapat bermakna:

Regression (psychology), a defensive reaction to some unaccepted impulses

Regression analysis, a statistical technique for estimating the relationships among
variables.
Beberapa tipe regresi:
1. Regresi Linear
2. Regresi linear sederhana
3. Regresi Logistik
4. Regresi Nonlinear
5. Regresi Nonparametrik
6. Regresi Robust
7. Regresi Stepwise.

Regression toward the mean, a common statistical phenomenon
Regression (economics), Ludwig von Mises' theorem that tries to explain why money is
demanded in its own right

Software regression, the appearance of a bug which was absent in a previous revision

Regression testing, a software testing method which seeks to uncover regression bugs
Infinite regress, a problem in epistemology

Marine regression, coastal advance due to falling sea level, the opposite of marine
transgression

Regression (medicine), a characteristic of diseases to express lighter symptoms without
disappearing totally

Age regression in therapy

Past life regression, a process claiming to retrieve memories of previous lives
ANALISIS REGRESI
Diunduh dari: http://id.wikipedia.org/wiki/Analisis_regresi .. 22/8/2012
Analisis regresi merupakan salah satu metode untuk menentukan hubungan sebab-
akibat antara satu variabel dengan variabel (satu atau lebih variabel) lainnya.
Variabel "penyebab" disebut dengan bermacam-macam istilah: variabel penjelas,
variabel eksplanatorik, variabel independen, atau secara bebas, variabel X (karena
seringkali digambarkan dalam grafik sebagai absis, atau sumbu X).
Variabel akibat merupakan variabel yang dipengaruhi, variabel dependen, variabel
terikat, atau variabel Y.
Kedua variabel ini dapat merupakan variabel acak (random), namun variabel yang
dipengaruhi harus selalu variabel acak.
Analisis regresi merupakan analisis yang sangat populer dan luas pemakaiannya.
Bidang kajian lingkungan yang memerlukan analisis sebab-akibat biasanya juga
menggunakan analisis regresi.

CAUSE-EFFECT RELATION
Cause-effect relation is a relation between cause-concept and effect-concept.
Cause-effect relation is represented in the main memory by cause-effect relation
table.
Cause-effect relations are so important because:
1. Cause-effect relations help to understand what would happen as a result of
current situation. Cause effect relations help to predict the future of current
context. In order to find out what would happen, strong AI should just find all
effect concepts for specified concepts.
2. Cause-effect relations help to understand what strong AI can do in order to
achieve some goals. In order to figure out what to do, strong AI should just find
cause concepts for the specified goal-concepts (sub goals).
Let imagine that strong AI wants to
find out what would be the result of
the sun. In order to figure that out,
strong AI would take a look into
cause-effect relations and find out
that probable results are Heat and
SunBurn.

Diunduh dari:
http://www.dennisgorelik.com/ai/CauseEff
ectRelation.htm
VARIABEL
Diunduh dari: http://id.wikipedia.org/wiki/Variabel .. 22/8/2012
Variabel :
1. berubah-ubah, tidak tetap;
2. deklarasi sesuatu yang memiliki variasi nilai
3. berbeda-beda

Dalam bahasa pemrograman disebut juga simbol yang mewakili nilai tertentu, variabel
yang dikenal di sub program disebut variabel lokal. sedang yang di kenal secara
umum/utuh dalam satu program disebut variabel global.
Variabel : adalah objek penelitian, atau apa yang menjadi fokus di dalam suatu
penelitian.
Menurut F.N. Kerlinger , variabel MERUPAKAN sebuah konsep. Variabel merupakan
konsep yang nilainya bermacam-macam. Suatu konsep dapat diubah menjadi suatu
variabel dengan cara memfokuskan pada aspek tertentu dari konsep itu.
Variabel dapat dibagi menjadi variabel kuantitatif dan variabel kualitatif. Variabel
kuantitatif dapat diklasifikasikan menjadi dua, yaitu variabel diskrit (discrete) dan
variabel kontinu (continuous).

Peubah (Variable)
Variable adalah karakteristik subjek penelitian yang berubah dari satu subjek ke subjek lain.
Variable juga dapat bermakna sebagai karakteristik atau sifat dari objek kajian yang diamati atau
diukur atau dicacah.
Variable menurut fungsinya dapat dibagi menjadi:
1. Variable Bebas (Independent Variable). Adalah variabel yang bila ia berubah akan
mengakibatkan perubahan variabel lain.
2. Variable Tergantung (Dependent Variable). Adalah variabel yang ditentukan atau
tergantung pada variabel lainnya.
3. Variable Penyerta (Concomitant Variable). Adalah suatu variabel dalam penelitian yang
tidak merupakan pusat perhatian akan tetapi muncul dan berpengaruh terhadap keragaman
variabel tergantung dan tidak terpengaruh atau membaur (Confounding) terhadap variabel
bebas.
4. Variable Perancu (Confounding Variable) Adalah jenis variabel yang berhubungan
(asosiasi) dengan variabel bebas dan berhubungan dengan variabel tergantung tetapi bukan
merupakan variabel antara.
5. Variable Penggangu (Intervening Variable). Adalah suatu variabel dalam penelitian yag
tidak menjadi pusat perhatian akan tetapi muncul dalam penelitian dan berpengaruh terhadap
keragaman variabel tergantung dan atau berpengaruh terhadap variabel bebas.
6. Variable Kendali (Control Variable). Merupakan variabel yang bukan merupakan pusat
perhatian dalam suatu penelitian, akan tetapi berpengaruh terhadap keragaman variabel
tergantung dan pengaruh tersebut dapat dikendalikan misalnya dengan cara pengelompokan.

FUNGSI
Diunduh dari: http://id.wikipedia.org/wiki/Fungsi .. 22/8/2012
Fungsi adalah sekelompok aktivitas yang tergolong pada jenis yang sama
berdasarkan sifat atau pelaksanaannya.

Fungsi dapat dihubungkan dengan:
1. Fungsi diatonik, sesuatu istilah dalam teori musik
2. Fungsi (biologi), sesuatu yang menjelasakan bagaimana seleksi alam terjadi
3. Fungsi (ilmu komputer), atau sub rutin, bagian dari sebuah kode pemrograman di
dalam program yang lebih besar, dan menjalankan tugas tertentu
4. Fungsi (teknik), berhubungan dengan bagian dari suatu sistem yang lebih besar
5. Fungsi (bahasa), dalam linguistik berarti suatu cara untuk mencapai tujuan dengan
menggunakan bahasa tersebut
6. Fungsi (matematika), suatu entitas abstrak yang mengasosiasikan suatu masukkan
kepada suatu keluaran yang saling terkait berdasarkan peraturan tertentu dan baku
7. Fungsi model, fungsi, kegiatan dan proses yang terangkum dalam suatu tatanan
tertentu
8. Function object atau functor atau functionoid, suatu konsep dalam pemrograman
'object-oriented.

Fungsi Linier
Fungsi Linier atau fungsi berderajat satu ialah fungsi yang pangkat tertinggi dari
variabelnya adalah pangkat satu. Sesuai namanya, setiap persamaan linier apabila
digambarkan akan menghasilkan sebuah garis lurus.
Bentuk umum persamaan linier adalah :
y = a + bx
dimana a adalah penggal garisnya pada sumbu vertikal y, sedangkan b adalah
koefisien arah atau gradien garis yang bersangkutan.

Dua garis lurus akan sejajar apabila
lereng/gradien garis yang satu sama dengan
lereng/gradien dari garis yang lain. Dengan
demikian , garis Y1= a1 + b1 X akan sejajar
dengan garis Y2 = a2 + b2 X , jika b1 = b2.

Diunduh dari:
http://setyonugroho09.wordpress.com/2010/0
4/08/bab-2-fungsi-linier/. 30/8/2012
FUNGSI MATEMATIK
Diunduh dari: http://id.wikipedia.org/wiki/Fungsi_%28matematika%29 .. 22/8/2012
Fungsi, dalam istilah matematika adalah pemetaan setiap anggota sebuah himpunan (dinamakan
sebagai domain) kepada anggota himpunan yang lain (dinamakan sebagai kodomain).
Istilah ini berbeda pengertiannya dengan kata yang sama yang dipakai sehari-hari, seperti alatnya
berfungsi dengan baik.
Konsep fungsi adalah salah satu konsep dasar dari matematika dan setiap ilmu kuantitatif. Istilah
"fungsi", "pemetaan", "peta", "transformasi", dan "operator" biasanya dipakai secara sinonim.
Anggota himpunan yang dipetakan dapat berupa apa saja (kata, orang, atau objek lain), namun
biasanya yang dibahas adalah besaran matematika seperti bilangan riil. Contoh sebuah fungsi
dengan domain dan kodomain himpunan bilangan riil adalah y=f(2x), yang menghubungkan suatu
bilangan riil dengan bilangan riil lain yang dua kali lebih besar. Dalam hal ini kita dapat menulis
f(5)=10.

Drying-induced changes in phosphorus status of soils with
contrasting soil organic matter contents Implications for laboratory
approaches
David L. Achat, Laurent Augusto, Anne Gallet-Budynek, Mark R. Bakker.
Geoderma, Volumes 187188, October 2012, Pages 4148

Total organic P as a
function of soil organic
matter in moist and dried
soil samples. Non-linear
regressions for all forest
floor and surface and deep
mineral soil samples.
The residual sum of squares
was significantly smaller
(P < 0.05) when the non-
linear regression was
individually fitted to the
data of each soil treatment
than when it was fitted to
the grouped data.

APA ITU ANALISIS REGRESI?
Diunduh dari: http://www.scielosp.org/scielo.php?pid=S0034-
89102005000300018&script=sci_abstract .. 22/8/2012
Analisis statistika yang memanfaatkan hubungan antara dua atau lebih
peubah kuantitatif sehingga salah satu peubah dapat diramalkan dari
peubah lainnya.

CORDEIRO, Ricardo; CLEMENTE, Ana Paula Grotti; DINIZ, Cntia
Sgre and DIAS, Adriano.
OCCUPATIONAL NOISE AS A RISK FACTOR FOR WORK-
RELATED INJURIES.
Rev. Sade Pblica [online]. 2005, vol.39, n.3, pp. 461-466. ISSN 0034-8910.

To assess whether exposure to occupational noise is an important risk factor for work-
related injuries.

METHODS: A population-based case-control study was performed. Data collection
was carried out from May 16, 2002 to October 15, 2002 in the city of Botucatu,
southeast Brazil. Cases were defined as workers who had suffered typical work-
related injuries in a 90-day period previously to the study, and who identified through
systematic random sampling of their households. Controls were non-injured workers
randomly sampled from the same population, matched on 3:1 ratio according to sex,
age group and census track.
A multiple logistic regression model was adjusted, where the independent
variable was exposure to occupational noise, controlled for covariates of interest.

RESULTS: A total of 94 cases and 282 controls were analyzed. An adjusted multiple
regression model showed that "work always exposed to high-level noise" and "work
sometimes exposed to high-level noise" were associated to a relative risk for work-
related injuries of about 5.0 (95% CI: 2.8-8.7; p<0.001) and 3.7 (95% CI: 1.8-7.4;
p=0.0003) respectively, when work not exposed to noise was taken as a reference,
controlled for several covariates.

CONCLUSIONS: Based on the study findings, investing in hearing conservation
programs, particularly those for controlling noise emission at its source, is justifiable
aiming at both hearing health maintenance and reduction of work-related injuries.
. REGRESI
Dalam permasalahan pengelolaan dan menejemen seringkali dijumpai kegiatan peramalan,
pendugaan, perkiraan, dan lainnya. Salah satu metode yang dapat digunakan untuk maksud-
maksud ini adalah regresi. Metode analisis ini sangat tepat kalau peubah yang diramal secara
logis "dependent" terhadap peubah lainnya ("independent"). Misalnya ada
ketergantungan logis antara "sales" dan "biaya perjalanan salesmen". Apabila peubah
independent-nya hanya satu maka disebut regresi sederhana , dan apabila peubah
independent-nya lebih dari satu maka disebut regresi-berganda.

Dalam rangka untuk dapat mengimplementasikan regresi ini ada dua kriteria yang harus
diperhatikan, yaitu (i) apakah ada peubah lain yang mempunyai hubungan "prasyarat" logis
dengan peubah dependent, dan (ii) apakah bentuk hubungan logis tersebut linear atau non-
linear. Untuk dapat menjawab kriteria pertama tersebut kita harus menguasai landasan teoritis
yang melatar-belakangi permasalahan yang dihadapi.

Hubungan logis yang menjadi prasyarat tersebut dapat berupa fubungan fungsional atau
hubungan sebab-akibat. Sedangkan bentuk hubungan antara dua peubah dapat dilihat
dengan menggunakan diagram pencar yang melukiskan titik-titik data.

CAUSAL RELATIONSHIP
A relationship between one variable and another or others such that a change in one
variable effects a change in the other variable. A cause-and-effect relationship is claimed
where the following conditions are satisfied: the two events occur at the same time and
in the same place; one event immediately precedes the other; the second event appears
unlikely to have happened without the first event having occurred. Many phenomena
exhibit close association, but they may not have a cause-and-effect relationship.

Read more: http://www.answers.com/topic/cause-and-effect-relationship#ixzz25438QyJe

CORRELATION
A general term used to describe the fact that two (or more) variables are related. Galton,
in 1869, was probably the first to use the term in this way (as 'co-relation'). Usually the
relation is not precise. For example, we would expect a tall person to weigh more than a
short person of the same build, but there will be exceptions.
Although the word 'correlation' is used loosely to describe the existence of some general
relationship, it has a more specific meaning in the context of linear relations between
variables
Read more: http://www.answers.com/topic/correlation#ixzz25444u8Bj
ANALISIS REGRESI
Diunduh dari: http://www.jonathansarwono.info/regresi/regresi.htm .. 22/8/2012
Pengertian
Untuk mengukur besarnya pengaruh variabel bebas terhadap variabel
tergantung dan memprediksi variabel tergantung dengan menggunakan
variabel bebas. Gujarati (2006) mendefinisikan analisis regresi sebagai kajian
terhadap hubungan satu variabel yang disebut sebagai variabel yang
diterangkan (the explained variabel) dengan satu atau dua variabel yang
menerangkan (the explanatory). Variabel pertama disebut juga sebagai
variabel tergantung dan variabel kedua disebut juga sebagai variabel bebas.
Jika variabel bebas lebih dari satu, maka analisis regresi disebut regresi linear
berganda. Disebut berganda karena pengaruh beberapa variabel bebas akan
dikenakan kepada variabel tergantung.

Tujuan menggunakan analisis regresi ialah
1. Membuat estimasi rata-rata dan nilai variabel tergantung dengan
didasarkan pada nilai variabel bebas.
2. Menguji hipotesis karakteristik dependensi
3. Untuk meramalkan nilai rata-rata variabel bebas dengan didasarkan pada
nilai variabel bebas diluar jangkaun sample.

Penggunaan regresi linear sederhana didasarkan pada asumsi diantaranya sbb:
1. Model regresi harus linier dalam parameter
2. Variabel bebas tidak berkorelasi dengan disturbance term (Error) .
3. Nilai disturbance term sebesar 0 atau dengan simbol sebagai berikut: (E
(U / X) = 0
4. Varian untuk masing-masing error term (kesalahan) konstan
5. Tidak terjadi otokorelasi
6. Model regresi dispesifikasi secara benar. Tidak terdapat bias spesifikasi
dalam model yang digunakan dalam analisis empiris.
7. Jika variabel bebas lebih dari satu, maka antara variabel bebas
(explanatory) tidak ada hubungan linier yang nyata

ANALISIS REGRESI
Persyaratan Penggunaan Model Regresi
Model kelayakan regresi linear didasarkan pada hal-hal sebagai berikut:
1. Model regresi dikatakan layak jika angka signifikansi pada ANOVA
sebesar < 0.05
2. Predictor yang digunakan sebagai variabel bebas harus layak. Kelayakan
ini diketahui jika angka Standard Error of Estimate < Standard Deviation
3. Koefesien regresi harus signifikan. Pengujian dilakukan dengan Uji T.
Koefesien regresi signifikan jika T hitung > T table (nilai kritis)
4. Tidak boleh terjadi multikolinieritas, artinya tidak boleh terjadi korelasi
yang sangat tinggi atau sangat rendah antar variabel bebas. Syarat ini
hanya berlaku untuk regresi linier berganda dengan variabel bebas lebih
dari satu.
5. Tidak terjadi otokorelasi. Terjadi otokorelasi jika angka Durbin dan
Watson (DB) sebesar < 1 dan > 3
6. Keselerasan model regresi dapat diterangkan dengan menggunakan nilai
r
2

semakin besar nilai tersebut maka model semakin baik. Jika nilai
mendekati 1 maka model regresi semakin baik. Nilai r
2

mempunyai
karakteristik diantaranya: 1) selalu positif, 2) Nilai r
2

maksimal sebesar 1.
Jika Nilai r
2

sebesar 1 akan mempunyai arti kesesuaian yang sempurna.
Maksudnya seluruh variasi dalam variabel Y dapat diterangkan oleh
model regresi. Sebaliknya jika r
2

sama dengan 0, maka tidak ada
hubungan linier antara X dan Y.
7. Terdapat hubungan linier antara variabel bebas (X) dan variabel
tergantung (Y)
8. Data harus berdistribusi normal
9. Data berskala interval atau rasio
10. Kedua variabel bersifat dependen, artinya satu variabel merupakan
variabel bebas (disebut juga sebagai variabel predictor) sedang variabel
lainnya variabel tergantung (disebut juga sebagai variabel response)
ANALISIS REGRESI
Diunduh dari: http://www.ncbi.nlm.nih.gov/pubmed/22204918.. 24/8/2012
LINEARITAS
Ada dua macam linieritas dalam analisis regresi, yaitu linieritas dalam variabel dan
linieritas dalam parameter. Yang pertama, linier dalam variabel merupakan nilai rata-rata
kondisional variabel tergantung yang merupakan fungsi linier dari variabel (variabel)
bebas. Sedang yang kedua, linier dalam parameter merupakan fungsi linier parameter
dan dapat tidak linier dalam variabel.
Environ Res. 2012 Jan;112:199-203. Epub 2011 Dec 26.
AMBIENT LEVELS OF AIR POLLUTION INDUCE CLINICAL
WORSENING OF BLEPHARITIS.
Malerbi F.K., Martins L.C., Saldiva P.H., Braga A.L.

Even though air pollutants exposure is associated with changes in the ocular surface and
tear film, its relationship to the clinical course of blepharitis, a common eyelid disease,
had not yet been investigated. Our objective was to investigate the correlation between
air pollution and acute manifestations of blepharitis.

METHOD:
We recorded all cases of changes in the eyelids and ocular surface, and rated clinical
findings on a scale from zero (normal) to two (severe alterations). Daily values of carbon
monoxide, particulate matter smaller than 10 m in diameter and nitrogen dioxide
concentrations and meteorological variables (temperature and relative humidity) in the
vicinity of the medical service were obtained.
Specific linear regression models for each outcome were constructed
including pollutants as independent variables (single pollutant
models). Temperature and humidity were included as confounding
variables.

Increases of 28.8 g/m(3) in the concentration of particulate matter and 1.1 ppm in the
concentration of CO were associated with increases in cases of blepharitis on the day of
exposure (5 cases, 95% CI: 1-10 and 6 cases, 95% CI: 1-12, respectively).
Exposure to usual air pollutants concentrations present in large cities affects, in a
consistent manner, the eyes of residents contributing to the increasing incidence of
diseases of the eyelid margin.
ANALISIS REGRESI
Uji Hipotesis
Pengujian hipotesis dapat didasarkan dengan menggunakan dua hal, yaitu:
tingkat signifikansi atau probabilitas () dan tingkat kepercayaan atau
confidence interval. Didasarkan tingkat signifikansi pada umumnya orang
menggunakan 0,05. Kisaran tingkat signifikansi mulai dari 0,01 sampai
dengan 0,1. Yang dimaksud dengan tingkat signifikansi adalah probabilitas
melakukan kesalahan tipe I, yaitu kesalahan menolak hipotesis ketika
hipotesis tersebut benar. Tingkat kepercayaan pada umumnya ialah sebesar
95%, yang dimaksud dengan tingkat kepercayaan ialah tingkat dimana
sebesar 95% nilai sample akan mewakili nilai populasi dimana sample
berasal. Dalam melakukan uji hipotesis terdapat dua hipotesis, yaitu:
H0 (hipotessis nol) dan H1 (hipotesis alternatif)

Contoh uji hipotesis misalnya rata-rata produktivitas pegawai sama dengan
10 ( x= 10), maka bunyi hipotesisnya ialah:
H0: Rata-rata produktivitas pegawai sama dengan 10
H1: Rata-rata produktivitas pegawai tidak sama dengan 10

Hipotesis statistiknya:
H0: x= 10
H1: x > 10 Untuk uji satu sisi (one tailed) atau
H1: x < 10
H1: x 10 Untuk uji dua sisi (two tailed)

Beberapa hal yang harus diperhatikan dalam uji hipotesis ialah;
Untuk pengujian hipotesis kita menggunakan data sample.
Dalam pengujian akan menghasilkan dua kemungkinan, yaitu pengujian
signifikan secara statistik jika kita menolak H0 dan pengujian tidak signifikan
secara statistik jika kita menerima H0.
Jika kita menggunakan nilai t, maka jika nilai t yang semakin besar atau
menjauhi 0, kita akan cenderung menolak H0; sebaliknya jika nila t semakin
kecil atau mendekati 0 kita akan cenderung menerima H0.

ANALISIS REGRESI
Karakteristik Model yang Baik
Model dikatakan baik menurut Gujarati (2006), jika memenuhi beberapa kriteria seperti
di bawah ini:
1. Parsimoni: Suatu model tidak akan pernah dapat secara sempurna menangkap
realitas; akibatnya kita akan melakukan sedikit abstraksi ataupun penyederhanaan
dalam pembuatan model.
2. Mempunyai Identifikasi Tinggi: Artinya dengan data yang ada, parameter-parameter
yang diestimasi harus mempunyai nilai-nilai yang unik atau dengan kata lain, hanya
akan ada satu parameter saja.
3. Keselarasan (Goodness of Fit): Tujuan analisis regresi ialah menerangkan sebanyak
mungkin variasi dalam variabel tergantung dengan menggunakan variabel bebas
dalam model. Oleh karena itu, suatu model dikatakan baik jika eksplanasi diukur
dengan menggunakan nilai adjusted r
2

yang setinggi mungkin.
4. Konsitensi Dalam Teori: Model sebaiknya segaris dengan teori. Pengukuran tanpa
teori akan dapat menyesatkan hasilnya.
5. Kekuatan Prediksi: Validitas suatu model berbanding lurus dengan kemampuan
prediksi model tersebut. Oleh karena itu, pilihlah suatu model yang prediksi
teoritisnya berasal dari pengalaman empiris.

Canopy interactions of rainfall in an off-shore mangrove ecosystem dominated
by Rhizophora mangle (Belize)
Wolfgang Wanek
,
Julia Hofmann
,
and Ilka C. Feller.
Journal of Hydrology. Volume 345, Issues 12, 20 October 2007, Pages 7079
Relationship between solute
concentrations in Rhizophora mangle
leaves and average net throughfall
(mmol event
1
m
2
) in a mangrove
ecosystem, Carrie Bow Cays, Belize. A
curvilinear regression model was fitted to
the ionic solutes excluding the DOC and
DON data. Data represent means 1SE
(n = 915 for foliar concentrations,
n = 1958 for net throughfall).

Diunduh dari:
http://www.sciencedirect.com/science/art
icle/pii/S0022169407004374 28/8/2012

ANALISIS REGRESI
. Analisis regresi berbeda dengan analisis korelasi. Jika analisis korelasi digunakan
untuk melihat hubungan dua variable; maka analisis regresi digunakan untuk melihat
pengaruh variable bebas terhadap variable tergantung serta memprediksi nilai variable
tergantung dengan menggunakan variable bebas.
Dalam analisis regresi variable bebas berfungsi untuk menerangkan (explanatory)
sedang variable tergantung berfungsi sebagai yang diterangkan (the explained). Dalam
analisis regresi data harus berskala interval atau rasio. Hubungan dua variable bersifat
dependensi. Untuk menggunakan analisis regresi diperlukan beberapa persyaratan
yang harus dipenuhi.
A probability model for investigating the trend of structural deterioration of
wastewater pipelines
Rizwan Younis

and Mark A. Knight.
Tunnelling and Underground Space Technology. Volume 25, Issue 6, December 2010, Pages
670680.
Modeling flow chart.
Diunduh dari:
http://www.sciencedirect.com/science/article/pii/S088677981000098228/8/2012
HUBUNGAN ANTARA DUA VARIABEL
Diunduh dari: http://www.ehjournal.net/content/11/1/22/abstract .. 22/8/2012
Hubungan antara dua peubah tersebut di atas dapat dinyatakan dalam bentuk matematis
sbb:

1. Model regresi linear: Y = a + b X
2. Model regresi non linear:

2.1. Kuadratik : Y = a + bX + c X2
2.2. Eksponensial : Y = a (ecX) atau Y = a (e-cX)
2.3. Asimtotis : Y = a - b(e-cX)
2.4. Logistik : Y = a / (1+b rX).
Spatiotemporal air pollution exposure assessment for a Canadian population-
based lung cancer case-control study
Perry Hystad
1*
, Paul A Demers
2
, Kenneth C Johnson
3
, Jeff Brook
4
, Aaron van Donkelaar
5
, Lok
Lamsal
6
, Randall Martin
7
and Michael Brauer
Environmental Health 2012, 11:22 doi:10.1186/1476-069X-11-22
Published: 4 April 2012.

Few epidemiological studies of air pollution have used residential histories to develop long-term
retrospective exposure estimates for multiple ambient air pollutants and vehicle and industrial
emissions. National spatial surfaces of ambient air pollution were compiled from recent satellite-
based estimates (for PM
2.5
and NO
2
) and a chemical transport model (for O
3
). The surfaces were
adjusted with historical annual air pollution monitoring data, using either spatiotemporal
interpolation or linear regression. Model evaluation was conducted using an independent ten
percent subset of monitoring data per year. Proximity to major roads, incorporating a temporal
weighting factor based on Canadian mobile-source emission estimates, was used to estimate
exposure to vehicle emissions. A comprehensive inventory of geocoded industries was used to
estimate proximity to major and minor industrial emissions.
Calibration of the national PM
2.5
surface using annual spatiotemporal interpolation predicted
historical PM
2.5
measurement data best (R
2
= 0.51), while linear regression incorporating the
national surfaces, a time-trend and population density best predicted historical concentrations of
NO
2
(R
2
= 0.38) and O
3
(R
2
= 0.56). Applying the models to study participants residential histories
between 1975 and 1994 resulted in mean PM
2.5
, NO
2
and O
3
exposures of 11.3 g/m
3
(SD = 2.6),
17.7 ppb (4.1), and 26.4 ppb (3.4) respectively. On average, individuals lived within 300 m of a
highway for 2.9 years (15% of exposure-years) and within 3 km of a major industrial emitter for 6.4
years (32% of exposure-years). Approximately 50% of individuals were classified into a different
PM
2.5
, NO
2
and O
3
exposure quintile when using study entry postal codes and spatial pollution
surfaces, in comparison to exposures derived from residential histories and spatiotemporal air
pollution models. Recall bias was also present for self-reported residential histories prior to 1975,
with cases recalling older residences more often than controls.
REGRESI BERGANDA
Model regresi yang melibatkan lebih dari satu peubah independent dinamakan model regresi berganda,.
Salah satu contoh yang populer adalah Regresi Linear Berganda.
Aplikasi penting dari model regresi ini ialah (i) membuat persamaan dengan beberapa peubah
independent (Xi) yang dapat digunakan untuk menduga perilaku peubah independent (Y); dan (ii)
menemukan peubah-peubah independent (Xi) yang berhubungan dengan peubah Y, mengurutkan
tingkat kepentingannya, dan menginterpretasikan hubungan- hubungan yang ada.

Model matematikanya adalah:

Y = a + b1X1 + b2X2 + ........ + bn Xn

dimana:
Y = peubah independent
X1 = peubah independent pertama
X2 = peubah independent ke dua
Xn = peubah independent ke n
A = intercept
b1, b2, bn, ....... = koefisien regresi.
. Simulating effects of management measures on the improvement of the
environmental performance of construction waste management
Gui Ye, Hongping Yuan, Liyin Shen, Hongxia Wang
Resources, Conservation and Recycling. Volume 62, May 2012, Pages 5663
Causal loop diagram of the proposed model.
Diunduh dari: http://www.sciencedirect.com/science/article/pii/S0921344912000122 . 29/8/2012
REGRESI - HUBUNGAN ANTAR VARIABEL
Diunduh dari: staff.unud.ac.id/~sampurna/wp-content/.../analisis-regresi-korelasi.do..... 22/8/2012
. Analisis regresi mempelajari bentuk hubungan antara satu atau lebih peubah bebas (X) dengan satu
peubah tak bebas (Y). dalam penelitian peubah bebas ( X) biasanya peubah yang ditentukan oelh
peneliti secara bebas misalnya dosis obat, lama penyimpanan, kadar zat pengawet, umur ternak dan
sebagainya.
Disamping itu peubah bebas bisa juga berupa peubah tak bebasnya, misalnya dalam pengukuran
panjang badan dan berat badan sapi, karena panjang badan lebih mudah diukur maka panjang badan
dimasukkan ke dalam peubah bebas (X), sedangkan berat badan dimasukkan peubah tak bebas (Y).
Peubah tak bebas (Y) dalam penelitian berupa respon yang diukur akibat perlakuan / peubah bebas (X).
Misalnya jumlah sel darah merah akibat pengobatan dengan dosis tertentu, jumlah mikroba daging
setelah disimpan beberapa hari, berat ayam pada umu tertent dan sebagainya.
. Interactions between economic growth and environmental quality in Shenzhen,
China's first special economic zone
Xiaozi Liu, Gerhard K. Heilig, Junmiao Chen, Mikko Heino.
Ecological Economics. Volume 62, Issues 34, 15 May 2007, Pages 559570

Causal loop diagram illustrating consumption-induced emissions (produced with Vensim 3.0).
Diunduh dari: http://www.sciencedirect.com/science/article/pii/S0921800906003600 29/8/2012
REGRESI POLINOMIAL
Bentuk hubungan antara peubah bebas (X) dengan peubah tak bebas (Y) dapat berbentuk
polinom derajat satu (linear) atau polinom derajat dua (kuadratik), polinim derajat tiga
(kubik) dan seterusnya. Disamping itu bisa juga dalam bentuk lain misalnya
eksponensial,logaritma,sigmoid dan sebagainya.
Bentuk-bentuk ini dalam analisis regresi-korelasi biasanya ditransformasi supaya
menjadi bentuk polinomial.
Dalam bentuk yang paling sederhana yaitu satu peubah bebas (X) dengan
satu peubah tak bebas (Y) mempunyai persamaan :
Y =a +bx
di sini a disebut intersep dan b koefisien arah
Diunduh dari: staff.unud.ac.id/~sampurna/wp-content/.../analisis-regresi-korelasi.do..... 22/8/2012
Polynomial Regression Models to Characterize Environmental Conditions
Conducive for Leaf Rust Development on Winter Wheat in Mississippi
Muhammad Aslam Khan ; Larry Eugene Trevathan
Pakistan Journal of Biological Sciences. 1999 Volume: 2. Issue:1 pages :113-120

Environmental conditions conducive for leaf rust development were determined at
Starkville, MS, during the 1991-92 and 1992-93 wheat growing seasons. Four wheat
varieties, grown in a randomized complete block design and infected by natural
inoculum, were rated weekly for leaf rust severity.
The relationship of weekly maximum, minimum, and average air temperatures, dew
point, relative humidity, total rainfall, soil temperature, solar radiation and wind
movement to leaf rust severity was determined by polynomial regression. Leaf rust
severity for each of the varieties was different under differing environmental conditions.
In 1992, the relationship between leaf rust severity and weekly air and soil temperatures
and solar radiation was linear for most varieties. In 1992, significantly higher solar
radiation and soil temperature, lower rainfall and less wind movement contributed to
greater leaf rust severity compared to 1993. During two seasons neither quadratic nor
qubic regression models fit the data well for most of the environmental parameters.
During 1992 leaf rust development on all the four varieties in relation to weekly
maximum, minimum and average air temperature and soil temperature was best
explained by linear regression models. During 1993, the relationship of environmental
condition to leaf rust severity recorded only on Pioneer varieties was best explained by
linear regression models. The environmental conditions characterized for maximum leaf
rust severity on four varieties included 25-27, 15-20, 21-23 C maxi, min, ave air
temperatures and 85-90 percent relative humidity respectively.

Diunduh dari: http://www.doaj.org/doaj?func=abstract&id=590885 29/8/2012
Pengertian regresi sederhana
Written by Riri Melati Thursday, 19 May 2011 14:06
Diunduh dari: http://ilerning.com/index.php?option=com_content&view=article&id=248:regresi-sederhana-
edit-mar&catid=39:hipotesis&Itemid=70 .. 22/8/2012
Regresi merupakan suatu alat ukur yang juga dapat digunakan untuk mengukur ada atau
tidaknya korelasi antarvariabel. Jika kita memiliki dua buah variabel atau lebih maka
sudah selayaknya apabila kita ingin mempelajari bagaimana variabel-variabel itu
berhubungan atau dapat diramalkan.
Analisis regresi lebih akurat dalam melakukan analisis korelasi, karena pada analisis itu
kesulitan dalam menunjukkan slop (tingkat perubahan suatu variabel terhadap variabel
lainnya dapat ditentukan). Dengan demikian maka melalui analisis regresi, peramalan
nilai variabel terikat pada nilai variabel bebas lebih akurat pula.
Model kelayakan regresi linear didasarkan pada hal-hal sebagai berikut:

1. Model regresi dikatakan layak jika angka signifikansi pada ANOVA sebesar < 0.05
2. Predictor yang digunakan sebagai variabel bebas harus layak. Kelayakan ini diketahui jika angka
Standard Error of Estimate < Standard Deviation
3. Koefesien regresi harus signifikan. Pengujian dilakukan dengan Uji T. Koefesien regresi
signifikan jika T hitung > T table (nilai kritis)
4. Tidak boleh terjadi multikolinieritas, artinya tidak boleh terjadi korelasi yang sangat tinggi atau
sangat rendah antar variabel bebas. Syarat ini hanya berlaku untuk regresi linier berganda dengan
variabel bebas lebih dari satu.
5. Tidak terjadi otokorelasi. Terjadi otokorelasi jika angka Durbin dan Watson (DB) sebesar 3
6. Keselerasan model regresi dapat diterangkan dengan menggunakan nilai r
2

semakin besar nilai
tersebut maka model semakin baik. Jika nilai mendekati 1 maka model regresi semakin baik. Nilai
r
2

mempunyai karakteristik diantaranya: 1) selalu positif, 2) Nilai r
2

maksimal sebesar 1. Jika Nilai
r
2

sebesar 1 akan mempunyai arti kesesuaian yang sempurna. Maksudnya seluruh variasi dalam
variabel Y dapat diterangkan oleh model regresi. Sebaliknya jika r
2

sama dengan 0, maka tidak
ada hubungan linier antara X dan Y.
7. Terdapat hubungan linier antara variabel bebas (X) dan variabel tergantung (Y)
8. Data harus berdistribusi normal
9. Biasanya data berskala interval atau sekala rasio
10. Kedua variabel bersifat dependen, artinya satu variabel merupakan variabel bebas (disebut juga
sebagai variabel predictor) sedang variabel lainnya variabel tergantung (disebut juga sebagai
variabel response)

UJI REGRESI
Diunduh dari: .. 22/8/2012
Pengujian regresi dilakuan dengan 2 cara, yaitu :

Uji-t atau T test

Uji-t (t-test) merupakan statistik uji yang sering kali ditemui dalam masalah-masalah
praktis statistika. Uji-t termasuk dalam golongan statistika parametrik. Statistik uji ini
digunakan dalam pengujian hipotesis. Seperti yang telah dibahas dalam tulisan (post)
lain di weblog ini, uji-t digunakan ketika informasi mengenai nilai variance (ragam)
populasi tidak diketahui.

Uji-t dapat dibagi menjadi 2, yaitu uji-t yang digunakan untuk pengujian hipotesis 1-
sampel dan uji-t yang digunakan untuk pengujian hipotesis 2-sampel. Bila dihubungkan
dengan kebebasan (independency) sampel yang digunakan (khusus bagi uji-t dengan 2-
sampel), maka uji-t dibagi lagi menjadi 2, yaitu uji-t untuk sampel bebas (independent)
dan uji-t untuk sampel berpasangan (paired).

Dalam lingkup uji-t untuk pengujian hipotesis 2-sampel bebas, maka ada 1 hal yang
perlu mendapat perhatian, yaitu apakah ragam populasi (ingat: ragam populasi, bukan
ragam sampel) diasumsikan homogen (sama) atau tidak. Bila ragam populasi
diasumsikan sama, maka uji-t yang digunakan adalah uji-t dengan asumsi ragam
homogen, sedangkan bila ragam populasi dari 2-sampel tersebut tidak diasumsikan
homogen, maka yang lebih tepat adalah menggunakan uji-t dengan asumsi ragam tidak
homogen.

Uji-t dengan ragam homogen dan tidak homogen memiliki rumus hitung yang berbeda.
Oleh karena itulah, apabila uji-t hendak digunakan untuk melakukan pengujian hipotesis
terhadap 2-sampel, maka harus dilakukan pengujian mengenai asumsi kehomogenan
ragam populasi terlebih dahulu dengan menggunakan uji-F.

Diunduh dari:
http://ilerning.com/index.php?option=com_content&view=article&id=248:regresi-sederhana-
edit-mar&catid=39:hipotesis&Itemid=70 .. 22/8/2012
ANOVA REGRESI
ANOVA merupakan lanjutan dari uji-t independen dimana kita memiliki dua kelompok
percobaan atau lebih.
ANOVA biasa digunakan untuk membandingkan mean dari dua kelompok sampel
independen (bebas).
Uji ANOVA - One Way Analysis of Variance.
Asumsi yang digunakan adalah subjek diambil secara acak menjadi satu kelompok n.
Distribusi mean berdasarkan kelompok normal dengan keragaman yang sama.
Ukuran sampel antara masing-masing kelompok sampel tidak harus sama, tetapi
perbedaan ukuran kelompok sampel yang besar dapat mempengaruhi hasil uji
perbandingan keragaman.

Hipotesis yang digunakan adalah:
H
0
: 1 = 2 = k (mean dari semua kelompok sama)
H
a
: i j (terdapat mean dari dua atau lebih kelompok tidak sama)

Diunduh dari: http://ilerning.com/index.php?option=com_content&view=article&id=248:regresi-
sederhana-edit-mar&catid=39:hipotesis&Itemid=70 .. 22/8/2012
Tabel Analisis Sidik Ragam Regresi
Sumber Keragaman Jumlah Kuadrat
Derajat
Bebas
Kuadrat Tengah F Hitung
F Tabel
5% 1%
Regresi 11,85425 1 11,85425 11,71022* 5,590 12,25
Residual 7,0861 7 1,0123 . . .
Total 18,94036 9 . . . .
Karena F Hitung dari analisis sidik ragam regresi menunjukkan hasil yang nyata, maka
persamaan regresi yang telah didapatkan (Y = 50,4166 + 0,4069X) layak digunakan
sebagai fungsi penduga untuk memprediksi tingkat kecernaan bahan kering (Y)
berdasarkan data densitas pakan (X).

Diunduh dari: http://kampungonline.com/?p=33 30/8/2012
PENGERTIAN REGRESI LINIER
Diunduh dari: http://repository.usu.ac.id/bitstream/123456789/26987/4/Chapter%20II.pdf ..
22/8/2012
Regresi adalah alat analisis statistik yang menjelasan pola hubungan (model) antara dua variabel
atau lebih..
Dalam analisis regresi ada dua jenis variabel yaitu:
1. Variabel Respon (variabel dependen ), yaitu variabel yang keberadaannya dipengaruhi oleh
variabel lainnya dan dinotasikan dengan variabel .
2. Variabel Prediktor (variabel independen) yaitu variabel yang bebas (tidak dipengaruhi oleh
variabel lainnya) dan dinotasikan dengan

Untuk mempelajari hubugan hubungan antar variabel ada dua bentuk, yaitu:
1. Analisis regresi sederhana (Simple analysis regression)
2. Analisis regresi berganda (Multiple analysis regression).

Analisis regresi sederhana merupakan hubungan antara dua variabel yaitu variabel bebas (variable
independen) dan variabel tak bebas (variabel dependen).
Sedangkan analisis regresi berganda merupakan hubungan antara 3 variabel atau lebih, yaitu sekurang-
kurangnya dua variabel bebas dengan satu variabel tak bebas.

Tujuan utama regresi adalah untuk membuat perkiraan nilai suatu variabel (variabel dependen) jika
nilai variabel yang lain yang berhubungan dengannya (variabel lainnya) sudah ditentukan.
Meteorological modes of variability for fine particulate matter (PM
2.5
) air quality in
the United States: implications for PM
2.5
sensitivity to climate change
A. P. K. Tai, L. J. Mickley, D. J. Jacob, E. M. Leibensperger, L. Zhang, J. A. Fisher, and H. O. T. Pye.
Atmos. Chem. Phys., 12, 3131-3145, 2012.
We applied a multiple linear regression model to understand the relationships of PM
2.5
with
meteorological variables in the contiguous US and from there to infer the sensitivity of PM
2.5
to climate
change. We used 20042008 PM
2.5
observations from ~1000 sites (~200 sites for PM
2.5
components) and
compared to results from the GEOS-Chem chemical transport model (CTM). All data were deseasonalized
to focus on synoptic-scale correlations. We find strong positive correlations of PM
2.5
components with
temperature in most of the US, except for nitrate in the Southeast where the correlation is negative.
Relative humidity (RH) is generally positively correlated with sulfate and nitrate but negatively correlated
with organic carbon. GEOS-Chem results indicate that most of the correlations of PM
2.5
with temperature
and RH do not arise from direct dependence but from covariation with synoptic transport. We applied
principal component analysis and regression to identify the dominant meteorological modes controlling
PM
2.5
variability, and show that 2040% of the observed PM
2.5
day-to-day variability can be explained by
a single dominant meteorological mode: cold frontal passages in the eastern US and maritime inflow in
the West.
Our results demonstrate the need for multiple GCM realizations (because of climate chaos) when
diagnosing the effect of climate change on PM
2.5
, and suggest that analysis of meteorological modes of
variability provides a computationally more affordable approach for this purpose than coupled GCM-
CTM studies.

Diunduh dari: http://www.atmos-chem-phys.net/12/3131/2012/acp-12-3131-2012.html.. 29/8/2012
. ANALISIS REGRESI LINIER SEDERHANA
Regresi linier sederhana digunakan untuk mendapatkan hubungan matematis dalam
bentuk suatu persamaan antara variabel tak bebas tunggal dengan variabel bebas tunggal.
Regresi linier sederhana hanya memiliki satu peubah yang dihubungkan dengan satu
peubah tidak bebas .
Bentuk umum dari persamaan regresi linier untuk populasi adalah : y = A + B x

Y = Variabel takbebas
X = Variabel bebas
A = Parameter Intercep
B = Parameter Koefisisen Regresi Variabel Bebas

Menentukan koefisien persamaan a dan b dapat dengan menggunakan metode kuadrat
terkecil, yaitu cara yang dipakai untuk menentukan koefisien persamaan dan dari jumlah
pangkat dua (kuadrat) antara titik-titik dengan garis regresi yang dicari ysng terkecil .
Diunduh dari: repository.usu.ac.id/bitstream/123456789/26987/.../Chapter%20II.pdf .. 22/8/2012
Childhood Air Pollutant Exposure and Carotid Artery Intima-Media Thickness in
Young Adults
Carrie V. Breton; Xinhui Wang; Wendy J. Mack; Kiros Berhane ; Milena Lopez; Talat S. Islam; Mei
Feng; Fred Lurmann; Rob McConnell; Howard N. Hodis; Nino Knzli; Ed Avo.

Exposure to ambient air pollutants increases risk for cardiovascular health outcomes in adults. The
contribution of childhood air pollutant exposure to cardiovascular health has not been thoroughly
evaluated.
Testing Responses on Youth study consists of 861 college students recruited from the University of
Southern California in 2007-2009. Participants attended one study visit during which blood pressure,
heart rate and carotid artery intima-media thickness (CIMT) were assessed. Self-administered
questionnaires collected information about health and socio-demographic characteristics and a 12-hr
fasting blood sample was drawn for lipid and biomarker analyses. Residential addresses were geocoded
and used to assign cumulative air pollutant exposure estimates based on data derived from the U.S.
Environmental Protection Agency's Air Quality System (AQS) database. The associations between
CIMT and air pollutants were assessed using linear regression analysis. Mean CIMT was 603 m
( 54 SD). A 2 standard deviation (SD) increase in childhood (aged 0-5 years) or elementary school
(aged 6-12) O
3
exposure was associated with a 7.8 m (95% CI -0.3, 15.9) or 10.1 m (95% CI 1.8,
18.5) higher CIMT, respectively. Lifetime exposure to O
3
showed similar but non-significant
associations. No associations were observed for PM
2.5
, PM
10
or NO
2
although adjustment for these
pollutants strengthened the childhood O
3
associations.
Childhood exposure to O
3
may be a novel risk factor for CIMT in a healthy population of college
students. Regulation of air pollutants and efforts that focus on limiting childhood exposures continue to
be important public health goals.
REGRESI LINIER BERGANDA
Diunduh dari: repository.usu.ac.id/bitstream/123456789/26987/.../Chapter%20II.pdf .. ..
22/8/2012
Regresi linier berganda adalah analisis regresi yang menjelaskan hubungan antara peubah respon
(variabel dependen) dengan faktor-faktor yang mempengaruhi lebih dari satu prediktor (variabel
independen).
Regresi linier berganda hampir sama dengan regresi linier sederhana, hanya saja pada regresi linier
berganda variabel bebasnya lebih dari satu variabel penduga. Tujuan analisis regresi linier berganda
adalah untuk mengukur intensitas hubungan antara dua variabel atau lebih dan membuat prediksi
perkiraan nilai atas
Secara umum model regresi linier berganda untuk populasi adalah sebagai berikut:

Y = B0 +B1X1 + B2x2 + ....... Bixi + e

Di mana B adalah koefisien atau parameter model.

Model regresi linier berganda untuk populasi diatas dapat ditaksir berdasarkan sebuah sampel acak
yang berukuran n.
Effects of Ionic Strength, Temperature, and pH on Degradation of Selected
Antibiotics
Keith A. Loftin, Craig D. Adams, Michael T. Meyer and Rao Surampalli.
JEQ Vol. 37 No. 2, p. 378-386. Received: May 7, 2007. Published: Mar, 2008

Aqueous degradation rates, which include hydrolysis and epimerization, for chlortetracycline (CTC),
oxytetracycline (OTC), tetracycline (TET), lincomycin (LNC), sulfachlorpyridazine (SCP),
sulfadimethoxine (SDM), sulfathiazole (STZ), trimethoprim (TRM), and tylosin A (TYL) were studied
as a function of ionic strength (0.0015, 0.050, or 0.084 mg/L as Na
2
HPO
4
), temperature (7, 22, and
35C), and pH (2, 5, 7, 9, and 11).
Multiple linear regression revealed that ionic strength did not significantly affect ( = 0.05)
degradation rates for all compounds, but temperature and pH affected rates for CTC, OTC, and TET
significantly ( = 0.05). Degradation also was observed for TYL at pH 2 and 11. No significant
degradation was observed for LNC, SCP, SDM, STZ, TRM, and TYL (pH 5, 7, and 9) under study
conditions. Pseudo first-order rate constants, half-lives, and Arrhenius coefficients were calculated
where appropriate. In general, hydrolysis rates for CTC, OTC, and TET increased as pH and
temperature increased following Arrhenius relationships. Known degradation products were used to
confirm that degradation had occurred, but these products were not quantified. Half-lives ranged from
less than 6 h up to 9.7 wk for the tetracyclines and for TYL (pH 2 and 11), but no degradation of LIN,
the sulfonamides, or TRM was observed during the study period. These results indicate that
tetracyclines and TYL at pH 2 and 11 are prone to pH-mediated transformation and hydrolysis in some
cases, but not the sulfonamides, LIN nor TRM are inclined to degrade under study conditions. This
indicates that with the exception of CTC, OTC, and TET, pH-mediated reactions such as hydrolysis
and epimerization are not likely removal mechanisms in surface water, anaerobic swine lagoons,
wastewater, and ground water.
Format Umum Data Observasi
Quantification of Greenhouse Gas Emissions from Windrow Composting of Garden
Waste
Jacob K. Andersen , Alessio Boldrina, Jerker Samuelssonb, Thomas H. Christensena and Charlotte
Scheutz.
JEQ. Vol. 39 No. 2, p. 713-724 Received: Aug 26, 2009. Published: Mar, 2010
Example of a flux measurement using the flux chamber method on material that is 114
d of age. The full lines represent linear regression and the equation for the linear
regression fit and the R2 values are shown next to the time series of each gas.
Diunduh dari: https://www.agronomy.org/publications/jeq/articles/39/2/713 ..
29/8/2012
Membentuk Persamaan Regresi Linier Berganda
Dalam regresi linier berganda variabel tak bebas tergantung kepada dua atau lebih
variabel bebas .
Bentuk persamaan regresi linier berganda yang mencakup dua atau lebih variabel dapat
ditulis sebagai : Y = b0 + bi(Xi) + e

Dimana : i = 1, 2, ,
n = ukuran sampel
e = variabel kesalahan (galat)
Data-driven prediction model of indoor air quality by the preprocessed recurrent
neural networks
ICCAS-SICE, 2009. Date of Conference: 18-21 Aug. 2009
MinHan Kim ; YongSu Kim ; SuWhan Sung ; ChangKyoo Yoo
Page(s): 1688 - 1692

In this study, data-driven prediction methods based on recurrent neural networks (RNN)
for indoor air quality in a subway station are developed. The RNN can predict the air
pollutant concentration of PM10 and PM
2.5
at a platform of a subway station by using
the previous information of NO, NO
2
, NO
X
, CO, CO
2
, temperature, humidity, and PM
10

and PM
2.5
on yesterday. For comparison, the other prediction models such as neural
networks (NN) and multiple regression model are used. To optimize the prediction
model, the variable importance in the projection (VIP) of the PLS is used to select key
input variables as a preprocessing step.
Experimental result shows that the selected key variables have positive influence on the
prediction performance.
The predicted result of RNN model gives better modeling performance and higher
interpretability than other data-driven prediction modeling methods.

Diunduh dari:
http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=5335014&url=http%3A%2F%2Fieeexplore.ie
ee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D5335014 . 29/8/2012
KOEFISIEN DETERMINASI
Koefisien determinasi dinyatakan dengan untuk pengujian regresi linier berganda yang mencakup lebih
dari dua variabel. Koefisien determinasi adalah untuk mengetahui proporsi keragaman total dalam
variabel tak bebas yang dapat dijelaskan atau diterangkan oleh variabel variabel bebas yang ada di
dalam model persamaan regresi linier berganda secara bersama-sama.
Harga yang diperoleh sesuai dengan variasi yang dijelaskan masingmasing variabel yang tinggal
dalam regresi. Hal ini mengakibatkan variansi yang dijelaskan penduga yang disebabkan oleh variabel
yang berpengaruh saja (yang bersifat nyata)

Koefisien determinasi (R2) pada intinya mengukur seberapa jauh kemampuan model dalam
menerangkan variasi variabel terikat. Besarnya nilai koefisien
determinasi adalah di antara nol dan satu (0<R2<1).

Pembacaan Hasil Analisis Regresi dan Korelasi
Posted on June 13, 2012 by adhistyafdj
Analisis regresi menunjukkan pola pengaruh variabel bebas terhadap variabel tidak-bebas. Analisis
korelasi menunjukkan seberapa besar pengaruh variabel-variabel tersebut. Contoh nya adalah :
variabel bebas (Xi) adalah waktu penyimpanan selama 4 hari (dari hari ke-0 s.d. hari ke-4 maka
terdapat 5 titik) ; dan variabel terikat () nya adalah : Karakteristik Hardness.
Persamaan regresi linier nya :
Kemudian KD sebesar 86,3 artinya
sebanyak 86,3 % perubahan
hardness dipengaruhi oleh watu
penyimpanan. Sedangkan sisanya
sebesar 13,7% (100%-86,3%)
merupakan faktor lain diluar
variabel bebasnya.

Diunduh dari:
http://adhistyafdj.wordpress.com/20
12/06/13/pembacaan-hasil-analisis-
regresi-dan-korelasi/ 30/8/2012
Uji Regresi Linier Berganda

Uji regresi linier ganda perlu dilakukan untuk mengetahui apakah sekelompok variabel
bebas secara bersamaan mempunyai pengaruh terhadap variabel tak bebas.
Pada dasarnya pengujian hipotesis tentang parameter koefisien regresi secara
keseluruhan atau pengujian persamaan regresi dengana menggunakan statistik F yang
dirumuskan sebagai berikut:

Dengan:
1. Statistik F yang menyebar mengukuti distribusi F denagan derajat kebebasan dan
2. Jumlah Kuadrat regresi , dengan derajat kebebasan
3. Jumlah kuadrat residu (sisa) , dengan derajat kebebasan

Dalam pengujian persamaan regresi terutama menguji hipotesis tentang parameter
koefisien regresi secara keseluruhan melibatkan intersep serta variabel penjelasan
sebagai berikut:

Dengan persamaan penduganya adalah:

Y = b0 +b1X1 + b2X2 + . biXi + e

Dengan b merupakan penduga bagi parameter .
Langkah-langkah yang dibutuhkan dalam pengujian hipotesis ini adalah sebagai berikut:
a. Menentukan formulasi hipotesi ( tidak mempengaruhi ; Minimal ada satu parameter
koefisien regresi yang tidak sama dengan nol atau mempengaruhi ).
b. Menentukan taraf nyata dan dengan derajat kebebasan dan memilih taraf nyata
yang diinginkan.
c. Menentukan kriteria pengujian :
Diterima bila .
Ditolak bila ..
d. Menentukan nilai statistik F
e. Membuat kesimpulan apakah diterima atau ditolak

Diunduh dari: repository.usu.ac.id/bitstream/123456789/26987/.../Chapter%20II.pdf ..
22/8/2012
PENGERTIAN REGRESI
Diunduh dari: http://suhartoumm.blogspot.com/2009/01/pengertian-regresi.html ..
22/8/2012
Sir Francis Galton (1822 1911), memperkenalkan model peramalan, penaksiran, atau
pendugaan, yang selanjutnya dinamakan regresi, sehubungan dengan penelitiannya
terhadap tinggi badan manusia. Penelitian ini membandingkan antara tinggi anak laki-
laki dan tinggi badan ayahnya.
Analisis regresi digunakan untuk menentukan bentuk (dari) hubungan antar variabel.

Tujuan analisis regresi adalah untuk meramalkan atau menduga nilai dari satu variabel
dalam hubungannya dengan variabel yang lain, bentuk hubungan ini diketahui dari
persamaan garis regresinya.

Finding the best linear fit between two paired variables is very useful in many geoscience
applications. For example, one might want to estimate the increase in global temperature per
decade by performing a linear regression of the global mean temperature on time. As another
example, by plotting rock permeability versus density for a particular formation we can get a
visual feeling for the possible relationship between these two variables. Performing a least squares
linear regression of density on porosity provides an objective method to quantify the linear
relationship between these measurements. Often using one's subjective judgment to draw a "best
fit" line through the data can also serve as a useful first estimate in the field.

DIUNDUH DARI: http://serc.carleton.edu/introgeo/teachingwdata/StatRegression.html.
30/8/2012
REGRESI LINEAR

The basic idea of any
least squares fit whether
it is a linear least squares
fit or a polynomial fit is
to find the curve which
minimizes the sum of
the vertical distances
squared between all data
point and the least
squares line.

TEORI REGRESI
Diunduh dari: http://industri06.wordpress.com/2009/03/18/teori-regresi-dan-korelasi/ .. 22/8/2012
Banyak analisis statistika bertujuan untuk mengetahui apakah ada hubungan antara dua atau lebih
peubah. Bila hubungan demikian ini dapat dinyatakan dalam bentuk rumus matematik, maka kita akan
dapat menggunakannya untuk keperluan peramalan. Masalah peramalan dapat dilakukan dengan
menerapkan persamaan regresi.
Istilah regresi berasal dari pengukuran yang dilakukan oleh Sir Francis Galton yang membandingkan
tinggi badan anak laki- laki dengan tinggi badan ayahnya.
Galton menunjukkan bahwa tinggi badan anak laki laki dari ayah yang tinggi beberapa generasi
cenderung mundur (regressed) mendekati nilai tengah populasi.

Sekarang ini, istilah regresi ditetapkan pada semua jenis peramalan, dan tidak harus berimplikasi suatu
regresi mendekati nilai tengah populasi.
ANALISIS REGRESI

Regression analysis is widely used for prediction and forecasting, where its use has substantial
overlap with the field of machine learning.
Regression analysis is also used to understand which among the independent variables are related to
the dependent variable, and to explore the forms of these relationships. In restricted circumstances,
regression analysis can be used to infer causal relationships between the independent and dependent
variables. However this can lead to illusions or false relationships, so caution is advisable:
[1]
See
correlation does not imply causation.

Regression models involve the following variables:
The unknown parameters, denoted as , which may represent a scalar or a vector.
The independent variables, X.
The dependent variable, Y.
In various fields of application, different terminologies are used in place of dependent and
independent variables.
A regression model relates Y to a function of X and .

diunduh dari: http://en.wikipedia.org/wiki/Regression_analysis..... 30/8/2012
DEFINISI REGRESI

Bila terdapat suatu data yang terdiri atas dua atau lebih variabel, adalah sewajarnya
untuk mempelajari cara bagaimana variabel-variabel itu saling berhubungan dan saling
mempengaruhi satu sama lain. Hubungan yang didapat pada umumnya dinyatakan dalam
bentuk persamaan matematik yang menyatakan hubungan fungsional antara variabel-
variabel. Studi yang menyangkut masalah ini dikenal dengan analisis regresi.

Analisis regresi bertujuan untuk, pertama, mengestimasi atau menduga suatu hubungan
antara variabel variabel ekonomi, misalnya Y = f(x). Kedua, melakukan peramalan atau
prediksi nilai variabel terikat (tidak bebas) atau dependent variable berdasarkan nilai
variabel terkait (variabel independen/bebas).

Penetuan variabel mana yang bebas dan mana yang terkait dalam beberapa hal tidak
mudah dilaksanakan. Studi yang cermat, diskusi yang seksama (dengan para pakar),
berbagai pertimbangan, kewajaran masalah yang dihadapi dan pengalaman akan
membantu memudahkan penetuan kedua variabel tersebut.

Untuk menentukan persamaan hubungan antarvariabel, langkah-langkahnya sebagai
berikut :
1. Mengumpulkan data dari variabel yang dibutuhkan misalnya X sebagai variabel
bebas dan Y sebagai variabel tidak bebas.
2. Menggambarkan titik-titik pasangan (x,y) dalam sebuah sistem koordinat bidang.
Hasil dari gambar itu disebut Scatter Diagram (Diagram Pencar/Tebaran) dimana
dapat dibayangkan bentuk kurva halus yang sesuai dengan data. Kegunaan dari
diagram pencar adalah membantu menunjukkan apakah terdapat hubungan yang
bermanfaat antara dua variabel dan membantu menetapkan tipe persamaan yang
menunjukkan hubungan antara kedua variabel tersebut.
3. Menentukan persamaan garis regresi dengan mencari nilai-nilai koefisien regresi
dan koefisien korelasi.

Diunduh dari: http://industri06.wordpress.com/2009/03/18/teori-regresi-dan-korelasi/ ..
22/8/2012
TEORI KORELASI
Diunduh dari: http://en.wikipedia.org/wiki/Correlation_and_dependence .. 30/8/2012
Defenisi Korelasi

Teknik korelasi merupakan teknik analisis yang melihat kecenderungan pola dalam satu variabel
berdasarkan kecenderungan pola dalam variabel yang lain. Maksudnya, ketika satu variabel memiliki
kecenderungan untuk naik maka kita melihat kecenderungan dalam variabel yang lain apakah juga naik
atau turun atau tidak menentu. Jika kecenderungan dalam satu variabel selalu diikuti oleh
kecenderungan dalam variabel lain, kita dapat mengatakan bahwa kedua variabel ini memiliki
hubungan atau korelasi.
Jika data hasil pengamatan terdiri dari banyak variabel , ialah beberapa kuat hubungan antara-antara
variabel itu terjadi.
Derajat keeratan hubungan antara variabel-variabel perlu ditentukan. Studi yang membahas tentang
derajat hubungan antara variabel-variabel dikenal dengan nama KORELASI.
Ukuran yang dipakai untuk mengetahui derajat hubungan, terutama untuk data kuantitatif dinamakan
koefisien korelasi.
Diunduh dari: http://industri06.wordpress.com/2009/03/18/teori-regresi-dan-korelasi/ .. 22/8/2012
KORELASI DAN DEPENDENSI

In statistics, dependence refers to any statistical relationship between two random variables or
two sets of data.
Correlation refers to any of a broad class of statistical relationships involving dependence.

Familiar examples of dependent phenomena include the correlation between the physical
statures of parents and their offspring, and the correlation between the demand for a product and
its price. Correlations are useful because they can indicate a predictive relationship that can be
exploited in practice. For example, an electrical utility may produce less power on a mild day
based on the correlation between electricity demand and weather. In this example there is a
causal relationship, because extreme weather causes people to use more electricity for heating
or cooling; however, statistical dependence is not sufficient to demonstrate the presence of such
a causal relationship.

Formally, dependence refers to any situation in which random variables do not satisfy a
mathematical condition of probabilistic independence. In loose usage, correlation can refer to
any departure of two or more random variables from independence, but technically it refers to
any of several more specialized types of relationship between mean values.

There are several correlation coefficients, often denoted or r, measuring the degree of
correlation. The most common of these is the Pearson correlation coefficient, which is sensitive
only to a linear relationship between two variables (which may exist even if one is a nonlinear
function of the other). Other correlation coefficients have been developed to be more robust
than the Pearson correlation that is, more sensitive to nonlinear relationships

JENIS JENIS KORELASI
Korelasi yang menyatakan tingkat hubungan variabel bebas dan variabel terikat dapat
dibedakan berdasarkan banyaknya variabel bebas yang mempengaruhi nilai dari variabel
terikat.

a. Korelasi Linier
Angka yang digunakan untuk menggambarkan derajat hubungan ini disebut
koefisien korelasi dengan lambang rxy. Teknik yang paling sering digunakan untuk
menghitung koefisien korelasi selama ini adalah teknik Korelasi Product Momen
Pearson. Teknik ini sebenarnya tidak terbatas untuk menghitung koefisien korelasi
dari variabel dengan skala pengukuran interval saja, hanya saja interpretasi dari
hasil hitungnya harus dilakukan dengan hati-hati.

Konsep utama korelasi product momen adalah seperti ini:
1. Jika kenaikan kuantitas dari suatu variabel diikuti dengan kenaikan kuantitas dari
variabel lain, maka dapat kita katakan kedua variabel ini memiliki korelasi yang
positif.
2. Jika kenaikan kuantitas dari suatu variabel sama besar atau mendekati besarnya
kenaikan kuantitas dari suatu variabel lain dalam satuan SD, maka korelasi kedua
variabel akan mendekati 1.
3. Jika kenaikan kuantitas dari suatu variabel diikuti dengan penurunan kuantitas dari
variabel lain, maka dapat kita katakan kedua variabel ini memiliki korelasi yang
negatif.
4. Jika kenaikan kuantitas dari suatu variabel sama besar atau mendekati besarnya
penurunan kuantitas dari variabel lain dalam satuan SD, maka korelasi kedua
variabel akan mendekati -1.
5. Jika kenaikan kuantitas dari suatu variabel diikuti oleh kenaikan dan penurunan
kuantitas secara random dari variabel lain atau jika kenaikan suatu variabel tidak
diikuti oleh kenaikan atau penurunan kuantitas variabel lain (nilai dari variabel lain
stabil), maka dapat dikatakan kedua variabel itu tidak berkorelasi atau memiliki
korelasi yang mendekati nol.

Koefisien korelasi antara dua peubah sehingga nilai r = 0 berimplikasi tidak ada
hubungan linear, bukan bahwa antara peubah itu pasti tidak terdapat hubungan. Ukuran
korelasi linear antara dua peubah yang paling banyak digunakan adalah koefisien
karelasi momen-hasilkali pearson atau ringkasnya koefisien korelasi.
Diunduh dari: http://industri06.wordpress.com/2009/03/18/teori-regresi-dan-korelasi/ ..
22/8/2012
LOGISTIC REGRESSION
Regresi logistic merupakan salah satu analisi multivariate, yang berguna untuk
memprediksi dependent variabel berdasarkan variabel independen.

Diunduh dari: http://teorionline.wordpress.com/2011/05/15/logistic-regression-chapter-1/#more-
966 .. 22/8/2012
Data
Pada logistic regresi, dependen variabel adalah variabel dikotomi (kategori). Ketika kategori
variabel dependennya berjumlah dua kategori maka digunakan binary logistic, dan ketika
dependen variabelnya lebih dari dua kategori maka digunakan multinominal logistic regression.
Lalu ketika dependen variabelnya berbentuk ranking, maka disebut dengan ordinal logistic
regression.

Konsep Regresi Logistik
Regresi logistik merupakan alternative uji jika asumsi multivariate normal distribution pada
variabel bebasnya tidak bisa terpenuhi ketika akan dilakukan analisis diskriminan. Tidak
terpenuhinya asumsi ini dikarenakan variabel bebas merupakan campuran antara variabel kontinyu
(metric) dan kategorial (non metric). Misalnya, probabilitas bahwa orang yang menderita serangan
jantung pada waktu tertentu dapat diprediksi dari informasi usia, kebiasaan merokok, jenis
kelamin, dan lainnya.

Bioassay Analysis with the Five Parameter Logistic (5-PL) Non-Linear Regression
Curve-Fitting Model
Posted by Allen Liu under MasterPlex QT, MasterPlex ReaderFit
The 5-PL or 5 Parameter Logistic is a nonlinear regression model used for prediction of the probability of
occurrence of an event by fitting data to a logistic curve. It differs from the 4-PL or 4 Parameter Logistic
model in that it is an asymmetric function which is a better fit for immunoassay or bioassay data. As the
name suggests, there are 5 parameters in the 5-PL model equation:
F(x) = A + (D/(1+(X/C)^B)^E)

A is the MFI/RLU value for the minimum asymptote
B is the Hill slope
C is the concentration at the inflection point
D is the MFI/RLU value for the maximum asymptote
E is the asymmetry factor
The 5-PL model equation has the extra E parameter which the 4-PL
model lacks and when E = 1 the 5-PL equation is identical to the 4-PL
equation.

Parameters A (minimum asymptote) and D (maximum asymptote)
are the limits of where you can interpolate or extrapolate your
data. Any MFI/RLU values > D and MFI/RLU values < A
simply cannot be calculated because they are out of the function
range. (http://www.miraibio.com/blog/2009/02/5-pl-logistic-
regression/)
REGRESI LOGISTIK
Diunduh dari: http://statistik4life.blogspot.com/2009/12/regresi-logistik.html .. 22/8/2012
Regresi logistik adalah bagian dari analisis regresi yang digunakan ketika
variabel dependen (respon) merupakan variabel dikotomi. Variabel dikotomi
biasanya hanya terdiri atas dua nilai, yang mewakili kemunculan atau tidak
adanya suatu kejadian yang biasanya diberi angka 0 atau 1.

Tidak seperti regresi linier biasa, regresi logistik tidak mengasumsikan
hubungan antara variabel independen dan dependen secara linier. Regresi
logistik merupakan regresi non linier dimana model yang ditentukan akan
mengikuti pola kurva seperti gambar di bawah ini.
Model yang digunakan pada regresi logistik adalah:

Log (P / 1 p) = 0 + 1X1 + 2X2 + . + kXk

Dimana p adalah kemungkinan bahwa Y = 1, dan X1, X2, X3 adalah variabel
independen, dan b adalah koefisien regresi.

. Regresi logistik akan membentuk variabel prediktor/respon (log (p/(1-p))
yang merupakan kombinasi linier dari variabel independen. Nilai variabel
prediktor ini kemudian ditransformasikan menjadi probabilitas dengan fungsi
logit.

Regresi logistik juga menghasilkan rasio peluang (odds ratios) terkait dengan
nilai setiap prediktor. Peluang (odds) dari suatu kejadian diartikan sebagai
probabilitas hasil yang muncul yang dibagi dengan probabilitas suatu
kejadian tidak terjadi. Secara umum, rasio peluang (odds ratios) merupakan
sekumpulan peluang yang dibagi oleh peluang lainnya. Rasio peluang bagi
prediktor diartikan sebagai jumlah relatif dimana peluang hasil meningkat
(rasio peluang > 1) atau turun (rasio peluang < 1) ketika nilai variabel
prediktor meningkat sebesar 1 unit.

Lebih jelasnya kita dapat mengikuti ilustrasi berikut ini:

Jika kita ingin mengetahui pembelian kosmetik merk tertentu oleh beberapa
orang wanita dengan beberapa variabel penjelas antara lain adalah umur,
tingkat pendapatan (low, medium, high), dan status (M menikah; S untuk
single). Pada data tersebut, pembelian merupakan variabel prediktor yang
dijelaskan dengan angka 1 sebagai membeli dan 0 sebagai tidak membeli.

REGRESI LOGISTIK
Diunduh dari: http://statistik4life.blogspot.com/2009/12/regresi-logistik.html .. 22/8/2012
REGRESI LOGISTIK
Diunduh dari: http://id.wikipedia.org/wiki/Regresi_logistik .. 22/8/2012
Regresi logistik (kadang disebut model logistik atau model logit),
dalam statistika digunakan untuk prediksi probabilitas kejadian suatu
peristiwa dengan mencocokkan data pada fungsi logit kurva logistik.
Metode ini merupakan model linier umum yang digunakan untuk
regresi binomial. Seperti analisis regresi pada umumnya, metode ini
menggunakan beberapa variabel prediktor, baik numerik maupun
kategori. Misalnya, probabilitas bahwa orang yang menderita
serangan jantung pada waktu tertentu dapat diprediksi dari informasi
usia, jenis kelamin, dan indeks massa tubuh. Regresi logistik juga
digunakan secara luas pada bidang kedokteran dan ilmu sosial,
maupun pemasaran seperti prediksi kecenderungan pelanggan untuk
membeli suatu produk atau berhenti berlangganan.
Fungsi logistik, dengan z pada sumbu hosrizontal dan (z) pada sumbu
vertikal
GLOBAL LOGISTIC REGRESSION
Locally weighted logistic regression can be used to approximate P(yq | Sp, xq). Lets begin
with a very simple case with boolean output, shown in the following figure.
Diunduh dari: http://www.cs.cmu.edu/~kdeng/thesis/logistic.pdf.. 22/8/2012
Logistic function, which is also referred to as sigmoid
function, can be employed here. Logistic function is a
monotonic, continuous function between 0 and 1, whose
shape is shown as the grey curve in the above figure.
Mathematically, it is defined as:
(Global) logistic regression for classification.
The efficiency of logistic regression compared to normal discriminant
analysis under class-conditional classification noise
Bi, Yingtao and Jeske, Daniel R.
Journal of Multivariate Analysis.Volume: 101 (2010) . Issue: 7 (August)
Pages: 1622-1637
Diunduh dari: http://ideas.repec.org/a/eee/jmvana/v101y2010i7p1622-1637.html..
22/8/2012
. In many real world classification problems, class-conditional classification noise (CCC-
Noise) frequently deteriorates the performance of a classifier that is naively built by
ignoring it. In this paper, we investigate the impact of CCC-Noise on the quality of a
popular generative classifier, normal discriminant analysis (NDA), and its corresponding
discriminative classifier, logistic regression (LR). We consider the problem of two
multivariate normal populations having a common covariance matrix. We compare the
asymptotic distribution of the misclassification error rate of these two classifiers under
CCC-Noise. We show that when the noise level is low, the asymptotic error rates of both
procedures are only slightly affected. We also show that LR is less deteriorated by CCC-
Noise compared to NDA. Under CCC-Noise contexts, the Mahalanobis distance between
the populations plays a vital role in determining the relative performance of these two
procedures. In particular, when this distance is small, LR tends to be more tolerable to
CCC-Noise compared to NDA.
Logistic Regression
Paul Gustafson
Encyclopedia of Environmetrics
Published Online: 15 SEP 2006. DOI: 10.1002/9780470057339.val016. Copyright
2002 John Wiley & Sons, Ltd

Logistic regression is by far the most common approach to modeling the
relationship between some explanatory variables and a binary response
variable. It is applicable when the response variable for each study unit can
be viewed either as the success or failure of a single trial, or as the number
of successes in some fixed number of independent trials.

Diunduh dari:
http://onlinelibrary.wiley.com/doi/10.1002/9780470057339.val016/abstract?deniedAccessCustomis
edMessage=&userIsAuthenticated=false. 26/8/2012
Asumsi Regresi Logistik
Diunduh dari: http://teorionline.wordpress.com/2011/05/15/logistic-regression-chapter-
1/#more-966.. 22/8/2012
. Asumsi Regresi Logistik
Regresi logistik tidak membutuhkan hubungan linier antara variabel bebas dengan
variabel terikat.
Regresi logistik dapat menyeleksi hubungan karena menggunakan pendekatan non linier
log transformasi untuk memprediksi odds ratio. Odd dalam regresi logistik sering
dinyatakan sebagai probabilitas. Misal Odd sebuah perusahaan dapat bangkrut atau
berhasil atau odd seorang anak dapat lulus atau tidak pada Ujian Nasional.
Variabel bebas tidak memerlukan asumsi multivariate normality
Asumsi homokedastis tidak diperlukan
Variabel bebas tidak perlu dirubah ke bentuk metric (interval atau skala ratio)

A discrete time logistic regression model for analyzing censored
survival data
A. Maul
Environmetrics. Volume 5, Issue 2, pages 145157, June 1994

Consideration is given to survival data analysis by modelling the hazard as a discrete
function of time. This is done for each individual who is examined independently
from the other individuals of the sample observed.
Assuming time has been divided into intervals of the same length, the hazard associated
with any specific time interval is taken to be of the form of a logistic function
including a number of time-dependent covariates which serve to characterize the
individual under consideration.
Asymptotic maximum likelihood results are given for the estimation of both the
regression coefficients in the hazard function and the survivor function
corresponding to a given profile, i.e. the successive values of the different
covariates.
The likelihood ratio statistic for testing the effects of the various covariates in order to
compare several survival curves with respect to longevity is also derived. The
process of model fitting is illustrated by two examples referring to clinical trails on
leukaemia and advanced lung cancer patients, respectively.

Diunduh dari: http://onlinelibrary.wiley.com/doi/10.1002/env.3170050205/abstract...
26/8/2012
CONTOH KASUS LOGISTIC REGRESSION
Data Yang Diberikan Adalah Data Fiktif Bukan Data Sebenarnya, Cuma
Sebagai Latihan Uji Statistik
Seorang dokter ingin mengetahui probabilitas seorang pasien terkena penyakit jantung
berdasarkan rutinitas kebiasaan merokok dan usia
Data dikumpulkan dari catatan medis sebanyak 30 orang pasien yang melakukan
pemeriksaan kesehatan di RS ABC
Keterangan :
Sakit (1), tidak sakit (0)
Merokok (1), tdk merokok (0)
Usia (usia dalam tahun)
Sakit Rokok Usia
1 0 51
1 1 46
1 1 53
1 0 55
1 1 43
1 1 33
1 1 42
1 1 42
1 1 46
1 1 51
1 1 46
1 1 46
1 1 46
1 1 51
1 1 25
0 1 29
0 0 38
0 0 31
0 0 47
0 0 50
0 0 51
0 1 41
0 0 32
0 0 42
0 0 38
0 0 40
0 0 42
0 0 33
0 0 43
0 0 46
HASIL DAN INTERPRESTASI
1/#more-966.. 22/8/2012
Menilai Model Fit

Untuk menilai model fit dapat diperhatikan nilai statistik -2LogL yaitu tanpa
mengikutsertakan variabel hanya berupa konstanta yaitu sebesar 41.589. Ketika
dimasukkan 2 variabel baru maka prediksi nilai -2LogL sebesar 16.750. Artinya terdapat
penurunan sebesar 41.589 16.750 = 24.839.
Untuk -2LogL pertama diperoleh nilai 41.589 dengan df1 = 30-1 = 29. Nilai ini
signifikan pada alpha 5% yang berarti Ho ditolak, artinya model tidak fit.
nilai -2LogL kedua adalah sebesar 16.750 dengan df2 = 30-3 = 27 adalah tidak signifikan
pada alpha 5%. (Nilai statistik -2LogL di atas dibandingkan dengan nilai statistik
distribusi x^2.), artinya model fit dengan data.

Statistik -LogL dapat digunakan untuk menentukan apakah jika variabel bebas
dimasukkan dalam model dapat secara signifikan mempengaruhi model. dengan selisih
24.839 dan df(df1-df2=29-27=2) maka menunjukkan angka ini signifikan pada alpha
5%. Hal ini berarti Ho ditolak dan Model fit dengan data.

Cox n Snells R Square adalah ukuran pengaruh bersama yaitu sebesar 0.563 dan nilai
Nagelkerke R Square adalah sebesar 0.751. dengan demikian dapat disimpulkan bahwa
kemampuan variabel bebas menjelaskan model adalah sebesar 75.10%.

Selanjutnya, Hosmer and Lemeshows GoF dilakukan untuk menguji hipotesis. Jika sig
< 0.05 maka Ho ditolak yang berarti ada perbedaan signifikan antara model dengan nilai
observasinya. Jika sig > 0.05 maka Ho diterima, artinya tidak ada perbedaan antara
model dan nilai observasinya.

Statistik Hosmer and Lemeshows GoF diperoleh sebesar 0.594 (> 0.05) sehingga dapat
dinyatakan bahwa model fit dengan data. Hosmer and Lemeshows GoF juga
menghasilkan nilai 6.475 dengan probabilitas sebesar 0,594 sehingga dapat disimpulkan
bahwa model fit dengan data.

ESTIMASI PARAMETER DAN INTERPRESTASI
1/#more-966.. 22/8/2012

Estimasi Maximum Likehood parameter model dapat dilihat dari output pada tabel
Variables in the Equation. Logistic Regression kemudian dapat dinyatakan :

Ln P/1-P = -11.506 + 5.348 Rokok + .210 Usia.

Variabel bebas kebiasaan merokok signifikan dengan probabilitas sebesar 0.004 (< 0.05)
dan variabel usia juga signifikan dengan probabilitas 0.032. dengan memperhatikan
persamaan ini maka dapat diinterprestasikan sbb :

Log of Odds seseorang terkena secara positif berhubungan dengan rokok. Probabilitas
atau Odds seorang terkena penyakit jantung jika ia perokok adalah sebesar 5.348. artinya
seorang perokok memiliki kemungkinan terkena serangan jantung 5.35 kali lebih besar
dibanding yang tidak merokok.

Jika variabel rokok dianggap konstan, maka probabilitas seseorang terkena serangan
jantung adalah sebesar 0.210 pada setiap kenaikan satu tahun usia.

Jika Rokok dianggap konstan, maka seseorang memiliki odds terkena penyakit jantung
adalah sebesar 1.233 untuk setiap penambahan usia. Sementara jika usia bernilai konstan
maka odds seorang terkena penyakit jantung adalah sebesar 210.286 untuk perokok
dibandingkan dengan yang tidak merokok.

Hasil overall clasification rate adalah sebesar 90.0% pada cutoff 50%

. MODEL REGRESI NON LI NEAR
DAN UJI DETEKSI HUBUNGAN NONLI NEAR
Azwar Rhosyied
1
1305 100 054
Saudi Imam Besari
2
1306 100 046
Arisman Wijaya
3
1306 100 042
1
rhosyied54@gmail.com,
2
e_saudi@ymail.com ,
3
arin_mathlover@yahoo.co.id

Diunduh dari: 54ud1.files.wordpress.com/2010/01/model-regresi-non-linear.doc..
22/8/2012

In our living, there are many data doesnt has linear pattern. So it is fit to using
non linear model to solving it. The purpose of this research is applying non linear
regression model for three cases using SPSS, SAS and R software.

The best model for the first case is adalah Yt = 81,84 + 102,40 exp(t/203,19) + .
is the model for the second case. All software has the same result in estimating
parameter for this model. For the third case, we use the newest model, Nelson
Siegel (N-S) and Nelson Siegel Svensson (N-S-S) model with yield curve data.

The result for each model is

YTM = 0.133 - 0.031* exp( - TTM / 2.265) 0.014*exp((TTM /2.265) * exp( -
TTM / 2.265))

expecially for N-S model,

YTM = 0.647 + 0.4*exp( -TTM / 0.601) 0.087* ((TTM / 0.601) * exp(TTM
/0.601)) + 0.004 * (( -TTM / 0.545) * exp( - TTM / 0.545))

expecially for N-S-S model.

Azwar Rhosyied
1
1305 100 054
Saudi Imam Besari
2
1306 100 046
Arisman Wijaya
3
1306 100 042
1
2
e_saudi@ymail.com ,
3

Diunduh dari: 54ud1.files.wordpress.com/2010/01/model-regresi-non-linear.doc.. 22/8/2012

Peristiwa di sekitar sering merupakan kejadian yang dapat dimodelkan dengan persamaan regresi.
Berdasarkan hubungan kelinearan antar parameter dalam persamaan regresi, model regresi mempunyai
dua bentuk hubungan kelinearan yaitu regresi linear dan regresi nonlinear.
Seringkali kejadian dalam kehidupan sehari-hari lebih sering merupakan pola model regresi nonlinear.
Untuk itu dalam makalah ini akan dibahas mengenai model regresi nonlinear.
Beberapa penelitian yang menggunakan regresi non-linear diantaranya oleh Miconnet, Geeraerd, Impe,
Roso, dan Cornu (2005) yaitu memodelkan produksi padi dengan least square non-linear dalam
permodelan kurva pertumbuhan dalam produksi.

Consequences of cutting off distal ends of cotyledons of Quercus
robur acorns before sowing
Giertych, Marian J.; Suszka, Jan
Annals of Forest Science Vol. 68 Issue 2. 2011-05-09
Institution(s): Institute of Dendrology, University of Zielona Gora

Richards growth function for cumulative
emergence (%) of pedunculate oak
seedlings. The mean is shown for all
provenances combined for each of the
five experimental treatments (1
untreated control, 2 cutting off the scar
of the pericarp and seed testa (DC), 3
cutting off of 1/5 of the distal end of
acorns, 4 cutting off 1/2 of the distal
end of acorns, 5 cutting off 2/3 of the
distal end of acorns)

Diunduh dari:
http://www.springerimages.com/Images/
RSS/1-10.1007_s13595-011-0038-6-0
Azwar Rhosyied
1
1305 100 054
Saudi Imam Besari
2
1306 100 046
Arisman Wijaya
3
1306 100 042
1
2
e_saudi@ymail.com ,
3

Uji Deteksi Non-linear dengan Uji Ramseys RESET, Uji White dan Uji Terasvirta

Uji Ramseys RESET, Uji White dan Uji Terasvirta untuk mendeteksi apakah suatu
model mengikuti pola linear atau non-linear tersedia dalam software R. Statistik uji
Ramseys RESET adalah (Lihat pembahasan lengkap di Gujarati, 1996).

(1)

dengan p jumlah variabel independen baru, k jumlah parameter pada model baru, n
jumlah data. Kesimpulanya Ho ditolak bila F > F(o,p,n-k)

Uji White adalah uji deteksi non-linearitas yang dikembangkan dari model neural
network yang ditemukan oleh White (1989). Uji white menggunakan statistik dan F.

Prosedur yang digunakan untuk adalah :
Meregresikan y
t
pada 1, x
1
, x
2
, , x
p
dan menghitung nilai-nilai residual u
t
.
Meregresikan pada 1, x
1
, x
2
, , x
p
dan m prediktor tambahan dan kemudian hitung
koefisien determinasi dari regresi R
2
. Dalam uji ini, m prediktor tambahan ini adalah
nilai-nilai dari hasil dari hasil dari suatu transformasi komponen utama.
Hitung =nR
2
, dimana n adalah jumlah pengamatan yang digunakan.

Dengan hipotesis linearitas, mendekati distribusi atau tolak Ho jika P-value < .

Uji Terasvirta adalah uji deteksi non-linearitas yang juga dikembangkan dari model
neural network dan termasuk dalam kelompok uji tipe Lagrange Multiplier (LM) yang
dikembangkan dengan ekspansi Taylor (Terasvirta, 1993).

Pengambilan kesimpulan ketiga uji tersebut dapat dilihat melalui nilai P-value, yaitu
tolak Ho jika kurang dari .

F =
(R
new
2
R
old
2
) / p
(1 R
new
2
) / (n k)
Azwar Rhosyied
1
1305 100 054
Saudi Imam Besari
2
1306 100 046
Arisman Wijaya
3
1306 100 042
1
2
e_saudi@ymail.com ,
3

22/8/2012
Model Regresi Non-linear Parametrik

Berdasarkan kelinearan antar parameter pada model regresi, maka suatu model
regresi dapat diklasifikasikan menjadi dua macam yaitu model linear dan non-linear.
Model regresi dikatakan linear jika dapat dinyatakan dalam model :
(2)

Apabila model tidak dapat dinyatakan dalam model tersebut maka model yang
diperoleh adalah model non-linear. Secara umum model regresi non-linear
parametrik dengan sebagai variabel respon pada replikasi sebanyak dan setiap nilai
merupakan variabel independen.dapat dinyatakan dalam persamaan (Ripley, 2002) :

(3)

dengan f adalah fungsi regresi dengan parameter yang harus diduga dan adalah galat
dengan sifat N(0,). Salah satu metode pendugaan parameter dalam sistem non-
linear adalah jalan tengah Marquardt (Marquadts compromise).

Metode Marquardt merupakan kompromi atau jalan tengah antara metode linearisasi
atau deret Taylor dengan metode steepest descent (Draper & Smith, 1996).

c | | | | | + + + + + + =
k k
x x x x y ...
3 3 2 2 1 1 0

ij i ij
x f Y c u + = ) , (

Azwar Rhosyied
1
1305 100 054
Saudi Imam Besari
2
1306 100 046
Arisman Wijaya
3
1306 100 042
1
2
e_saudi@ymail.com ,
3

22/8/2012
Model Nelson Siegel (N-S) dan Nelson Siegel Svensson (N-S-S)

Tahun 1987, Nelson dan Siegel menunjukkan yield curve dari model yang terletak pada
bentuk range yang sama. Model N-S dan N-S-S merupakan pendekatan untuk
mendapatkan model yield curve. Model N-S dinyatakan dalam persamaan sebagai
berikut

(4)

dengan adalah nilai yield to maturity (YTM yang )merupakan yield dengan pendekatan
forward rate pada maturitas m atau time to maturity (TTM). Sedangkan parameter
merupaka konstanta waktu dari belokan kurva dan parameter menunjukkan nilai
asimtotik atau konstanta, serta dan merupakan parameter yang menunjukkan arah
lengkungan dari kurva.
Sedangkan model N-S-S berikut merupakan pengembangan dari model N-S
dengan penambahan parameter dan yang digunakan untuk menambah fleksibilitas
kurva (Amoako et al., 2005).

(5)

( )
(
|
.
|
\
|
|
.
|
\
|
+
|
.
|
\
|
+ =
t t
|
t
| |
m m m
m exp exp exp
2 1 0

( )
(
|
|
.
|
\
|
|
|
.
|
\
|
+
(
|
|
.
|
\
|
|
|
.
|
\
|
+
|
|
.
|
\
|
+ =
2 2
3
1 1
2
1
1 0
exp exp exp exp
t t
|
t t
|
t
| |
m m m m m
m
MODEL REGRESI NON LI NEAR
Azwar Rhosyied
1
1305 100 054
Saudi Imam Besari
2
1306 100 046
Arisman Wijaya
3
1306 100 042
1
2
e_saudi@ymail.com ,
3
Dalam penelitian ini digunakan tiga jenis data.
1. Kasus pertama adalah data mengenai program penurunan berat badan yang diikuti oleh pasien
laki-laki dengan variabel prediktor adalah hari (t) dan berat badan dalam kg (yt) sebagai variabel
respon.
2. Kasus ke dua mengenai The Stormer Viscometer dengan viscosity (v) dan berat fluida (w) sebagai
variabel prediktor dan waktu (T) sebagai variabel respon.
3. Kasus ke tiga adalah data mengenai transaksi perdagangan obligasi pemerintah pada periode 6
April 2009 dengan variabel prediktor adalah time to maturity (TTM) dan variabel respon adalah
yield to maturity (YTM).

MODELLING GROWTH OF FIVE DIFFERENT COLOUR
TYPES OF MINK
Zong-yue Liu1,2, Fang-yong Ning3, Zhi-heng Du1, Chun-san Yang1,
Jing Fu1, Xing Wang1 & Xiu-juan Bai
South African Journal of Animal Science 2011, 41 (no 2)

Growth is a fundamental property of biological systems and it can be defined as
an increase in body size per unit time. Modelling of growth curves is useful
because it provides means for visualizing growth patterns over time, and the
generated equations can be used to predict the expected weight of a group of
animals at a specific age.
Azwar Rhosyied
1
1305 100 054
Saudi Imam Besari
2
1306 100 046
Arisman Wijaya
3
1306 100 042
1
2
e_saudi@ymail.com ,
3

Memodelkan data kasus 1 dengan pemodelan non-linear, kuadratik dan kubik.

Model non-linear yang diberikan adalah (Ripley, 2002) :

Yt =
0
+
1
exp(t/) +

Identifikasi awal penaksiran parameter
00
,
10
, dan
0
yaitu :
Melakukan regresi kuadratik antara variabel hari (t) sebagai prediktor dan berat dalam kg
(Yt) sebagai respon. Sehingga didapatkan nilai fitted value . Model kuadratik tersebut :

Yt =
0
*
+
1
*
t +
2
*
t
2
+

Memilih tiga data secara berurutan x
o
, x
1
, x
2
dari n data yang memiliki selisih sama ( ).
Sehingga didapatkan , dan .

Menentukan nilai dengan rumus :

Menentukan
00
dan
10
dengan meregresikan Yt sebagai respon dengan exp(-t/ ) sebagai
prediktor.
|
|
.
|
\
|
=
2 1
1
0

log
y y
y y
o
o
u

0
y
1
y
2
y
Azwar Rhosyied
1
1305 100 054
Saudi Imam Besari
2
1306 100 046
Arisman Wijaya
3
1306 100 042
1
2
e_saudi@ymail.com ,
3

. Melakukan pemodelan data studi kasus ke tiga dengan permodelan non-linear Nelson
Siegel (N-S) dan Nelson Siegel Svensson (N-S-S) dengan tahapan sebgai berikut.
1. Penentuan nilai awal berdasarkan penelitian oleh Amoako (2002), b0=7.41 b1=-5.41
b2=-5.03 b3=-4.43 t1=0.44 dan t2=1.38.
2. Membagi data training dan testing masing-masing sebanyak 100 dan 32 data sampel
dan memodelkan NS dan NSS berdasarkan nilai awal.
3. Menghitung nilai RMSE untuk masing-masing data training dan testing.
4. Memodelkan keseluruhan data dengan model NS dan NSS berdasarkan nilai awal
yang sudah ada.

Forecasting the term structure of government bond yields
Francis X. Diebold, dan Canlin Li
Journal of Econometrics. Volume 130, Issue 2, February 2006, Pages 337364.

We use variations on the NelsonSiegel exponential components framework to model the entire
yield curve, period-by-period, as a three-dimensional parameter evolving dynamically. We show
that the three time-varying parameters may be interpreted as factors corresponding to level, slope
and curvature, and that they may be estimated with high efficiency.
Actual (data-based) and fitted
(model-based) average yield
curve. We show the actual
average yield curve and the fitted
average yield curve obtained by
evaluating the NelsonSiegel
function.

Diunduh dari:
http://www.sciencedirect.com/science/
article/pii/S0304407605000795
30/8/2012
Azwar Rhosyied
1
1305 100 054
Saudi Imam Besari
2
1306 100 046
Arisman Wijaya
3
1306 100 042
1
2
e_saudi@ymail.com ,
3

Pengujian Deteksi Hubungan Non-linear
Pengujian hubungan non-linear dengan uji Ramseys RESET, uji White dan uji
Terasvirta (dengan uji Chi-Square) pada software R yaitu :

Tabel 1 Pengujian Deteksi Hubungan Non-linear

Data Studi
Kasus Ramseys RESET White Terasvirta
1
714.1839 199.7347 210.1889
2.2e-16* 2.2e-16* 2.2e-16*
2
7.6107 11.2577 91.3249
0,0004* 0,0036* 2.2e-16*
3
42.3412 55.1889 53.9386
1.544e-09* 1.037e-12* 1.938e-12*
Keterangan : (*) nilai P-value
. Illustration of a typical 'cause-
effect' relationship
The cause-effect curve for a
particular toxin is established by
measuring growth impairment or
death as the effect of increasing
concentration of a toxin on aquatic
species.

Diunduh dari:
http://www.qp.org.nz/plan-topics/surface-
water-quality.php .. 1/9/2012
Azwar Rhosyied
1
1305 100 054
Saudi Imam Besari
2
1306 100 046
Arisman Wijaya
3
1306 100 042
1
2
e_saudi@ymail.com ,
3

22/8/2012
. Model Non-linear Studi Kasus Pertama

Pembentukan model non-linear dimulai dengan penaksiran awal parameter
yang akan digunakan. Dari persamaan kuadratik

diperoleh tiga data secara berurutan dengan =3 :
x
o
=27 , = 171.51
x
1
=30 , = 170.26
x
2
=33, .= 169.03

Sehingga didapatkan = 428.27,
00
= 17.97 dan
10
= 162.13. Kemudian
dilanjutkan dengan pembentukan model non-linear dengan software SPSS, R
dan SAS dengan parameter awal
0
= 17.97 dan
1
= 162.13.

Tabel 2 Model Non-linear pada SPSS, R dan SAS

c + + =
2
001 . 0 454 . 0 3 . 183 t t Yt
Software Model R
2

SPSS Yt = 81,84 + 102,40 exp(t/203,19) + 99,8 %
R Yt = 81,84 + 102,40 exp(t/203,19) + -
SAS Yt = 81,84 + 102,40 exp(t/203,19) + 99,8 %
ccc
. MODEL REGRESI DENGAN VAR.DEPENDEN KUALITATIF
Oleh : Lelarospida
Diunduh dari: http://www.slideshare.net/efvolutionzunior/ekonometrikmodel-regresi-dengan-
vardependen-kualitatif .. 22/8/2012
1. Ada topik-topik mengenai regresi variabel dependen yang kuantitatif dengan
variabel independen yang kuantitatif dan variabel independen yang kualitatif
(dummy variabel)
2. Adakalanya yang bersifat kualitatif adalah variabel dependen
(dummy,dikotomus, atau biner), sedangkan variabel independennya dapat
kuantitatif, dummy atau kombinasi keduanya.
3. Studi tentang partisipasi tenaga kerja wanita sebagai fungsi dari pendapatan
keluarga dan tingkat pendidikan partisipasi tenaga kerja wanita sebagai var.
dependen yang bersifat kualitatif dan dikotomus (1= tenaga kerja wanita,
0=bukan tenaga kerja wanita) dapat menentukan besarnya probabilitas seorang
tenaga kerja wanita untuk bekerja berdasarkan pendapatan keluarga dan
pendidikannya .
4. Beberapa pendekatan untuk mengestimasi model dengan variabel dependen
yang bersifat dikotomis :
1. Linear Probability Model
2. Cumulative Distribution Function Model (Probit Model dan Logit Model)
3. Tobit Model

. MODEL REGRESI DENGAN VAR.DEPENDEN
KUALITATIF
Oleh : Lelarospida
Linear Probability Model

Model ini mengasumsikan bahwa probabilitas bersifat linier terhadap
variabel penjelas (X)

Suatu ilustrasi : Y i = 1 + 2 X i + u i
X = pendapatan
Y = 1 ; bila seseorang pernah melakukan perjalanan ke luar negeri
Y = 0 ; bila seseorang tidak pernah melakukan perjalanan ke luar
negeri

E(Y i | X i ) = (Y i =1).P(Y i =1|X i ) + (Y i =0).P(Y i =0|X i ) = P(Y i
=1|X i )

Ekpektasi ini diinterpretasikan sebagai probabilitas kondisional
bahwa suatu peristiwa (seseorang pernah melakukan perjalanan ke
luar negeri) akan terjadi bila X (pendapatan) diketahui.

Besarnya probabilitas bahwa seseorang pernah melakukan perjalanan
ke luar negeri berdasarkan pendapatannya.

Sedemikian secara matematis dapat juga ditulis :

E(Y i | X i ) = 1 + 2 X i

dengan asumssi E(u i ) = 0
. MODEL REGRESI DENGAN VAR.DEPENDEN
KUALITATIF
Oleh : Lelarospida

Karena merupakan nilai probabilitas, maka E(Y i | X i ) yaitu = 1 +
2 X i haruslah berada antara 0 dan 1

Karena karakteristik LPM sama dengan model regresi linier maka
metode OLS dapat digunakan untuk penyelesaiannya.

Contoh Y = -0.3998 + 0.1137 X

Y = kepemilikan mobil dalam rumah tangga
X = pendapatan

Jika terjadi kenaikan pendapatn 1 juta, maka rata-rata probabilitas
suatu rumah tangga memiliki mobil akan naik sebesar 11 %.

Jika pendapatan rumah tangga Rp 9 juta, maka probabilitas rumah
tangga tersebut memiliki mobil adalah 0.6235.

Beberapa kelemahan model LPM :
1. residuals (e i ) tidak berdistribusi normal
2. mengandung masalah heteroskedastisitas
3. E(Y i | X i ) tidak selalu terletak antara 0 dan 1
4. Nilai R2 diragukan
Oleh : Lelarospida
CUMULATIVE DISTRIBUTION FUNCTION MODEL : CDF

Dalam LPM probabiliti (Y i =1|X i ) menaik secara linier terhadap X, artinya jika
pendapatan keluarga terus meningkat maka probabilitasnya akan makin besar.(sesuatu
yang akan menyalahi asumsi dalam probabiliti yaitu probabili memiliki nilai antara 0
dan 1)

Model CDF (Cumulative Distribution Function) mampu menjamin bahwa nilai
probabiliti akan terletak antara 0 dan 1 10. Sifat CDF: (1). ketika X i naik maka
probabiliti (Y i =1|X i ) akan naik pula tetapi tidak pernah keluar dari interval 0-1 (2).
Hubungan antara P i dan X i adalah non linier sehingga tingkat perubahannya tidak
sama tetapi kenaikannya semakin besar dan kemudian semakin kecil. (3). Ketika nilai
probabilitasnya mendekati nol tingkat penurunannya makin kecil dan ketika
probabilitasnya mendekati 1 tingkat kenaikannya juga akan makin kecil. 1

Model CDF terdiri dari 2 macam yaitu Probit Model dan Logit Model Probit Model
menggunakan fungsi distribusi normal (Normal Distribution Function) Logit Model
menggunakan fungsi distribusi logistik (Logistic Distribution Function)

Probit Model Illustrasi : Z i = 1 + 2 X i ;

X i = pendapatan
Keputusan memiliki mobil { P i = P(Y i =1|X i ) = P(Z i * Z i ) = P(Z i * ( 1 + 2
X i )= F ( 1 + 2 X i )
Ya jika Z i Z*
Tidak, jika Zi Z*

Dalam model probit, OLS tidak bisa digunakan karena sifat non linier sehingga
digunakan metode Maximum Likelihood Informasi hasil regresi dengan ML berbeda
dengan tampilan hasil regresi dengan OLS Dapat menggunakan software EVIEWS

Oleh : Lelarospida

Beberapa hal yang berkaitan dengan metode ML dalam pengestimasi model
probit :
(1) Sampel besar, sehingga standar error bersifat asimtotik
(2) Konsekwensi dari sampel besar, menyebabkan penggunaan statistik Z (bukan
lagi statistik t)
(3) Untuk menguji pengaruh var. penjelas (independen) secara keseluruhan
(global) digunakan statistik likelihood ratio (LR) yang mengikuti distribusi
Chi Square dengan df sebesar jumlah var. penjelas (Jika nilai chi square
hitung lebih besar dari chi square tabel maka akan menolak hipotesis nol,
yang berarti bahwa var.penjelas secara bersama-sama mempengaruhi
var.dependen)
(4) Koefisien determinasi yang digunakan adalah yang dikembangkan oleh Mc-
Fadden disingkat R McF 2

Contoh hasil output model probit dengan satu variabel penjelas, X = pendapatan ;
Y = Z = kepemilikan mobil yang datanya 0 dan 1

Dependent Variable : Y
Method : ML- Binary Probit
Sampel : 1 30
Variable Coefficient Std Error Z-Statistic Prob C X 7.455182 0.923041 3.460451
0.413465 -2.154396 2.232454 0.0312 0.0256

LR statistic (1 df)

Probability (LR stat) 31.85257 1.66E-08

R- squared 0.775872

MODEL REGRESI DENGAN VAR.DEPENDEN KUALITATIF
Oleh : Lelarospida

Berdasarkan hasil tersebut maka model probit nya adalah : Z = -7.455182 + 0.923041X ,
(model signifikan dalam uji statistik LR dan Z).

Apabila pendapatan bertambah 1 jt maka nilai estimasi probit akan naik sebesar 0.9230.
Jika pendapatan sebesar 10 juta maka nilai probit (nilai Z) adalah sebesar -7.455182 +
0.923041(10) = 1.775228 (nilai ini merupakan nilai Z hitung yang besarnya probabiliti (luas
dibawah kurva normal baku) yang kurang dari nilai Z hitung
Nilai ini merupakan besarnya probabiliti memiliki mobil bagi sebuah rumah tangga dengan
pendapatan 10 juta jika pendapatan 13 juta maka nilai Z adalah 4.544351 (nilai probitnya
hampir mendekati 1)

Jika pendapatan 2 juta maka nilai Z = -5.6091 ( nilai prob nya hampir = 0)

Probit Regression Models
An alternative to logistic regression analysis is probit analysis. The term "probit' was coined in the
1930's by Chester Bliss and stands for probability unit. These two analyses, logit and probit, are very
similar to one another. As discussed in the previous unit logit analysis is based on log odds while
probit uses the cumulative normal probability distribution. Here is what a cumulative normal
distribution looks like. (http://www.philender.com/courses/categorical/notes3/probit1.html)
Notice the S-shaped curve that runs from zero
to one. It is very similar to the graph of the logit
function. The two procedures are so similar that
they can easily be confused with one another.
The bottom line is that logistic regression and
probit analysis produce predicted probabilities
that are very similar. An example of predicted
probabilities for logit and probit is given below.

The probit model is defined as
Pr(y=1|x) = (xb)

where is the standard cumulative normal
probability distribution and xb is called the
probit score or index.
The log-likelihood function for probit is:
where w
j
denotes optional weights.
Oleh : Lelarospida
Logit Model

Tidak banyak berbeda dengan model Probit, pada model Probit digunakan fungsi distribusi
normal sedangkan pada model Logit digunakan fungsi distribusi logistik. Sehingga perbedaannya
hanya pada tingkat penurunan yang lebih lambat pada model probit. Semua prosedur yang lain
berlaku seperti pada model Probit Esimasi dengan ML akan memberikan hasil yang berbeda

Contoh hasil output model Logit dengan satu variabel penjelas, X = pendapatan ; Y = Z =
kepemilikan mobil yang datanya 0 dan 1
Data yang digunakan sama dengan data pada contoh model Probit
Dependent Variable : Y
Method : ML- Binary Logit
Sampel : 1 30
Variable Coefficient Std Error Z-Statistic Prob C X 13.49109 1.650428 6.824275 0.803802 -
1.976926 2.053276 0.0480 0.0400 LR statistic (1 df) Probability (LR stat) 31.68133 1.82E-08
McFadden R- squared 0.771701 .
Assessing the Influence of Traffic-related Air Pollution on Risk of Term Low
Birth Weight on the Basis of Land-Use-based Regression Models and
Measures of Air Toxics
Jo Kay C. Ghosh, Michelle Wilhelm, Jason Su, Daniel Goldberg, Myles
Cockburn, Michael Jerrett and Beate Ritz
Am. J. Epidemiol. (2012) doi: 10.1093/aje/kwr469 First published online: May
13, 2012
Diunduh dari: http://aje.oxfordjournals.org/content/early/2012/05/13/aje.kwr469.abstract..
24/8/2012

Few studies have examined associations of birth outcomes with toxic air
pollutants (air toxics) in traffic exhaust.
This study included 8,181 term low birth weight (LBW) children and 370,922
term normal-weight children born between January 1, 1995, and December 31,
2006, to women residing within 5 miles (8 km) of an air toxics monitoring
station in Los Angeles County, California. Additionally, land-use-based
regression (LUR)-modeled estimates of levels of nitric oxide, nitrogen dioxide,
and nitrogen oxides were used to assess the influence of small-area variations in
traffic pollution.
The authors examined associations with term LBW (37 weeks completed
gestation and birth weight <2,500 g) using logistic regression adjusted for
maternal age, race/ethnicity, education, parity, infant gestational age, and
gestational age squared.

Odds of term LBW increased 2%5% (95% confidence intervals ranged from
1.00 to 1.09) per interquartile-range increase in LUR-modeled estimates and
monitoring-based air toxics exposure estimates in the entire pregnancy, the third
trimester, and the last month of pregnancy. Models stratified by monitoring
station (to investigate air toxics associations based solely on temporal
variations) resulted in 2%5% increased odds per interquartile-range increase in
third-trimester benzene, toluene, ethyl benzene, and xylene exposures, with
some confidence intervals containing the null value.

This analysis highlights the importance of both spatial and temporal
contributions to air pollution in epidemiologic birth outcome studies.

Exposures to fine particulate air pollution and respiratory outcomes
in adults using two national datasets: a cross-sectional study
Keeve E Nachman
1*
and Jennifer D Parker
Environmental Health 2012, 11:25 doi:10.1186/1476-069X-11-25
Published: 10 April 2012
Diunduh dari: http://www.ehjournal.net/content/11/1/25/abstract.. 24/8/2012
Relationships between chronic exposures to air pollution and respiratory health outcomes
have yet to be clearly articulated for adults. Recent data from nationally representative
surveys suggest increasing disparity by race/ethnicity regarding asthma-related morbidity
and mortality. The objectives of this study are to evaluate the relationship between
annual average ambient fine particulate matter (PM
2.5
) concentrations and respiratory
outcomes for adults using modeled air pollution and health outcome data and to examine
PM
2.5
sensitivity across race/ethnicity.

Respondents from the 2002-2005 National Health Interview Survey (NHIS) were linked
to annual kriged PM
2.5
data from the USEPA AirData system. Logistic regression was
employed to investigate increases in ambient PM
2.5
concentrations and self-reported
prevalence of respiratory outcomes including asthma, sinusitis and chronic bronchitis.
Models included health, behavioral, demographic and resource-related covariates.
Stratified analyses were conducted by race/ethnicity.

Of nearly 110,000 adult respondents, approximately 8,000 and 4,000 reported current
asthma and recent attacks, respectively. Overall, odds ratios (OR) for current asthma
(0.97 (95% Confidence Interval: 0.87-1.07)) and recent attacks (0.90 (0.78-1.03)) did not
suggest an association with a 10 g/m
3
increase in PM
2.5
. Stratified analyses revealed
significant associations for non-Hispanic blacks [OR = 1.73 (1.17-2.56) for current
asthma and OR = 1.76 (1.07-2.91) for recent attacks] but not for Hispanics and non-
Hispanic whites. Significant associations were observed overall (1.18 (1.08-1.30)) and in
non-Hispanic whites (1.31 (1.18-1.46)) for sinusitis, but not for chronic bronchitis.
Non-Hispanic blacks may be at increased sensitivity of asthma outcomes from PM
2.5
exposure. Increased chronic PM
2.5
exposures in adults may contribute to population
sinusitis burdens.

BENTUK-BENTUK
FUNGSIONAL
DARI
MODEL REGRESI

Persamaan model linier:

Y = b
1
+ b
2
X + u ;
dimana:
X menyatakan harga gula pasir per Kg
Y menyatakan kuantitas yang diminta.
Berapa permintaan jika harga gula pasir = 0 rupiah?
Apa mungkin suatu komoditi berharga 0 rupiah?
Apa logis bila harga gula pasir per Kg = 0, maka permintaan hanya sebesar b
1
?.
Untuk mengatasi kelemahan tersebut, maka akan dipelajari model yang
merupakan bentuk-bentuk fungsional dari model regresi.
Diunduh dari: xa.yimg.com/kq/groups/23376985/1561225507/name/k5_Model..
22/8/2012
MODEL LINEAR
In statistics, the term linear model is used in different ways according to the context.
The most common occurrence is in connection with regression models and the term is
often taken as synonymous with linear regression model. However, the term is also
used in time series analysis with a different meaning. In each case, the designation
"linear" is used to identify a subclass of models for which substantial reduction in the
complexity of the related statistical theory is possible.

For the regression case, the statistical model is as follows. Given a (random) sample

the relation between the observations Y
i
and the independent variables X
ij
is formulated
as

where may be nonlinear functions. In the above, the quantities
i
are random variables
representing errors in the relationship.
The "linear" part of the designation relates to the appearance of the regression
coefficients,
j
in a linear way in the above relationship. Alternatively, one may say
that the predicted values corresponding to the above model, namely

are linear functions of the
j
.

Diunduh dari: http://en.wikipedia.org/wiki/Linear_model
Model Log-Log
Model Semi Log
Model Reciprocal
Kurva Philips
Kurva Engel
22/8/2012
JENIS-JENIS MODEL FUNGSIONAL
In statistics, logistic regression is a type of regression analysis used for predicting the
outcome of a categorical (a variable that can take on a limited number of categories)
criterion variable based on one or more predictor variables.
The probabilities describing the possible outcome of a single trial are modelled, as a
function of explanatory variables, using a logistic function.
Logistic regression measures the relationship between a categorical dependent variable
and usually a continuous independent variable (or several), by converting the dependent
variable to probability scores.

Logistic regression can be bi- or multinomial. Binomial or binary logistic regression refers
to the instance in which the observed outcome can have only two possible types (e.g.,
"dead" vs. "alive", "success" vs. "failure", or "yes" vs. "no").
Multinomial logistic regression refers to cases where the outcome can have three or more
possible types (e.g., "better' vs. "no change" vs. "worse"). Generally, the outcome is coded
as "0" and "1" in binary logistic regression as it leads to the most straightforward
interpretation.

The target group (referred to as a "case") is usually coded as "1" and the
reference group (referred to as a "noncase") as "0".

Diunduh dari: http://en.wikipedia.org/wiki/Logistic_regression
Model ini juga dikenal dengan: Model Double Log dan Model
Konstan Elastisitas
Menurut suatu teori ekonomi, hubungan antara kuantitas yang diminta
dan harga suatu komoditas mempunyai bentuk sebagai berikut:
Y X e
u
= |
|
1
2
Y : kuantitas
X : harga
|
1
, |
2
: parameter-parameter
u : error
Model diatas mirip dengan Fungsi Produksi (Model Cobb Douglas)
Model tidak linier baik variabel Sulit diestimasi
Untuk mempermudah, model ditransformasi
22/8/2012
Model log-log
An equation that specifies a linear relationship among the variables gives an
approximate description of some economic behaviour. An alternative approach is to
consider a linear relationship among log-transformed variables. This is a log-log
model - the dependent variable as well as all explanatory variables are transformed to
logarithms. Since the relationship among the log variables is linear some researchers
call this a log-linear model.
Different functional forms give parameter estimates that have different economic
interpretation. The parameters of the linear model have an interpretation as marginal
effects. The elasticities will vary depending on the data. In contrast the parameters of
the log-log model have an interpretation as elasticities. So the log-log model assumes
a constant elasticity over all values of the data set.
The log transformation is only applicable when all the observations in the data set are
positive. Gujarati [Basic Econometrics, Third Edition, 1995, McGraw-Hill, p.387]
notes that this can be guaranteed by using a transformation like log(X+k) where k is a
positive scalar chosen to ensure positive values. However, users will then need to give
careful thought to the interpretation of the parameter estimates.

Diunduh dari: http://shazam.econ.ubc.ca/intro/olslog.htm
lnY = ln |
1
+ |
2
ln X + u

Transformasi dilakukan pada dua sisi Model Log-Log

Redefinisi Model :
Y* = |
1
* + |
2
* X* + u*
Dimana:
Y* = ln Y
X* = ln X
|
1
* = ln |
1
|
2
* = |
2
u* = u
Redefinisi model menunjukkan bahwa model sesungguhnya merupakan
model regresi linier |
1
* dan |
2
* dapat ditaksir dengan OLS.
Diunduh dari: xa.yimg.com/kq/groups/23376985/1561225507/name/k5_Model.. 22/8/2012
HASIL TRANSFORMASI LOGARITMA:
How do I interpret a regression model when some variables are log
transformed?

How to interpret a regression model when some variables in the model have been log
transformed.
The variables in the data set are writing, reading, and math scores (write, read and
math), the log transformed writing (lgwrite) and log transformed math scores
(lgmath) and female. In the examples below, the variable write or its log transformed
version will be used as the outcome variable.

Both the outcome variable and some predictor variables are log transformed
What happens when both the outcome variable and predictor variables are log
transformed?

Written as an equation, we can describe the model:
log(write)=
0
+
1
*female +
2
*log(math) +
3
*read

Diunduh dari: http://www.ats.ucla.edu/stat/mult_pkg/faq/general/log_transformed_regression.htm
Y X = |
|
1
2
; |
2
< 0
ln X
Y
X
InY
lnY=ln|
1
+ |
2
lnX
Apa Keistimewaan Model Log-Log?
22/8/2012
SECARA GEOMETRIS:

1. Slope |
2
dalam Model Log-Log menyatakan elastisitas Y terhadap X, yaitu
ukuran persentasi perubahan dalam Y bila diketahui perubahan persentasi X.
Dengan perkataan lain, bila Y menyatakan kuantitas yang diminta dan X
menyatakan harga komoditas per unit, maka |
2
menyatakan elastistas harga
dari permintaan.
2. |
1
dan |
2
juga bisa diinterpretasikan dengan mengembalikan model ke bentuk
semula. Jadi, |
1
dan |
2
di interpretasikan melalui e
|1
dan e
|2
. Model tersebut juga
menunjukan bahwa bila harga komoditi mahal sekali, maka permintaan akan
minimal, yaitu e
|1
, dan bila harga murah sekali, maka permintaan maksimal.
3. Harga tidak akan pernah mencapai nilai nol. Sehingga dapat dikatakan bahwa
permasalahan yang dihadapi dalam regresi linier dapat teratasi dengan fungsi ini.
Keistimewaan Model Log-Log dibandingkan dengan Model
Linier
WHEN TO USE LOGLINEAR MODELS:
The loglinear model is one of the specialized cases of generalized linear models for
Poisson-distributed data.
Loglinear analysis is an extension of the two-way contingency table where the
conditional relationship between two or more discrete, categorical variables is
analyzed by taking the natural logarithm of the cell frequencies within a contingency
table.
Although loglinear models can be used to analyze the relationship between two
categorical variables (two-way contingency tables), they are more commonly used to
evaluate multiway contingency tables that involve three or more variables.
The variables investigated by log linear models are all treated as response
variables. In other words, no distinction is made between independent and
dependent variables. Therefore, loglinear models only demonstrate association
between variables. If one or more variables are treated as explicitly dependent and
others as independent, then logit or logistic regression should be used instead.
Also, if the variables being investigated are continuous and cannot be broken down
into discrete categories, logit or logistic regression would again be the appropriate
analysis.

Diunduh dari:
http://www.education.umd.edu/EDMS/fac/Hancock/Course_Materials/EDMS771/readings/LogLin
earModels%20reading.pdf
Q
1
e
|
P
Kelemahan ?
Model Log-Log ini tidak dapat dibentuk dari data yang mempunyai nilai = 0.
Karena Ln(0) =
22/8/2012
FUNGSI PERMINTAAN DAN HARGA
Limitations to Loglinear Models

1. Interpretation. The inclusion of so many variables in loglinear models often makes
interpretation very difficult.
2. Independence. Only a between subjects design may be analyzed. The frequency
in each cell is independent of frequencies in all other cells.
3. Adequate Sample Size. With loglinear models, you need to have at least 5
times the number of cases as cells in your data. For example, if you have a 2x2x3
table, then you need to have 60 cases. If you do not have the required amount of
cases, then you need to increase the sample size or eliminate one or more of the
variables.
4. Size of Expected Frequencies. For all two-way associations, the expected cell
frequencies should be greater than one, and no more than 20% should be less than
five. Upon failing to meet this requirement, the Type I error rate usually does not
increase, but the power can be reduced to the point where analysis of the data is
worthless. (diunduh dari: http://www.education.umd.edu/EDMS/fac/
Perhatikan dua model yang menyatakan hubungan antara harga gula pasir (X)
dengan banyaknya gula pasir yang dikonsumsi (Y).
Fungsi linier:
Y = 2,6911 0,4795 X
SE : (0,1216) (0,1140)
R
2
= 0,6628
Model Log-Log:
ln Y = 0,774 0,2530 lnX
SE : (0,0152) (0,0494)
R
2
= 0,7448
Manakah model yang paling cocok?.
ILUSTRASI MASALAH
Categorical Data
Peter.B. Imrey, Douglas G. Simpson
Encyclopedia of Environmetrics. Published Online: 15 SEP 2006
DOI: 10.1002/9780470057339.vac011. Copyright 2002 John Wiley & Sons, Ltd

Categorical data refers to counts of events or individuals observed through some defined process and
often allocated to subgroups, or categories, corresponding to levels of one or more attributes.
This article reviews methods for interpreting collections of such counts, when they arise from apparently
random environmental processes and may be treated as dependent variables relative to potentially
explanatory factors or covariates. After introducing basic terminology including measures of relative
frequency and association, we review the Poisson probability distribution. This is followed by the
binomial, multinomial and hypergeometric distributions and products thereof, that result from
conditioning upon sums of independent Poisson counts. These form the basis for modeling the random
variation in observed categorical data.
For modeling structural relationships, generalized linear models are first defined, and Poisson regression,
logistic regression, and log-linear models are each considered within that framework. We then
summarize several methods for analyzing the correlated counts that occur when observing a categorical
dependent variable on the same observational units under several measurement conditions or at multiple
observation times, or on multiple observational units within matched sets. These methods include
weighted least-squares functional regression, conditional logistic regression, CochranMantelHaenszel
tests, generalized linear mixed models, and analyses using generalized estimating equations.

Finally, we briefly comment on Bayes and empirical Bayes methods, spatial modeling, exact methods,
and add inevitably ephemeral comments on the status of software at the turn of the millenium.

Diunduh dari: http://onlinelibrary.Wiley.Com/doi/10.1002/9780470057339.Vac011/abstract.
26/8/2012
Lihat R
2.
Apakah model log-log lebih baik ?.
Data aktual dan hasil transformasi tidak dapat dibandingkan karena skala besaran yang
digunakan berbeda.
Slop dan intercept kedua bentuk model berbeda. Interpretasinya:.
Model linier
Bila harga gula pasir naik sebesar 1 unit, maka permintaan terhadap komoditi tersebut akan
turun unit.
Model log-log
Setiap kenaikan harga gula pasir sebesar 1%, jumlah yang diminta akan turun 0,25 %. Atau
dapat dikatakan, elastisitas harga = -0,25.

Komoditi Elastis atau tidak? Berapa batasan elastis?

ANALISIS - INTERPRETASI
Interpreting Regression Coefficients - level-level, log-level & log-log regression
Interpreting Beta - how to interpret your estimate of beta given a level-level, log-level
& log-log regression.

Model
Dependent
Variable (y)
Independent
Var (X)
Interpretation of
Level-level
y=0+1x+u
y x
y=1x
If you change x by one, wed expect y to
change by 1"
Log-Level
ln(y)=0+1x+u
ln(y) x
%y=1001x
if we change x by 1 (unit), wed expect our
y variable to change by 1001 percent
Log-log
ln(y)=0+1lnx+u
ln(y) ln(x)
%y=1%x
if we change x by one percent, wed expect
y to change by 1 percent
Diunduh dari: https://sites.google.com/site/curtiskephart/ta/econ113/interpreting-beta ..... 26/8/2012
Komoditas tidak elastis karena perubahan harga gula pasir tidak menimbulkan
gejolak yang besar terhadap permintaannya.
Dalam Prakteknya:
Model Log-Log dibuat karena sebaran data mengikuti garis tersebut.
Adanya permasalahan dalam membuat regresi linier
22/8/2012
ANALISIS
Basic Strategy and Key Concepts:

The basic strategy in loglinear modeling involves fitting models to the observed
frequencies in the cross-tabulation of categoric variables.

The models can then be represented by a set of expected frequencies that may or may
not resemble the observed frequencies.
Models will vary in terms of the marginals they fit, and can be described in terms of
the constraints they place on the associations or interactions that are present in the
data. The pattern of association among variables can be described by a set of odds and
by one or more odds ratios derived from them.
Once expected frequencies are obtained, we then compare models that are hierarchical
to one another and choose a preferred model, which is the most parsimonious model
that fits the data.
Its important to note that a model is not chosen if it bears no resemblance to the
observed data.

The choice of a preferred model is typically based on a formal comparison of
goodness-of-fit statistics associated with models that are related hierarchically (models
containing higher order terms also implicitly include all lower order terms).

Ultimately, the preferred model should distinguish between the pattern of the
variables in the data and sampling variability, thus providing a defensible
interpretation.

Diunduh dari:
http://www.education.umd.edu/EDMS/fac/Hancock/Course_Materials/EDMS771/readings/LogLinearModels
%20reading.pdf
Prinsip model sama dengan model log-log, yaitu melakukan
transformasi logaritma terhadap data. Bedanya, pada model semi-log
data yang ditransformasi hanya salah satu dari Y atau X.

Model Semi Log terdiri atas dua jenis model, yaitu:
Model Log-Lin
Model Lin-Log
22/8/2012
Model Semi-log
MODEL SEMILOG

Model semilog adalah model dimana hanya salah satu variabel (Y atau X) yang
ditransformasi secara logaritma. Bentuk modelnya sebagai berikut:

lnYi = 0 + 1Xi + ui .. Atau

Yi = 0 + 1lnXi + ui

1 mengukur perubahan relatif (persentase) Y yang disebabkan oleh perubahan absolut
dari X.

Model pertama ini disebut juga dengan model pertumbuhan tetap, karena mengukur
tingkat pertumbuhan yang konstan sepanjang waktu seperti trend kesempatan kerja,
produktivitas, dan lainnya.

Sedangkan untuk model ke dua, 1 mengukur perubahan absolut Y yang disebabkan
oleh perubahan relatif (persentase) dari X.

Diunduh dari: http://junaidichaniago.blogspot.com/2009/04/bentuk-fungsional-regresi-linear-
seri_19.html
ln Y = o
1
+ o
2
X + u
Interpretasi:
o
2
merupakan rasio antara perubahan relatif Y terhadap
perubahan absolut X, dituliskan sebagai berikut :

X _ dalam _ absolut _ perubahan
Y _ dalam _ relatif _ perubahan
2
= o
Penggunaan:
Variabel X menyatakan unit waktu (tahun, bulan, dan seterusnya)
Y dapat menyatakan pengangguran, penduduk, keuntungan, penjualan,
GNP, dan sebagainya.

Oleh karena itu, o
2
merupakan suatu ukuran pertumbuhan (growth rate)
bila o
2
> 0 atau merupakan suatu ukuran penyusutan (decay) bila o
2
< 0.

Oleh karenanya, model ini disebut juga model pertumbuhan.

22/8/2012
Model Log-Lin
Berdasarkan data pertumbuhan Produk Nasional Bruto (PNB) atas
dasar harga konstan (pertumbuhan riil) tahun 1986 2004 di suatu
negara, diperoleh model:
ln PNB = 6,9636 + 0,0796 Tahun
SE : (0,0151) (0,0017)
R
2
= 0,9756
Analisis?
Model tersebut menyatakan bahwa o
2
= 0,0796. Artinya, setiap
tahunnya PNB naik/tumbuh 7,96 % pada periode 1986 2004.
Diunduh dari:
http://www.unboundmedicine.com/medline/ebm/record/22841879/abstract/Smoking_and_air_pollution_exp
osure_and_lung_cancer_mortality_in_Zhaoyuan_County_.. 24/8/2012
ILUSTRASI: Model Log-Lin
Li H, Da Li Q, Wang MS, Li FJ, Li QH, Ma XJ, Wang DN
Smoking and air pollution exposure and lung cancer mortality in Zhaoyuan County.
[JOURNAL ARTICLE]
Int J Hyg Environ Health 2012 Jul 27.

Simultaneous exposure to high levels of air pollution and high tobacco consumption at the same place is
rare. The aim of the present study was to evaluate the impact of the two factors on the risk of developing
lung cancer.
Data on the number of deaths due to lung cancer and on population from 1970 to 2009 were obtained from
Zhaoyuan County. Data on the smoking populations were obtained at random sampling survey during the
time in Zhaoyuan. Data on the components of atmospheric surveillance were obtained from the local
environmental protection offices. Logarithmic linear regression and general log-linear Poisson age-period-
cohort (APC) models were used to estimate age, period, cohort, gender, smoking, and air pollution effects on
the risk of lung cancer mortality.
The standardized mortality rates of lung cancer drastically increased from 8.43 in per 100 000 individuals in
the 1970-1974 to 25.67 in per 100 000 individuals in the 2005-2009 death survey. The annual change of lung
cancer mortality was 3.20%. In the log linear regression model, the age, proportion of smokers, gender,
period, and air pollution are significantly associated with lung cancer mortality. The APC analysis shows that
the relative risks (RRs) of gender, smoking, and air pollution are 2.29 (95% confidence interval (CI): 2.16-
2.43), 3.05 (95% CI=2.76-3.36), and 1.42 (95% CI=1.19-1.69), respectively. Compared with the period
1970-1974, high RRs were found during 1995-2009. Compared with the birth cohort 1950-1954, the RRs
increased in the birth cohorts of 1910 to the 1940. Compared the aged 35-59 and 60-84 in the1980-1984
death survey (not exposed to air pollution) with that in the 2005-2009 death survey (exposed to air
pollution), The two age groups exposed to air pollution, 25 years later, had an increased mortality rates for
lung cancer by 2.27 and 3.55 times for males and by 1.47 and 3.35 times for females.
The mortality rates of lung cancer drastically increased in the past 35 years. The trend of lung cancer
mortality may be in a great extent possibly due to the effects of combined smoking and air pollution
exposure.
Interpretasi:
|
2
merupakan ukuran rasio antara perubahan absolut Y terhadap
perubahan relatif X, dituliskan sebagai berikut :
|
2
=
perubahan absolut dalam Y
perubahan relatif dalam X
_ _ _
_ _ _
Digunakan pada situasi dimana perubahan relatif pada X akan mengakibatkan
perubahan absolut pada Y.
Misalnya:
Perusahaan mempunyai target omset, maka kita dapat melihat kenaikan keuntungan.
Diunduh dari: http://www.econstor.eu/dspace/bitstream/10419/18116/1/dp356.pdf.. 24/8/2012
Model Lin-Log: Y = |
1
+ |
2
ln X + u
Environment and Happiness: Valuation of Air Pollution in Ten
European Countries
Heinz Welsch
Department of Economics, University of Oldenburg. 26111 Oldenburg, Germany. Email:
welsch@uni-oldenburg.de. April 2003.

This paper uses a set of panel data from happiness surveys, jointly with data on per capita
income and pollution, to examine how self-reported well-being varies with prosperity and
environmental conditions.
Using POLLUTION to refer to the various pollutants (NITROGEN, PARTICLES, LEAD)
the equations to be estimated can be written as follows:

The coefficient relating to INCOME is expected to be positive while the POLLUTION coefficients
should be negative. As an alternative to the specification in logarithms we will also consider the
corresponding specification in level variables.
The parameters i and t are country and period dummies. The country dummies are included to control
for time-invariant omitted-variable bias, and the period dummies are included to control for global
shocks, which might affect well-being in any period but are not otherwise captured by the explanatory
variables.
As mentioned above, the dependent variable LIFESAT is the average of self-reported wellbeing taken
across all respondents in a particular country and year. Thus, even though the individual responses are
categorical data - cardinalized on a four-point integer scale -LIFESAT is a continuous variable.
Therefore, estimation techniques for discrete variables are not applicable. Instead, standard continuous-
variable methods will be used.
Perhatikan Model yang menunjukkan hubungan antara laba dan omset:
Laba = 1040,1105 + 24,9879 Ln Omset
SE : (18,8574) (2,0740)
R
2
= 0,9236
Interpretasi: Setiap Omset naik 1% maka laba akan naik sebesar 24 juta rupiah.
Bagaimana jika perusahaan menargetkan tahun depan omset naik 5%?
ILUSTRASI Model Log-Lin
WHEN TO USE LOGLINEAR MODELS:

The loglinear model is one of the specialized cases of generalized linear models for
Poisson-distributed data. Loglinear analysis is an extension of the two-way contingency
table where the conditional relationship between two or more discrete, categorical
variables is analyzed by taking the natural logarithm of the cell frequencies within a
contingency table.
Although loglinear models can be used to analyze the relationship between two
categorical variables (two-way contingency tables), they are more commonly used to
evaluate multiway contingency tables that involve three or more variables.
The variables investigated by log linear models are all treated as response variables. In
other words, no distinction is made between independent and dependent variables.
Therefore, loglinear models only demonstrate association between variables. If one or
more variables are treated as explicitly dependent and others as independent, then logit
or logistic regression should be used instead.
Also, if the variables being investigated are continuous and cannot be broken down into
discrete categories, logit or logistic regression would again be the appropriate analysis.

Suppose we are interested in the relationship between sex, heart disease and body
weight. We could take a sample of 200 subjects and determine the sex, approximate
body weight, and who does and does not have heart disease. The continuous variable,
body weight, is broken down into two discrete categories: not over weight, and over
weight.
Diunduh dari:
http://userwww.sfsu.edu/~efc/classes/biol710/loglinear/Log%20Linear%20Models.htm 26/8/2012
Sifat:
Apabila X bernilai sangat besar, maka Y akan memiliki harga mendekati |
1
.
Y
x
u = +
|
\
|
.
| + | |
1 2
1
MODEL RECIPROCAL
Functional form
A functional form refers to the algebraic form of a relationship between a dependent
variable and regressors or explanatory variables. The simplest functional form is the
linear functional form, where the relationship between the dependent variable and an
independent variable is graphically represented by a straight line. Other useful
functional forms in regression analysis include:
1. Semi-log. Either the dependent variable or the independent variables are transformed using the
natural logarithm transformation.
2. Double-log. Variables are transformed using the natural logarithm transformation.
3. Reciprocal. Independent variables (one or more) are represented as the reciprocal (that is, for
variable x, the transformation is 1/x).
4. These functional forms allow the analyst to represent a wide range of shapes.

Interpretation
The interpretation of coefficients is different in alternative functional forms. In the
following formulations Y represents the dependent variable, x the independent variable, a
is the y-intercept, b is the slope coefficient, ln(y) and ln(x) represent the natural
logarithm of y and x, respectively; and e is an error term.
1. Linear: y = a + b x + e
In this functional form b represents the change in y (in units of y) that will occurs as
x changes one unit.
2. Semi-log: ln(y) = a + b x + e
In this functional form b is interpreted as follows. A one unit change in x will cause
the b(100)% change in y, e.g., if the estimated coefficient is 0.05 that means that a
one unit increase in x will generate a 5% increase in y.
3. Double-log: ln(y) = a + b ln(x) + e
In this functional form b is the elasticity coefficient. A one one percent change in x
will cause the b% change in y; e.g. if the estimated coefficient is -2 that means
that the 1% increase in x will generate the -2% decrease in y.

Diunduh dari:
http://cmapskm.ihmc.us/rid=1052458916298_870839951_7777/Functional%20form.htm. 26/8/2012
Aplikasi I (|
1
> 0, |
2
> 0) : Model Rata-rata
Biaya Tetap Suatu Kelas
Didefinisikan :
Y : Rata-rata biaya tetap
X : Banyaknya mahasiswa/kelas
Biaya operasional yang diperlukan dapat dikategorikan
menjadi dua jenis, yaitu :
Biaya tetap, meliputi: sewa ruangan, honor dosen,
dan lain-lain.
Biaya variabel, meliputi: makan, snack, hand-out,
dan lain-lain.
Hubungan antara Y dan X dapat dinyatakan sebagai:
Y
x
u = +
|
\
|
.
| + | |
1 2
1
; |
1
> 0, |
2
>
0
22/8/2012
Fungsi reciprocal
untuk |
1
> 0, dan |
2
> 0
Karakteristik model :
Pada saat jumlah mahasiswa tidak banyak (X kecil), rata-rata biaya
tetap sangat besar. Kebalikannya, bila jumlah mahasiswa sangat
banyak (X besar sekali), rata-rata biaya tetap mendekati |
1
(|
1
> 0).
Cara mengestimasi model?
OLS (Ordinary Least Square)
|
1
Y
X
22/8/2012
Aplikasi II (|
1
< 0, |
2
> 0)
Didefinisikan :
X : tingkat pengangguran (%)
Y : tingkat perubahan upah (%)
Bentuk hubungan antara Y dan X digambarkan dalam kurva berikut :
Tingkat
Pengangguran
Alami
Y
X
- |
1
Kurva Philips
22/8/2012
Kurva Phillips: United Kingdom, 1950-1966
Y = -1,4282 + 8,7243
t: (2,0625) (2,8498)
R
2
= 0,3849
Pengamatan :
|
1
= -1,43 % Artinya?

Batas bawah perubahan upah 1,43 %. Artinya, bila unemployment rate (tingkat
pengangguran) besar sekali, penurunan upah tidak lebih dari 1,43 % per
tahun
R
2
sangat rendah, kurang dari 40 %, tetapi intercep dan slop keduanya signifikan.
ILUSTRASI

Kurva Philips adalah kurva yang
menunjukkan hubungan antara tingkat
pengangguran dengan tingkat inflasi di suatu
negara.

Menurut Kurva Philips, hubungan keduanya
adalah berbanding negatif. Jadi ketika inflasi
naik, maka pengangguran turun., dan ketika
inflasi turun, maka pengangguran naik
jumlahnya.

Kedua hal ini dalam makroekonomi ini
menjadi pilihan yang begitu rumit.
Aplikasi III (|
1
> 0, |
2
< 0)
Didefinisikan :
Y : konsumsi / pengeluaran pada suatu komoditas
X : pendapatan
Hubungan antara pendapatan seseorang dengan konsumsi suatu komoditas
digambarkan dalam Kurva Engel .
22/8/2012
Christian Engel :
Hukum Konsumsi" : elastisitas pendapatan makanan pokok sangat kecil (E < 1),
pakaian dan rumah (E = 1), rekreasi, kesehatan dan barang mewah (E > 1).
Semakin miskin keluarga atau bangsa semakin besar persentase pengeluaran untuk
makanan.

Kurva Engel : fungsi jumlah barang yang dapat dibeli (vertikal) dengan pendapatan
(horizontal).

Lereng kurva Engel serupa lereng Elastisitas Pendapatan (Ep).
Untuk bahan pokok, kurva Engel agak datar karena perubahan pendapatan tidak besar
pengaruhnya terhadap konsumsi barang.
Untuk daging, misalnya, kurva Engel agak tegak.
Barang-barang dapat digolongkan menjadi necessity (pokok) atau luxuries good
(barang mewah).

Faktor yang mempengaruhi permintaan individu:
1. Harga barang itu sendiri, sesuai hukum permintaan
2. Pendapatan konsumen, makin besar pendapatan makin besar permintaan
3. Selera naik permintaan naik
4. Harga barang lain, barang substitusi (pengganti), barang komplementari (saling
melengkapi).

Diunduh dari: https://sites.google.com/site/kuliahteorimikro1/
Ada garis ambang pendapatan (threshold level of
income ). Bila pendapatan lebih kecil dari garis
ambang pendapatan, komoditas tersebut tidak akan
dibeli/dikonsumsi (-|
2
/|
1
).

Ada suatu level kejenuhan. Meskipun pendapatan
mencapai level sangat tinggi, konsumsi komoditas
tidak akan melewati level tersebut (|
1
).
Y
x
u = +
|
\
|
.
| + | |
1 2
1
-
|
2
/|
1
C
I
|
1 |
.
|
\
|
x
1
22/8/2012
SIFAT KURVA ENGEL
Model Regresi Double-log
Diunduh dari: http://junaidichaniago.blogspot.com/2009/04/bentuk-fungsional-regresi-linear-seri.html ..
23/8/2012
Model-model regresi yang dikemukakan sebelumnya adalah model yang linear dalam
paramater dan variabel. Namun, pengertian regresi linear yang lebih umum adalah
regresi tersebut linear dalam parameter (atau yang secara intrinsik bisa dibuat linear
melalui transformasi variabel), sedangkan variabelnya boleh saja bersifat linear atau
tidak.
Misalnya, persamaan Y = 0+ 1Xi2 dapat digolongkan sebagai regresi linear, karena
paramaternya (1) bersifat linear, meskipun variabelnya (Xi2) tidak bersifat linear.

Berdasarkan hal tersebut, dapat dikembangkan berbagai berbagai bentuk fungsional
model regresi. Bentuk pertama yang akan kita bahas dalam tulisan ini adalah Model
Double-Log sebagai berikut:

Misalnya suatu model: Yi = 0Xi 1e ui

Model tersebut adalah terlihat tidak linear dalam parameter, tetapi secara intrinsik bisa
dibuat linear dengan transformasi sebagai berikut:

lnYi = ln0 + 1lnXi + ui

ln = logaritma natural (logaritma dengan bilangan dasar e = 2,71828)

Jika = ln0, Yi* = lnYi dan Xi* = lnXi , persamaan tersebut dapat ditulis kembali
menjadi:

Yi* = + 1Xi*+ ui

Model ini dinamakan dengan model double-log.

Hal yang perlu diperhatikan dalam model double-log adalah, koefisien 1 dapat
ditafsirkan sebagai elastisitas yaitu persentase perubahan variabel Y sebagai akibat
persentase perubahan variabel X.
Dengan demikian, jika X merupakan harga dan Y adalah permintaan, maka koefisien 1
dapat diinterpretasikan sebagai elastisitas harga.

MODEL PILIHAN KUALITATIF
Diunduh dari: http://junaidichaniago.blogspot.com/2009/04/model-pilihan-kualitatif-seri-5-model.html ..
23/8/2012
Model pilihan kualitatif adalah model dengan variabel terikatnya berskala pengukuran
nominal atau ordinal. Dengan kata lain, dalam model regresi ini melibatkan dua atau
lebih pilihan kualitatif.

Kalau peubah bebas memiliki skala pengukuran nominal atau ordinal, maka dapat
dibentuk variabel dummy dan mengestimasi persamaan regresi menggunakan metode
OLS (Ordinary Least Square).
Namun demikian, jika variabel terikat memiliki skala pengukuran nominal atau ordinal,
estimasi menggunakan metode OLS akan menghasilkan permasalahan yang terkait
dengan pelanggaran asumsi-asumsi klasik, terutama asumsi residual yang berdistribusi
normal dan asumsi homo-skedastisitas.
Model pilihan kualitatif yang sering digunakan adalah model logit dengan berbagai
variasinya. Model logit selain lebih sering digunakan, interpretasi modelnya juga lebih
sederhana dibandingkan model-model lainnya.
Model logit juga dapat dibedakan atas skala pengukuran dan banyaknya kategori data
pada variabel terikatnya sebagai berikut:

1. Model Binary Logit.
Model dengan variabel terikat hanya terdiri dari dua kategori. Misalnya model
untuk memprediksi keputusan individu membeli mobil atau tidak. Contoh lain,
misalnya model yang menganalisis pengaruh faktor-faktor sosial ekonomi terhadap
terlibat atau tidaknya wanita dalam angkatan kerja.
2. Model Multinomial Logit
Model dengan variabel terikat memiliki lebih dari dua kategori dan berskala
nominal. Misalnya model yang memprediksi keputusan pemilih dalam memilih
antara partai sosialis, nasionalis atau partai berbasis agama.
3. Model Ordinal Logit
Model dengan variabel terikat memiliki lebih dari dua kategori dan berskala ordinal.
Misalnya model yang memprediksi keputusan konsumen dalam berbelanja antara
pasar tradisional, pasar semi-modern dan pasar modern (supermarket atau
hipermarket).
Dalam mengestimasi model logit juga terdapat beberapa metode yaitu metode
maximum likelihood, noninteractive weighted least square dan discriminant
function analysis. Namun demikian, metode yang umum digunakan dalam software
paket-paket statistic adalah metode maximum likelihood.

REGRESI
DENGAN
VARIABEL BEBAS
DUMMY

REGRESI DENGAN VARIABEL DUMMY
Regresi yang telah dipelajari data kuantitatif
Analisis membutuhkan analisis kualitatif.
Contoh:
Pengaruh jenis Kelamin terhadap gaji.
Pengaruh kualitas produk terhadap omset.
Pengaruh harga terhadap kepuasan pelayanan.
Pengaruh pendidikan terhadap umur perkawinan pertama.

Contoh (1) & (2) variabel bebas kualitatif dan variabel terikat kuantitatif.
Contoh (3) variabel bebas kuantitatif dan variabel terikat kualitatif.
Contoh (4) variabel bebas kualitatif dan variabel terikat kualitatif.

(1) dan (2) Regresi dengan Dummy Variabel
(3) dan (4) Model Logistik atau Multinomial
Diunduh dari: enistat.lecture.ub.ac.id/files/2012/03/Regresi-Dummy.ppt .. 22/8/2012
Regresi linier dapat digunakan untuk melakukan analisis data bila variabel bebasnya
(X) bertipe data nominal. Teknik semacam ini dikenal dengan nama regresi variabel
dummy.
Seorang peneliti memprediksi laba dua macam perusahaan (swasta asing dan swasta
nasional) bila ditinjau dari besarnya biaya iklan yang dikeluarkan oleh perusahaan
untuk membuat iklan mengenai produknya. (Untuk perusahaan swasta asing, laba yang
diamati adalah laba yang diperoleh dari hasil penjualan produknya di wilayah
Indonesia saja.)
Kasus semacam ini dapat diselesaikan dengan metode regresi menggunakan variabel
dummy. Hal yang perlu diperhatikan adalah teknik menyusun variabel dummy dalam
analisis regresinya.
Variabel respon (Y) adalah Laba perusahaan, variabel bebas (X) adalah biaya iklan,
sedangkan variabel dummy-nya adalah tipe perusahaan, yaitu swasta asing dan swasta
nasonal. Hal ini berarti ada 2 tipe/kategori perusahaan.

Untuk menyusun variabel dummy-nya, maka kita perlu menentukan terlebih dahulu
banyaknya variabel dummy yang digunakan. Banyaknya variabel dummy yang
digunakan adalah sebanyak kategori dikurangi satu.
Dalam hal contoh di atas, maka banyaknya variabel dummy adalah = 2-1 = 1 buah.
Jika perusahaan swasta asing dilambangkan dengan angka 1, maka perusahaan swasta
nasional dilambangkan dengan angka 0.
Diunduh dari: http://ineddeni.wordpress.com/2007/08/17/analisis-regresi-dengan-variabel-dummy/..
26/8/2012
1. Data Kualitatif harus berbentuk data kategorik Belum bisa dibuat regresi
secara langsung Variabel Dummy.
2. Variabel dummy disebut juga variabel indikator, biner, kategorik, kualitatif,
boneka, atau variabel dikotomi.
3. Variabel Dummy pada prinsipnya merupakan perbandingan karakteristik.
Misalnya:
Perbandingan kondisi (besaran/jumlah) konsumen yang merasa puas
terhadap suatu produk dengan konsumen yang tidak puas.
Perbandingan besarnya gaji antara laki-laki dan perempuan.

REGRESI DENGAN VARIABEL DUMMY
DUMMY VARIABEL

Dalam statistik dan ekonometrik , khususnya dalam analisis regresi , variabel dummy (juga dikenal
sebagai variabel indikator) adalah salah satu yang mengambil nilai 0 atau 1 untuk menunjukkan
ketiadaan atau adanya beberapa efek kategoris yang dapat diharapkan untuk menggeser hasilnya.
Misalnya, dalam analisis time series , variabel dummy dapat digunakan untuk menunjukkan
terjadinya perang, atau pemogokan. Dengan demikian dapat dianggap sebagai nilai kebenaran
direpresentasikan sebagai nilai numerik 0 atau 1.

Variabel dummy merupakan "proxy" variabel atau angka untuk fakta kualitatif dalam model
regresi.
Dalam analisis regresi, variabel dependen dipengaruhi oleh variabel kuantitatif (pendapatan, output,
harga, dll), dan juga oleh variabel kualitatif (jenis kelamin, agama, wilayah geografis, dll).
Dummy variabel independen mengambil nilai dari 0 atau 1. Oleh karena itu, mereka juga disebut
variabel biner.
Sebuah variabel dummy dengan nilai 0 akan menyebabkan koefisien variabel menghilang dan
dummy dengan nilai 1 akan menyebabkan koefisien bertindak sebagai tambahan dalam model
regresi.
Misalnya, gender merupakan salah satu variabel kualitatif yang relevan dengan kemunduran.
Kemudian, perempuan dan laki-laki akan menjadi kategori termasuk dalam variabel Gender. Jika
perempuan diberikan nilai 1, maka laki-laki akan mendapatkan nilai 0 (atau sebaliknya).
Dengan demikian, variabel dummy dapat didefinisikan sebagai variabel perwakilan kualitatif
dimasukkan ke dalam regresi, sedemikian rupa sehingga mengasumsikan nilai 1 setiap kali kategori
yang diwakilinya terjadi, dan 0 sebaliknya.
Dummy variabel digunakan sebagai perangkat untuk mengurutkan data ke dalam kategori yang
saling eksklusif (seperti pria / wanita, perokok / non-perokok, dll).

Diunduh dari: http://en.wikipedia.org/wiki/Dummy_variable_(statistics). 26/8/2012
Teknik pembentukan Variabel Dummy dan Estimasi
Dummy bernilai 1 atau 0. Kenapa?

Perhatikan data kategorik berikut:
1. Konsumen puas
2. Konsumen tidak puas
Dapatkah kita membuat regresi dengan kode kategorik diatas, yaitu 1 dan 2?
Bila digunakan kode kategorik tersebut, berarti kita sudah memberi nilai pada
konsumen yang tidak puas dua kali konsumen yang puas.
Bila dibuat dummy, misalnya:
1. Konsumen puas = 1
2. Konsumen tidak puas = 0.
MODELING MONTHLY TEMPERATURE DATA IN LISBON AND
PRAGUE
Teresa Alpuim, Abdel El-Shaarawi
Environmetrics. 04/2009; 20(7):835 - 852.

This paper examines monthly average temperature series in two widely separated
European cities, Lisbon (18561999) and Prague (18412000).

The statistical methodology used begins by fitting a straight line to the temperature
measurements in each month of the year. Hence, the 12 intercepts describe the
seasonal variation of temperature and the 12 slopes correspond to the rise in
temperature in each month of the year. Both cities show large variations in the monthly
slopes.
In view of this, an overall model is constructed to integrate the data of each city.
Sine/cosine waves were included as independent variables to describe the seasonal
pattern of temperature, and sine/cosine waves multiplied by time were used to describe
the increase in temperature corresponding to the different months.

The model also takes into account the autoregressive, AR(1), structure that was found
in the residuals. A test of the significance of the variables that describe the variation of
the increase in temperature shows that both Lisbon and Prague had an increase in
temperature that is different according to the month.
The winter months show a higher increase than the summer months.

Diunduh dari: http://www.researchgate.net/profile/Teresa_Alpuim/. 26/8/2012
Tekhnik pembentukan Variabel Dummy dan
Estimasi
Regresi yang dibuat menunjukkan kondisi dimana konsumen
merasa puas (Dummy berharga 1 Dummy ada dalam model),
dan kondisi sebaliknya (Dummy berharga 0 Dummy hilang
dari model). Jadi modelnya akan menunjukan kondisi ada atau
tidak ada Dummy.

Untuk jelasnya perhatikan contoh berikut:
Penelitian mengenai pengaruh daerah tempat, yaitu kota
atau desa, terhadap harga berbagai macam produk.

Model: Y = o + | D + u

Y = Harga produk
D = Daerah tempat tinggal
D = 1 ; Kota
D = 0 ; Desa
u = kesalahan random.

Catatan: Dummy yang bernilai 0 disebut dengan kategorik
Pembanding atau dasar atau reference.
ILUSTRASI
Dari model di atas, rata-rata harga produk :
Kota : E (Y | D = 1) = o + |
Desa : E (Y | D = 0) = o

Jika | = 0 tidak terdapat perbedaan harga antara daerah
perkotaan dengan pedesaan.

Jika | = 0 terdapat perbedaan harga antara daerah perkotaan
dengan pedesaan.

Model di atas merupakan model Regresi OLS
Misal hasil estimasi dengan OLS untuk model diatas
didapat:

Y = 9,4 + 16 D
t (53,22) (6,245)
R
2
= 96,54%
o = 0 dan | = 0; yaitu : o = 9,4 dan | = 16.

Artinya, harga rata-rata produk didaerah perkotaan
adalah: 9,4+ 16 = 25,4 ribu rupiah, dan pedesaan sebesar
9,4 ribu rupiah. Dengan demikian dapat disimpulkan,
harga produk daerah perkotaan lebih mahal dibanding
pedesaan.
Model: variabel bebas merupakan variabel
kuantitatif dan variabel kualitatif.
Contoh: Analisis mengenai gaji dosen di sebuah
perguruan tinggi swasta di Jakarta, berdasarkan jenis
kelamin dan lamanya mengajar.
Didefinisikan :
Y = gaji seorang dosen
X = lamanya mengajar (tahun)
G = 1 ; dosen laki-laki
0 ; dosen perempuan

Model :
Y = o
1
+ o
2
G + | X + u

Dari model ini dapat dilihat bahwa :
Rata-rata gaji dosen perempuan = o
1
+ | X
Rata-rata gaji dosen laki-laki = o
1
+ o
2
+ | X
Model: variabel bebas merupakan variabel kuantitatif
dan variabel kualitatif.
Jika o
2
= 0 tidak ada diskriminasi gaji antara dosen
laki-laki dan perempuan
Jika o
2
= 0 ada diskriminasi gaji antara dosen laki-laki
dan perempuan

Misal: gaji dosen laki-laki > perempuan, maka secara
geometris, model dapat digambarkan sebagai berikut :
Gaji
Dosen laki-laki
Dosen perempuan
Pengalaman mengajar
o
1
o
2
Bagaimana jika pendefinisian laki-laki dan perempuan dibalik?
Misalkan :
S= 1; dosen perempuan
= 0; dosen laki-laki

Modelnya menjadi :
Y = o
1
+ o
2
S + | X + u

Jika o
2
= 0 tidak ada diskriminasi gaji antara dosen laki-laki dan perempuan
Jika o
2
= 0 ada diskriminasi gaji antara dosen laki-laki dan perempuan
The effects of particulate air pollution on daily deaths: a multi-city case
crossover analysis
J. Schwartz.
Occup Environ Med 2004;61:956-961 doi:10.1136/oem.2003.008250

Numerous studies have reported that day-to-day changes in particulate air pollution are associated with
day-to-day changes in deaths. Recently, several reports have indicated that the software used to control
for season and weather in some of these studies had deficiencies.

This approach compares the exposure of each case to their exposure on a nearby day, when they did not
die. Hence it controls for seasonal patterns and for all slowly varying covariates (age, smoking, etc) by
matching rather than complex modelling. A key feature is that temperature can also be controlled by
matching. This approach was applied to a study of 14 US cities. Weather and day of the week were
controlled for in the regression.

A 10 g/m
3
increase in PM
10
was associated with a 0.36% increase in daily deaths from internal causes
(95% CI 0.22% to 0.50%). Results were little changed if, instead of symmetrical sampling of control
days the time stratified method was applied, when control days were matched on temperature, or when
more lags of winter time temperatures were used. Similar results were found using a Poisson regression,
but the case-crossover method has the advantage of simplicity in modelling, and of combining matched
strata across multiple locations in a single stage analysis.

Despite the considerable differences in analytical design, the previously reported associations of particles
with mortality persisted in this study. The association appeared quite linear. Case-crossover designs
represent an attractive method to control for season and weather by matching.

Diunduh dari: http://oem.bmj.com/content/61/12/956.full. 26/8/2012
PEMBALIKAN DEFINISI
Misal: gaji dosen laki-laki > perempuan o
2
akan
bertanda negatif, maka secara geometris, model dapat
digambarkan sebagai berikut :

Gaji
Dosen Laki-laki
Dosen Perempuan
o
2
o
1
Pengalaman mengajar
PENDEFINISIAN
Perlu diperhatikan sekarang bahwa berdasarkan pendefinisian baru:
Rata-rata gaji dosen perempuan = o
1
o
2
+ | X
Rata-rata gaji dosen laki-laki = o
1
+ | X

Jadi, apapun kategorik pembanding akan menghasilkan kesimpulan yang
sama, sekalipun taksiran nilai koefisien regresi berbeda.

Bagaimana kalau definisi:
D
2
= 1; dosen laki-laki
0; dosen perempuan
D
3
= 1; dosen perempuan
0; dosen laki-laki
Sehingga modelnya menjadi :
Y = o
1
+ o
2
D
2
+ o
3
D
3
+ | X + u

Apa yang akan terjadi bila model ini diestimasi dengan OLS ?

Perhatikan: ada hubungan linear antara D
2
dan D
3
yakni
D
2
= 1 - D
3
atau D
3
= 1 - D
2
perfect colinearity antara D
2
dan D
3
sehingga OLS tidak dapat digunakan.

Dalam membuat Dummy: Jika data mempunyai kategori sebanyak m,
maka kita hanya memerlukan m-1 variabel dummy. Dalam contoh di
atas, kategorinya hanya dua, yaitu laki-laki dan perempuan. Oleh sebab
itu, hanya satu variabel dummy yang dibutuhkan.
VARIBEL DENGAN KATEGORI LEBIH DARI DUA
Misalkan:
Pendidikan mempunyai 3 kategori:
1. Tidak tamat SMU
2. Tamat SMU
3. Tamat Perguruan tinggi.
Dibutuhkan variabel dummy sebanyak (3-1) = 2.
Dua variabel dummy tersebut yaitu D
2
dan D
3
didefinisikan sebagai berikut:
D
2
= 1 ; pendidikan terakhir SMU
0 ; lainnya
D
3
= 1 ; pendidikan terakhir perguruan tinggi
0 ; lainnya
Manakah kategorik pembandingnya?
The Jour. Of Transportation Landuse. Vol. 3. No. 2. [Summer 2010] pp. 3963
Modeling hedonic residential rents for land use and transport
simulation while considering spatial effects
Michael Lchl and Kay W. Axhausen.

The application of UrbanSim requires land or real estate price data for the study area.
These can be difficult to obtain, particularly when tax assessor data and data from
commercial sources are unavailable.
The article discusses an alternative method of data acquisition and applies hedonic
modeling techniques in order to generate the required data. Many studies have
highlighted that ordinary least square (OLS) regression approaches lack the ability to
consider spatial dependency and spatial heterogeneity, consequently leading to biased
and inefficient estimations. Therefore, a comprehensive data set is used for modeling
residential asking rents by applying and comparing OLS, spatial autoregressive, and
geographically weighted regression (GWR) techniques.
The latter technique performed best with regard to model fit, but the issue of correlated
coefficients favored a spatial simultaneous autoregressive model.

Overall, the article reveals that when housing markets are a particular concern in
UrbanSim applications, significant efforts are needed for the price data generation and
modeling. The study concludes with further development potentials for UrbanSim.

ILUSTRASI
Perhatikan model berikut :
Y = o
1
+ o
2
D
2
+ o
3
D
3
+ | X + u

Y = pengeluaran untuk health care per tahun
X = pendapatan per tahun
D
2
= 1 ; pendidikan tertinggi SMU
0 ; lainnya
D
3
= 1 ; pendidikan tertinggi perguruan tinggi (S1)
0 ; lainnya
Berapa rata-rata pengeluaran seseorang berdasarkan pendidikannya?
Tidak tamat SMU : o
1
+ |X
Tamat SMU : o
1
+ o
2
+ |X
Berijazah S1 : o
1
+ o
3
+ |X

Assign each value of category as a binary dummy variable
We assign each value of Mode as a binary dummy variable. The distance between two
objects is the ratio of number of unmatched and total dummy variables.
For example, we have two variables: Gender and Mode. Gender has two values: 0 =
male and 1 = female. Mode has three choices of public transport mode to go to school:
Bus, Train and Van. Suppose we have three subjects: Alex (Male) uses bus, Brian
(Male) uses Van and Cherry (Female) use Bus.
We assign each value of Mode as
a binary dummy variable.

Let set the first coordinate as
Gender, while the second
coordinate as Mode (Bus, Train,
Van).
We have :
Alex = (0, (1, 0, 0))
Brian = (0, (0, 0, 1))
Cherry = (1, (1, 0, 0))

Diunduh dari:
http://people.revoledu.com/kardi/tutorial/Similarity/NominalVariables.html#Method1
26/8/2012
ILUSTRASI
Kalau dilihat secara geometris, pengeluaran untuk health care tersebut adalah
sebagai berikut :
PT
SMU
Tidak tamat SMU
o
3
o
2

o
1
Pendapatan (X)
Tabungan (Y)
A dummy variable with a value of 0 will
lead to the variables coefficient to go away
while a value of 1 will cause the coefficient
to act as an intercept in the model. With such
ease of setting up and the obvious reasons for
supporting the usage, dummy variables are
now commonly used in economic
forecasting and time series analysis.

Lets say that Wages are being tested as the
dependent variable and wage is a function of
gender and education. Where:
Wage =
0
+
0
female +
1
education + e

Female is 1 while Male is 0
0
is the the difference in wages between
males and females.
Diunduh dari: http://www.economicswiki.com/economics-tutorials/dummy-variable/.. 26/8/2012
Regresi Dengan Beberapa Variabel Kualitatif
Contoh:
Y = o
1
+ o
2
D
2
+ o
3
D
3
+ | X + u

Y = gaji
X = pengalaman (tahun)
D
2
= 1 ; dosen laki-laki D
3
= 1 ; Fakultas tehnik
0 ; dosen perempuan 0 ; lainnya

Dari model didapatkan:
Rata-rata gaji dosen perempuan yang mengajar diluar fakultas tekhnik = o
1
+ | X
Rata-rata gaji dosen laki-laki yang mengajar diluar fakultas tekhnik = o
1
+ o
2
+ | X
Rata-rata gaji dosen perempuan yang mengajar di fakultas tekhnik = o
1
+ o
3
+ | X
Rata-rata gaji dosen laki-laki yang mengajar di fakultas tekhnik = o
1
+ o
2
+ o
3
+ | X
Categorical (nominal) variables can be coded for multiple regression analyses in several
ways, three of which we will examine here.
The topic of the multiple regression approach to ANOVA is very complex.
For this reason, to reduce the complexity for purposes of introduction by limiting our
discussion to cases where all values of the categorical variable (all the groups) have the
same number of subjects (n).
There are three types of coding, (1) dummy coding, (2) effect coding, and (3)
orthogonal contrast coding.
We can explore how the multiple regression analysis is run and interpreted for each
type of coding.

In coding nominal variables, the first thing to do is to determine the number of
categories the nominal variable has. For example, it is assumed we have a nominal
variable for zip code, and we will assume our total sample has 10 people from each of 4
zip codes (n=10, N=40).
For all types of categorical coding, the number of categorical variables needed for the
regression analysis is the number of categories or groups minus 1. Since we have 4
categories in our variable, we will need 3 recoded regression variables to represent the
one nominal variable of zip code. For each participant, we have a score of some type
which is the subject of our analysis (the criterion variable), and we have a rating of
some type which is an interval level predictor variable.

Diunduh dari: http://www.jamesstacks.com/stat/MR_dummy.htm .. 26/8/2012
ILUSTRASI
Seandainya didapat persamaan regresi sebagai berikut:

Y = 7,43 + 0,207 D
2
+ 0,164 D
3
+ 1,226 X
R
2
= 91,22%

Apa artinya jika uji-t menunjukan D
2
dan D
3
signifikan?
Berapa rata-rata gaji dosen perempuan yang mengajar diluar fakultas tekhnik dengan
pengalaman 1 tahun?
7,43 + 1,226 = Rp.8,656 juta.
Berapa rata-rata gaji dosen laki-laki yang mengajar diluar fakultas tekhnik dengan pengalaman
1 tahun?
7,43 + 0,207 + 1,226 = Rp.8,863 juta.
Rata-rata gaji dosen perempuan yang mengajar di fakultas tekhnik dengan pengalaman 1
tahun?
7,43 + 0,164 + 1,226 = Rp.8,820 juta.

Categorical Regression Models with Optimal Scaling for Predicting Indoor Air
Pollution Concentrations inside Kitchens in Nepalese Households
Srijan Lal Shrestha
Nepal Journal of Science and Technology 10 (2009) 205-211

Indoor air pollution from biomass fuels is considered as a potential environmental risk factor in
developing countries of the world. Exposure to these fuels have been associated to many respiratory
and other ailments such as acute lower respiratory infection, chronic obstructive pulmonary disease,
asthma, lung cancer, cataract, adverse pregnancy outcomes, etc. The use of biomass fuels is found to be
nearly zero in the developed countries but widespread in the developing countries including Nepal.
Women and children are the most vulnerable group since they spend a lot of time inside smoky kitchens
with biomass fuel burning, inefficient stove and poor ventilation particularly in rural households of
Nepal. Measurements of indoor air pollution through monitoring equipment such as high volume
sampler, laser dust monitor, etc are expensive, thus not affordable and practicable to use them
frequently. In this context, it becomes imperative to use statistical models instead for predicting air
pollution concentrations in household kitchens.

The present paper has attempted to contribute in this regard by developing some statistical models
specifically categorical regression models with optimal scaling for predicting indoor particulate air
pollution and carbon monoxide concentrations based upon a cross-sectional survey data of Nepalese
households.
The common factors found significant for prediction are fuel type, ventilation situation and house types.
The highest estimated levels are found to be for those using solid biomass fuels with poor ventilation
and Kachhi houses.
The estimated PM10 and CO levels are found to be 3024 g/m3 and 24115 g/m3 inside kitchen at
cooking time which are 5.2 and 40.40 times higher than the lowest predicted values for those using
LPG / biogas and living in Pakki houses with improved ventilation, respectively .
MANFAAT LAIN VARIABEL DUMMY
Dalam analisis menggunakan data time series, variabel dummy bermanfaat untuk
membandingkan suatu kurun waktu dengan kurun waktu tertentu.
Misalnya:
Bagaimana produksi PT Astra antara sebelum terjadi krisis dan saat krisis
ekonomi?
Bagaimana minat masyarakat untuk menabung di Bank Syariah setelah MUI
mengeluarkan fatwa bahwa bunga haram?
Apakah benar setiap bulan Desember harga dolar cenderung naik?
Apakah benar setiap hari senin harga saham Indofood naik?

Model diatas: Perbedaan hanya diakomodasi oleh intersep. Bagaimana jika slop
juga berbeda Membandingkan 2 regresi
The association between low level exposures to ambient air pollution and term low
birth weight: a retrospective cohort study
Rose Dugandzic
1
, Linda Dodds
2*
, David Stieb
1
and Marc Smith-Doiron
Environmental Health: A Global Access Science Source 2006, 5:3

Studies in areas with relatively high levels of air pollution have found some positive
associations between exposures to ambient levels of air pollution and several birth
outcomes including low birth weight (LBW). The purpose of this study was to examine
the association between LBW among term infants and ambient air pollution, by
trimester of exposure, in a region of lower level exposures.

The relationship between LBW and ambient levels of particulate matter up to 10 um in
diameter (PM
10
), sulfur dioxide (SO
2
) and ground-level ozone (O
3
) was evaluated using
the Nova Scotia Atlee Perinatal Database and ambient air monitoring data from the
Environment Canada National Air Pollution Surveillance Network and the Nova Scotia
Department of Environment. The cohort consisted of live singleton births (37 weeks of
gestation) between January1,1988 and December31,2000. Maternal exposures to air
pollution were assigned to women living within 25 km of a monitoring station at the
time of birth. Air pollution was evaluated as a continuous and categorical variable
(using quartile exposures) for each trimester and relative risks were estimated from
logistic regression, adjusted for confounding variables.

MEMBANDINGKAN DUA REGRESI
Perhatikan persamaan berikut:
Tabungan (Y) = o
1
+ o
2
Pendapatan (X) + u

Apakah hubungannya selalu demikian (sama) pada saat sebelum krisis moneter dan
ketika krisis moneter?

Data dibagi dua berdasarkan kurun waktu, yaitu sebelum dan saat krisis, sehingga
didapat dua model regresi, yaitu:
Periode I, sebelum krisis:
Y
i
= o
1
+ o
2
X
i
+ u
i
; i = 1,2, , n

Periode II, sesudah krisis:
Y
i
= |
1
+ |
2
X
i
+ c
i
; i = n+1, n+2, , N
Segmented regression is a method in regression analysis in which the independent variable is
partitioned into intervals and a separate line segment is fit to each interval. Segmented or piecewise
regression analysis can also be performed on multivariate data by partitioning the various
independent variables. Segmented regression is useful when the independent variables, clustered into
different groups, exhibit different relationships between the variables in these regions. The
boundaries between the segments are breakpoints.
Segmented linear regression is segmented regression whereby the relations in the intervals are
obtained by linear regression.
Segmented linear regression with two segments separated by a breakpoint can be useful to quantify
an abrupt change of the response function (Yr) of a varying influential factor (x). The breakpoint can
be interpreted as a critical, safe, or threshold value beyond or below which (un)desired effects occur.
The breakpoint can be important in decision making
The figures illustrate some of the results and regression types obtainable.
A segmented regression analysis is based on the presence of a set of ( y , x ) data, in which y is the
dependent variable and x the independent variable.
The least squares method applied separately to each segment, by which the two regression lines are
made to fit the data set as closely as possible while minimizing the sum of squares of the differences
(SSD) between observed (y) and calculated (Yr) values of the dependent variable, results in the
following two equations:
Yr = A1.x + K1 for x < BP (breakpoint)
Yr = A2.x + K2 for x > BP (breakpoint)
where:
Yr is the expected (predicted) value of y for a certain value of x; A1 and A2 are regression
coefficients (indicating the slope of the line segments); K1 and K2 are regression constants
(indicating the intercept at the y-axis).

Diunduh dari: http://en.wikipedia.org/wiki/Segmented_regression 26/8/2012
Kemungkinan-kemungkinan yang akan didapat:
Kasus 1: o
1
= |
1
dan o
2
= |
2
(model sama)
Kasus 2: o
1
= |
1
dan o
2
= |
2

Kasus 3: o
1
= |
1
dan o
2
= |
2

Kasus 4: o
1
= |
1
dan o
2
= |
2
(pergesaran model)

Environmetric modeling of emission sources for dry and wet
precipitation from an urban area
Th. Spanos , V. Simeonov , G. Andreev
Talanta 58 (2002) 367375

Monitoring data from chemical analysis of rainwater and aerosol samples collected in an
urban area have been interpreted by the use of environmetric approaches. An attempt was
done to compare the data set structures of both type of precipitation and to estimate the
contribution of different anthropogenic and naturally occurring emission sources to the total
mass of the wet and dry precipitation.

It was found that three latent factors explaining over 80% of the total variance of the set are
responsible for the rainwater set structuresea spray, soil dust, and anthropogenic.

Only two were the latent factors explaining the dominant part of the variance in the case of
aerosol samplesanthropogenic and natural. It is shown that the anthropogenic influence
for aerosol samples is more complex that that of rainwater samples and represents interaction
between typical anthropogenic sources and natural emitters.

Additionally, a source apportioning using multiple regression on absolute principal
component scores is performed in order to obtain qualitative information about the impact of
the different identified emission sources on the urban environment.

Diunduh dari: http://144.206.159.178/ft/1000/72315/1235173.pdf.. 26/8/2012

Untuk menanggulangi permasalahan diatas variabel dummy
Model:
Y
i
= o
1
+ o
2
D + |
1
X
i
+ |
2
D X
i
+ u
i

D = 1 ; pengamatan pada periode I (Sebelum Krisis)
0 ; pengamatan pada periode II (Saat Krisis)

Sehingga, rata-rata tabungan (Y) pada periode :
I : Y
i
= (o
1
+ o
2
) + (|
1
+ |
2
) X
i
II : Y
i
= o
1
+ |
1
X
i
Water Quality Environmetric Study of the Struma River Basin,
Bulgaria, Part I: Water quality Long-Term Trends (1989-1998)
Mihailov G.; Simeonov V.; Nikolov N.; Mirinchev G.
Toxicological and Environmental Chemistry, Volume 83, Numbers 1-4, Numbers 1-
4/2002 , pp. 1-12(12)

The present study deals with an estimation of the long-term trends in the water
quality of Struma river by the use of linear regression trend analysis. Nineteen
sampling sites along the main river stream and at different tributaries were included
in the study since they possess complete data sets for the period of observation. The
sites are part of the monitoring net of the region of interest consisting of 31 sites.

Seventeen chemical indicators of the surface water have been measured in the period
1989-1998 in monthly intervals. The trend study was performed by the use of annual
average values.
It is shown that the water quality is relatively stable throughout the monitoring
period, which is indicated by lack of statistically significant trends for many of the
sites and chemical variables. However, an effort is made to detect some specific
"patterns" of the water quality based on the site location and trend significance.

Diunduh dari:
http://www.ingentaconnect.com/content/tandf/gtec/2002/00000083/f0040001/art00001.
26/8/2012
Dengan demikian:
Kasus 1: Bila o
2
= 0 dan |
2
= 0 Model I = Model II
Kasus 2: Bila o
2
= 0 dan |
2
= 0 Slope sama, intercept beda
Kasus 3: Bila o
2
= 0 dan |
2
= 0 Intercept sama, slope beda
Kasus 4: Bila o
2
= 0 dan |
2
= 0 Intercept dan slope berbeda
Tabungan
o
2
o
1
Pendapatan
Sebelum Krisis
Saat Krisis
ILUSTRASI ARTI KOEFISIEN REGRESI
Fungsi respon untuk perusahaan Stock

|
2
|
1
+|
3

|
1

|
0
+ |
2
|
0

Fungsi Respon untuk perusahaan Mutual

REGRESI STEPWISE
Diunduh dari: http://statistik4life.blogspot.com/2009/12/regresi-stepwise.html ..
22/8/2012
. Model regresi terbaik terkadang didapatkan dari beberapa tahap pemilihan.
Daftar sejumlah variabel penjelas tersedia dan dari itu dicari variabel mana
yang seharusnya dimasukkan ke dalam model. Variabel penjelas terbaik akan
digunakan pertama kali, dan kemudian yang kedua, dan seterusnya. Prosedur
ini kita kenal dengan Regresi Stepwise.

Regresi stepwise melibatkan dua jenis proses yaitu: forward selection dan
backward elimination. Teknik ini dilakukan melalui beberapa tahapan. Pada
masing-masing tahapan, kita akan memutuskan variabel mana yang
merupakan prediktor terbaik untuk dimasukkan ke dalam model.

Variabel ditentukan berdasarkan uji-F, variabel ditambahkan ke dalam model
selama nilai p-valuenya kurang dari nilai kritik (biasanya 0,15). Kemudian
variabel dengan nilai p-value lebih dari nilai kritik akan dihilangkan. Proses
ini dilakukan terus menerus hingga tidak ada lagi variabel yang memenuhi
kriteria untuk ditambahkan atau dihilangkan.

Model dalam regresi Stepwise adalah:

Y = 0 + 1X1 + 2X2 + 3X3 + . + nXn

Sedangkan Hipotesis yang digunakan dalam Regresi Stepwise adalah:

H0 : 1, 2, 3 = 0

Dengan hipotesis alternatif adalah:

Ha : 1, 2, 3 0

REGRESI STEPWISE
Diunduh dari: http://statistik4life.blogspot.com/2009/12/regresi-stepwise.html ..
22/8/2012
Ilustrasi:

Berikut ini adalah data gaji manajer pada 10 perusahaan besar, dengan regresi stepwise kita
dapat memilih variabel mana saja dari daftar berikut yang signifikan dalam mempengaruhi
besarnya gaji para manajer tersebut:
dimana:
Y = gaji manajer (dalam logaritma natural = ln) *lihat bab normalisasi data dengan
transformasi; X1 = masa kerja (tahun); X2 = masa pendidikan (tahun)
X3 = bonus (1 jika ada, dan 0 jika tidak ada); X4 = Jumlah karyawan yang diawasi (orang)
X5 = Aset perusahaan (dalam logaritma natural = ln) *lihat bab normalisasi data dengan
transformasi; X6 = dewan direksi (1 jika ada, dan 0 jika tidak ada); X7 = umur (tahun); X8 =
keuntungan perusahaan (dalam logaritma natural = ln) *lihat bab normalisasi data dengan
transformasi; X9 = tanggung jawab internasional (1 jika ada, dan 0 jika tidak); X10 = total
penjualan perusahaan 12 bulan terakhir (dalam milyar)

Hipotesis:
H0 = H0 : 1, 2, 3 = 0
Ha : 1, 2, 3 0
POURSAFA, Parinaz et al.
ASSOCIATION OF AIR POLLUTION AND HEMATOLOGIC
PARAMETERS IN CHILDREN AND ADOLESCENTS.
J. Pediatr. (Rio J.) [online]. 2011, vol.87, n.4, pp. 350-356. ISSN 0021-
7557. http://dx.doi.org/10.2223/JPED.2115.
Diunduh dari: http://www.scielo.br/scielo.php?pid=S0021-
75572011000400013&script=sci_abstract .. 22/8/2012

To assess the relationship between air pollution and hematologic parameters in a
population-based sample of children and adolescents.
This cross-sectional study was conducted in 2009-2010 among school students randomly
selected from different areas of Isfahan city, the second largest and most air-polluted city
in Iran. The association of air pollutant levels with hemoglobin, platelets, red and white
blood cells (RBC and WBC, respectively) was determined by multiple linear and logistic
regression analyses, after adjustment for age, gender, anthropometric measures,
meteorological factors, and dietary and physical activity habits.

The study participants consisted of 134 students (48.5% boys) with a mean age of
13.102.21 years. While the mean Pollutant Standards Index (PSI) was at moderate
level, the mean particulate matter < 10 m (PM
10
) was more than twice the normal level.
Multiple linear regression analysis showed that PSI and most air pollutants, notably
PM
10
, had significant negative relationship with hemoglobin and RBC count, and
positive significant relationship with WBC and platelet counts. The odds ratio of
elevated WBC increased as the quartiles of PM
10
, ozone and PSI increased, however
these associations reached significant level only in the highest quartile of PM
10
and PSI.
The corresponding figures for hemoglobin and RBC were in opposite direction.

The association of air pollutants with hematologic parameters and a possible pro-
inflammatory state is highlighted. The presence of these associations with PM
10
in a
moderate mean PSI level underscores the necessity to re-examine environmental health
policies for the pediatric age group.

REGRESI BINARY LOGIT
Diunduh dari: http://junaidichaniago.com/2010/02/11/regresi-binary-logit-seri-6-model-
ekonometrik-dg-spss/ .. 22/8/2012
Sebagai kelanjutan dari tulisan mengenai model pilihan kualitatif, pada
bagian ini, akan dijelaskan contoh model binary logit dan estimasinya dengan
menggunakan program SPSS. Sebagai contoh ilustratif, misalnya ingin
diprediksi pengaruh umur, jenis kelamin dan pendapatan terhadap pembelian
mobil. Berdasarkan hasil survai terhadap 48 responden, didapatkan datanya
sebagai berikut:
Dimana:
Y = 1, jika konsumen membeli mobil, = 0 jika konsumen tidak membeli mobil
X1 = umur responden dalam tahun
X2 = 1, jika konsumen berjenis kelamin wanita, = 0 jika konsumen berjenis
kelamin pria
X3 = 0, jika konsumen berpendapatan rendah, = 1 jika konsumen berpendapatan
sedang; = 2 jika konsumen berpendapatan tinggi
Tahapan-tahapan estimasi dalam SPSS
Dari output SPSS, didapatkan nilai 2 sebesar 18,131 dengan p-value 0,001.
Karena nilai ini jauh dibawah 10 % (jika menggunakan pengujian dengan
=10%), atau jauh dibawah 5% (jika menggunakan pengujian dengan =5%),
maka dapat disimpulkan bahwa model regresi logistik secara keseluruhan
dapat menjelaskan atau memprediksi keputusan konsumen dalam membeli
mobil.
Printout di tabel ketiga memberikan estimasi koefisien model dan pengujian
hipotesis parsial dari koefisien model. Dalam pelaporannya, model regresi
logistiknya dapat dituliskan sebagai berikut:
Model ini merupakan model peluang membeli mobil [(P(xi)] yang
dipengaruhi oleh faktor-faktor umur, jenis kelamin dan pendapatan.
Model tersebut adalah bersifat non-linear dalam parameter.
. Selanjutnya, untuk menjadikan model tersebut linear, dilakukan
transformasi dengan logaritma natural, (transformasi ini yang menjadi hal
penting dalam regresi logistik dan dikenal dengan istilah logit
transformation), sehingga menjadi (pembahasan lebih rinci, silakan dibaca
buku-buku ekonometrik):

1-P(xi) adalah peluang tidak membeli mobil, sebagai kebalikan dari P(xi)
sebagai peluang membeli mobil. Oleh karenanya, ln [P(xi)/1-P(xi)] secara
sederhana merupakan log dari perbandingan antara peluang membeli mobil
dengan peluang tidak membeli mobil. Oleh karenanya juga, koefisien dalam
persamaan ini menunjukkan pengaruh dari umur, jenis kelamin dan
pendapatan terhadap peluang relative individu membeli mobil yang
dibandingkan dengan peluang tidak membeli mobil.

Selanjutnya, untuk menguji faktor mana yang berpengaruh nyata terhadap
keputusan pilihan membeli mobil tersebut, dapat menggunakan uji
signifikansi dari parameter koefisien secara parsial dengan statistik uji Wald,
yang serupa dengan statistik uji t atau uji Z dalam regresi linear biasa, yaitu
dengan membagi koefisien terhadap standar error masing-masing koefisien.

Dari output SPSS ditampilkan nilai Wald dan p-valuenya. Berdasarkan nilai
p-value (dan menggunakan kriteria pengujian =10%), dapat dilihat seluruh
variabel (kecuali X3_1), berpengaruh nyata (memiliki p-value dibawah 10%)
terhadap keputusan membeli mobil.
Lalu, bagaimana interpretasi koefisien regresi logit dari persamaan di atas ?
Dalam model regresi linear, koefisien i menunjukkan perubahan nilai
variabel dependent sebagai akibat perubahan satu satuan variabel
independent. Hal yang sama sebenarnya juga berlaku dalam model regresi
logit, tetapi secara matematis sulit diinterpretasikan.
. Koefisien dalam model logit menunjukkan perubahan dalam logit sebagai akibat
perubahan satu satuan variabel independent. Interpretasi yang tepat untuk koefisien ini
tentunya tergantung pada kemampuan menempatkan arti dari perbedaan antara dua logit.
Oleh karenanya, dalam model logit, dikembangkan pengukuran yang dikenal dengan
nama odds ratio (). Odds ratio untuk masing-masing variabel ditampilkan oleh SPSS
sebagaimana yang terlihat tabel diatas (kolom Exp(B)).
Odds ratio dapat dirumuskan: = e, dimana e adalah bilangan 2,71828 dan adalah
koefisien masing-masing variabel. Sebagai contoh, odds ratio untuk variabel X2 = e-
0.1602 = 0,201 (lihat output SPSS).
Dalam kasus variabel X2 (jenis kelamin dimana 1 = wanita dan 0 = pria), dengan odds
ratio sebesar 0,201 dapat diartikan bahwa peluang wanita untuk membeli mobil adalah
0,201 kali dibandingkan pria, jika umur dan pendapatan mereka sama. Artinya wanita
memiliki peluang lebih rendahi dalam membeli mobil dibandingkan pria.
Dalam kasus variabel X1 (umur), dengan odds ratio sebesar 1,153 dapat diartikan bahwa
konsumen yang berumur lebih tua satu tahun peluang membeli mobilnya adalah 1,153
kali dibandingkan konsumen umur yang lebih muda (satu tahun), jika pendapatan dan
jenis kelamin mereka sama. Artinya orang yang lebih tua memiliki peluang yang lebih
tinggi dalam membeli mobil.

Dalam konteks umur ini (yang merupakan variabel dengan skala ratio), hati-hati
menginterpretasikan nilai perbedaan peluangnya. Jika perbedaan umur lebih dari 1 tahun,
misalnya 10 tahun, maka odds rationya akan menjadi 4,14, yang diperoleh dari
perhitungan sbb: =e(10 x 0.142) . Artinya peluang membeli mobil konsumen yang
berumur lebih tua 10 tahun adalah 4,14 kali dibandingkan konsumen yang lebih muda
(10 tahun) darinya.

Selanjutnya, dalam konteks variabel pendapatan, terlihat bahwa X31 tidak berpengaruh
signifikan. Artinya, peluang membeli mobil antara konsumen pendapatan sedang dan
pendapatan rendah adalah sama saja. Sebaliknya, untuk X32, dapat diinterpretasikan
bahwa peluang membeli mobil konsumen pendapatan tinggi adalah 6,45 kali
dibandingkan pendapatan rendah, jika umur dan jenis kelaminnya sama.
Modeling environmental data by functional principal component
logistic regression
M. Escabias, A. M. Aguilera, M. J. Valderrama
Environmetrics. Volume 16, Issue 1, pages 95107, February 2005.
Diunduh dari: http://onlinelibrary.wiley.com/doi/10.1002/env.696/abstract.. 26/8/2012

In recent years, many studies have dealt with predicting a response variable based
on the information provided by a functional variable.
When the response variable is binary, different problems arise, such as
multicollinearity and high dimensionality, which prejudice the estimation of the
model and the interpretation of its parameters.

In this article we address these problems by using functional logistic
regression and principal component analysis.

In order to obtain a unique solution for the maximum likelihood estimation of the
parameter function, quasi-natural cubic spline interpolation of sample paths on
their discrete time observations is proposed.

We also introduce a new interpretation of the relationship between the response
variable and the functional predictor where the change in the odds of success is
evaluated from the estimated parameter function.

An analysis of climatological data is finally presented to illustrate the practical
performance of the proposed methodologies

REGRESI
DENGAN
VARIABEL INTERVENING

Diunduh dari: maksi.unsoed.ac.id/wp-content/.../Regresi-Variabel-Intervening1.ppt ..
22/8/2012
VARIABEL INTERVENING
Variabel mediasi atau intervening merupakan variabel antara atau
mediating, yang berfungsi memediasi hubungan antara variabel
independent (predictor) dengan variabel dependen (predictand)
22/8/2012
A variable which is postulated to be a predictor of one or more dependent variables,
and simultaneously predicted by one or more independent variables. Synonym :
mediating variable. (1)

A variable (as memory) whose effect occurs between the treatment in a psychological
experiment (as the presentation of a stimulus) and the outcome (as a response), is
difficult to anticipate or is unanticipated, and may confuse the results (2)

Menurut Tuckman (dalam Sugiyono, 2007) variabel intervening adalah variabel yang
secara teoritis mempengaruhi hubungan antara variabel independen dengan variabel
dependen menjadi hubungan yang tidak langsung dan tidak dapat diamati dan diukur.
Variabel ini merupakan variabel penyela / antara variabel independen dengan variabel
dependen, sehingga variabel independen tidak langsung mempengaruhi berubahnya
atau timbulnya variabel dependen.

A mediating variable is one which specifies how (or the mechanism by which) a
given effect occurs between an independent variable (IV) and a dependent variable
(DV). (Holmbeck, 1997, p. 599).

Dari definisi ini, intervening (mediator) dikatakan memberikan pengaruh di antara IV
dan DV. Dapat merubah hasil, persamaannya adalah mediator variabel / variabel
perantara, sulit untukj diantisipasi, dll. Dimananakah posisinya ?? yaitu di tengah.

Diunduh dari: http://teorionline.wordpress.com/2010/03/15/variabel-intervening-
intervening-variable/.. 25/8/2012
Variabel berdasarkan pada hubungan antar
variabel
Variabel
Tergantung
Variabel
Bebas
Variabel
Intervening
Variabel
Moderator
22/8/2012
VARIABLE INTERVENING
Diunduh dari: http://teorionline.wordpress.com/2010/03/15/variabel-intervening-intervening-
variable/ .. 25/8/2012
Perhatikan penjelasan berikut (cth variabel diambil dari buku. Sugiyono, 2007) :

Penghasilan (IV) > gaya hidup (M) > harapan hidup (Y)

Dari gambar anak panah dapat diketahui bahwa :
1. Penghasilan mempengaruhi gaya hidup.
2. Gaya hidup mempengaruhi harapan hidup
3. Karena adanya variabel gaya hidup ini maka hubungan yang terjadi antara
penghasilan (X) ke harapan hidup (M) menjadi hubungan yang tidak langsung
karena diperantarai gaya hidup (Y)

variable/ .. 25/8/2012
PERBEDAAN VARIABEL MEDIATOR DENGAN MODERATOR

Ditinjau dari definisinya, variabel mediasi (intervening) dan moderator sama-sama
mempengaruhi hubungan independen terhadap dependen, lalu dimana perbedaannya ??

Bagan berikut menjelaskan mengenai variabel dan paradigma hubungan. (Sugiyono
(2007:40-41)
Perhatikan dua model di atas ..ada dua perbedaan mendasar yaitu :

Variabel mediator berada dalam satu jalur hubungan, moderator di luar
Variabel mediator dipengaruhi IV dan mempengaruhi DV, moderator
lebih banyak tidak

Ciri khas variabel mediator (terutama dalam penelitian
sosial/keperilakuan) adalah mudah berubah, misal mood, emosi, rasa
puas, benci, sedih, dll.
Sedangkan moderator lebih susah berubah seperti kepribadian, usia, masa
kerja, budaya, dll.

variable/ .. 25/8/2012
Paul Jose (2008) menjelaskan perbedaan dan kesamaan variabel mediator dan
moderator :
Similarities:
They both involve three variables;
You can use regression to compute both;
You wish to see how a third variable affects a basic relationship (IV to DV).
Differences:
You create a product term in moderation; not in mediation;
You dont have to centre anything in mediation;
Moderation can be used on concurrent or longitudinal data, but mediation is best
used on longitudinal data.
Graphing is critical for moderation; helpful for mediation.
Interpretasi dari model di atas :
1. Pertama, stressor (penyebab stres) berakibat terhadap stres yang
dirasakan. Stessor ditempatkan sebagai penyebab (independen), dan
stres yang dirasakan ditempatkan sebagai mediator (M).
2. Kedua, pada hubungan antara stressor dan stress yang dirasakan ini
akan sangat dipengaruhi oleh salah satunya tipe kepribadian (misal
tipe A). Mengapa demikian ??
ILUSTRASI
Gaji sesecara langsung berpengaruh terhadap pendapatan.
Gaji akan mempengaruhi kekayaan dan kekayaan yang dimiliki akan mempengaruhi
pendapatan.
Gaji
Kekayaan
Pendapatan
22/8/2012
Hubungan Langsung
22/8/2012
Hubungan Melalui Mediasi
Hipotesis:
H1: X berpengaruh
postif terhadap M.
H2: M berpengaruh
postif terhadap Y.
H3: X berpengaruh
postif terhadap Y.
H4: M memediasi
hubungan antara X
terhadap M.

Analisis Regresi Variabel Mediasi dengan
Metode Kausal Step
22/8/2012
KRITERIA PENGUJIAN
Variabel M dinyatakan sebagai variabel mediasi sempurna (perfect mediation) jika,
setelah memasukan variabel M pengaruh variabel X terhadap Y menurun menjadi nol
(c=0) atau pengaruh variabel X terhadap Y yang tadinya signifikan (sebelum
memasukan variabel M) menjadi tidak signifikan setelah memasukan variabel M ke
dalam model persamaan regresi.

Variabel M dinyatakan sebagai variabel mediasi persial (partial mediation) jika,
setelah memasukan variabel M pengaruh variabel X terhadap Y menurun tetapi tidak
menjadi nol (c 0) atau pengaruh variabel X terhadap Y yang tadinya signifikan
(sebelum memasukan variabel M) menjadi tetap signifikan setelah memasukan
variabel M ke dalam model persamaan regersi tetapi mengalami penurunan koefesien
regresi
22/8/2012
Analisis Mediasi : Regresi dengan Variabel Mediator
Regresi adalah upaya untuk mengetahui apakah variabel independen
(prediktor) mampu menjelaskan variasi di dalam variabel dependen (kriteria).
Dalam berbagai literatur, kata menjelaskan variasi tersebut bisa diganti dengan
memprediksi, mempengaruhi atau berperan terhadap peningkatan atau
penurunan. Namun intinya sama, menjelaskan seberapa besar sebuah
prediktor mampu menjelaskan variasi skor di dalam kriteria.

Model Regresi Dengan Mediator, selain ada variabel independen (X) juga ada
variabel mediator (M). Variabel independen memprediksi M dan M memprediksi
Y. Sebagian variasi M bisa dijelaskan oleh X dan variasi di dalam Y bisa
dijelaskan oleh variabel M.
Model ini namanya model regresi dengan mediator. Beberapa ahli
menamakannya dengan mediasi lengkap (complete mediation), artinya
prediktor (X) tidak menjelaskan variasi di dalam kriteria (Y), hanya mediator (M)
saja yang menjelaskan variasi di dalam kriteria.

Syarat utama sebuah variabel sebagai mediator ditunjukkan dengan wilayah
variasi mediator mampu menjangkau prediktor sekaligus kriteria.

Diunduh dari: http://widhiarso.staff.ugm.ac.id/wp/berkenalan-dengan-analisis-mediasi-regresi-
dengan-melibatkan-variabel-mediator-bagian-pertama/
LANGKAH METODE KAUSAL STEP
Membuat persamaan regresi variabel bebas (X) terhadap variabel tergantung (Y).
Membuat persamaan regresi variabel bebas (X) terhadap variabel mediasi (M).
Membuat persamaan regresi variabel bebas (X) terhadap variabel terhantung (Y) dengan
memasukan variabel mediasi (M) dalam persamaan.
Menarik kesimpulan dengan kriteria seperti yang telah diuraikan di atas
22/8/2012
Analisis Regresi dengan Variabel Intervening
Variabel intervening merupakan variabel antara atau mediating. Fungsinya memediasi
hubungan antara variabel independen dengan variabel dependen. Dalam contoh kita kali
ini adalah contoh yang sama dipakai pada model analisis regresi dengan variabel
moderating, yaitu hubungan antara Earns dengan Income di mediasi oleh variabel
Wealth. (diunduh dari: http://blogtutorialspss.blogspot.com/2012/06/analisis-regresi-dengan-
variabel_27.html)
Jadi Wealth sebagai variabel intervening atau kalau digambarkan seperti di bawah ini :
Earns dapat berpengaruh langsung terhadap Income, tetapi juga dapat pengaruhnya tidak langsung yaitu
lewat variabel Wealth lebih dahulu baru ke Income. Logikanya semakin tinggi Earns akan meningkatkan
Wealth dengan tingginya Wealth akan berpengaruh terhadap Income.
Untuk menguji pengaruh variabel intervening digunakan metode analisis jalur (Path Analysis). Analisis jalur
merupakan perluasan dari analisis regresi linear berganda, atau analisis jalur adalah penggunaan analisis
regresi untuk menaksir hubungan kausalitas antar variabel (model causal atau sebab akibat) yang telah
ditetapkan sebelumnya berdasarkan teori.
Analisis jalur sendiri tidak dapat menentukan hubungan sebab-akibat dan juga tidak dapat digunakan sebagai
substitusi bagi peneliti untuk melihat hubungan kausalitas antar variabel. Hubungan kausalitas antar variabel
telah dibentuk dengan model berdasarkan landasan teoritis. Apa yang dapat dilakukan oleh analisis jalur
adalah menentukan pola hubungan antara tiga atau lebih variabel dan tidak dapat digunakan untuk
mengkonfirmasi atau menolak hipotesis kausalitas imajiner.
Regresi variabel bebas (X) terhadap variabel tergantung (Y).
Coefficients
a
2.714 .887 3.059 .005
.528 .141 .577 3.735 .001
(Constant)
Kualitas Pelayanan
Model
1
B Std. Error
Unstandardized
Coef f icients
Beta
Standardized
Coef f icients
t Sig.
Dependent Variable: Loyalitas
a.
22/8/2012
Regresi variabel bebas (X) terhadap variabel mediasi (M).
Coefficients
a
1.098 .647 1.697 .101
.973 .103 .872 9.446 .000
(Constant)
Kualitas Pelayanan
Model
1
B Std. Error
Unstandardized
Coef f icients
Beta
Standardized
Coef f icients
t Sig.
Dependent Variable: Kepuasan
a.
Regresi variabel bebas (X) terhadap variabel terhantung (Y) dengan
memasukan variabel mediasi (M) dalam persamaan
Coefficients
a
1.949 .817 2.384 .024
.697 .227 .850 3.066 .005
-.151 .254 -.165 -.594 .557
(Constant)
Kepuasan
Kualitas Pelayanan
Model
1
B Std. Error
Unstandardized
Coef f icients
Beta
Standardized
Coef f icients
t Sig.
a.
HASIL ANALISIS
1. Variabel bebas (kualitas pelayanan) berpengaruh terhadap variabel
mediasi (kepuasan pelanggan).
2. Variabel mediasi (kepuasan pelanggan) berpengaruh terhadap loyalitas.
3. Tetapi pengaruh variabel bebas (kualitas pelayanan) menjadi tidak
berpengaruh terhadap variabel tergantung (loyalitas) setelah
memasukan variabel mediasi (kepuasan pelanggan),
4. Sehingga dapat disimpulkan kepuasan pelanggan memediasi secara
mutlak hubungan antara kualitas pelayanan dengan loyalitas.
22/8/2012
Analisis Regresi Variabel Mediasi dengan Metode
Product of Coefficient
Uji variabel mediasi dengan metode ini dilakukan dengan cara menguji kekuatan
pengaruh tidak langsung variabel bebas (X) terhadap variabel terhantung (Y)
melalui variabel mediasi (M).
Menguji signifikansi pengaruh tak langsung (perkalian pengaruh langsung
variabel bebas terhadap variabel mediator (a) dan pengaruh langsung variabel
mediator terhadap variabel dependen (b) menjadi (ab).
22/8/2012
VARIABEL INTERVENING (Intervening Variable)
Variabel intervening adalah variabel yang secara teoritis mempengaruhi hubungan antara
variabel independen dengan variabel dependen menjadi hubungan yang tidak langsung dan
tidak dapat diamati dan diukur. Variabel ini merupakan variabel penyela / antara variabel
independen dengan variabel dependen, sehingga variabel independen tidak langsung
mempengaruhi berubahnya atau timbulnya variabel dependen.
(diunduh dari: http://ariefroean.blogspot.com/2012/02/metode-analisis-kuantitatif_13.html 25/8/2012)
Analisis Regresi Variabel Mediasi dengan Metode Kausal Step
Diunduh dari: http://teorionline.wordpress.com/2011/07/20/teori-dan-uji-model-mediasi/ 25/8/2012
maksi.unsoed.ac.id/wp-content/.../Regresi-Variabel-Intervening1.ppt .. 22/8/2012
Causal Step
Metode ini menunjukkan serangkaian persyaratan yang harus dipenuhi untuk model
mediasi. Seperti diuraikan oleh Baron dan Kenny (1986) :

1. Step 1: Show that the initial variable is correlated with the outcome. Use Y as the criterion
variable in a regression equation and X as a predictor (estimate and test path c). This step
establishes that there is an effect that may be mediated. (Regresikan X ke Y. Model ini
disimbolkan dengan jalur c)
2. Step 2: Show that the initial variable is correlated with the mediator. Use M as the criterion
variable in the regression equation and X as a predictor (estimate and test path a). This step
essentially involves treating the mediator as if it were an outcome variable. (Regresikan M ke Y.
Model ini disimbolkan dengan jalur a)
3. Step 3: Show that the mediator affects the outcome variable. Use Y as the criterion variable in a
regression equation and X and M as predictors (estimate and test path b). It is not sufficient just
to correlate the mediator with the outcome; the mediator and the outcome may be correlated
because they are both caused by the initial variable X. Thus, the initial variable must be
controlled in establishing the effect of the mediator on the outcome. (Regresikan X dan M ke Y,
sehingga akan diperoleh korelasi M ke Y (jalur b),dan X ke Y (jalur c)
4. Step 4: To establish that M completely mediates the X-Y relationship, the effect of X on Y
controlling for M (path c) should be zero. If all four of these steps are met, then the data are
consistent with the hypothesis that variable M completely mediates the X-Y relationship, and if
the first three steps are met but the Step 4 is not, then partial mediation is indicated.
Variabel IV = X
Variabel Moderator = M
Variabel outcome = Y

Pengaruh kualitas pelayanan terhadap kepuasan pelanggan
Model Summary
.872
a
.761 .753 .84130
Model
1
R R Square
Adjusted
R Square
Std. Error of
the Estimate
Predictors: (Constant), Kualitas Pelayanan
a.
Coefficients
a
1.098 .647 1.697 .101
.973 .103 .872 9.446 .000
(Constant)
Kualitas Pelayanan
Model
1
B Std. Error
Unstandardized
Coef f icients
Beta
Standardized
Coef f icients
t Sig.
Dependent Variable: Kepuasan
a.
Diunduh dari: maksi.unsoed.ac.id/wp-content/.../Regresi-Variabel-Intervening1.ppt .. 22/8/2012
Analisis Kualitas Pelayanan atau Service Quality (akronimnya SERVQUAL) adalah
suatu metode desktiptif guna menggambarkan tingkat kepuasan pelanggan. Metode ini
dikembangkan tahun 1985 oleh A. Parasuraman, Valarie A. Zeithaml, dan Leonard L.
Berry lewat artikel mereka di Journal of Marketing. Metode di jurnal tersebut lalu
direvisi oleh mereka lewat artikel SERVQUAL: A Multiple-Item Scale for Measuring
Consumer Perceptions of Service Quality.
Dalam model analisis di atas,
tampak bahwa Expected
Service (Pelayanan yang
Diharapkan) bergantung
pada WOM (Word of
Mouth), Personal Needs dan
Past Experience.

Diunduh dari:
http://setabasri01.blogspot.com/2
011/04/service-quality-
akronimnya-servqual.html
25/8/2012
Pengaruh kualitas pelayanan dan Kepuasan terhadap kepuasan
pelanggan
Model Summary
.711
a
.505 .468 1.01211
Model
1
R R Square
Adjusted
R Square
Std. Error of
the Estimate
Predictors: (Constant), Kepuasan , Kualitas Pelayanan
a.
Coefficients
a
1.949 .817 2.384 .024
-.151 .254 -.165 -.594 .557
.697 .227 .850 3.066 .005
(Constant)
Kualitas Pelayanan
Kepuasan
Model
1
B Std. Error
Unstandardized
Coef f icients
Beta
Standardized
Coef f icients
t Sig.
a.
Diunduh dari: maksi.unsoed.ac.id/wp-content/.../Regresi-Variabel-Intervening1.ppt .. 22/8/2012
Hubungan Antara Kualitas Pelayanan Dengan Loyalitas Pelanggan
Meningkatnya persaingan menuntut perusahaan termasuk jasa komunikasi untuk selalu memperhatikan
kebutuhan dan keinginan pelanggan serta berusaha memenuhi harapan mereka dengan cara yang lebih
memuaskan daripada yang dilakukan pesaing. Perhatian perusahaan tidak hanya terbatas pada produk
yang dihasilkan, melainkan juga pada aspek proses, sumber daya manusia, lingkungan, dll (Mazur, 1992
dalam Yunani , 2003 : 21).
Model Caruana menempatkan
dimensi-dimensi kualitas pelayanan
(service quality) sebagai variable
independent (X) dan loyalitas
pelanggan sebagai variabel
dependen (Y) sementara kepuasan
pelanggan (customer satisfaction)
implisit di dalam survei.

Diunduh dari:
http://www.pemimpinunggul.com/thesis/h
al-31-dan-32.html 25/8/2012
Pengujian Mediasi dengan Product Coefisient
22/8/2012
MODEL KONSTRUK HIPOTETIS
Diunduh dari: http://dipanugraha.blog.com/2011/12/27/contoh-proposal-riset/.. 25/8/2012
KUALITAS PELAYANAN MEMPUNYAI EFEK POSITIF TERHADAP
LOYALITAS PELANGGAN

Ada hubungan positif antara kualitas pelayanan dan minat membeli lagi (repurchase
intention), rekomendasi kepada pihak lain, dan kesetiaan terhadap alternatif lain yang
mungkin lebih baik. Semua ini minat membeli lagi, rekomendasi kepada pihak lain,
dan kesetiaan terhadap alternatif lain yang mungkin lebih baik merupakan minat
behavioral dan merupakan bentuk dari loyalitas pelanggan.
Hipotesis:
H 1. kualitas pelayanan berpengaruh positif terhadap loyalitas pelanggan
H 2. kualitas pelayanan berpengaruh positif terhadap kepercayaan
H 3. kualitas pelayanan berpengaruh positif terhadap citra perusahaan
H 4. kepercayaan berpengaruh positif terhadap loyalitas pelanggan
H 5. citra perusahaan berpengaruh positif terhadap loyalitas pelanggan

NON-LINEAR MIXED REGRESSION MODELS
Richard T. Burnett, W. H. Ross, Daniel Krewski
Environmetrics. Volume 6, Issue 1, pages 8599, January/February 1995.
Diunduh dari: http://onlinelibrary.wiley.com/doi/10.1002/env.3170060108/abstract.. 23/8/2012
. In this paper we present an estimating equation approach to statistical inference for non-
linear random effects regression models for correlated data. With this approach, the
distribution of the observations and the random effects need not be specified; only their
expectation and covariance structure are required. The variance of the data given the
random effects may depend on the conditional expectation. An approximation to the
conditional expectation about the fitted value of the random effects is used to obtain
closed form expressions for the unconditional mean and covariance of the data.
The proposed methods are illustrated using data from a mouse skin painting experiment.
Comparison of three expert elicitation methods for logistic regression on
predicting the presence of the threatened brush-tailed rock-wallaby Petrogale
penicillata
Rebecca A. O'Leary, Samantha Low Choy, Justine V. Murray, Mary Kynn, Robert Denham,
Tara G. Martin, Kerrie Mengersen.
Environmetrics. Volume 20, Issue 4, pages 379398, June 2009

Numerous expert elicitation methods have been suggested for generalised linear models (GLMs).
This paper compares three relatively new approaches to eliciting expert knowledge in a form
suitable for Bayesian logistic regression. These methods were trialled on two experts in order to
model the habitat suitability of the threatened Australian brush-tailed rock-wallaby (Petrogale
penicillata). The first elicitation approach is a geographically assisted indirect predictive method
with a geographic information system (GIS) interface. The second approach is a predictive
indirect method which uses an interactive graphical tool. The third method uses a questionnaire
to elicit expert knowledge directly about the impact of a habitat variable on the response. Two
variables (slope and aspect) are used to examine prior and posterior distributions of the three
methods.
The results indicate that there are some similarities and dissimilarities between the expert
informed priors of the two experts formulated from the different approaches. The choice of
elicitation method depends on the statistical knowledge of the expert, their mapping skills, time
constraints, accessibility to experts and funding available.
This trial reveals that expert knowledge can be important when modelling rare event data, such as
threatened species, because experts can provide additional information that may not be
represented in the dataset. However care must be taken with the way in which this information is
elicited and formulated.

Diunduh dari: http://onlinelibrary.wiley.com/doi/10.1002/env.935/abstract 28/8/2012
. Regression models for air pollution and daily mortality: analysis of
data from Birmingham, Alabama
Richard L Smith, Jerry M Davis,Jerome Sacks,Paul Speckman,Patricia Styer.
Environmetrics. Special Issue: Statistical Analysis of Particulate Matter Air
Pollution. Volume 11, Issue 6, pages 719743, November/December 2000.
Diunduh dari: http://onlinelibrary.wiley.com/doi/10.1002/1099-
095X%28200011/12%2911:6%3C719::AID-ENV438%3E3.0.CO;2-U/abstract..
23/8/2012
The purpose of the present paper is to propose a systematic approach to the
regression analyses that are central to this kind of research. We argue that the
results may depend on a number of ad hoc features of the analysis, including
which meteorological variables to adjust for, and the manner in which
different lagged values of particulate matter are combined into a single
exposure measure. We also examine the question of whether the effects are
linear or nonlinear, with particular attention to the possibility of a threshold
effect, i.e. that significant effects occur only above some threshold.
These points are illustrated with a data set from Birmingham, Alabama, first
cited by Schwartz (1993, American Journal of Epidemiology137: 1136
1147) and since extensively re-analyzed. For this data set, we find that the
results are sensitive to whether humidity is included along with temperature
as a meteorological variable, and to the definition of the exposure measure.
We also find evidence of a threshold effect, with the greatest increase in
mortality occurring above 50 g/m
3
, which is the long-term average level
permitted by the current NAAQS. Thus, on the basis of this data set, the need
for a tighter NAAQS is not established.
Although this particular analysis is focussed just on one data set, the issues it
raises are typical in this area of research. We do not dispute that there is a
reasonable level of evidence linking atmospheric particulate matter with
adverse health outcomes even within the levels permitted by current
regulations. However, the impression has been created by some of the
published literature that such associations are overwhelmingly supported by
epidemiological research. Our viewpoint is that the statistical analyses allow
different interpretations, and that the case for tighter regulations cannot be
based solely on studies of this nature.
Nonparametric Regression Model
Professor Jean D. Opsomer
Published Online: 15 SEP 2006
DOI: 10.1002/9780470057339.van019. Copyright 2002 John Wiley & Sons, Ltd
Diunduh dari: http://onlinelibrary.wiley.com/doi/10.1002/9780470057339.van019/abstract.. 23/8/2012
Nonparametric regression is a rapidly growing and exciting branch of statistics, both because
of recent theoretical developments and because of more widespread use of fast and
inexpensive computers. Many methods are currently available, including kernel-based
methods, regression splines, smoothing splines and wavelet and Fourier series expansions.
This article introduces the main types of smoothing methods and discusses the usefulness of
nonparametric regression for analyzing datasets, particularly of the types encountered in
environmental statistics.
INTRODUCTION TO NONPARAMETRIC REGRESSION
John Fox (Department of Sociology, McMaster University, Canada)
ESRC Oxford Spring School . May 2005

Nonparametric regression analysis is regression without an assumption of linearity.
The scope of nonparametric regression is very broad, ranging from "smoothing" the
relationship between two variables in a scatterplot to multiple-regression analysis
and generalized regression models (for example, logistic nonparametric regression
for a binary response variable). Unthinkable only a few years ago, methods of
nonparametric-regression analysis have been rendered practical by advances in
statistics and computing, and are now a serious alternative to more traditional
parametric-regression modelling.

Regression analysis traces the average value of a response variable (y)
as a function of one or several predictors (xs).
Suppose that there are two predictors, x1 and x2.
The object of regression analysis is to estimate the population regression
function |x1, x2 = f (x1, x2).
Alternatively, we may focus on some other aspect of the conditional
distribution of y given the xs, such as the median value of y or its
variance.

Diunduh dari: http://socserv.mcmaster.ca/jfox/Courses/Oxford-2005/slides-
handout.pdf 28/8/2012
A hierarchical zero-inflated Poisson regression model for stream fish
distribution and abundance
E.L. Boone, B. Stewart-Koster,M.J. Kennard
.
Environmetrics. Volume 23, Issue 3, pages 207218, May 2012.
Diunduh dari: http://onlinelibrary.wiley.com/doi/10.1002/env.1145/abstract..
23/8/2012
Ecologists are frequently confronted with the challenge of accurately modelling species
abundance. However, this task requires one to deal with both presence/absence as well as
abundance. Traditional Poisson regression models are not adequate when attempting to
deal with both issues simultaneously. Zero-inflated regression models have been
proposed to deal with this problem with much success. We extend these models to
incorporate both a multilevel hierarchical structure and spatial correlation. The model is
illustrated using a dataset concerning the Hypseleotris galii (Fire-tailed Gudgeon), a
native species to eastern Australia.
Poisson Regression Analysis
When the response variable had a Normal distribution we found that its mean
could be linked to a set of explanatory variables using a linear function like Y =
0
+
1
X
1
+
2
X
2 .+
k
X
k.

In the case of binary regression the fact that probability lies between 0-1 imposes
a constraint. The normality assumption of multiple linear regression is lost, and
so also is the assumption of constant variance. Without these assumptions the F
and t tests have no basis. The solution was to use the logistic transformation of
the probability p or logit p, such that
log
e
(p/1 p) =
0
+
1
1
+
2
2
.
n
n.

When the response variable is in the form of a count we face a yet different
constraint. Counts are all positive integers and for rare events the Poisson
distribution (rather than the Normal) is more appropriate since the Poisson mean
> 0. So the logarithm of the response variable is linked to a linear function of
explanatory variables such that
log
e
(Y) =
0
+
1
1
+
2
2
etc. and so Y = (e
0
) (e
11
) (e
22
) .. etc.

In other words, the typical Poisson regression model expresses the log outcome
rate as a linear function of a set of predictors.

Diunduh dari: http://www.oxfordjournals.org/our_journals/tropej/online/ma_chap13.pdf 28/8/2012
Nonlinear regression models for correlated count data
R. T. Burnett, J. Shedden, D. Krewski.
Environmetrics. Volume 3, Issue 2, pages 211222, 1992
Diunduh dari: http://onlinelibrary.wiley.com/doi/10.1002/env.3170030206/abstract .. 23/8/2012
In this article, nonlinear regression models for correlated count data are examined.
Correlation within clusters is modelled by a multivariate Gaussian mixing process on the
log-expectation scale. The regression parameters and the variance-covariance parameters
of the mixing process are estimated using quasi-likelihood methods. An example
involving temporal trends in hospital admissions for respiratory disease is used to
illustrate the methods proposed.
Regresi Non Linear & Regresi Logistik
Sedangkan Regresi logistik adalah salah satu bentuk regresi non-linear yang
mempunyai variabel dependen yang diskrit dan mempunyai sebaran binomial,
sedangkan variable independennya dapat terdiri dari variabel yang continu, diskrit,
dikotomus, ataupun gabungannya.
Regresi logistik dapat dibedakan menjadi 2, yaitu: Binary Logistic Regression (Regresi
Logistik Biner) dan Multinomial Logistic Regression (Regresi Logistik Multinomial).
Regresi Logistik biner digunakan ketika hanya ada 2 kemungkinan variabel respon
(Y), misal membeli dan tidak membeli. Sedangkan Regresi Logistik Multinomial
digunakan ketika pada variabel respon (Y) terdapat lebih dari 2 kategorisasi.

Diunduh dari: http://ian-manoppo.blogspot.com/2012/05/regresi-non-linear-regresi-
logistik.html ... 25/8/2012
Konsep Dasar :
Analisis regresi dapat digunakan untuk
berbagai model persamaan matematis,
misalnya : fungsi logarimic, fungsi
polinomial, fungsi power, exsponensial, dll
Analisis yang sering digunakan adalah
bentuk logaritmic baik yang biasa (Log X),
maupun logaritma natural (Ln X =
2,718Log X)
Koefisien yang diperoleh dari analisis
regresi logaritma/ fungsi pangkat akan
langsung menunjukkan elatisitasnya

Application of negative binomial regression models to the analysis of
quantal bioassays data
A. Maul, A. H. El-Shaarawi, J. F. Ferard.
Environmetrics. Volume 2, Issue 3, pages 253261, 1991
Diunduh dari: http://onlinelibrary.wiley.com/doi/10.1002/env.3770020302/abstract ..
23/8/2012
The problem of developing an approach for modelling the response of an organism to
chronic toxicity is discussed in this paper and illustrated by studying the toxic effect
of NaBr on the reproduction process of a population of Daphnia magna.

A general model is given which includes both the negative binomial and poisson
distributions as special cases depending on the values of a single parameter. The steps
involved in estimating the parameters of the model and testing the goodness-of-fit are
presented. In particular the iterative solution of the estimating equations are described
in detail along with the problem of setting confidence limits for model parameters.
This approach is useful in the analysis of quantal bioassay data.
Negative Binomial Regression for Event Count Dependent
Variables

Use the negative binomial regression if you have a count of events for each
observation of your dependent variable. The negative binomial model is
frequently used to estimate overdispersed event count models.
Negative binomial regression is for modeling count variables, usually for
overdispersed count outcome variables.

Diunduh dari: http://cran.r-project.org/web/packages/Zelig/vignettes/negbin.pdf
27/7/2012
. Regression rank scores in nonlinear models
Jana Jurekov
Source: N. Balakrishnan, Edsel A. Pea and Mervyn J. Silvapulle, eds., Beyond Parametrics
in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen (Beachwood,
Ohio, USA: Institute of Mathematical Statistics, 2008), 173-183.

Diunduh dari:
http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.ims
c/1207058272 .. 23/8/2012
Consider the nonlinear regression model
Y
i
=g(x
i
, )+e
i
, i=1, , n
with x
i
k
, =(
0
,
1
, ,
p
)
(compact in
p+1
), where g(x,
)=
0
+g
(x,
1
, ,
p
) is continuous, twice differentiable in and
monotone in components of . Following Gutenbrunner and
Jurekov (1992) and Jurekov and Prochzka (1994), we introduce
regression rank scores for model (1), and prove their asymptotic
properties under some regularity conditions. As an application, we
propose some tests in nonlinear regression models with nuisance
parameters.
Convenient properties of the nonlinear regression rank scores, lead
to an idea of their possible application in testing the significance of a
linear regression in the presence of a nonlinear regression with
nuisance parameters, or in testing other hypotheses with nuisance
parameters of nonlinear regression.

For instance, we can compare two sets of observations affected by a
nonlinear regression with unknown parameters.
Nonparametric harmonic regression for estuarine water quality
data
Melanie A. Autin, Don Edwards.
Environmetrics. Volume 21, Issue 6, pages 588605, September 2010
Periodicity is omnipresent in environmental time series data. For modeling estuarine water
quality variables, harmonic regression analysis has long been the standard for dealing with
periodicity.
Generalized additive models (GAMs) allow more flexibility in the response function.
They permit parametric, semiparametric, and nonparametric regression functions of the
predictor variables. We compare harmonic regression, GAMs with cubic regression splines,
and GAMs with cyclic regression splines in simulations and using water quality data
collected from the National Estuarine Reasearch Reserve System (NERRS).
While the classical harmonic regression model works well for clean, near-sinusoidal data, the
GAMs are competitive and are very promising for more complex data. The generalized
additive models are also more adaptive and require less-intervention.

Harmonic regression (aka trigonometric regression, cosinor regression) is a linear
regression model in which the predictor variables are trigonometric functions of a
single variable, usually a time-related variable. Harmonic regression is used in
modelling biological phenomena, which tends to exhibit periodic rhythms.

A simple harmonic regression model is

Y= mu + alpha*cos(2*pi/P)*t + beta*sin(2*pi/P)*t + epsilon,

where pi is 3.1415.... and P is the period for sin & cos.

More general models are

Y=mu + Sum( alpha(h)*cos(2*pi*h/P)*t + beta(h)*sin(2*pi*h/P)*t) + epsilon,

where h = 1,2,3,4 ... H

Harmonic models are much like polynomial models except that instead of using
powers of t, you use the trigonometric functions of t. The terms in the harmonic
model are orthogonal to one another.

Diunduh dari: http://math.yorku.ca/Who/Faculty/Monette/S-news/0510.html.....
27/8/2012
Analyzing wildfire threat counts using a negative binomial
regression model
J. A. Quintanilha, L. L. Ho.
Environmetrics. Special Issue: Special Issue on TIES Conference 2004. Volume 17, Issue 6,
pages 529538, September 2006
The fire-monitoring program managed by the Instituto Brasileiro do Meio Ambiente e
dos Recursos Naturais Renovveis collected fire pixel counts from 1998 to 2002 and
used them as a measure of wildfire threats for the Amazon region. The objective of the
study was to identify the most relevant explanatory variables related to the frequency of
fire pixel occurrence.
The sample unit was the municipality, the dependent variable was a function of fire pixel
counts, and the explanatory variables were related to land management, census, and
agricultural data. A generalized longitudinal linear model was used. The most relevant
explanatory variables were administrative limits, year, type of region, season,
percentages of deforested area and male population, extent of unpaved road, and density
of cattle. Approximately 95% of the standardized residuals resulting from fitting the
model were in the interval [2, +2].
BINOMIAL REGRESSION
In statistics, binomial regression is a technique in which the response (often
referred to as Y) is the result of a series of Bernoulli trials, or a series of one of two
possible disjoint outcomes (traditionally denoted "success" or 1, and "failure" or 0).
In binomial regression, the probability of a success is related to explanatory
variables: the corresponding concept in ordinary regression is to relate the mean
value of the unobserved response to explanatory variables.

Binomial regression models are essentially the same as binary choice models, one
type of discrete choice model.

The primary difference is in the theoretical motivation: Discrete choice models are
motivated using utility theory so as to handle various types of correlated and
uncorrelated choices, while binomial regression models are generally described in
terms of the generalized linear model, an attempt to generalize various types of
linear regression models.

Diunduh dari: http://en.wikipedia.org/wiki/Binomial_regression 27/8/2012
Nonparametric methods for spatial regression. An application to seismic
events

Mario Francisco-Fernndez, Alejandro Quintela-del-Ro, Rubn Fernndez-Casal.
Environmetrics. Special Issue: Spatio-Temporal Stochastic Modelling. (METMAV). Volume 23, Issue 1,
pages 8593, February 2012
Nonparametric regression estimation is a powerful tool to handle multidimensional data.
When a dependent data set is analyzed, classical techniques need to be modified to
provide useful results. In this work, different approximations to take the spatial
dependence into account are exposed. A bandwidth selection technique that adjusts the
generalized cross-validation criterion for the effect of spatial correlation, in the case of
bivariate local polynomial regression, is considered. Moreover, a bootstrap algorithm is
designed to assess the variability of the estimated spatial maps, and also to estimate the
probability of obtaining a response variable larger than or equal to a given threshold, for
a specific point. A simulation study checks the validity of the presented approaches in
practice. The broad applicability of the procedures is demonstrated on a data set of
earthquakes in the Iberian Peninsula.
SPATIAL REGRESSION MODELS

A spatial lag (SL) model
Assumes that dependencies exist directly among the levels of the
dependent variable
That is, the income at one location is affected by the income at the nearby
locations

A "lag" term, which is a specification of income at nearby locations, is included
in the regression, and its coefficient and p-value are interpreted as for the
independent variables.
As in OLS regression, we can include independent variables in the model.
Whereas we will see spatial autocorrelation in OLS residuals, the SL
model should account for spatial dependencies and the SL residuals would
not be autocorrelated,

Hence the SL residuals should not be distinguishable from random noise (i.e.,
have no consistent patterns or dependencies in them)

Diunduh dari: http://www.bisolutions.us/A-Brief-Introduction-to-Spatial-Regression.php .
27/8/2012
Time-series regression models to study the short-term effects
of environmental factors on health
Aurelio Tobas and Marc Saez.

Diunduh dari: http://www3.udg.edu/fcee/economia/n11.pdf .. 23/8/2012
Time series regression models are especially suitable in epidemiology for evaluating
short-term effects of time-varying exposures on health. The problem is that potential for
confounding in time series regression is very high. Thus, it is important that trend and
seasonality are properly accounted for. Our paper reviews the statistical models
commonly used in time-series regression methods, specially allowing for serial
correlation, make them potentially useful for selected epidemiological purposes.

In particular, we discuss the use of time-series regression for counts using a wide range
Generalised Linear Models as well as Generalised Additive Models. In addition, recently
critical points in using statistical software for GAM were stressed, and reanalyses of time
series data on air pollution and health were performed in order to update already
published.
Applications are offered through an example on the relationship between asthma
emergency admissions and photochemical air pollutants in Madrid for the period 1995-
1998, of how these methods are employed.

In the analysis of epidemiological time series data consisting of counts, the
underlying mechanism being modelled is a Poisson process with a homogeneous
risk l, i.e. the expected number of counts on day t, to the underlying population is
assumed. The probability of yt occurrences on a given day t is defined by

The Poisson regression model assumes

where xt is the column vector of independent variables on day t with regression
coefficients b and yt is the dependent variable on day t.
Analyzing of regression model of environmental health quality of
residential in slum areas
Ghasem Abedi (1) *, Farideh Rostami (2), Behzad Nikpor.
I nternational J ournal of Collaborative Research on I nternal Medicine & Public Health.
Vol. 4 No. 2 (2012)
Diunduh dari: http://www.iomcworld.com/ijcrimph/files/v04-n02-06.pdf .. 23/8/2012
Background & Objectives:
Study of cities development is the sign of disorganized and critical situation of
residential health as one of the important policy issue of city development. Slum or
marinated areas have high sensitivity against other residential places, because of
their high population, lack of organic relation with city, and limitation in small
places. The current study was conducted in three slum areas of the city of Sari, Iran,
to measure the environmental health quality of slum places, and to determine their
environmental health quality.

Methods:
Hierarchical Multiple Regression Analysis (HMR) and namely, Analytic
Hierarchy
Process (AHP) method was used as the analytical technique for investigation and
analysis of data.

Results:
The results show that the health quality in the studied residences was in weak
level
(1<1.92<5) and there were significant relationship between the criteria and sub-
criteria in different levels with the variable of environment health quality of slums.

Conclusion:
Preparation for constructing suitable urban residences as well as providing
conditions to benefit from urban advantages in line with an enriched urban culture is
of utmost importance.

Interactions between Economic Growth and Environmental Quality
in Shenzhen, Chinas First Special Economic Zone
Xiaozi Liua,*, Gerhard K. Heiligb, Junmiao Chenc, Mikko Heino
I nterim Reports on work of the I nternational I nstitute for Applied Systems Analysis.
Approved by
Ulf Dieckmann Program Leader, Evolution and Ecology Program. September 2006
Diunduh dari: http://www.iiasa.ac.at/Admin/PUB/Documents/IR-06-032.pdf .. 23/8/2012
The relationship between economic development and environmental quality
is a debated topic. Environmental Kuznets Curve (EKC) is one prominent
hypothesis, positing an inverted U-shaped development-environment
relationship. Here we test this hypothesis using data from Shenzhen, Peoples
Republic of China. Established in 1980 as the first special economic zone in
China, Shenzhen has developed from a small village into a large urban-
industrial agglomeration with the highest income level in the country. The
enormous expansion of infrastructure, industrial sites and urban settlements
has profoundly changed the local environment. We utilize environmental
monitoring data from Shenzhen on concentration of pollutants in ambient air,
main rivers, and near shore waters from 1989 to 2003. The results show that
production-induced pollutants support EKC while consumption-induced
pollutants do not support it.
EKC is only one of many types of environment-development relationships in
Shenzhen, China. Upward pattern, downward pattern and inverted EKC are three
other featured relationships. The general pattern is that production-induced pollutants
tend to support EKC while consumption-induced pollutants do not. For rivers, the
emergence of EKC is mainly due to the relocation of pollution and direct clean-up
actions, which are driven by the market and by government interventions. For organic
pollutants, the robust upward pattern is due to the scarcity of sewage treatment
systems and other public sanitation facilities. However, the income elasticity of
environmental quality demand is adding pressures for good environmental
governance. Once regulatory regimes are put in place, upward patterns for
consumption pollutants are expected to turn down. As the final note, we emphasize
that the time frame of observations may have decisive effects when interpreting
empirical environment-development relationships.
Ecologic regression analysis and the study of the influence of air quality on
mortality.
S Selvin, D Merrill, L Wong, and S T Sacks.
Environ Health Perspect. 1984 March; 54: 333340.
Diunduh dari: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1568150/ .. 23/8/2012
This presentation focuses entirely on the use and evaluation of regression analysis
applied to ecologic data as a method to study the effects of ambient air pollution on
mortality rates. Using extensive national data on mortality, air quality and socio-
economic status regression analyses are used to study the influence of air quality on
mortality.
The analytic methods and data are selected in such a way that direct comparisons can
be made with other ecologic regression studies of mortality and air quality.
Analyses are performed by use of two types of geographic areas, age-specific
mortality of both males and females and three pollutants (total suspended particulates,
sulfur dioxide and nitrogen dioxide).
The overall results indicate no persuasive evidence exists of a link between air quality
and general mortality levels.
Overall, it is concluded that linear regression analysis applied to nationally collected
ecologic data cannot be used to usefully infer a causal relationship between air quality
and mortality which is in direct contradiction to other major published studies.

Biometrics. 2003 Mar;59(1):9-17.
Sensitivity analyses for ecological regression.
Wakefield J.
In many ecological regression studies investigating associations between environmental
exposures and health outcomes, the observed relative risks are in the range 1.0-2.0. The
interpretation of such small relative risks is difficult due to a variety of biases--some of
which are unique to ecological data, since they arise from within-area variability in
exposures/confounders. The potential for residual spatial dependence, due to unmeasured
confounders and/or data anomalies with spatial structure, must also be considered, though
it often will be of secondary importance when compared to the likely effects of
unmeasured confounding and within-area variability in exposures/confounders. Methods
for addressing sensitivity to these issues are described, along with an approach for
assessing the implications of spatial dependence.
An ecological study of the association between myocardial infarction and magnesium is
critically reevaluated to determine potential sources of bias. It is argued that the
sophistication of the statistical analysis should not outweigh the quality of the data, and
that finessing models for spatial dependence will often not be merited in the context of
ecological regression.
Economic Growth and Air Quality in China
Daigee Shaw, Arwin Pang, Ming-Feng Hung, Wei Cen.

Diunduh dari: http://www.webmeets.com/files/papers/ERE/WC3/182/EKC_China.pdf ..
23/8/2012
This paper investigates the relationship between economic development and air
quality by examining Environmental Kuznets Curves from 1992 to 2001 for Mainland
China. We construct simultaneity models for three air pollutants (SO2, NOX and TSP)
and income. We then compile a panel data set of air quality, production, income and
environmental policy variables for 99 cities for these years to estimate the
simultaneity models. The regression results indicate that the EKC hypothesis is
supported in the cases of SO2 and TSP. The only one pollutant that does not reflect
the EKC relationship is NOx, probably due to the fact that it is the most
expensive-to-abate pollutant.
In this paper, we specify a simultaneity model to investigate the relationship between
urban air pollution and economic development in China. Since the economy and the
environment are determined in combination, the relationship must thus be estimated
simultaneously.
The basic simultaneity model is as follows:

where subscript i denotes city and t year. Equations (1) are the pollution equations.
ti P denotes air pollution in year t for city i. The intercepts ( t , t and t ) denote
the time-specific effects in the fixed-effects model that we use to analyze the panel
data. X is a vector of exogenous variables. Y is per capita GDP. Equation (1) is a
quadratic model with both linear term and squared term of income. If the subscript i
of the coefficients of the lnY term and its squared term are positive and negative,
respectively, then the EKC hypothesis holds.
Understanding air quality data using nonparametric regression analysis
by Yoon, Heesong, Ph.D.,
UNIVERSITY OF SOUTHERN CALIFORNIA, 2006, 253 pages; 3237738
Diunduh dari: http://gradworks.umi.com/32/37/3237738.html .. 23/8/2012
This research reports on the application of nonparametric regression analysis to air
quality data. As required by law, government agencies have been collecting large
volumes of ambient air quality data to determine compliance with federal and state air
quality standards. Although these data are a perfect candidate for data mining by the
nonparametric regression, it has not been applied to air quality data analysis previously.
The data used during this research consisted of about two years of one-hour average
concentrations of ultrafine particle number, PM10, CO, NOx, SO
2
, and O
3
, collected by
regulatory agencies as part of their routine air quality monitoring. Hourly wind speed and
wind direction were also available and were an important part of the analysis. There were
four monitoring sites: Atascadero is a rural city in a coastal valley of central California
while Long Beach, Glendora, and Upland are all located in the heavily populated Los
Angeles County in southern California.
Three-dimensional nonparametric regression charts showing the effect of wind speed and
direction on pollutant concentrations were especially useful in assessing the impact of
local sources. Pollutant concentrations in Atascadero were expected to be mainly from
local roadway sources and the results from this analysis were consistent with such an
expectation. With regards to Long Beach, the results were different from conventional
expectation. High concentrations of CO, NOx, and PM10 were not related to nearby
heavily used freeways, but were primarily the result of pollution from sources to the
north being transported to the site by late night and early morning drainage winds. This
effect was especially strong during the winter months. The inland sites, Glendora and
Upland, shared many similarities in most results. However, the analysis revealed some
important differences between them. Glendora showed more impact from transported
pollutants whereas Upland was more directly impacted by local businesses and traffic
along adjacent roads.
REGRESI NON-PARAMETRIK
Diunduh dari: http://en.wikipedia.org/wiki/Nonparametric_regression .. 27/8/2012
Nonparametric regression is a form of regression analysis in which
the predictor does not take a predetermined form but is constructed
according to information derived from the data.
Nonparametric regression requires larger sample sizes than
regression based on parametric models because the data must supply
the model structure as well as the model estimates.

KERNEL REGRESSION
Kernel regression estimates the continuous dependent variable from a
limited set of data points by convolving the data points' locations with
a kernel function - approximately speaking, the kernel function
specifies how to "blur" the influence of the data points so that their
values can be used to predict the value for nearby locations.

Example of a curve (red line)
fit to a small data set (black
points) with nonparametric
regression using a Gaussian
kernel smoother.
The pink shaded area
illustrates the kernel function
applied to obtain an estimate
of y for a given value of x.
The kernel function defines
the weight given to each data
point in producing the
estimate for a target point.

NONPARAMETRIC MULTIPLICATIVE
REGRESSION
Diunduh dari: http://en.wikipedia.org/wiki/Nonparametric_regression .. 27/8/2012
Nonparametric multiplicative regression (NPMR) is a form of nonparametric regression
based on multiplicative kernel estimation. Like other regression methods, the goal is to
estimate a response (dependent variable) based on one or more predictors (independent
variables). NPMR can be a good choice for a regression method if the following are true:
The shape of the response surface is unknown.
The predictors are likely to interact in producing the response; in other words, the shape
of the response to one predictor is likely to depend on other predictors.
The response is either a quantitative or binary (0/1) variable.
This is a smoothing technique that can be cross-validated and applied in a predictive way
Two kinds of kernels used with kernel
smoothers for nonparametric regression.
. Use of Gaussian kernels for nonparametric
multiplicative regression with two predictors.
The weights from the kernel function for each
predictor are multiplied to obtain a weight for a
given data point in estimating a response
variable (dependent variable) at a target point in
the predictor space.
NPMR BEHAVES LIKE AN ORGANISM
Diunduh dari: http://en.wikipedia.org/wiki/Nonparametric_regression.. 27/8/2012
NPMR has been useful for modeling the response of an organism to its environment.
Organismal response to environment tend to be nonlinear and have complex interactions
among predictors. NPMR allows you to model automatically the complex interactions
among predictors in much the same way that organisms integrate the numerous factors
affecting their performance.
[1]

A key biological feature of an NPMR model is that failure of an organism to tolerate any
single dimension of the predictor space results in overall failure of the organism. For
example, assume that a plant needs a certain range of moisture in a particular
temperature range. If either temperature or moisture fall outside the tolerance of the
organism, then the organism dies. If it is too hot, then no amount of moisture can
compensate to result in survival of the plant.

Mathematically this works with NPMR because the product of the weights for the target
point is zero or near zero if any of the weights for individual predictors (moisture or
temperature) are zero or near zero. Note further that in this simple example, the second
condition listed above is probably true: the response of the plant to moisture probably
depends on temperature and vice-versa.

Optimizing the selection of predictors and their smoothing parameters in a multiplicative
model is computationally intensive. With a large pool of predictors, the computer must
search through huge number of potential models in search for the best model. The best
model has the best fit, subject to overfitting constraints or penalties.
Overfitting Controls
Understanding and using these controls on overfitting is
essential to effective modeling with nonparametric
regression. Nonparametric regression models can become
overfit either by including too many predictors or by using
small smoothing parameters (a.k.a. bandwidth or tolerance).
This can make a big difference with special problems, such
as small data sets or clumped distributions along predictor
variables.
THE LOCAL MODEL
Diunduh dari: http://en.wikipedia.org/wiki/Nonparametric_regression.. 27/8/2012
NPMR can be applied with several different kinds of local models. By "local
model" we mean the way that data points near a target point in the predictor
space are combined to produce an estimate for the target point. The most
common choices for the local models are the local mean estimator, a local
linear estimator, or a local logistic estimator. In each case the weights can be
extended multiplicatively to multiple dimensions dimensions.
In words, the estimate of the response is a local estimate (for example a local
mean) of the observed values, each value weighted by its proximity to the
target point in the predictor space, the weights being the product of weights
for individual predictors. The model allows interactions, because weights for
individual predictors are combined by multiplication rather than addition.
Two commonly used forms of a local model used in nonparametric
regression, contrasted with a simple linear model.
SEMIPARAMETRIC REGRESSION
Diunduh dari: http://en.wikipedia.org/wiki/Semiparametric_regression .. 27/8/2012
In statistics, semiparametric regression includes regression models that
combine parametric and nonparametric models.

They are often used in situations where the fully nonparametric model may
not perform well or when the researcher wants to use a parametric model but
the functional form with respect to a subset of the regressors or the density of
the errors is not known.

Semiparametric regression models are a particular type of semiparametric
modelling and, since semiparametric models contain a parametric component,
they rely on parametric assumptions and may be misspecified and
inconsistent, just like a fully parametric model.

Partially linear models
A partially linear model is given by:

where Yi is the dependent variable, Xi and Zi are vectors of explanatory variables,
is a p x 1 vector of unknown parameters and .

The parametric part of the partially linear model is given by the parameter vector
while the nonparametric part is the unknown function g(Zi).

The data is assumed to be i.i.d. with

and the model allows for a conditionally heteroskedastic error process

of unknown form. This type of model was proposed
by Robinson (1988) and extended to handle categorical covariates by Racine and Liu
(2007).
.
POLYNOMIAL REGRESSION
Diunduh dari: http://en.wikipedia.org/wiki/File:Polyreg_scheffe.svg .. 27/8/2012
In statistics, polynomial regression is a form of linear regression in which
the relationship between the independent variable x and the dependent
variable y is modelled as an nth order polynomial. Polynomial regression fits
a nonlinear relationship between the value of x and the corresponding
conditional mean of y, denoted E(y|x), and has been used to describe
nonlinear phenomena such as the growth rate of tissues, the distribution of
carbon isotopes in lake sediments, and the progression of disease epidemics.

Although polynomial regression fits a nonlinear model to the data, as a
statistical estimation problem it is linear, in the sense that the regression
function E(y|x) is linear in the unknown parameters that are estimated from
the data.
For this reason, polynomial regression is considered to be a special case of
multiple linear regression.
A cubic polynomial regression fit to a simulated data set.
The confidence band is a 95% simultaneous confidence band constructed
using the Scheff approach.
Diunduh dari: www.yorku.ca/.../Polynomial%20regression.p..... 27/8/2012
General formula of the polynomial model:

E(Y ) =
0
+
1
X +
2
X
2
+
3
X
3
+ +
p
X
p

One way of choosing which polynomial model should be used, a
forward selection procedure is implemented where we begin by fitting
a linear regression data:

E(Y ) =
0
+
1
X
Regression Analysis
Linear Regression Nonlinear Regression
Exponential
Model
Polynomial
Model
Power
Model
Saturation
Growth Model
POLYNOMIAL REGRESSION

Kompendium Analisis Regresi Dalam Kajian Lingkungan

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Kompendium Analisis Regresi Dalam Kajian Lingkungan

Uploaded by

Copyright:

Available Formats

Diabstraksikan oleh: Smno.psl.ppsub.

You might also like