You are on page 1of 63

UNIVERSITY OF GONDAR

FACULTY OF NATURAL AND COMPUTATIONAL SCIENCES

DEPARTMENT OF STATISTICS

ORDINAL LOGISTIC REGRESSION ANALYSIS


ON
THE PREVALENCE AND RISK FACTORS OF ANEMIA AMONG
REPRODUCTIVE AGE GROUP WOMEN IN ETHIOPIA

BY

HAILE YISMAW ALEMAYEHU

THESIS SUBMITTED TO UNIVERSITY OF GONDAR IN PARTIAL FULFILLMENT


OF THE REQUIREMENTS FOR THE DEGREE OF MASTERS OF SCIENCE IN
STATISTICS (BIOSTATISTICS)

August 2014
Gondar, Ethiopia
UNIVERSITY OF GONDAR

FACULTY OF NATURAL AND COMPUTATIONAL SCIENCES

DEPARTMENT OF STATISTICS

ORDINAL LOGISTIC REGRESSION ANALYSIS


ON
THE PREVALENCE AND RISK FACTORS OF ANEMIA AMONG
REPRODUCTIVE AGE GROUP WOMEN IN ETHIOPIA

BY

HAILE YISMAW ALEMAYEHU

THESIS SUBMITTED TO UNIVERSITY OF GONDAR IN PARTIAL FULFILLMENT


OF THE REQUIREMENTS FOR THE DEGREE OF MASTERS OF SCIENCE IN
STATISTICS (BIOSTATISTICS)

ADVISOR: LEAKEMARIAM BERHE (DR.)

August 2014
Gondar, Ethiopia

I
Declaration

I, the undersigned, declare that the thesis is my original work, has not been presented for a
degree in any other university and that all sources of material used for the thesis have been duly
acknowledged.

Declared by:

Name: Haile Yismaw

Signature: -------------------------------

Date:------------------- 2014

This thesis has been submitted for examination with my approval as a University Advisor.

Approved by the Advisor:

Name: Leakemariam Berhe (Dr.)

Signature: ---------------------------------

Date: …, …, 2014

II
Approved by the Board of Examiners

–––––––––––––––––––– ––––––––––––––––––– –––––––––––––––––––

Chairman Signature Date

–––––––––––––––––––– ––––––––––––––––––– –––––––––––––––––––

Examiner Signature Date

–––––––––––––––––––– ––––––––––––––––––– –––––––––––––––––––

Examiner Signature Date

III
Acknowledgements
First of all, I would like to thank my god for bringing me up to where I desire and motivation to
be retained.

My special gratitude goes to my advisor, Dr Leakemariam Berhe for his invaluable comments
and suggestions with great patient that contributed to the successful realization of the study.

Many thanks to my parents especially to my brother and mother; I appreciate all your
contributions of initiation, motivation and funding to have this opportunity in my life.

IV
ABSTRACT

The study attempted to investigate risk factors of the prevalence of anemia in reproduction age
group women in Ethiopia based on Ethiopian demographic and health survey (2011) study. The
data was collected by CSA. An ordinal logistic regression analysis was performed to determine
the potential risk factors of anemia.

The study showed that region, place of residence, educational background, religion, pregnancy
status, marital status and body mass index are determinant factors for the existence of anemia.
It showed that women who are from Somali and Dire Dawa are 2.802 and 2.347 times more
likely severely anemic compared to those Addis Ababa women of reproductive age group. It is
recommended that Identifying high prevalence rate areas, empowering women in education
and economic status, establishing well planed intervention programs to reduce severity of
anemia in the reproductive age group women is required.

V
Table of contents
Content Pages
Acknowledgments…………………………………………………………………………………………….IV
Abstract………………………………………………………………………………………………………….…V
Table of contents………………………………………………………………………………………………VI
List of abbreviations……………………………………………………………………………………….… X
CHAPTER ONE……………………………………………………………………………………………………….1
1. Introduction………………………………………………………………………………………………1
1.1 Background of the study……………………………………………………………….…..1
1.2 Statement of the problem………………………………………………………….………..2
1.3 Objectives of the study…………………………………………………………………….…..4
1.4 Significance of the study……………………………………………………………….………4
CHAPTER TWO……………………………………………………………………………………………………..…6
2. LITERATURE REVIEW ……………..……………………………………………………………………6
2.1 Background of the study……….…………………………………………………………......6
2.2 Causes of anemia…………………………………………………………………………….…….7
CHAPTER THREE………………………………………………………………………………………………………8
3. METHODOLOGY…………………………………………………………………………………….……8
3.1 Study Area.....…………………………………………………………………………………………8
3.2 Source of data………………………….…………………………………………………………….8
3.3 Variables considered in the study…………………………………………………………..9
3.3.1 dependent variable………..……..…………………………………………………………9
3.3.2 independent variable…………..……………………………………………..…………..10

3.4 Statistical model…………………………………………………………….……..…………..17


3.4.1 Logistic regression model…………………………………………………………………17
3.4.1.1 Multiple logistic regression….……………………………….…………….….…..17
3 .4.1.2 Assumption of logistic regression…………………………………………………..19

3.4.2 Ordinal logistic regression………..…………………………………………….………………20

VI
3.4.2.1 Proportional odds model…………………………………………………………………..……20
3.4.2.2 Partial Proportional odds model……………………………………………………………..23
3.4.2.3 Continuation ratio model………………………..……………………………………………...24
3.4.2.4 Parallel lines assumption……………….………………………………………………………...24
3.4.2.5 Odds ratio……………………………………….………………………….………………………..….24
3.4.2.6 Parameter estimation for ordinal logistic regression………….…………………...25
3.4.2.6.1 Maximum Likelihood method ………………………………….…….………….…….25

3.4.2.7 The Wald statistic………………………..…………………………………….…………..……….26

3.4.3. Model evaluation……………………………………………..………………………….……..…………27

3.4.3.1 Goodness of fit test…...................................................................................27


3.4.3.2 Test of parallel lines………………………………………….…………….………………..…….27
3.4.3.3 Likelihood ratio test…………………………………………………………………..……….….27

3.4.3.4 Pearson and deviance Chi-Square statistic………….……………………………..….28

CHAPTER FOUR …………………………………………………………………………………………………..…..….27

4. Statistical analysis and Discussion.…………………………………………………………..30


4.1. Descriptive statistics…………………………………………………………………………………..…30

4.2 Multiple ordinal logistic regression analysis result.…………………….……..………..….32

4.3 Evaluation of the model………………….…..…………………………………….….………….……35

4.4 Interpretation of results………………………….……………………………………………………..36

4.5 Discussion of results………………………….…………………………………………………………..38

Chapter five………………………………………………………………………………………………..………40
VII
5. Conclusion and Recommendation …………………………….……………………………………40

5.1 Conclusion………………………………………………………………………………………………….….…....40

5.2 Recommendation……………………………………………………………………………………………….....40

References ……………………………………………………………………………………….….……41

Appendix………………………………………………………….……………………………..…..…….45

Descriptive statistics………………………………………………………………….…………….….…………….45

SAS codes………………………………………………………………………………….…………….….…………….52

List of Tables
Table 1: Hemoglobin Values defining Anemia in Women…………………………………………….…...9

VIII
Table 2: Description of covariates……………………………………………………………………………….……11

Table 3: Frequency and percentage distribution of anemia status

By the levels of covariates………………………………………………………………………………………………13

Table 4: Frequency distribution of anemia in women of

Reproductive age group…………………………………………………………………………………………..…..…30

Table 5: Ordinal logistic regression analysis result having all covariates.…………………………32

Table6: Ordinal logistic regression analysis result excluding few covariates…………………….33

Table7: model fit result…………………….…………………………………………………………………………….34

Table 8: Percentage distributions of the prevalence of anemia among Women of

Reproductive age group (15-49 years) in the category of covariates…………………………..…45

Table 9: Analysis of Maximum Likelihood Estimates………………………………………………….…..48

Table 10: Odds Ratio Estimates………………………………………………………….………………………….50

LIST OF ABBREVIATIONS
AIP Anemia in pregnancy

IX
HB Hemoglobin

OR odds ratio

WHO world health organization

EDHS Ethiopian demographic and health survey

NRERC National research ethics review committee

FMOH Federal ministry of health

POM Proportional odds model

DHS Demographic and health survey

SNNP Southern nations and nationalists of people

RHS Reproductive health survey

LBW low birth weight

CDH center of disease control and prevention

GOE Government of Ethiopia

PASDEP Plan for Accelerated and Sustained Development to End Poverty

NNS the National Nutrition Strategy

AOR Adjusted odds ratio

X
Chapter one
1. Introduction
1.1 Background of the study

Anemia is a condition characterized by a decrease in the concentration of hemoglobin in the


blood. Hemoglobin is necessary for transporting oxygen to tissues and organs in the body. The
reduction in oxygen available to organs and tissues due to low hemoglobin level (less than 12.0
g/dl) is responsible for many of the symptoms experienced by anemic people (Mishra,
Ahluwalia, Garg, Kar, and Panda, 2012). The effects of anemia include general body weakness,
frequent tiredness, and lowered resistance to disease (sharmanov, 2000; Bentley and Griffiths,
2003; Gillespie and Johnston, 1998; Stoltzfus, 1997; Allen, 1997).

Anemia is a severe global public health problem with serious consequences for the humans
and socio-economic systems in general. Despite health service improvements over the past
decade, anemia remains a significant public health problem especially for women and children
(WHO, 2001).
Unfavorable pregnancy outcomes have been reported to be more common in anemic
women than non-anemic women due to low concentration of hemoglobin (INACG, 1989).
Women with severe anemia can experience difficulty meeting oxygen transport requirements
near and at delivery, especially if significant hemorrhage occurs. This may be an underlying
cause of maternal death and antenatal and prenatal infant loss (Fleming, 1987).
According to WHO report (2008), based on the studies conducted from 1993 to 2005, the
estimated global prevalence of anemia was 24.8%, and it affected 1.62 billion people
worldwide. The same WHO study also reported that estimated prevalence was 41.8% in
pregnant women and 30.2% in non-pregnant women worldwide. That is, 56 million pregnant
women and 468 million non- pregnant women were estimated to be anemic. The condition is
inherently associated with poverty and is therefore particularly prevalent in the developing
world where the problem is often exacerbated by limited access to appropriate healthcare and
treatment options (WHO, 2008).

1
Anemia occurs at all stages of life, but is particularly prevalent in women of reproductive and
children (WHO, 2009). The World Health Organization (WHO, 2011) estimates that iron
deficiency is responsible for approximately 50% of all anemia cases. Other significant causes,
the relative contributions of which vary by geographic location, include deficiencies of other
nutrients, malaria, helminth (worm) infections, and a variety of other diseases (WHO, 2008).

According to WHO classifications, anemia is a moderate public health problem in Ethiopia. The
Demographic and Health Survey (DHS) in Ethiopia (2005) reported that 30.6 percent of
pregnant women, 29.8 percent of lactating women, 23.9 percent of non-pregnant women and
24.8 percent of adolescent girls are anemic. Severe anemia was found 3.0 percent in pregnant
women and 1.3 percent in non pregnant women of reproductive age group.
The various causal factors that contribute to the existence of anemia are multiple and complex.
Therefore, it is critical to collect accurate information about them to provide evidence for the
best interventions of anaemia control. Determination of factors that influence the occurrence
of anemia is fundamental for the implementation of control measures. In view of this, the effort
of this work is to determine the potential risk factors that are responsible for the prevalence of
anemia among women of reproductive age groups using ordinal logistic regression in Ethiopia.

1.2 Statement of the Problem


The consequences of anemia in women include increased fatigue, decreased cognitive ability,
decreased work productivity, increased morbidity and mortality. In fact, women with severe
anemia in pregnancy have a 3.5 times greater chance of dying from obstetric complications
compared with non-anemic pregnant women (Bhargava et al, 2001). The consequences of
anemia can be quite severe and are often irreversible, affecting both human life and
socioeconomic aspect of individuals and families. Severe anemia (Hb <70g/L) reduces a
woman’s ability to survive bleeding during and after childbirth and is considered a major cause
of maternal morbidity and mortality, particularly in the developing countries (WHO, 1992).
Anemia during pregnancy is also associated with increased risk of premature delivery and low
birth-weight, resulting in an increase of prenatal mortality (WHO, 2008).

2
Despite the high prevalence and serious consequences of anemia, there have been few
reported studies (Jemal, 2010; WHO, 2008; bamlaku, 2009) in developing countries. Moreover,
although it is well known that anemia is a result of multiple causes, there are few reported case
studies (WHO, 2001; World Bank, 1994) In Ethiopia. Improvements of maternal, newborn, and
child health is a high priority area in its health policy. Also reduction in child and maternal
mortality rates and accelerating sustained development to end poverty are major objectives of
the country. In an effort to effectively address maternal health such as anemia and successfully
meet its health policy

- Causes of anemia need to be identified based on empirical study.

- Statistical methods used to analyze the data should be chosen based on the nature of the data
considered.

Studies have been conducted to see factors that are responsible for the prevalence of anemia
among women of reproductive age (Jemal, 2010; Central Statistical Agency and ORC Macro,
2012). Despite the importance of several comparative studies conducted mostly under the
support of WHO that provided world and regional estimates of anemia, the credibility of some
of these comparisons is doubtful because of differences in methodology (DeMaeyer and Adiels-
Tegman, 1985; WHO, 1992). Hence, this study explored factors for the prevalence of anemia
among women of reproductive age group by using, ordinal logistic regression, an appropriate
method of data analyzing.

1.3 Objectives of the Study


3
General Objective

The general objective of this study is to investigate the risk factors for the prevalence of anemia
in reproductive age group women in Ethiopia using ordinal logistic regression.

Specific objectives

 To fit ordinal logistic regression model for the prevalence of anemia in the women of
reproductive age group in Ethiopia.
 To estimate the relative contribution of the significant factors to the prevalence of
anemia in the reproductive age group women in Ethiopia.
 To provide information on the prevalence and risk factors of anemia in the women of
reproductive age group to policy makers and stakeholders.

1.4 Significance of the study

The importance of this study is to provide relevant information to policy makers, health
practitioners and researchers about the possible risk factors that are contributing to the
prevalence of anemia among women of reproduction age group in Ethiopia. Effective
management of anemia requires determination of main risk factors in order to design and
implement an integrated package of interventions to address the problem.

This study will help in identifying women at greatest risk of anemia and priority areas for
action, especially when resources are limited. It also will facilitate the monitoring and
assessment of progress towards international goals of preventing and controlling iron
deficiency

4
CHAPTER TWO

2. LITERATURE REVIEW

2.1 Background
Mishra et al. (2010) carried out a cross-sectional study of the prevalence of anaemia in females
of reproductive age 15-45 years old in Barara village of Ambala district, India, with a
representative sample of 598 by using a logistic regression analysis. They found that the
variable income has significant effect on the prevalence of anaemia. The study also revealed
that less than 30 years age group women have more chances to have moderate & severe
anaemia (<9 gm/dl) compared to more than 30 years age group women (Odds Ratio = 1.31,
95% C.I. = 0.88–1.96) while less income group women have more chances to have moderate &
severe anaemia compared to more income group women (Odds Ratio = 4.80, 95% C.I. = 2.96-
7.79).
Samson and Fikre (2011) carried out a cross-sectional study in Ethiopia to see the risk factors of
anemia among women of reproductive age using binary logistic regression. They found that
women who were living in rural areas at the time of the survey were twice more likely to have
anemia than urban areas with AOR of 1.99 (95% CI: 1.73-2.30). Also, illiterates and those who
had primary level education experienced significantly higher risk of anemia compared to
women with secondary or higher level of education with AOR of 2.59 (95% CI: 1.62-4.14) and
1.83 (95% CI: 1.13-2.96), respectively. They also found that religion and marital status and of
respondents did not have association with risk of anemia.

Jemal (2010) conducted a cross-sectional community-based study in Ethiopia to assess the


magnitude of anemia and the factors responsible for anemia among women of reproductive
age group, 15-49 years. Pearson's chi-square tests and logistic regression model were applied to
test for significant variables. He found that women who had more children aged 2-5 years and
women who had no formal education were identified as risk factors of anemia prevalence.

5
Dey et al. (2010) conducted a study in Meghalaya, India, using 2005-2006 National Family
Health Survey (NFHS -3) data to explore the predictors responsible for the prevalence of
anaemia. They explored several variables such as age, place of residence, nutritional status,
number of children ever born, pregnancy status, educational achievement, and economic
status, using binomial logistic regression. They found that all the predictors are statistically
significant to the prevalence of anemia except the number of children ever born. They also
found that women of age group 20-24 years old were at high risk of anemia with an odds ratio
of 1.509.
Bentley and Griffiths (2003) carried out a cross-sectional study on women’s (15 – 49 years old)
hemoglobin status in Andhra Pradesh, southern Indian state, using ordinal logistic regression to
identify socio-economic, regional and demographic determinants of anemia. They found that
women from the urban areas with low standard of living groups are more likely to be mildly,
moderately or severely anemic (OR=1.76, 95% Cl=1.25, 2.46) compared to women from urban
areas with high standard of living groups. Also, religious groups other than Muslim or Hindu are
more likely to be anemic (OR=1.59, 95% CI=1.28, 1.98). Women who have received at least a
high school education significantly less likely to be mildly, moderately or severely anemic
(OR=0.65, 95% CI=0.45, 0.94); women body mass index less than 18.5 kg/m2 are significantly
more likely to be anemic (OR=1.14, 95% CI=1.00, 1.29) than those with a normal BMI (18.5 –
24.9 kg/m2). Also they found that overweight respondents with a BMI ≥25 kg/m2 are
significantly less likely to be anemic than those with a normal BMI (OR=0.76, 95% CI=0.62,
0.93).
Pala and Dundar (2007) conducted cross-sectional study in Bursa, Turkey, to determine the
anaemia prevalence and risk factors in women of reproductive age group using multiple logistic
regression analysis. They found that consumption of more than 2 sanitary pads during
menstruation (OR=3.67, 95% CI 2.30-5.88; P<0.001), and more than five days of menstrual
bleeding (OR=3.01, 95% CI 1.94-4.66; P<0.001) are risk factors for anaemia, but not found
independent relation between anaemia and age, education, marital status, job, parity, BMI,
regularity of cycle and length of cycle.

6
2.2 Causes of anemia

Certain forms of anemia are hereditary and infants may be affected from the time of birth
(Bhargava et al, 2001; Colomer et al, 1990). Women in the childbearing years are particularly
susceptible to iron-deficiency anemia because of the blood loss from menstruation and the
increased blood supply demands during pregnancy. They also may have a greater risk of
developing anemia because of poor diet and other medical conditions.

There are several types of anemia. All are very different in their causes and treatments. Iron-
deficiency anemia, the most common type, is very treatable with diet changes and iron
supplements. Some forms of anemia, like the anemia that develops during pregnancy, are even
considered normal. However, some types of anemia may present lifelong health problems.

Regarding socio-demographic factors, being from lower economic and education category, and
living in rural areas were identified as predisposing factors to anemia (Tata, 2009). Hence,
empowering women in terms of education and economic status would have positive
contributions to avert the problem.

7
CHAPTER THREE
3. METHODOLOGY
3.1 Study area

This study is conducted in Ethiopia. It is located in the Horn of Africa bordered by Eritrea to the
north, Djibouti and Somalia to the east, Sudan and South Sudan to the west, and Kenya to the
south. It is a country with great climatic, geographic and cultural diversity.

According to Popular census Commission report (2008), it has a population of 78,254,090 and is
the second most densely populated country in Sub-Saharan Africa and registered with an
average annual population growth rate of 2.6% since 1994.Urban population constitutes 17.6%
of the whole population, whereas the remaining 82.4% is concentrated in rural areas. Out of
the total population, 50.3% is composed of males and 49.7% of females. More than one fifth
(23.3%) of the population are women in the reproductive age group.

3.2 source of data

The required data for the study and necessary information have been taken from Ethiopia
demographic and health survey collected by the Ethiopian Central Statistics Agency (CSA) in
2011. Sample data for the 2011 EDHS was designed to provide information on population and
health indicators at the national and regional levels. EDHS (2011) sample data was collected
from eleven geographic/administrative regions: nine regional states (Tigray, Afar, Amhara,
Oromia, Somali, Benishangul-Gumuz, SNNP, Gambela and Harari) and two city administrations
(Addis Ababa and Dire Dawa).

Administratively, each of the 11 geographic regions in Ethiopia is divided into Zones and each
Zone into lower administrative units called Woredas. Each Woreda was then further
subdivided into the lowest administrative unit, called Kebele. Each of the Kebeles was
subdivided into convenient areas called census enumeration areas (EAs) from which samples
were drawn (EDHS Preliminary Report, 2011). The EDHS samples included 624 EAs of which 187

8
in urban areas and 437 in rural areas. A complete listing of households was made for each of
the 624 selected EAs from September 2010 to January 2011. The listing excluded institutional
living arrangements (e.g., army camps, hospitals, police camps, and boarding schools). The
EDHS (2011) survey was composed of 17,817 representative samples of households, and of
these, 17,018 were found to be occupied. Of the 17,018 occupied households, 16,702 were
successfully interviewed, yielding a response rate of 98%. In the interviewed households, a total
of 16,515 women (age 15-49) were successfully interviewed. Those women who were
interviewed further consented for the individual anemia test. Finally, 15,568 (95%) of them
were successfully tested. Anemia testing was performed in each household, among eligible
women who consented to being tested.

Table 1: Hemoglobin Values (g/dl) defining Anemia in Women based on WHO (2001) criteria

non Mild Moderate Severe Anemia


Anemia Anemia Anemia
Pregnant women >11.0 10.0-10.9 7.0-9.9 <7.0
Non-pregnant women of >12.0 10.0-11.9 7.0-9.9 <7.0
age (15-49) years

3.3 Variables Considered in the study

3.3.1 The Dependent Variable

Dependent variable is a variable whose values are influenced by the value of explanatory
variables (co variate) (Hosmer DW, Lemeshow S, 2000). For this study, the dependent variable is
“anemia prevalence” of reproductive age group women measured on ordinal scale. It has four
categories (levels) namely, non-anemic, mild anemia, moderate anemia and sever anemia. But
for this study the categories (levels) are collapsed to three to maintain ordinal logistic
regression model assumption as non anemic, mild or moderate anemic and sever anemic. The
categories are defined based on hemoglobin reference (who, 1992) as follows.

9
Non-anemic
Pregnant women with blood hemoglobin level greater than 11 g/dl and non-pregnant women
with blood hemoglobin level greater than 12 g/dl were recorded as non anemic.
Mild anemia
Mild anemia corresponds to a level of hemoglobin concentration of 9.0-10.9 gm/dl for
pregnant women and 9.0-11.9 gm/dl for non-pregnant women. Women with mild anemia in
pregnancy have decreased work capacity. They may be unable to earn their livelihood if the
work involves manual labor (who, 1992).
Moderate anemia
Moderate anemia corresponds to a level of 7.0-9 gm/dl; women with moderate anemia have
substantial reduction in work capacity and may find it difficult to cope with household chores
and child care. Available data from India and elsewhere indicate that maternal morbidity rates
are higher in women with Hb below 8 gm/dl. They are more susceptible to infections and
recovery from infections may be prolonged. Premature births are more common in women
with moderate anemia. They deliver infants with lower birth weight and prenatal mortality is
higher in these babies.

Severe anemia
For all of the tested groups, severe anemia (<7.0 gm/dl) is more dangerous. Severe anemia
indicates that there may be one or more serious nutritional deficiencies or an underlying
medical problem that requires thorough assessment and treatment.

3.3.2 The Independent Variables


The independent variables are that variables which are presumed to affect or determine a
dependent variable. The variables that are considered in the study and expected to be the risk
factors for the prevalence of anemia in women of reproductive age group include age, number
of children ever born, place of residence, region, religion, marital status, pregnancy status,
educational status and economic status (Dey and Goswami, 2010; Alem et al, 2013).

10
Table 2: Description of explanatory variables

predictors Variable name and value level Type of


variable

agroup Age in 5-year groups Ordinal-


categorical
1 = 15-19
2 = 20-24
3 = 25-29
4 = 30-34
5 = 35-39
6 = 40-44
7 = 45-49

region regions Nominal


1= Tigray 7= SNNP categorical
2= Affar 8= Gambela
3= Amhara 9= Harari
4= Oromiya 10= Addis Ababa
5= Somali 11= Dire Dawa
6= Benishangul-Gumuz
tplace Type of place of residence Nominal
categorical
1=urban 2=rural

religion Religion Nominal


categorical
0=orthodox 3=Muslim

1=catholic 4=others

2=protestant

edattain Educational attainment Ordinal


categorical
0=no education 2=incomplete secondary

1=incomplete primary 3=complete secondary

4=complete primary 5=higher

windex Wealth index Ordinal


categorical
1=poorest 4=richer

11
2=poorer 5=richest

3=middle

cpregn Currently pregnant nominal

1=yes 0=no or unsure

smcig Smoking cigarette nominal

0=no 1=yes

marsta Current Martial status Nominal


categorical
0=never in union 3=widowed

1=married 4=divorced

2 =living with partner 5=no longer living


together/separated

cwork Currently working nominal

0=no 1=yes

tborn Total children ever born Ordinal


1= 0≤tborn<2 3= 4≤tborn<6 categorical
2= 2≤tborn< 4 4 = >6
bmaindex Body mass index nominal
1= underweight 3= overweight
2= normal/healthy weigh 4=obese

Table 3: Frequency and percentage distribution of anemia status by the levels of covariates

12
anemia status

Non-anemic Mild-moderate sever Total

Age in 5-
year percentag frequ percent freque percentag frequ Percent frequen
groups e ency age ncy e ency age cy

15-19 19.42 3023 3.67 571 0.15 23 23.23 3617

20-24 14.91 2321 3.22 502 0.17 27 18.31 2850

25-29 14.96 2329 4.05 631 0.18 28 19.19 2988

30-34 9.84 1532 2.58 402 0.14 22 12.56 1956

35-39 9.33 1453 2.49 387 0.13 21 11.95 1861

40-44 6.23 970 1.71 266 0.06 9 8 1245

45-49 5.38 837 1.34 208 0.04 6 6.75 1051

regions

Tigray 9.49 1477 1.31 204 0.04 7 10.84 1688

Affar 5.15 801 2.81 438 0.13 21 8.09 1260

Amhara 10.51 1636 2.22 345 0.05 8 12.78 1989

Oromiya 10.75 1673 2.44 380 0.1 15 13.28 2068

Somali 2.95 459 2.03 316 0.24 38 5.22 813

Benishan
gul-
Gumuz 6.28 977 1.48 230 0.04 6 7.79 1213

SNNP 11.05 1720 1.39 216 0.04 7 12.48 1943

Gambela 5.62 875 1.37 213 0.03 4 7.01 1092

Harari 5.06 788 1.19 185 0.04 7 6.29 980

Addis
Ababa 8.81 1371 0.96 149 0.03 5 9.8 1525

Dire
Dawa

13
4.42 688 1.87 291 0.12 18 6.4 997

Place of percent frequen Percent Frequen percenta frequen percent frequenc


residence age cy age cy ge cy age y

urban 26.3 4094 4.29 668 0.12 18 30.70 4780

rural 53.77 8371 14.77 2299 0.76 118 69.30 10788

Religion

orthodox 36.66 5708 5.54 862 0.1 16 42.3 6586

catholic 0.87 135 0.22 35 0.01 2 1.1 172

protestant 15.18 2363 2.78 433 0.1 15 18.06 2811

Muslim 26.13 4068 10.25 1595 0.65 101 37.02 5764

others 191 67 2 235


1.23 0.27 0.01 1.5 91

Educational
attainment

no education 37.93 5905 12.06 1877 0.69 107 50.67 7889

incomplete 4114 768 24 4906


primary 26.43 4.93 0.15 31.51

complete primary 3.66 1877 0.48 75 0 0 4.14 645

Incomplete
secondary
5.8 768 0.72 112 0.02 3 6.54 1018

complete 222 29 0 251


secondary 1.43 0.19 0 1.61

5=higher 4.82 751 0.68 106 0.01 2 5.52 859

Wealth index

poorest 16.79 2614 5.95 927 0.35 55 23.1 3596

poorer 11.79 1835 3.04 474 0.12 18 14.95 2327

middle 11.15 1736 2.72 424 0.15 24 14.03 2184

14
richer 12.48 1943 2.82 439 0.13 20 15.43 2402

richest 27.86 4337 4.52 703 0.12 19 32.5 3596

Currently
pregnant

0=no 74.46 11592 16.95 2639 0.71 111 92.12 14342

1=no or unsure 5.61 873 2.11 328 0.16 25 7.88 1226

Smoking cigarette

0=no 79.75 12415 18.98 2955 0.87 135 99.6 15505

1=yes 0.32 50 0.08 12 0.01 1 0.4 63

Current Martial
status

never in union 22.53 3508 3.51 546 0.12 19 26.16 4073

married 44.96 6999 12.54 1953 0.66 102 58.16 9054

living with 554 112 4 670


partner 3.56 0.72 0.03 4.3

widowed 2.63 410 0.88 137 0.02 3 3.53 550

divorced 4.48 697 1 155 0.03 5 5.5 857

no longer living 36
together/separat 297 64 3 4
ed 1.91 0.41 0.02 2.34

Currently working

no 50.31 7832 13.28 2068 0.7 109 64.29 10009

yes 29.76 4633 5.71 899 0.17 27 35.71 5559

Total children
ever born

0≤tborn<2 38.15 5939 7.08 1102 0.29 45 45.52 7086

2≤tborn< 4 15.37 2393 4.06 632 0.22 35 19.66 3060

4≤tborn<6 11.41 1777 3.17 493 0.17 26 14.75 2296

15
4=>6 15.13 2356 4.75 740 0.19 30 20.08 3126

Body mass index

underweight 21.2 3301 6.24 972 0.31 49 27.76 4322

normal/healthy 8114 1808 84 10006


weigh 52.12 11.61 0.54 64.27

overweight 5.37 836 0.97 151 0.01 2 6.35 989

obese 1.37 214 0.23 36 0.01 1 1.61 251

3.4. Statistical model

In order to address the objectives stated for this study, ordinal logistic regression model and
tests related to the model are employed as a methodology for the study.

3.4.1 Logistic Regression Model


The logistic model, as a non-linear regression model, is a special case of generalized linear
model (McCullagh and Nelder, 1989) where the assumptions of normality and constant
variance of residuals are not satisfied. This model is a statistical technique for predicting
probability of an event, given a set of predictor variables. The procedure is more sophisticated
than the linear regression procedure. Logistic regression is used to predict the probability of
dependent variable on the basis of independent variables and to determine the effect size of
the independent variables on the dependent; to rank the relative importance of independents;
to assess interaction effects; and to understand the impact of covariate control variables
(Hosmer, Lemeshow, 2000). The impact of predictor variables is usually explained in terms of
odds ratio.

3.4.1.1 Multiple Logistic Regression Model

In a variety of regression applications, a response variable of interest has only two possible
qualitative outcomes, and therefore can be represented by a binary indicator variable taking on

16
values 0 and 1. We shall denote this response variable by Y and the two possible values are 0
and 1 or by the general terminology failure and success. Logistic regression allows one to
predict a discrete outcome, such as group membership, from a set of predictor variables that
may be continuous, discrete, dichotomous, or a mixture of any of these (Hosmer D &
Lemeshow S, 2000).

Consider a collection of P independent variables with n independent observation which


denoted by a vector X=(X1, X2…, Xp), then the ratio of the success with probability P (x i), to that
of failure probability 1-P (xi) is given by

P( x i )
1−P( xi ) is known as odds of a success.

In terms of the odds, the logistic model can be written as:

P( x i )
=exp( β 0 + β1 X i1 + β 2 X i2 + . . . +β p X ip ) , i=1,2 , . . ., n
1−P( xi )

where exp (βj), j=1, 2,…, p is the factor by which the odds of occurrence of a success change by
a unit increase in the jth independent variable.

β 0+ β 1 X i 1 +β 2 X i2+ . . . +β p X ip
e e Xβ 1
P( xi )= β + β X + β X + .. . +β X
= Xβ
= , i=1,2 , . .. , n
In which case, 1+e 0 1 i1 2 i2 p ip 1+e 1+e− Xβ

Where:

P(xi) is the probability that ith person is anemic given that pth predictor variables and

β is a vector of unknown coefficients β=(β0,β1,β2, …, βp)t.

It is obvious that the response variable and the predictors have not a linear relationship.
However, to have a linear relationship we can use the logarithm transformation. Thus, the
transformation of the logistic regression is the logit transformation of P(x i), and it is given as:

17
P ( xi )
log it { P( xi ) }=log
{ 1−P( x i ) }
=β 0 + β 1 X i1 + β 2 X i 2 + . .. +β p X ip
, i=1,2 …, n

Fitting the model requires to estimate the values of the vector parameter β= (β0, β1, β2, βp)t.

3.4.1.2 Assumptions of Logistic Regression


The validity of inferences drawn from modern statistical modeling techniques depends on the
assumptions of the statistical model being satisfied. For valid analysis, the model should satisfy
the following assumptions (Hosmer, Lemeshow , 2000).

i. It does not need a linear relationship between the dependent and independent
variables.
ii. The error terms need to be independent. Logistic regression requires each
observation to be independent. That is, the data should not be from any dependent
samples design, and also the model should have little or no multicollinearity.
However, there is the option to include interaction effects of categorical variables in
the analysis and the model.
iii. Logistic regression assumes linearity of independent variables and log odds; it
requires that the independent variables are linearly related to the log odds.
Otherwise the logistic regression underestimates the strength of the relationship
and rejects the relationship easily, that is being not significant (not rejecting the null
hypothesis) where it should be significant. A solution to this problem is the
categorization of the independent variables. That is transforming metric variables to
ordinal level and then including them in the logistic regression model.
iv. Logistic regression requires quite large sample sizes. Because in the case of small
sample size maximum likelihood estimates are less powerful than ordinary least
squares.

18
3.4.2 Ordinal logistic regression
Ordinal logistic regression models have shown to be suitable for analyzing data with ordinal
response. The choice of the best model depends on the character of the variable, adaptation of

the model to the assumptions, the quality of the adjustment and the capacity it has for coming
up with a good explanation with a reduced number of parameters to be estimated. Since
anemia prevalence is measured by means of ordinal scales (non-anemic, mild, moderate, and
severe) ordinal logistic regression model is appropriate because (Agresti, 2002) advised to use
ordinal logistic regression model when the nature of the data is ordinal. It is an appropriate
tool for analyzing ordinal data and have proven their great potential for use in other research
involving ordinal data (McCullagh, 1980; Anderson, 1984). Also, it recommended to avoid
simple procedures, such as dichotomization of the response variable and overlooking ordering
which may cause loss of information contained in the data and probably incorrect or less
appropriate inferences.

Hence, the response variable Y (anemia level) with K (3) categories (according to this study non
anemic, mild or moderate and severe) coded as 1, 2, and 3 and X= (x 1, x2.., xp) the vectors of
explanatory variables (covariates). The K=3 categories of Y conditional to the value of
explanatory variables occur with probability p1, p2, .., pk, that is Pj=Pr(Y=j/x) for j= 1, 2, .., k-1.
There are several ordinal models, such as the proportional odds, partial proportional odds,
continuation-ratio and stereotype logistic models (Agresti, 1990; Ananth and Kleinbaum, 1997).
For this study proportional odds model is used to analyze the data.
3.4.2.1 Proportional Odds (PO) Model
Proportional Odds Model is defined as the probability of an equal or smaller response, Y < k, to
the probability of a larger response, Y > k; it is used as a tool to model the ordinal nature of a
dependent variable by defining the cumulative probabilities differently instead of considering
the probability of an individual event. It considers the probability of that event and all events
that are ordered before it. When response categories are ordered, logits can directly

19
incorporate the ordering. The cumulative probabilities are the probability that the response Y
falls in category j or below, for each possible j. The jth cumulative probability is Pr(Y≤ j) = P1+P2+
…. +Pj.

The proportional odds model assumes that the cumulative logits can be represented as parallel
linear functions of independent variables. That is, for each cumulative logit the parameters of
the models are the same, except for the intercept. Consequently, according to the proportional
odds assumption, odds ratio is the same for all categories of the response variable. The odds
ratio of cumulative probabilities in the expression is called a cumulative odds ratio. The log of
the cumulative odds ratio is proportional to the distance between the values of the explanatory
variables, with the same proportionality constant applying to each cut-point. Because of this
property, this model is called a proportional odds model (McCullagh, 1980; Agresti, 1996).

The Proportional odds model is now the most commonly used ordinal logistic regression model
for ordinal response dependent variable because of the following reasons (Agresti, 1996, 2002,
2007; Armstrong & Sloan, 1989; Long, 1997; Long & Freeze, 2006; McCullagh, 1980; McCullagh
& Nelder, 1989; Powers & Xie, 2000; O’Connell, 2006; and Wiley, 2006).

 when the Y codes are inverted(i.e., Y 1 is coded asY k , Y 2 as Yk-1), only the signs of the
regression coefficients change.
 the regression coefficients do not change when response categories are collapsed or
the category definitions are changed.
 it produces the most easily interpretable regression coefficients, as exp( β ) is the
homogenous odds ratio over all cut-off points summarizing the effects of the
explanatory factor X on the response Y in one single frequently used measure.
 Standard statistical software with additional features such as stepwise variable
selection procedures is now available for calculation.

Let Y takes categorical response variable with k ordered categories and x is a vector of p
explanatory variables (covariates), then the cumulative probability is given as

20
Pr(Y≤ j/ x ) = ∏j ( x) j=1, 2,... k-1.( Fagerland and Hosmer,2008)
(1)
Cumulative probability reflect the ordering, with P r(Y≤ 1/ x)≤Pr(Y≤ 2/ x)≤……≤Pr(Y≤ k / x) =1 and
let the cumulative probability of the first k-1 of Y then the odds of the first k -1 cumulative
probabilities are

∏j ( x )
Odds (Pr(Y
≤ j/ x
))
Pr (Y ≤ j /x )
= 1−Pr ( Y ≤ j/ x ) =
[ 1−∏ j ( x ) ] , j=1, 2..., k-1.
(2)
The proportional odds model, models the log odds of the first k -1 cumulative probabilities as:

∏j ( x )
logit
[ P r (Y ≤ j/x ) ] []
=log =log
[ 1−∏ j ( x ) ] (3)

And the relationship between the cumulative logits of Y is:

∏ j( x )
log
[ ∏ j( x)
1−∏ j ( x ) ] =log
[ ∏ j +1 ( x ) +∏ j+2 ( x )+ ..+∏k ( x ) ] , j=1, 2,..,k-1
(4)
Consider a collection of P explanatory variables denoted by X=(x1,x2,..,xp). the probability of
success is given by:
exp ( a j + β1 x 1 +. ..+ β p x p )
∏ j ( x )= 1+exp ( a j + β 1 x 1 +.. .+ β p x p )

(5).

Then the logit or log-odds of having pr(Y ≤j/x) = ∏j ( x) is modeled as a linear function of the
explanatory variables as:

∏j ( x )
[]
log = log
[ 1−∏ j ( x ) ] α β β
= j+ 1X1+….+ pXp. (6)

21
equivalent with:

∏j ( x )
[ ]
p
α ∑β
1−∏ j ( x )
log = j+ i=1 X j=1,2,…,k-1 and i=1,2,…,p
i i (7)

Therefore:
p
logit[ P r (Y ≤ j/x ) ]=α j+∑ β iXi j=1,2,…,k-1 and i=1,2,…,p (8)
i=1

Equation (6) is called proportional odds model and it estimates simultaneously multiple
equations of cumulative probability. An equation is solved for each category of the dependent
variable except the last one.
In this model each logit has its own α i term called the threshold value and their values do not
depend on the values of the independent variable for a particular case. Logistic regression
coefficients are indicating the direction and strength of the relationship between independent
variable and the log odds of dependent variable. However, these logistic regression coefficients
are a little bit more complicated to determine, as they present the influence of a unit change in
the independent variable on the log odds of the dependent variable. The influence determines
the rate of increase or decrease in the log odds of dependent variable. The coefficients of the
independent variables are the same for different categories of the response variable. That’s
also the reason why the model is called the proportional odds model (McCullagh, 1980).
3.4.2.2 Partial proportional odds model
The need of using partial proportional odds model is to relax the strong assumption of identical
log-odds ratio for the response variable and predictor variables association, in the proportional
odds model. Violation of the assumption of identical log-odds could lead to the formulation of
an incorrect or misspecified model (Peterson, 1990).
X1; X2; . . . Xp are the complete set of covariates, and q of these are known to have non-
proportional odds, with the remaining having proportional odds. The parameters are the
components of each of the covariate specific log odds, for which proportionality over the cut-
off points can be assumed.

22
The model is given as:
j

Logit
[ P r (Y ≤ j/x )]
=ln
[ pr ( y=1/ x )+ .. ..+ p r ( y= j / x )
pr ( y=( j+ 1/ x ) +. .. .. . p r ( y=k / x ) ] =ln
[ ∑1 pr ( y = j / x )
k
∑ j =1 pr ( y = j / x ) ]
(9)
a j + [ ( B 1 +γ j 1 ) X 1 +.. ..+ ( B q + γ jq ) X q + ( Bq+1 ) X q+1 +. .. .+B P X P ]
= j= 1, 2…k-1
(10)

If the gamma parameters γ j = 0 for everyi, the model is reduced to the proportional odds model.
In this model, for the first q co-variables the coefficient depends on j, meaning that the
relationship between X and Y is dependent on the category. Consequently, ORs are estimated for
all the comparisons between response variable categories. For the remaining ( p−q)co-variables,
the coefficients (β) are independent of j, and thus only one OR is estimated.

3.4.2.3 Continuation ratio model


This model used to compare the probability of a response equal to the category with a certain
score, let us say yj, Y = j, with the probability of a greater response, Y> yj, j=1, 2.., k-1 (Wolfe,
1998). This means it compares each response to all lower responses that is Y = j versus Y<j for j
= 1, 2,...,k (Agresti, 2007). This model has different intercepts and coefficients for each
comparison. It is more suitable when there is an intrinsic interest in a specific category of the
response variable.
The model is given as

pr ( y= j/x )

Logit
[ P r (Y ≤ j/x )]
=ln
[ p r ( y= j / x )
pr ( y= j +1/ x ) +.. pr ( y=k / x ) ] =ln
[ k
∑ j+1 p r ( y = j/ x ) ] ….
(12)
a j + ( B j 1 X 1 + B j2 X 2 + .. ..+ B jp X p ) , j=1,2. . , k
= ………..……………… (13)

3.4.2.4 Parallel lines assumption

23
One of the assumptions underlying ordinal regression is that the relationship between each pair
of outcome groups is the same. This means, ordinal regression assumes that the coefficients
that describe the relationship between, say, the lowest versus all higher categories of the
response variable are the same as those that describe the relationship between the next lowest
category and all higher categories, etc. This is called the proportional odds assumption or the
parallel regression assumption. Because the relationship between all pairs of groups is the
same, there is only one set of coefficients.

3.4.2.5 Odds Ratio

Suppose the response (Y) has k ordered categories ( y jwith j = 1,2,...,k) and that two categories of
explanatory variable, let residence as rural (1) and urban (2), need to be compared. For category
i, OR is given by: (ref)

pr ( Y ≤ j∨X (1) )

ORj ¿
pr ( Y ≤ j∨ X(1) )
1− pr ( Y ≤ j∨X (1) )
=
[ pr ( Y > j∨X (1) ) ] =
odds (1 )
(2)
pr ( Y ≤ j∨ X ) pr ( Y ≤ j∨X ) (2) odds ( 2 )
1− pr ( Y ≤ j∨X ) (2)
[ pr ( Y > j∨X (2) ) ]
(14)
This is the ratio between two odds in terms of cumulative probabilities. It is a value which
measures the strength of effect of each independent variable in the model on the log odds of
the dependent variable.

3.4.2.6 Method of parameter estimation for ordinal logistic regression

3.4.2.6.1 Maximum likelihood estimation

For logistic regression, the model coefficients are estimated by the maximum likelihood method
and the likelihood equations are non-linear function of unknown parameters. The ordinal
logistic regression model is fitted to the observed responses using the maximum likelihood
approach. In general, the method of maximum likelihood produces values of the unknown
parameters that best match the predicted and observed probability values. Therefore, it usually

24
used a very effective and well known Fisher scoring algorithm to obtain ML estimates. Let
(yi1,yj2,...,yik ) be ordinal indicators of the response for subject i.
The likelihood function L is viewed as a function of β andα j parameters. The parameters are
estimated by maximizing the likelihood, or more usually, by maximizing the logarithm of the
likelihood. The likelihood function is given by the equation. McCullagh (1980) and Walker and
Duncan (1967) used Fisher scoring algorithms to obtain ML estimates.
n k

[ ]
n k

L=
∏ ∏ ∏j (X J )
i=1 j=1
Y IJ

= i=1
[
∏ ∏ ( P ( Y ≤ j/ X i )−P ( Y ≤i+1/ X i ))Y
j
ijj
]
Y ij
exp ( α j + β ' X i ) exp ( α j−1 + β ' X i )
[( )]
n k

∏∏ −
i=1 j=1 1+ exp ( α j + β' X i ) 1+ exp ( α j −1 + β ' X i )
=

n
Y1i Y 2i Y cj

L
( β ¿ )= ∏
i=1
[∏ ( X ) ∏ ( X )
1 i 2 j . .. ∏k ( X i ) ]
¿
Here β use somewhat imprecisely to denote both the slope coefficients and intercept
coefficients. It follows that the log-likelihood function is:

L
( β ¿ )=∑ Y 1i ln
i=1
[ ∏ ( X ) ]+Y ln [∏ ( X ) ]+.. .+Y ln [∏ ( X ) ]
1 i
2i 2 i kj k i
(15)

The maximum possible value of the likelihood for a given dataset occurs if the model fits the
data exactly. This occurs if observed counts are close with predicted. Hence by maximizing (15)
above we can theoretically estimate the parameter β. To find an estimate of β that maximizes

L ( β ¿) , we differentiate L ( β¿) with respect to each component of β and set the resulting
(p+1) equations to zero. Solutions are obtained by iterative algorithms that are programmed in
available logistic regression packages like SPSS, STATA, and SAS. In this work, we use SAS.
3.4.2.7 The Wald Statistic

25
The Wald test is a way of testing the significance of particular explanatory variables in a
statistical model. The Wald test is one of a number of ways of testing whether the parameters
associated with a group of explanatory variables are zero ( Agresti, 2002).

If for a particular explanatory variable, or group of explanatory variables, the Wald test is
significant, then we can conclude that the parameters of these variables are not zero, so that
the variables should be included in the model.

In order to determine the effect of the individual explanatory variable in ordinal logistic
regression, the Wald statistic given as:

Λ
2

W=
βi
Λ 2

va (β ) i , i=1, 2,…, p (16)

Under the null hypothesis H0: βi = 0 (i = 1, 2,..., p), the statistic W is approximately distributed as
chi-square with one degree of freedom.

3.4.3 Model Evaluation

3.4.3.1 Goodness of Fit Test


For the selected model before proceeding to examine the individual coefficients, we should
look at an overall test of the null hypothesis that the location coefficients for all of the variables
in the model are 0. This is used as a base for the change in –2log-likelihood when the variables
are added to a model that contains only the intercept. The change in likelihood function has a
chi-square distribution even when there are cells with small observed and predicted counts.
This value provides a measure of how well the model fits the data.

3.4.3.2 Test of Parallel Lines

The statistical hypotheses can be stated as follows:

H o :The slope coefficients in the model are the same across the response categories.

26
H 1 : Not H o.

For a POM to be valid, the assumption that all the logit surfaces are parallel must be tested. A
non-significant test of parallelism is taken as evidence that the logit surfaces are parallel and that
the odds ratios can be interpreted as constant across all possible cut points of the outcome.

3.4.3.3. The Likelihood Ratio Test


The likelihood ratio test statistic (G2) is the test statistic commonly used for assessing the overall
fit of the logistic regression model (the overall significance of all coefficients in the model). A
“likelihood" is a probability, specifically the probability that the observed values of the
dependent may be predicted from the observed values of the independents. Like any
probability, the likelihood varies from 0 to 1. The log likelihood (LL) is its log and varies from 0
to minus infinity (it is negative because the log of any number less than 1 is negative). LL is
calculated through iteration, using maximum likelihood estimation (MLE). Log likelihood is the
basis for tests of a logistic model. Because -2LL has approximately a chi-square distribution, -2LL
can be used for assessing the significance of logistic regression, analogous to the use of the sum
of squared errors in OLS regression.

The likelihood-ratio test statistic is given by:

l0
G 2 =−2 log
()
l1
=−2 [ log ( l 0 ) −log ( l 1 ) ]
, where, l0 is the likelihood of the
null (intercept only) model and l1 is the likelihood of the saturated model. Under the global null
hypothesis, H0: β1 = β2 = ... = βp = 0, the statistic G 2 follows a chi-square distribution. It is
compared with a χ2 distribution with one degree of freedom.

3.4.3.4 Pearson and deviance Chi-Square statistic

Other diagnostics that are used to determine goodness of fit the values of Pearson and deviance
chi-square statistics computed by covariate pattern. In testing the goodness of fit of a model, the
null and alternative hypotheses can be stated as follows

27
H o : The observed data are consistent with the fitted model

H 1 : Not H o.

The Pearson chi-square statistic compares the model fit to the actual data, defined by:

l c 2
( Omi −Emi )
2 ∑∑ E mi
X = m i

distributed as Chi-square with degrees of freedom equal to number of unique covariate minus
number of model parameter. where Omi is t h e observed frequency ¿ E miis the expected frequency
for the( m ,i )t h cellof aggregated frequency table over several unique covariate patterns.

The deviance goodness of fit statistic for ordinal logistic regression has the form:

l c
Omi
D=∑ ∑ O mi log
m i
( )
Emi

distributed as Chi-square with degrees of freedom equal to number of unique covariate minus
number of model parameter.

Both goodness-of-fit statistics should be used only for models that have reasonably large
expected values in each cell. If the model fits well, the observed and expected cell counts are
similar, the value of each statistic is small, and the observed significance level is large. As usual
2
large X values provide the evidence of lack of fit. Good models have non-significant
goodness of fit statistic.

28
Chapter four

4. Statistical analysis and Discussion

4.1 Descriptive statistics

A total of 16515 women of reproductive age group were tested for anemia status. Of them,
15568 were included for which the data for the variable of interest is complete. Of these
3103(19.93%) anemic. And of them 80.07% are non anemic, 19.06% are mild to moderate
anemic and 0.87% are sever anemic.

Table 4. Frequency distribution of anemia in women of reproductive age group.

Anemia status frequency percent

Non-anemic 12465 80.07

Mild-moderate 2967 19.06

Sever 136 0.87

29
Total 15568 100

Based on table 8, Percentage distributions of the prevalence of anemia among women of


reproductive age group (15-49 years) in the category of the independent variable, we have the
following

According to women age category, anemia prevalence is 77.95%, 21.12% and 0.95% are non
anemic, mild to moderate and severely anemic in the age group 25-29 years; and
83.58%,15.79% and 0.64% in the age group 15-19 are non anemic, mild to moderate anemic
and severely anemic, respectively.

Regarding region categories, there are differences in anemia prevalence i.e. prevalence rate of
anemia in afar is 63.57%, 34.76% and 1.67% non anemic, mild to moderate anemic and severely
anemic; and of Somali 56.46%, 38.87 and 4.67% of women are non anemic, mild to moderate
and severely anemic, respectively. Also, regarding place of residence, anemia prevalence is
77.6% 21.31% and 1.09% in rural area; and 85.65%, 13.97% and 0.38% in urban area are non
anemic, mild to moderate and sever anemic, respectively.

According to educational attainment, anemia prevalence is74.85%, 23.79% and 1.36% to


women with no education; and 87.43%, 12.34% and 0.23% to women with higher education are
non anemic, mild to moderate anemic and sever anemic, respectively.

According to women religion categories, anemia prevalence is 70.58%, 27.67 and 1.75% in
Muslim; and 86.67%, 13.09% and 0.24% in orthodox who are non anemic, mild to moderate
and sever anemic, respectively. Similarly regarding to women wealth status, Anemia prevalence
is 72.69%, 25.78% and 1.53% on the poorest categories; and 85.73%, 13.9 %and 0.38 % on the
richest categories are non anemic, mild to moderate and sever anemic, respectively.

According to the categories of pregnancy status, anemia prevalence is 71.21%, 26.75% and
2.04% in women who are pregnant; and 80.83%, 18.4% and 0.77% in non pregnant women are

30
non anemic, mild to moderate and sever anemic, respectively. We can observe that the
prevalence of anemia is high to the pregnant women.

Regarding to the categories of women smoking status, anemia prevalence is 80.07%, 19.06%
and 0.87% who are not smoking; and 79.37%, 19.05% and 1.59% who are smoking are non
anemic, mild to moderate and sever anemic, respectively. Here we can see that prevalence of
anemia is almost similar.

According to the categories of women marital status, anemia prevalence is 86.13%, 13.41% and
0.47% who are never in union; and 77.3%, 21.57% and 1.13% who are married are non anemic,
mild to moderate and sever anemic, respectively. Also regarding to the categories of women
work status, anemia prevalence is 78.25%, 20.66% and1.09% who are not working; 83.34%
16.17% and 0.49% who are working are non anemic, mild to moderate and sever anemic,
respectively.

According to women total children ever born, 83.1%, 15.55% and 0.64% in categories below
two child; and 75.37%, 23.67% and 0.96% in categories above six child ever born are non
anemic, mild to moderate and sever anemic, respectively. Also, regarding women body max
index, 76.38%, 22.49% and 1.13% in underweight categories; and 84.53%, 15.27% and 0.2% in
the overweight categories are non anemic, mild to moderate and sever anemic, respectively.

4.2 Multiple Ordinal logistic regressions analysis result

The following table shows ordinal logistic regression analysis result containing all predictor
variables.

Table5. Ordinal logistic regression analysis result having all variables.

Effect Wald Pr > ChiSq


DF Chi-Square

agroup 6 4.6982 0.5831

31
tplace 1 8.4222 0.0037

region 10 309.9030 <.0001

edattain 5 17.0099 0.0045

windex 4 8.6968 0.0691

religion 4 41.8377 <.0001

cpregn 1 27.5937 <.0001

marsta 5 18.6973 0.0022

bmaindex 3 23.7763 <.0001

cuwork 1 2.1839 0.1395

smcig 1 0.1238 0.7250

tborn 3 1.7571 0.6243

The chi-square test is one of those that is suitable for selecting principal effects, since it
considers the ordinal nature of the response variable. Normally, a conservative level of
significance is used (generally between 20% (0.2) and 25% (0.25)) for entering the co variables
in the model (Hosmer and Lemeshow, 2000). Based on table-5 Wald chi-square test, agroup,
smcig and tborn have p-value greater than 0.25. This implies that those variables will not be
included in the final model. After excluding these variables, the ordinal logistic regression
analysis result is as follows.

Table 6. Ordinal logistic regression analysis result excluding few covariates

Effect DF Wald
Pr > ChiSq
Chi-Square

32
tplace 1 8.5022 0.0035

region 10 311.2597 <.0001

edattain 5 20.0595 0.0012

windex 4 8.5812 0.0725

religion 4 42.6594 <.0001

cpregn 1 29.1551 <.0001

marsta 5 27.1235 <.0001

bmaindex 3 23.6127 <.0001

cuwork 2 3.2681 0.1951

Table 7. Model fit result

33
Score Test for the Proportional Odds Assumption

Chi-Square DF Pr > ChiSq

49.3123 35 0.0550

Deviance and Pearson Goodness-of-Fit Statistics

Criterion Value DF Value/DF Pr > ChiSq

Deviance 5486.8873 9905 0.5540 1.0000

Pearson 8279.2085 9905 0.8359 1.0000

Model Fit Statistics

Intercept

Intercept and

Criterion Only Covariates

AIC 16671.717 15583.291

SC 16687.023 15866.451

-2 Log L 16667.717 15509.291

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 1158.4262 35 <.0001

Score 1195.9417 35 <.0001

Wald 1094.7980 35 <.0001

4.3 Evaluation of the model


34
Before examining the estimates of the model, there is goodness of fit measures that will
indicate appropriateness of the model

Test of Proportional Odds Assumption

Ordinal logistic regression assumes that the beta estimates for all levels of the response
variable are the same. The test is unique to ordinal logistic regression and used to indicate the
appropriate of the model. This tests the null hypothesis that the odds are proportional. If the p-
value is low compared to the significant level then the null hypothesis is rejected and the
results could not be reliable. From the output of table-7, the score test of proportional odds
assumption has p-value (p=0.0550) greater than α -value (0.05) which is not significant.

Hence, it is reasonable not to reject the null hypothesis that the parameters are the same for all
categories for the mode considered. This indicates that the proportional odds model
adequately fits the data.
Pearson is widely used in statistics to measure the degree of the relationship between the
linear related variables. Deviance is a likelihood-ratio test used under full maximum likelihood.
The deviance can be regarded as a measure of lack of fit between model and data. The null
hypothesis states that the observed data are consistent with the fitted model. The null
hypothesis is accepted and one concludes that the observed data were consistent with the
estimated values in the fitted model since the p was insignificant, p = 1.00 > 0.05. This implies,
deviance and Pearson chi-square statistics suggest that the model fits the data well.
From the output table-7, the test for the global null hypothesis that all beta estimates are zero
indicates if any beta estimate is significantly different than zero. If the p-value is low, then at
least one beta estimates is different from zero. It tests the fit of the model against a null or
intercept-only model. This test is highly significant, indicating that at least one of the covariates
has an effect on anemia prevalence.

4.4 Interpretation of the results

35
When the proportional odds model is used in the analysis of ordinal data, the coefficients of the
explanatory variables in the model are interpreted as the logarithm of the ratio of the odds of
the response variable. This means that estimates of this odds ratio, and corresponding
confidence intervals, can be easily found from the fitted model. The interpretations of
parameters corresponding to different covariates which are found significant in the final model
are described in the following section and Comparison is made with the reference category. A
positive slope indicates a tendency for the response level to increase as the variable increase,
and negative slope indicates a tendency for the response level to decrease as the variable
decrease.

The results of the output in table-5and 6 shows that region, Place of residence, Educational
attainment, Religion, pregnancy status, Marital Status and Body mass index are statistically
significant covariates on the prevalence of anemia. Based on table-9 and table-10, we can put
the following

The log odds of anemia status of women that are from Somali are increased by 1.0305 with
estimated odds ratio (OR = 2.803) indicates that women that are from Somali are 2.803 times
more likely to be severely anemic than those women who are from Addis Ababa. This means
Somali women are 2.802 times more likely to be severely anemic than Addis Ababa women,
holding other covariates constant. The odds ratio could be as low as 2.187 and as high as 3.591
with 95% confidence.
Also the log odds of anemia status of women who are from afar are increased by 0.5726 with
estimated odds ratio (OR=1.782), which indicates that women from afar are 1.782 times more
likely to be severely anemic than those who are from Addis Ababa women. This implies that
afar women are 1.782 times more likely to be anemic than women of Addis Ababa, holding
other covariates constant. The odds ratio could be as low as 1.392 and as high as 2.259 with
95% confidence.
The log odds of anemia status of southern nations and nationalities of people women are
decreased by 0.6715 with estimated odds ratio (OR=0.511), which indicates that odds of
severely anemic for southern nations and nationalities women are lower than those of Addis

36
Ababa women by 49%,as the odds of sever or mild to moderate anemic. The odds ratio could
be as low as 0.392 and as high as 0.666 with 95% confidence.
The log odds of anemia status of Dire Dawa women are increased by 0.8502 with estimated
odds ratio (OR=2.34), which indicates that women of Dire Dawa are 2.34 times more likely to be
severely anemic than those who are from Addis Ababa women. This means Dire Dawa women
are 2.347 times more likely to be severely anemic than women of Addis Ababa, holding other
covariates constant. The odds ratio could be as low as 1.861 and as high as 2.943 with 95%
confidence.
The log odds of anemia status of rural women are increased by 0.2654 with estimated odds
ratio (OR=1.304), which indicates that odds of severely anemic for rural women are 1.303 times
more likely to be severely anemic than those of urban women. The odds ratio could be as low
as 1.091 and as high as 1.55 with 95% confidence.
The log odds of anemia status of women who complete primary school are decreased by 0.339
with estimated odds ratio (OR=0.712), which indicates that odds of severely anemic for women
who complete primary school are lower than those of women with no education by 29%, as
the odds of sever or mild to moderate anemic. The odds ratio could be as low as 0.568 and as
high as 0.894 with 95% confidence.
The log odds of anemia status of catholic women are increased by 0.643 with estimated odds
ratio (OR=1.902), which indicates that catholic women are 1.894 times more likely to be
severely anemic than those who are orthodox women. This means catholic women are 1.894
times more likely to be severely anemic than orthodox women, holding other covariates
constant. The odds ratio could be as low as 1.282 and as high as 2.823 with 95% confidence.
Similarly, the log odds of anemia status of Muslim women are increased by 0.4012 with
estimated odds ratio (OR=1.494) indicates that Muslim women are 1.489 times more likely to
be severely anemic than those who are orthodox women. This means Muslim women are 1.494
times more likely to be severely anemic than orthodox women, holding other covariates
constant. The odds ratio could be as low as 1.315 and as high as 1.697 with 95% confidence.

The log odds of anemia status of pregnant women are increased by 0.3868 with estimated odds
ratio (OR=1.472) which indicates that pregnant women are 1.472 times more likely to be

37
severely anemic than those who are not pregnant, holding other covariates constant. The odds
ratio could be as low as 1.279 and as high as 1.694 with 95% confidence.
The log odds of anemia status of married, widowed and no longer living together/separated
women are increased by 0.2177, 0.5406 and 0.3787 with estimated odds ratio 1.243, 1.717 and
1.46, respectively. This implies that women who are married are 1.243 times more likely to be
severely anemic than women who never in union. Also widowed women are 1.717 times more
likely to be severely anemic than women who never in union. Similarly, women of no longer
living together/separated are 1.46 times more likely to be severely anemic than women who
never in union. In all cases other variables are held constant.
The log odds of anemia status of normal weight women are decreased by 0.1795 with
estimated odds ratio (OR=0.836), which indicates that odds of severely anemic for normal
weight women are lower than those underweight women by 16.4%, and also the odds of
severely anemic for overweight women are lower than underweight women by 29% holding
other covariates constant with 95% confidence.

4.5. Discussion of results

This study showed that there is a statistically significant difference to the prevalence of anemia
among women in the regions of Somali, Dire Dawa, and Afar as compared to Addis Ababa
women with odds ratio; 2.803, 2.347, 1.782, respectively. This implies that women who are
living in these region have experienced with high risk of anemia prevalence; but women who
are from southern nations and nationalities of people and Tigray are less likely to be severely
anemic than Addis Ababa women. The result also shows that there is statistically significant
difference between rural and urban women of reproductive age group in the prevalence of
anemia. Rural women of reproductive age group are experienced with high risk of anemia
prevalence.

High risk of anemia prevalence is seen in women of reproductive age group with no education.
Likewise those women who complete primary school have 31% less likely to be severely anemic
in comparison to no education. Similarly, those women who do not complete secondary school
but above primary school are 28% less likely to be severely anemic than women of reproductive

38
age group with no education. Hence, empowering women in terms of education would have
positive contributions to reduce the problem.
Women religion has statistically significant effect to the prevalence of anemia. Women of
Catholic, Muslim and protestant have 1.894, 1.489 and 1.307 times more likely to be severely
anemic compared to orthodox, respectively. This implies that women from catholic religion are
in higher risk of severity of anemia than others. Kendie (2012) also reported those highest
prevalence rates were observed in these religious groups.

The risk of being anemic is nearly 1.5 times higher among pregnant women compared to non
pregnant women. Therefore, pregnant women are in a higher risk of severity of anemia.
Women with severe anemia can experience difficulty meeting oxygen-transport requirements
near and at delivery, especially if significant hemorrhaging occurs. This may be an underlying
cause of maternal death and prenatal and perinatal infant loss (Fleming, 1987; Thonneau et al.,
1992; Omar et al., 1994). Also WHO (1992) reported that unfavorable pregnancy outcomes are
more common in anemic women than in nonanemic women

The result also showed that underweight women of reproductive age group are associated
with higher risk of anemia prevalence compared to any one of the categories in body mass
index. This agrees with the study conducted in India (ME Bentley and PL griflits, 2003).

Chapter five

39
5. Conclusion and recommendation

5.1 Conclusion
This study tried to identify the determinants of anemia prevalence in women of reproductive

age groups in Ethiopia. To assess this issue, ordinal logistic regression with 12 covariates was

applied because of the ordinal nature of the response outcome. The appropriateness of the

model and the validity of the assumptions were examined. The highest prevalence of anemia

was found in Somali, afar and Dire Dawa. Based on the result of the analysis; region, Place of

residence, Educational attainment, Religion, pregnancy status, Marital Status and Body mass

index are statistically significant covariates on the prevalence of anemia, but age group,

smoking cigarette, wealth index, working status and total children ever born are not.

5.2 Recommendation

 Priorities should be given areas where high prevalence rate of exists like; Somali Dire
Dawa and Afar.

 Empowering women in education, economic status and aware them about the
situation.

 Establishing well planed programs that will target women in all socioeconomic groups

 There should be up to date evidence based strategies to control anaemia among


women of reproductive age living within both rural and urban areas.

 Programs aiming to control anemia need to be designed according to the specific


contributing factors in particular areas.

REFERENCES

40
1. Agresti A. (2007): An introduction to categorical data analysis. 2 nd ed.: Wiley, New
York.

2. Agresti, A. (2002). Categorical data analysis, 2nd ed.: John Wiley & Sons, New York

3. Agresti, A. (2010). The analysis of ordinal categorical data. 2nd ed.: New York, Wiley.

4. Agresti, A. A. (1996): An introduction to categorical data analysis. John Wiley & Sons,
New York.
5. Agresti, A.(1989). Tutorial on modeling ordered categorical response data.
Psychological
6. Alem M, Kena T, Baye N, Ahmed R, Tilahun S (2013). Prevalence of Anemia and
Associated Risk Factors among Adult HIV Patients at the Anti-Retroviral Therapy Clinic
at the University of Gondar Hospital, Gondar, Northwest Ethiopia. 2: 662 doi
7. Allen, L (1997). Pregnancy and iron deficiency: unresolved issues. Nutr. Rev., 55, 4
8. Ananth CV, Kleinbaum DG. (1997): Regression models for ordinal responses: a review
of methods and applications: international journal of epidemiology: 26(6):1323-33.

9. Ananth. C and Kleinbaum D (1997). Regression models for ordinal responses: a review
of methods and applications. International Journal of Epidemiology; 26: 1323–33.1–
30.
10. Anderson (1984): Regression and ordered categorical variables. J R Stat Soc B, 46:1-
30.
11. Anderson JA (1984). Regression and ordered categorical variables (with discussion).
Journal of Royal Statistical Society Series B; 46:
12. Armstrong and Sloan (1989): Ordinal regression models for epidemiologic data. Am J
Epidemiol, 129:191-204.

13. Bender R, Grouven U. (1998): Using binary logistic regression models for ordinal data
with non-proportional odds: J Clin Epidemiol, 51(10):809-816.

14. Bentley and Griffiths (2003). The burden of anemia among women in Andhra
Pradesh, India.

41
15. Bhargava A, Bouis HE, Scrimshaw NS(2001). Dietary intakes and socioeconomic
factors are associated with the hemoglobin concentration of Bangladeshi women. J
Nutr 131:758-764.
Bulletin; 105: 290–301.

16. Burnham, K. P. and Anderson, D. R. (2002): Model selection and multi-model inference:
Springer, New York.

17. Central Statistical Agency (Ethiopia) and ORC Macro (2006). Ethiopia Demographic
and Health Survey 2005. Addis Ababa, Ethiopia and Calverton, USA: Central Statistical
Agency and ORC Macro.

18. Central Statistical Agency [Ethiopia] and ORC Macro (2006): Ethiopia Demographic
and Health Survey 2005: Addis Ababa, Ethiopia, and Calverton, Maryland, USA:
Central Statistical Agency and ORC Macro.
19. Central Statistical Agency [Ethiopia] and ORC Macro (2012): Ethiopia Demographic
and Health Survey 2011: Addis Ababa, Ethiopia, and Calverton, Maryland, USA:
Central Statistical Agency and ORC Macro.
20. DeMaeyer E, Adiels-Tegman M (1985). The prevalence of anaemia in the world.
World Health Statistics Quarterly, 38:302–316
21. Demaeyer M (1998). Prevention and controlling iron deficiency anaemia through
primary health care. Geneva.
22. Dey and Goswami ( 2010).Prevalence of anaemia in women of reproductive age in
23. Fagerland MW, Hosmer DW, Baffin AM (2008). Multinomial goodness-of-fit tests for
logistic regression models. Statistics in Medicine; 27:4238–4253.
24. Fleming, A.F. 1987. Maternal anemia in northern Nigeria: Causes and solutions.
World Health Forum 8(3): 339-343.
25. Gillespie, S & Johnston, J (1998). Expert Consultation on Anemia Determinants and
Interventions, Ottawa: The Micronutrient Initiative.
26. Haidar J, Nekatibeb H, Urga K (1999). Iron Deficiency Anemia in Pregnant and
Lactating Mothers in Rural Ethiopia. East Africa Med J; 76(11):618-22.

42
27. Hinderaker SG, Olsen BE, Bergsj P, Lie RT, Gasheka P, Kvåle G (2001). Anemia in
Pregnancy in the Highlands of Tanzania. Acta Obstet Gynecol Scand; 80:18–26.
28. Hinderaker SG, Olsen BE, Bergsj.P, Lie RT, Gasheka P, Kvåle G (2001). Anemia in
Pregnancy in the Highlands of Tanzania. Acta Obstet Gynecol Scand; 80:18–26.

29. Hosmer DW, Lemeshow S (2000). Applied Logistic Regression, (2nd edn), Wiley Series
in Probability and Statistics. Wiley: New York.
30. International Nutritional Anemia Consultative Group (INACG). 1989. Iron deficiency in
women. International Nutritional Anemia Consultative Group, World Health
Organization. Geneva, Switzerland.
31. Jemal H. (2010). Prevalence of anemia, deficiencies of iron and folic acid and their
determinants in Ethiopian women: J Health Popul Nutr.: 28 (4):359–368.
32. Kayihan Pala & Nilgun Dundar (2007). Prevalence & risk factors of anaemia among
women of reproductive age in Bursa, Turkey, pp 282-286
33. McCullagh P (1980). Regression models for ordinal data (with discussion). Journal of
Royal Statistical Society Series B; 42:109–42
34. McLean E, Cogswell M, Egli I, Wojdyla D, Benoist B (2008). Worldwide Prevalence of
Anemia: WHO Vitamin and Mineral Nutrition Information System 1993– 2005. Public
Health Nutr; 12(4):444–54.

35. McLean E, Cogswell M, Egli I, Wojdyla D, de Benoist B (2009). Worldwide prevalence


of anaemia, WHO Vitamin and Mineral Nutrition Information System, 1993–2005.
Public Health Nutr; 12: 444–54.
36. ME Bentley and PL Griffiths (2003). The burden of anemia among women in India.
European Journal of Clinical Nutrition 57:52–60.
Meghalaya, India: a logistic regression analysis 40 (5): 783-789
37. Mishra P, Ahluwalia SK, Garg PK, Kar R, Panda GK (2012). The Prevalence of Anaemia
among Reproductive Age Group (15-45 Yrs) Women in A PHC of Rural Field Practice
Area of MM Medical College, Ambala, India.

43
38. Ngnie-Teta I, Kuate-Defo B and Receveur O (2009). Multilevel modeling of socio-
demographic predictors of various levels of anaemia among women in Mali. Public
Health Nutr; 12: 1462–69.
39. O’Connell, A.A., (2000). Methods for modeling ordinal outcome variables.
Measurement and Evaluation in Counseling and Development, 33(3), 170-193.
40. Peterson B, Harrell F (1990). Partial proportional odds model for ordinal response
variables. Applied Statistics; 39: 205–17.
41. Peterson BL, Hanrrel FE (1990). Partial proportional odds models for ordinal response
variables. Appl Statistic. 39(2):205-17. DOI: 10.2307/2347760
42. Pulkstenis E, Robinson TJ (2004). Goodness-of-fit tests for ordinal response regression
models. Statistics in Medicine; 23:999–1014.
43. Sharmanov, A. 2000. Anemia testing manual for population-based surveys. Calverton,
Maryland: Macro International Inc.
44. Stoltzfus, RJ (1997). Rethinking anemia surveillance. Lancet, 349, 1764–1766.
45. Tadios Y. (1996). Prevalence of anemia and its risk factors among pregnant mothers
following antenatal in Asendabo health center.
Technical Bulletin, STB-44, 18-21.
46. United Nations Development Programme (2007). Measuring human development: A
primer. New York: UNDP.
47. Wolfe, R. (1998). Continuation-ratio models for ordinal response data. Stata
48. World Health Organization (1992). The Prevalence of Anemia in Women: A Tabulation
of Available Information, 2nd ed.
49. World Health Organization (2008). Centers for Disease Control and Prevention.
Worldwide Prevalence of Anemia: WHO Global Database of Anemia.
50. World Health Organization (UNICEF. (2001). Iron deficiency anemia, assessment
prevention and control: a guide for programme managers.

Appendixes

Descriptive statistics

44
Table 8 : Percentage distributions of the prevalence of anemia among women of reproductive
age group (15-49 years) in the category explanatory variables.

Anemia status

Non-anemic Mild-moderate sever

Age in 5-year groups percentage percentage percentage

15-19 83.58 15.79 0.64

20-24 81.44 17.61 0.95

25-29 77.95 21.12 0.94

30-34 78.32 20.55 1.12

35-39 78.08 20.8 1.13

40-44 77.91 21.37 0.72

45-49 79.64 19.79 0.57

Regions

Tigray 87.5 12.09 0.41

Affar 63.57 34.76 1.67

Amhara 82.25 17.35 0.4

Oromiya 80.9 18.38 0.73

Somali 56.46 38.87 4.67

Benishangul-Gumuz 80.54 18.96 0.49

SNNP 88.52 11.12 0.36

Gambela 80.13 19.51 0.37

Harari 80.41 18.88 0.71

Addis Ababa 89.9 9.77 0.33

Dire Dawa 69.01 29.19 1.81

Place of residence

Urban 85.65 13.97 0.38


45
Rural 77.6 21.31 1.09

Religion

Orthodox 86.67 13.09 0.24

Catholic 78.49 20.35 1.16

protestant 84.06 15.4 0.53

Muslim 70.58 27.67 1.75

Others 81.28 17.87 0.85

Educational attainment

no education 74.85 23.79 1.36

incomplete primary 83.86 15.65 0.49

complete primary 88.37 11.63 0

Incomplete secondary 88.7 11 0.29

complete secondary 88.45 11.55 0

5=higher 87.43 12.34 0.23

Wealth index

poorest 72.69 25.78 1.53

Poorer 78.86 20.37 0.77

Middle 79.49 19.41 1.1

Richer 80.89 18.28 0.83

Richest 85.73 13.9 0.38

Currently pregnant

0=no or unsure 80.83 18.4 0.77

1=yes 71.21 26.75 2.04

Smoking cigarette

0=no 80.07 19.06 0.87

46
1=yes 79.37 19.05 1.59

Current Marshal status

never in union 86.13 13.41 0.47

Married 77.3 21.57 1.13

living with partner 82.69 16.72 0.6

widowed 74.55 24.91 0.55

Divorced 81.33 18.09 0.58

no longer living
together/separated 81.59 17.58 0.82

Currently working

No 78.25 20.66 1.09

Yes 83.34 16.17 0.49

Total children ever born

0≤tborn<2 83.81 15.55 0.64

2≤tborn< 4 78.2 20.65 1.14

4≤tborn<6 77.4 21.47 1.13

4=>6 75.37 23.67 0.96

Body mass index

Underweight 76.38 22.49 1.13

normal/healthy weigh 81.09 18.07 0.84

Overweight 84.53 15.27 0.2

Obese 85.26 14.34 0.4

47
Table 9: Analysis of Maximum Likelihood Estimates

Standard Wald
95% Confidence Limits
Parameter DF Estimate Error Chi-Square Pr> ChiSq

Intercept 3 1 -4.6862 0.2608 322.9530 <.0001 -5.1973 -4.1751

Intercept 2 1 -1.2288 0.2469 24.7648 <.0001 -1.7128 -0.7448

tplace 2 1 0.2654 0.0910 8.5022 0.0035 0.0870 0.4437

region 3 1 -0.0201 0.1218 0.0273 0.8687 -0.2589 0.2186

region 9 1 0.2316 0.1260 3.3806 0.0660 -0.0153 0.4784

region 5 1 1.0305 0.1265 66.3304 <.0001 0.7825 1.2785

region 6 1 -0.1103 0.1282 0.7397 0.3898 -0.3616 0.1410

region 4 1 -0.0463 0.1188 0.1521 0.6965 -0.2791 0.1864

region 7 1 -0.6715 0.1354 24.5804 <.0001 -0.9369 -0.4060

region 1 1 -0.3366 0.1298 6.7232 0.0095 -0.5910 -0.0822

region 2 1 0.5726 0.1235 21.4878 <.0001 0.3305 0.8147

region 8 1 -0.0302 0.1381 0.0478 0.8270 -0.3009 0.2405

region 11 1 0.8502 0.1170 52.7857 <.0001 0.6209 1.0796

edattain 1 1 -0.1872 0.0545 11.7935 0.0006 -0.2941 -0.0804

edattain 2 1 -0.3890 0.1346 8.3555 0.0038 -0.6528 -0.1252

edattain 3 1 -0.3390 0.1159 8.5479 0.0035 -0.5662 -0.1117

edattain 5 1 -0.1149 0.1240 0.8598 0.3538 -0.3579 0.1280

edattain 4 1 -0.2333 0.2118 1.2131 0.2707 -0.6485 0.1819

windex 3 1 0.00935 0.0757 0.0153 0.9016 -0.1390 0.1577

windex 1 1 0.0552 0.0680 0.6602 0.4165 -0.0780 0.1885

48
windex 5 1 -0.2176 0.1027 4.4922 0.0341 -0.4189 -0.0164

windex 4 1 -0.0679 0.0760 0.7982 0.3716 -0.2168 0.0810

religion 3 1 0.4012 0.0651 37.9973 <.0001 0.2737 0.5288

religion 2 1 0.2705 0.0840 10.3807 0.0013 0.1059 0.4351

religion 4 1 0.2562 0.1817 1.9868 0.1587 -0.1000 0.6124

religion 1 1 0.6430 0.2014 10.1956 0.0014 0.2483 1.0378

cpregn 1 1 0.3868 0.0716 29.1551 <.0001 0.2464 0.5271

marsta 1 1 0.2177 0.0615 12.5330 0.0004 0.0972 0.3382

marsta 4 1 0.2328 0.1057 4.8530 0.0276 0.0257 0.4400

marsta 2 1 0.2603 0.1184 4.8326 0.0279 0.0282 0.4923

marsta 5 1 0.3787 0.1481 6.5343 0.0106 0.0883 0.6690

marsta 3 1 0.5406 0.1165 21.5208 <.0001 0.3122 0.7690

bmaindex 2 1 -0.1795 0.0474 14.3136 0.0002 -0.2725 -0.0865

bmaindex 3 1 -0.3461 0.1042 11.0433 0.0009 -0.5502 -0.1420

bmaindex 4 1 -0.5748 0.1926 8.9066 0.0028 -0.9523 -0.1973

cuwork 0 1 0.0664 0.0473 1.9670 0.1608 -0.0264 0.1592

Table 10: Odds Ratio Estimates

Odds Ratio Estimates

Point 95% Wald

Effect Estimate Confidence Limits

tplace 2 vs 1 1.304 1.091 1.55

region 3 vs 10 0.980 0.772 1.244

region 9 vs 10 1.261 0.985 1.614

49
region 5 vs 10 2.803 2.187 3.591

region 6 vs 10 0.896 0.697 1.151

region 4 vs 10 0.955 0.756 1.205

region 7 vs 10 0.511 0.392 0.666

region 1 vs 10 0.714 0.554 0.921

region 2 vs 1 1.773 1.392 2.259

region 8 vs 10 0.970 0.740 1.272

region 11 vs 10 2.340 1.861 2.943

edattain 1 vs 0 0.829 0.745 0.923

edattain 2 vs 0 0.678 0.521 0.882

edattain 3 vs 0 0.712 0.568 0.894

edattain 5 vs 0 0.891 0.699 1.137

edattain 4 vs 0 0.792 0.523 1.199

windex 3 vs 2 1.009 0.870 1.171

windex 1 vs 2 1.057 0.925 1.207

windex 5 vs 2 0.804 0.658 0.984

windex 4 vs 2 0.934 0.805 1.084

religion 3 vs 0 1.494 1.315 1.697

religion 2 vs 0 1.311 1.112 1.545

religion 4 vs 0 1.292 0.905 1.845

religion 1 vs 0 1.902 1.282 2.823

cpregn 1 vs 0 1.472` 1.279 1.694

marsta 1 vs 0 1.243 1.102 1.402

marsta 4 vs 0 1.262 1.026 1.553

50
marsta 2 vs 0 1.297 1.029 1.636

marsta 5 vs 0 1.460 1.092 1.952

marsta 3 vs 0 1.717 1.366 2.158

bmaindex 2 vs 1 0.836 0.761 0.917

bmaindex 3 vs 1 0.707 0.577 0.868

bmaindex 4 vs 1 0.563 0.386 0.821

cuwork 0 vs 1 1.069 0.974 1.173

Sas code
data ord;
set Sasuser.Girma3;
run;
proc print data=ord;
title "ordered data";
run;
proc freq data=ord;
tables cpregn;
title1 'Simple Frequency Tables';
run;
proc logistic data=ord;
class agroup(ref='1') tplace(ref='2') region(ref='14')
edattain(ref='0') windex(ref='2') religion(ref='0')
cpregn(ref='0') marsta(ref='1') bmaindex(ref='1')
cuwork(ref='1') smcig(ref='0') tborn(ref='1') /order=data
param=ref ref=first;;
model anlevel(descending) = agroup tplace region edattain windex
religion cpregn marsta bmaindex cuwork smcig tborn /
link=logit clparm = wald rsquare
aggregate = (agroup tplace region edattain windex religion
cpregn marsta bmaindex cuwork smcig tborn)scale= none;
run;

proc logistic data=ord;


class agroup(ref='1') tplace(ref='1') region(ref='14')
edattain(ref='0') windex(ref='2') religion(ref='0')

51
cpregn(ref='0') marsta(ref='0') bmaindex(ref='1')
cuwork(ref='1') tborn(ref='1') /order=data param=ref ref=first;;
model anlevel(descending) = agroup tplace region edattain windex
religion cpregn marsta bmaindex cuwork tborn / link=logit
clparm = wald rsquare
aggregate = (agroup tplace region edattain windex religion
cpregn marsta bmaindex cuwork tborn)scale= none;
run;

52

You might also like