You are on page 1of 13

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/220140448

Factors influencing university drop out rates

Article  in  Computers & Education · November 2009


DOI: 10.1016/j.compedu.2009.03.013 · Source: DBLP

CITATIONS READS

182 5,830

3 authors:

Francisco Araque Concepcion Roldan


University of Granada University of Granada
49 PUBLICATIONS   365 CITATIONS    39 PUBLICATIONS   984 CITATIONS   

SEE PROFILE SEE PROFILE

Alberto Salguero
Universidad de Cádiz
73 PUBLICATIONS   485 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

A computable model of suspense based on cognitive phenomena View project

Parallel Multicore and GPU Programming View project

All content following this page was uploaded by Francisco Araque on 19 February 2019.

The user has requested enhancement of the downloaded file.


Computers & Education 53 (2009) 563–574

Contents lists available at ScienceDirect

Computers & Education


journal homepage: www.elsevier.com/locate/compedu

Factors influencing university drop out rates


Francisco Araque a, Concepción Roldán b,*, Alberto Salguero a
a
Department of Software Engineering, University of Granada, C/Periodista Daniel Saucedo Aranda s/n, E-18071 Granada (Andalucía), Spain
b
Department of Statistics and Operations Research, University of Jaén, Las Lagunillas s/n, E-23071 Jaén (Andalucía), Spain

a r t i c l e i n f o a b s t r a c t

Article history: This paper develops personalized models for different university degrees to obtain the risk of each stu-
Received 26 November 2008 dent abandoning his degree and analyzes the profile for undergraduates that abandon the degree. In this
Received in revised form 12 March 2009 study three faculties located in Granada, South of Spain, were involved. In Software Engineering three
Accepted 17 March 2009
university degrees with 10,844 students, in humanities nineteen university degrees with 39,241 students
and in Economic Sciences five university degrees with 25,745 students were considered. Data, corre-
sponding to the period 1992 onwards, are used to obtain a model of logistic regression for each faculty
Keywords:
which represents them satisfactorily. These models and the framework data show that certain variables
Drop out
Data Warehouse
appear repeatedly in the explanation of the drop out in all of the faculties. These variables are, among
Evaluation methodologies others, start age, the father’s and mother’s studies, academic performance, success, average mark in
Country-specific developments the degree and the access form and in some cases also, the number of rounds needed to pass. Students
Innovation with weak educational strategies and without persistence to achieve their aims in life have low academic
performance and low success rates and this implies a high risk of abandoning the degree. The results sug-
gest that each university centre could consider similar models to elaborate a particular action plan to help
lower the drop out rate reducing costs and efforts. As concluded in this paper, the profile of the students
who tend to abandon their studies is dependent on the subject studied. For this reason, a general meth-
odology based on a Data Warehouse architecture is proposed. This architecture does most of the work
automatically and is general enough to be used at any university centre because it only takes into account
the usual data the students provide when registered in a course and their grades throughout the years.
Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction

Higher education in Spain has recently been undergoing an important process of restructuring because of the need to converge with
other members of the European Union and to introduce information and communication technologies (ICT) in the teaching processes.
These changes demand certain reforms in order to adapt the goals of the institution to the new social needs.
The increasing interest in studying university drop out comes from the increase of cases registered in the Spanish universities together
with the elevated cost that the education of every undergraduate means to Public Administration. According to the statistics of the Spanish
Coordination University Council (National Evaluation’s Plan of the Quality of the Universities, PNECU), presented in December 2002, 26% of
the undergraduates leave their studies or change their degrees. The data provided by the Organization for Cooperation and Economic
Development (OCDE), presented the same year, show that the academic failure in Spain is set at over 50%, related fundamentally to the
rates of drop out. With other data provided by the Spanish Center of Research, Documents and MEC’s Evaluation (CIDE), MEC (Spanish
Department of Education and Science), drop out rates are set between 30% and 50%. This phenomenon began in the rest of Europe rather
than in Spain, reaching 45% in Austria. According to ‘‘The Standards for Educational and Psychological Testing, 19991, every year the per-
centage of students that abandons their studies or changes the degree increases, results obtained as from the analysis of the data of the rates
registered in the college. Other studies accomplished in Central Europe and the United States, show a similar percentage, although these stud-
ies are made with minority populations and perhaps this is the explanation for a bigger level of drop out (see Callejo, 2001; Feldman, 2005;
Last & Fulbrook, 2003; Orazem, 2000).

* Corresponding author.
E-mail address: iroldan@ujaen.es (C. Roldán).
1
The Standards for Educational and Psychological Testing is a set of testing standards developed jointly by the American Educational Research Association (AERA), American
Psychological Association (APA), and the National Council on Measurement in Education (NCME)

0360-1315/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.compedu.2009.03.013
564 F. Araque et al. / Computers & Education 53 (2009) 563–574

As far as 1968, Rubio García Mina carried out one of the first studies on the university drop out in Spain and analyzed the cohorts from
1960 to 1966 in the technical superior schools of Madrid. This study and others were approximations to an incipient phenomenon, coin-
ciding with some institutional reforms and social changes, like the access of a bigger percentage of students to the university, implantation
of the Spanish law for the General Organizing of the Educational System (LOGSE), the reform of University Curriculums, new requirements
of higher education (new methodologies, new technologies, practices at companies), etc. The disconnection between the laws of obligatory
education with the university study programs, and the strong linkage of these with the business world, together with other institutional
circumstances that did not suit the new student’s characteristics well enough, have had the effect of a great increase in the percentage of
drop out, specially in technical degrees. These circumstances together with the overcrowding also produce in humanities and social sci-
ences. The problem is unsolved, because every year the drop out rates at all the universities, and in all the degrees, increase, although
the differences between all of them continue being significant.

2. Previous work

There are several attempts to build theoretic models that explain the phenomenon of drop out from the university studies. The majority
of them reveal a series of common characteristics and centre their analyzes in the following groups of variables: the student body, the
teaching staff, the institution and the family contexts. For several authors including Forbes and Wickens (2005), the students’ decision
of changing or continuing his formative university process is determined basically by the level of social integration that these students
achieve at the university institution, that is, the students will feel more integrated inasmuch as their capacities allow them to cope with
the intellectual and other demands of university life. Another kind of factor that has been identified in the student body is the deficiency of
capacities or abilities to face up to the demands of university studies, previous inadequate knowledge, inappropriate attitudes toward
learning and low psychological resilience, etc. (see Kirton, 2000; Wasserman, 2001; Landry; 2003; Lightseym, 2006; Saunders, Davis, Wil-
liams, & Williams, 2004).
In relation to the teaching staff factors, it is pertinent to highlight the pedagogic deficiencies (not much clarity in the presentation of the
subject, etc.) or deficiency of attention individualized to the student. The association between the student–teacher relationship and drop-
ping out has also been demonstrated in the literature (see Lessard, Fortin, Joly, Royer, & Blaya, 2004). Moreover, the conflicts with teachers
frequently escalated and ultimately resulted in a pivotal moment that precipitated leaving education (Lessard et al., 2008). Institution-re-
lated risk factors associated with dropping out include absence of clearly defined objectives, restriction in the offer of determined degrees
(see Thomas, 2002), etc. Another large group of causes related to drop out point to the family-related risk factors such as social origin,
socioeconomic status, family turmoil (see Fortin, Marcotte, Potvin, Royer, & Joly, 2006). Low affective support, low cohesion among family
members and a high rate of conflict fuel the family turmoil and increase the drop out probability (see Lessard et al., 2008).
With all of this, nowadays the drop out has become a worrying theme of great interest at all levels. In addition, all these data show the
need to accomplish studies which identify the causes of the drop out, with the aim of putting actions into practice to lead to a decrease in
the incidence of this problem at the institution.
Drop out is one of the multiple indicators of the quality of a educational system, that highlights the existence of serious failures in the
processes of orientation, transition, adaptation and the student body’s promotion. For this reason in the frame of the European Space of
Higher Education, a lot of universities are taking into account in their strategic plans a primary objective: reducing the rate of the student’s
drop out.
The objective of this study is to analyze which variables influence the drop out as well as the creation of a model of prediction that al-
lows the measurement of the risk of drop out of a student using the academic and certain personal data that the university centres have for
every student. We conceived the drop out as the situation in which the students have abandoned the university or the initiated degree
when they do not register for a subsequent school year. The hypothesis of departure is that there exists a student profile susceptible to
the abandonment of the university studies. Therefore, the use of different statistical techniques is proposed to identify this intervening pro-
file using historical data, preferably stored in a Data Warehouse. Logistic regression model techniques (Hosmer & Lemeshow, 2004) are
frequently used to analyze factors influencing a response variable of great interest in education (see Braak, 2001; Martins, Steil, & Todesco,
2004). Once the profile has been identified, several actions can be included in the strategic plans of the universities. Taking into account
these models and the results obtained, individual candidates belonging to the profile identified as susceptible to abandonment among the
students signed on the faculty can be found and guided to a system of orienteering or tutorial action.

3. Materials and methods

There are many alternatives for studying the drop out phenomenon. The easiest and most common way is to take some surveys of the
students from time to time. The problem of this alternative is that it is very intrusive. It requires the participation of the students and the
processing of the collected data. For these reason this alternative usually considers only a small part of the students. On the other hand, we
can make use of the personal information the students provide when registered in courses every year as well as all of their grades through-
out the courses. The latter is less powerful than the former because the surveys can be designed as needed, focused in the drop out issue.
The main advantages of the latter alternative are that we have more data and that the process is transparent to the students because we do
not need extra information rather than that provided by them when registered in courses.
We opted for the latter alternative. The information technology department of the university provides us with all the data we need
about the students, modified to be completely anonymous in order to avoid legal issues (the name of the students is confidential and
was removed but the rest of the personal information was kept). After some initial analysis we promptly noticed that the information
was not in a suitable form to be analyzed. The data is structured in such a way that its employees can easily manage it: add, edit and delete
information. Since the data by themselves are useless, they must be put together and reorganized to produce useful information. In turn,
information becomes the basis for decision making.
To facilitate the decision-making process, a new piece of technology more sophisticated than usual database systems has been devel-
oped: a Data Warehouse. A Data Warehouse (DW) can be generally described as a decision-support tool that collects its data from oper-
F. Araque et al. / Computers & Education 53 (2009) 563–574 565

ational databases and various external sources, transforms them into information and makes that information available to decision-makers
(top managers) in a consolidated and consistent manner (Inmon, 2005; Kimball & Ross, 2002). The persistence of huge amounts of data
opens a new perspective for various statistical analysis methods which are essential for tactical decisions.
Inmon (2005) defined a DW as ‘‘a subject-oriented, integrated, time-variant, non-volatile collection of data in support of management’s
decision-making process”. A DW is a database that stores a copy of operational data whose structure is optimized for query and analysis.
The scope is one of the DW defining issues: it is the entire organization. The DW is usually implemented using related databases (Rainardi,
2007) defining multidimensional structures.
The generic architecture of the developed system is illustrated in Fig. 1. Data sources include basically existing operational databases.
The data are extracted from the sources and then loaded into the DW using an ETL tool (Araque, Salguero, Martínez, Navarro, & Calero,
2007a; Araque et al., 2007b). ETL stands for extract, transform and load, the processes that enable companies to move data from multiple
sources, reformat and cleanse it, and load it into another database or on another operational system to support a business process. This is
usually the most complex task in the process and, in our case, we had to develop an ad hoc application. It basically consists of the following
steps:

 Clean the data. The application firstly has to clean the data. There are some erroneous and missing values which have to be put right.
Some of those values are recoverable: the starting age is supposed to be the age the student is when the first of his courses is recorded;
the country of origin is supposed to be Spain; it is also supposed that the students are single and that they do not work. . . Other values
can be completed using a mode-value imputation process: the population was considered divisible into a number of disjoint classes
using complete variables forming these classes and the mode-value imputation method was applied. All the variables used in the study
satisfy a non-response less than 25% (maximum value established by the Office of Management and Budget). All variables with missing
values higher than 25% were removed (Levy & Lemeshow, 2003, 2008).
 Aggregate the data. This is basically the reason because we use a DW based approach. The data is not analyzable in the original form. It is
normalized and there are many relations connecting the facts. The data is aggregated in such a way that the historical information about
the courses each student has been enrolled in through the years is summarized in a unique record. Some rates, denoted using asterisk
notation in Table 1, are defined for this reason.
 Generate new measures. The last step of the process consists in determining which of the students have actually abandoned their studies.
This information is the basis of the consequent analysis and it is, obviously, not defined in the original data. It has to be derived.

The DW is then used to populate the various subject (or process) oriented Data Marts (DM) and On-Line Analytical Process (OLAP) serv-
ers. Data Marts are subsets of a DW categorized according to functional areas depending on the domain (problem area being addressed)
and OLAP servers are software tools that help a user to prepare data for analysis, query processing, reporting and data mining (Fig. 2).
Several software choices are available to build a data mart such as MS Access, Oracle or any other database management software. MS
Access is selected in this research for its user friendliness and easy availability. It is important to note that OLAP software packages such as
MS OLAP server or SQL server are also available in the market. These software packages allow users to write their queries using any chosen
analytical model. The advantage is that the users can use different graphical wizards to easily write their queries and display results in
different graphical formats. Since our intention is not to develop a commercial software product, we have used MS Access.
Due to the huge quantity of data the process takes, for instance, 297 h for processing the history of 25,745 Economic Sciences students in
a modern computer (Core2, 8 GB). It has been implemented using the Microsoft .net framework and more than 1500 lines of code have
been required. In order to improve the performance of the application some part of it has been implemented using the .net native threads
library, which makes possible the parallelization of some tasks.

4. Student data collection and statistical analysis

The population under investigation consisted of all the students signed on the following three faculties of the University of Granada
from 1992 onwards: Software Engineering, Humanities and Economic Sciences. These faculties include 27 degrees and have had 75,830
students signed on. Every year the students fill in the registration with personal information (family name, first name, address, city, coun-
try, sex, age, birth, etc.) and with the subjects interested in studying. This information together with the academic results is included in a
database. Therefore the first step of our study was to develop, on the basis of historical data about the drop out of these students, a Data
Warehouse named Data Mart, with the variables needed for the aims of this study.

Fig. 1. System architecture.


566 F. Araque et al. / Computers & Education 53 (2009) 563–574

Table 1
Description of the variables used in the study.

Variable Description
*
Abandon The student leaves the school: yes (1), no (0)
Marital status Marital status of the student
Student job Student job: more (1) or less (2) than 15 h weekly or unemployed (3)
Access mark Mark obtained in the degree access
Access year Year of the degree access
End degree Year of the studies ending
Academic performance rate* Academic performance rate (passed credits/enrolled credits)
Success rate* Success rate (passed credits/examined credits)
Average mark* Average mark (from 1 to 4)
Mode round* Mode round (first round, second round, third round, etc.)
Start year Year the university studies start
End year Last year signed on the degree
Degree Degree
Sex Sex of the student
Birth Date of birth
Family city Family city
Country Native land
Father’s job Father’s job
Mother’s job Mother’s job
Father’s education level Father’s education level
Mother’s education level Mother’s education level
Access form Form of the degree access
Exam round Sort of round more used (standard or extraordinary)

Fig. 2. The ETL process.

The statistical analysis achieved for each faculty using the previous variables and SPSS (Statistical Package of Social Science), included:

– Descriptive analysis: This study shows the main characteristics of the students of each faculty.
– Studies of the relationships between ‘‘abandon” and the remaining variables.
– Principal components analysis (PCA): This analysis showed the underlying framework of data.
– Logistic regression model for ‘‘abandon”: Logistic regression modeling was used to obtain conclusions about the drop out problem in each
faculty and predict the possible effect of the remaining variables on the dichotomous ‘‘abandon” variable.

This paper discusses the main results obtained in the first three studies briefly and focuses on the logistic regression models obtained.

5. Results

In each faculty, a first step after the imputation procedure was to achieve a descriptive analysis of the variables in the study and analyze
the relationships between ‘‘abandon” and the remaining variables. For the last purpose, chi-square testing and contingence tables were
employed. This analysis showed differences between the group of students that abandon the degree and the rest of students for all the
variables considered. Following, by the partial method and using stepwise as method of selection by steps, the logistic regression models
have been obtained. For different cut points the respective models have been calculated, choosing the model that sorts the data better in
the two categories of the variable ‘‘abandon”. The models obtained only contained a few variables. This phenomenon is explained by the
results obtained in principal component analysis included too.

5.1. Software Engineering

In general, most students signed on in these degrees are men. In this study, 81.6% of the students were men. In addition 96.5% are single
and 3.5% were married. Most of them (98.3%) are Spanish and 1.4% is Moroccan. The largest group of fathers (40.5%) have received primary
F. Araque et al. / Computers & Education 53 (2009) 563–574 567

education and 21.4% have a degree. But only 14.5% are classified as ‘‘high level professional” and 21% work in the Public Administration. The
largest group of mothers (44.5%) have received primary education and only 6.8% have a degree. The largest group of mothers (38.5%) have
never had a remunerative job, 11.8% of the mothers are housewives but a big percent of them (16.7%) have a company with more than 10
workers.
Most students are unemployed (93.5%), 5.4% work more than 15 weekly hours and only 1.1% work less than 15 weekly hours. Most stu-
dents (61.5%) are admitted to the university degree with an entrance examination, 27.3% of students are admitted as undergraduate, 10.5%
come from vocation courses for 14 to 18 years olds and 0.6% has been admitted in the university by passing a special entrance examination
for those over 25.
The academic performance rate is an average of 0.3868, the success rate is an average of 0.61 and the average mark is 1.3169. Most
students (97.2%) pass the exam in the first round, 2.3% in the second round, 0.4% in the third round and 0.1% in the fourth round.
In this faculty the drop out rate is very high (49.6%). PCA (see Table 2) with a KMO (Kaiser–Meyer–Olkin) measure of 0.647 shows the
need of using seven components for describing the data (these components explain 64.27% of the variability). Drop out is associated with
the academic performance rate, the success rate and the average mark. The sign of the coefficient in the rotated component matrix indicate
that when the academic performance rate, the success rate and the average mark go up the drop out goes down.
For explaining the drop out a logistic regression model has also been obtained after nine iterations of the stepwise method. This model
explains 78.8% of data, that is, the 78% of those students who do not drop out and the 85.3% of those students who abandon. In addition,
with a signification of 5% the Hosmer and Lemeshow test shows that the model fits the sample data well and the Wald’s contrasts show,
with a p-value of 0.999, that the model obtained included the variables of Table 3.
Only values with significant odds have been included in Table 3. According to the results of Table 3, the odds ratios show the following
information:

– Student’s average mark: If the student’s average mark is increased by one, the odds that the student do not drop out will be more than
doubled (1/0.410 = 2.6).
– Age of the student at the end of his degree: Considering an increase of one unit, odds ratio is significant, but its value is 1. For this reason, an
increase of 5 years is considered. This means if the age of the student at the end of his degree is increased in 5 years, the risk of dropping
out will be multiplied by 1.5 (e5*0.074).
– Degree: The risk of abandoning the degree is 181 times bigger for the students of Management of Technical Software Engineering than for
the students of Software Engineering. The risk of abandoning the degree is 119 times bigger for the students of System of Technical Soft-
ware Engineering than for the students of Software Engineering.

Table 2
Rotated component matrix of the principal component analysis with n = 10,844.

Component
1 2 3 4 5 6 7
Academic performance rate 0.882
Success rate 0.929
Abandon 0.631
Average mark 0.870
Access year 0.784
Birth 0.676
Access form 0.728
Father’s job 0.709
Mother’s job 0.748
Father’s education level 0.644
Mother’s education level 0.726
Marital status 0.672
Student job 0.643
Family city 0.815
Country 0.809
Average round 0.698
Mode round 0.892
Degree 0.677
Sex 0.539
Exam round 0.427

Total variance explained is 64.27%.

Table 3
Results of the stepwise logistic regressiona.

Variable B SE Wald coef Sign of B Exp(B)


Average mark 0.893 0.239 13.892 0.000 0.410
End year 0.074 0.037 4.000 0.046 1.077
Degree 94.176 0.000
Degreee (1) 5.201 0.542 92.209 0.000 181.381
Degree (2) 4.778 0.541 78.025 0.000 118.840
Exam round(1) 1.605 0.400 16.085 0.000 0.201
a
n = 10,844.
568 F. Araque et al. / Computers & Education 53 (2009) 563–574

– Exam round: The risk of abandoning the degree is five (1/0.201) times bigger for those students who take their exams in February and
June than for those who take their exams in September and December.

5.2. Humanities

In this faculty, 40.5% of the students were men. In addition 95.5% are single and 3.7% were married. Most of them (99.6%) are Spanish
and only 0.2% is Moroccan. Most fathers (60.5%) have received primary education and 16.9% have a degree. The 10% are classified as ‘‘high
level professional” and most of them (60.6%) work in the Public Administration. Most mothers (63.8%) have received primary education,
9.1% have not the primary education certificate and only 11.4% have a degree. Most mothers (69.7%) have never had a remunerative job,
12% of the mothers are housewives. Only 5.8% are classified as ‘‘high level professional” and 8.2% work in the Public Administration.
Most students are unemployed (95.2%), 3.9% work more than 15 weekly hours and only 0.9% work less than 15 weekly hours. Most stu-
dents (51.2%) are admitted to the university degree with an entrance examination, 31.5% of students are admitted as undergraduate, 14.4%
have been admitted passing a special entrance examination for those over 25.
The academic performance rate is an average of 0.48, the success rate is an average of 0.68 and the average mark is 1.4351. Most stu-
dents (97.4%) pass the exam in the first round, 2.5% in the second round and 0.4% in the third round.
In this faculty the drop out rate is 63.5%, the highest percent in the three faculties considered. PCA (see Table 4) with a KMO (Kaiser–
Meyer–Olkin) measure of 0.636 shows the need of using seven components for describing the data (these components explain 70.51% of
the variability). Drop out is associated with the academic performance rate, the success rate, the average mark again. The sign of the coef-
ficient in the rotated component matrix indicate that when the academic performance rate, the success rate and the average mark go up the
drop out goes down.
For explaining the drop out problem a logistic regression model has also been obtained after eleven iterations of the stepwise method.
This model explains 83.2% of data, that is, the 88.1% of those students who do not drop out and the 79.7% of those students who abandon. In
addition, with a signification of 5% the Hosmer and Lemeshow test shows that the model fits the sample data well and the Wald’s contrasts
show, with a p-value of 0.999, that the model obtained included the variables of Table 5.
Only values with significant odds have been included in Table 5. According to these results the odds ratios show the following
information:

– Mode round: The risk of abandoning is approximately double (1/0.557 = 1.8) for those students who are admitted in first round than for
those who are admitted in second round.
– Access form: Approximately, the risk of abandoning is two times (e0.576 = 1.779) bigger for those students who are admitted by a degree
than for those who are admitted by passing the university entrance test.
– Father’s education level: Approximately the risk of abandoning is two times bigger (1/0.565 = 1.77) for the students whose fathers have
not studies than for those students whose fathers have completed secondary school. The risk of abandoning is approximately two
times (1/0.529 = 1.89) bigger for the students whose fathers have not studies than for those students whose fathers are graduates
as well.
– Father’s job: The risk of abandoning is approximately ten times (e2.278 = 9.754) bigger for the students whose fathers do not have a qual-
ified job than for those students whose fathers work in the Public Administration.
– Marital status: The risk of dropping out of philosophy is approximately three times (1/0.358 = 2.8) bigger for single students than for
married students.
– Degree: If the student is signing on Philosophy the risk of abandoning is lower, that is, this degree has less risk of abandoning.

Table 4
Rotated component matrix of the principal component analysis with n = 39,241.

Component
1 2 3 4 5 6 7 8
Academic performance rate 0.872
Success rate 0.937
Average mark 0.881
Average round 0.654
Abandon 0.607
Mother’s job 0.719
Father’s job 0.852
Mother’s job 0.902
Access year 0.732
Marital status 0.735
Student job 0.674
Degree 0.841
Birth 0.758
Mode round 0.876
Father job 0.456
Family city 0.746
Country 0.754
Sex 0.816
Exam round 0.941

Total variance explained is 70.51%.


F. Araque et al. / Computers & Education 53 (2009) 563–574 569

Table 5
Results of the stepwise logistic regressiona.

Variable B SE Wald coef Sign of B Exp(B)


Academic performance rate 3.995 0.300 177.018 0.000 0.018
Success rate 1.912 0.487 15.414 0.000 0.148
Average mark 0.469 0.128 13.363 0.000 1.599
Average round 3.044 0.317 92.185 0.000 0.048
Mode round 32.044 0.000
Mode round (1) 3.494 0.617 32.044 0.000 32.917
Access year 0.197 0.026 56.603 0.000 0.821
Degree 131.726 0.000
Degree (1) 0.646 0.206 9.785 0.002 1.907
Degree (2) 2.039 0.453 20.253 0.000 7.684
Degree (3) 1.027 0.263 15.262 0.000 2.792
Degree (4) 1.213 0.501 5.870 0.015 3.364
Degree (5) 2.749 0.555 24.569 0.000 15.627
Degree (6) 2.002 0.366 29.859 0.000 7.405
Degree (7) 7.101 0.776 83.737 0.000 1213.195
Degree (8) 1.675 0.424 15.587 0.000 5.341
Degree (9) 1.069 0.268 15.862 0.000 2.913
Degree (10) 1.210 0.281 18.607 0.000 3.354
Degree (11) 2.125 0.560 14.402 0.000 8.376
Degree (12) 4.523 1.206 14.058 0.000 92.106
Degree (13) 1.514 0.278 29.580 0.000 4.547
Degree (14) 1.376 0.261 27.790 0.000 3.961
Degree (18) 1.046 0.365 8.193 0.004 2.846
Marital status 9.977 0.019
Marital status (1) 1.028 0.338 9.250 0.002 0.358
Father’s job 89.804 0.000
Father’s job (22) 2.278 0.710 10.301 0.001 9.754
Father’s education level 26.687 0.000
Father’s education level (2) 0.570 0.241 5.593 0.018 0.565
Father’s education level (5) 0.636 0.269 5.599 0.018 0.529
Access form 20.137 0.000
Access form (3) 0.576 0.155 13.880 0.000 1.779
a
n = 39,241.

– Access year: If the age of the student at the beginning of his degree is increased by 3 years, the risk of abandoning will be approximately
doubled (1/e3*( 0.197) = 1.8).
– Mode round: The abandoning risk for students who have passed in second round is much greater (e3,494 = 32.917) than for the students
who are passed in first round.
– Average round: If the average number of rounds to pass the subjects is increased by one, the risk of dropping out will be twenty times
higher (1/0.048 = 20.8).
– Average mark: If the student’s average mark is increased by one, the odds that the student does not abandon will be multiplied by 1.6.
– Academic performance rate: If the success rate is increased in 0.3, the odds that the student does not abandon will be approximately dou-
bled (1/e0.3*( 1.912) = 1.7).
– Success rate: If the success rate is increased by 0.3, the odds that the student does not abandon will be tripled (1/e0.3*( 3.995) = 3).

5.3. Economic Sciences

In this faculty, 45.9% of the students were men. In addition 96.4% are single and 3.1% were married. Most of them (98.2%) are Spanish
and only 0.6% is Moroccan. The largest group of fathers (43.1%) have received primary education, 25.2% have a degree and 9.2% have not the
primary education certificate. The 12.9% are classified as ‘‘high level professional” and most of them (38.8%) work in the Public Adminis-
tration. The largest group of mothers (43.1%) have received primary education, 11% have not the primary education certificate and 16.7%
have a degree. Most mothers (67.3%) have never had a remunerative job, only a 4.9% of the mothers are housewives, 7.3% are classified as
‘‘high level professional” and 14.6% work in the Public Administration.
Most students are unemployed (92.5%), 6.3% work more than 15 h weekly and only 1.1% work less than 15 h weekly. Most students
(67.6%) are admitted to the university degree with an entrance examination, 20.3% of students are admitted as undergraduate and 10% have
been admitted passing a special entrance examination for those over 25.
The academic performance rate is an average of 0.40, the success rate is an average of 0.60 and the average mark is 1.27. Most students
(97.7%) pass the exam in the first round, 2.1% in the second round and 0.1% in the third round.
In this faculty the drop out rate is 43.6%, the lowest in the three faculties. PCA (see Table 6) with a KMO (Kaiser–Meyer–Olkin) measure
of 0.677 shows the need of using six components for describing the data (these components explain 67.24% of the variability). Drop out is
associated with the academic performance rate, the success rate, the average mark again. The sign of the coefficient in the rotated com-
ponent matrix indicates that when the academic performance rate, the success rate, and the average mark go up the drop out goes down.
For explaining the drop out problem a logistic regression model has also been obtained after eleven iterations of the stepwise method.
This model explains 73.9% of data, that is, the 68.9% of those students who do not drop out and the 81% of those students who abandon. In
addition, with a signification of 5% the Hosmer and Lemeshow test shows that the model fits the sample data well and the Wald’s contrasts
show, with a p-value of 0.999, that the model obtained included the variables of Table 7.
570 F. Araque et al. / Computers & Education 53 (2009) 563–574

Table 6
Rotated component matrix of the principal component analysis with n = 25,745.

Component
1 2 3 4 5 6
Academic performance rate 0.869
Success rate 0.905
Average mark 0.822
Abandon 0.683
Father’s job 0.613
Mother’s job 0.703
Father’s education level 0.812
Mother’s education level 0.847
Access year 0.834
Birth 0.815
Marital status 0.510
Student job 0.561
Exam round 0.375
Average round 0.701
Mode round 0.860
Family city 0.761
Country 0.759
Degree 0.696
Sex 0.754

Total variance explained is 67.24%.

Table 7
Results of the stepwise logistic regressiona.

Variable B SE Wald coef Sign of B Exp(B)


Father’s education level 24.405 0.000
Father’s education level (3) 0.715 0.312 5.253 0.022 0.489
Father’s education level (5) 1.057 0.321 10.845 0.001 0.348
Mother’s education level 14.322 0.014
Mother’s education level (5) 0.862 0.364 5.610 0.018 2.368
Academic performance rate 4.555 0.253 324.936 0.000 0.011
Average round 2.814 0.264 113.907 0.000 0.060
Mode round 53.371 0.000
Mode round (1) 3.279 0.466 49.621 0.000 26.557
Mode round (3) 4.710 1.717 7.521 0.006 110.998
Access year 0.082 0.027 9.071 0.003 1.086
a
n = 25,745.

Only values with significant odds have been included in Table 7. According to these results the odds ratios show the following
information:

– Father’s education level: The risk of abandoning is two times (1/0.489 = 2.04) bigger for the students whose fathers have not studies than
for those students whose fathers have completed secondary school. The risk of abandoning is three times bigger (1/0.348 = 2.9) for the
students whose fathers have not studies than for those students whose fathers have a degree.
– Mother’s education level: The risk of abandoning is approximately 2.4 (e0.862 = 2.368) times bigger for the students whose mothers have
not studies than for those students whose mothers have a degree.
– Academic performance rate: If the academic performance rate is increased by 0.2, the advantage that the student does not abandon will be
multiplied by 2.5 (1/e0.2*( 4.555) = 2.49).
– Mode round: The risk of abandoning for students who are often passed in second round is 26 times bigger (e3.279 = 26.557) than for the
students who are often passed in first round. The risk of abandoning for students who are often passed in fourth round is 120 times big-
ger (e4.710 = 119.998) than for the students who are often passed in first round.
– Access year: This means that if the age of the student at the beginning of his degree is increased by 8 years, the risk of abandoning will be
doubled (e8*0.081 = 1.93).

In the Economic Sciences faculty, as the access year of the degree increases, the academic performance decreases and the possibility of
abandoning the degree increases. If the father has not studies, the student’s drop out tends to increase.

6. Discussion

Behind the observed results, we could verify that the rate of drop out in the different disciplines studied is over 40% and even exceeds
60% in the case of Humanities. This corroborates the data of the Spanish Ministry of Education that calculates the rates of drop out at the
Spanish universities at around 40% rising, and, in the case of humanities and the technical sciences like Software Engineering, registered
rates are higher. Besides these data coincide with various reports of the University Council on Indicators of Academic Performance that
F. Araque et al. / Computers & Education 53 (2009) 563–574 571

reveal that the degrees of humanities present a low backwardness rate (15%), and the highest rate of drop out (43%); and engineering pre-
sents the higher rate of backwardness and drop out (40%).
Taking into account public opinion and the information of some newspapers, we found more and more titles like these: ‘‘University
studies abandoned in Spain approximately doubles the European average, (the Spanish newspaper, ABC, Madrid, March 2006)”, 5000 stu-
dents approximately abandon the university in the islands, (headline news in Canary Islands 7, August 2004). All of this is in agreement
with the results of drop out that we have seen in the degrees of our study (Fig. 3).
Our objective was to analyze the main variables that seemed to be behind the drop out in different faculties: Software Engineering,
Humanities and Economic Sciences. The results obtained suggest that certain variables appear repeatedly in the explanation of the drop
out in all of the faculties. These variables are among others, start age, the father’s and mother’s studies, academic performance, success,
average mark in the degree and the access form and in some cases also, the number of rounds needed to pass (Fig. 4).
On the one hand the start age, we have seen through the results when start age increases the probability of drop out increases too, in
fact, the great majority of the students that abandon a degree have begun their studies at over 20 years, while the great majority of the
students that do not abandon have begun at 18 or 19. So, in Software Engineering the majority of students that abandon are over 21 years
and in Humanities are over 22. This supports the previous studies in which many of the students that abandon come from other degrees,
and are not very clear what degree to choose for their formation. In addition to this reason, it is also true that a lot of students that are
admitted to the university are older than 25 years and in many instances, the level of adaptation of an older student to the requirements
and learning needs, as the university to different profiles of students, is not the optimum, so that a high rate of drop outs is triggered. For
this reason, the variable access form (selectivity, FP2 or those over 25 years among others), also relates meaningfully with the drop out in
the numerous disciplines that have been studied (Figs. 5 and 6).
On the other hand, the father’s and the mother’s studies and job (highly correlated variables in the second or third component of the
PCA, see Tables 2, 4 and 6) also have an important level of influence on the drop out for the degrees studied. There can be no doubt that, in
nearly all the studies under review, the factors of psychoeducational character have a big influence on the drop out in these moments (Last
& Fulbrook, 2003). The relationship between the family factors and high school drop out has been well documented (Fortin et al., 2006). We
could observe in our study that the parents’ socioeconomic status correlate with the drop out showing that low socioeconomic status and
parental low academic performance contribute to increase the dropout probability. Some family economic difficulties or the lack of finan-
cial help for studying force some students to simultaneously study and work which in some cases induces situations of incompatibility that
cause drop out (Sinclair & Dale, 2000). Nowadays a big percentage of students come from broken homes and the divorce of the parents also
contributes to financial hardship (Lessard et al., 2008). Looking for other causes of drop out, numerous studies show the family pressure is a
determinant of great influence. When the students are taking vocational decisions of academic and professional character, a lot of parents
exercise such a strong pressure on the students that they cannot control and it leads them to drop out. In Root, Rudawski, Taylor, and
Rochon’s study (2003) carried out in Wisconsin with aspirant university students hoping to obtain teacher’s qualification, they found that
the family pressures had a great weight at the time of making a decision to leave studies, particularly in men. The need to reproduce the
professional roles of the parents to continue the family business frequently makes those parents force the children to study determined
degrees or else abandon the one that they began for lack of connection with the family role (Figs. 7 and 8).

0.7
63.5%
0.6 56.4%
50.4% 49.6%
0.5 43.6%

0.4 36.5%

0.3

0.2

0.1

0
Software Humanities Economic
Engineering Sciences
Non drop out Drop out

Fig. 3. Drop out rates in the three faculties involved in the study.

22.5
22
22
21.5
21 21
21
20.5
20
20
19.5
19 19
19
18.5
18
17.5
Software Humanities Economic
Engineering Sciences

Non drop out Drop out

Fig. 4. Age the university studies start.


572 F. Araque et al. / Computers & Education 53 (2009) 563–574

0.9
84.0%
0.8
72.9% 67.2%
0.7 64.6%
0.6 58.6%
52.8%
0.5
0.4
0.3
0.2
0.1
0
Software Humanities Economic
Engineering Sciences
Non drop out Drop out

Fig. 5. Statistical results of drop out for students whose fathers have not completed secondary school.

0.9 84.0%
0.8 72.9%
67.2%
0.7 64.6%
58.6%
0.6 52.8%
0.5
0.4
0.3
0.2
0.1
0
Software Humanities Economic
Engineering Sciences
Non drop out Drop out

Fig. 6. Statistical results of drop out for students whose fathers have completed secondary school.

0.7
61%
0.6
51%
0.5 49%
Percentage

40%
0.4

0.3 26% 28%

0.2

0.1

0
Software Humanities Economics
Engineering Sciences
Non drop out Drop out

Fig. 7. Average academic performance rates.

0.9
79%
0.8 73%
68%
0.7
62%
0.6
Percentage

49% 51%
0.5
0.4
0.3
0.2
0.1
0
Software Humanities Economic
Engineering Sciences

Non drop out Drop out

Fig. 8. Average success rates.


F. Araque et al. / Computers & Education 53 (2009) 563–574 573

Regarding the academic performance, success and the students’ average mark, we have noticed that these three variables are also very
associated with the drop out showing that a low level of academic performance and success is connected with a big probability of aban-
doning. A great majority of studies declared that the students that present a high motivation and positive expectations toward academic
performance do not consider abandoning, in spite of the fact that many of them suffer a lot of difficulties, but in the end, they often achieve
academic success (Landry, 2003). For some students, the adaptation to university life constitutes a challenge and a personal liability that
leads them to make an effort and to look for the necessary help to attain the goals that they have set. However, many of them fail in the
attempt and stop half way. Therefore we found that students with a good psychological profile to overcome obstacles, have greater per-
sistence and in consequence a better adaptation. In this sense, Kirton (2000) found that the perception of the university environment and
the academic self-efficacy had a great influence in the drop out the first year, during the first semester of the course. Another determining
factor is academic failure, that is, the non attainment of success. This situation has been widely studied in the Spanish State recently, and
the conclusions point to a weak previous education that affects determined degrees specifically.

7. Conclusions

Summing up, the conclusions that we obtain after analyzing all variables of our study are the following:

– Nowadays the rates of drop out of our students are higher than other previous studies reflected.
– In many instances, factors associated with drop out have a multi-causal nature, and they are related as much with psychological, vital,
generational characteristics as with the student’s educational characteristics.
– Practically the totality of the considered variables evidences a significant relation with the variable drop out. However, the model of
logistic regression obtained only considers some of them. This is happening, as the structural analysis shows, because some variables
are highly correlated among themselves and then they contribute the same information.
– It is possible to obtain a model of logistic regression that enables us to calculate the students’ risk of drop out for each university faculty.
– The presence of characteristics in the student body like not much control of learning strategies, low capacity of persistence to attain his
aims, translated into low rates of academic performance and success, supposes a high risk of drop out from the degree.

There are many universities that have begun to design, to implement and to evaluate programs and strategies to increase the rates of
persistence and to reduce the rates of drop out. In the strategic plan of the University of Granada there are numerous reports that have
analyzed the university context. A doctoral thesis carried out at the department of Psychology in the University of Granada, has created
a program of intervention whose efficiency, efficacy and benefit has been confirmed by the University Council of Coordination. The Tutorial
Program between Partners has incorporated as main strategy of intervention the realization of sessions of tutorship between students of
different age and academic course with more knowledge and ability (for example students of doctorate and from previous courses). A very
interesting study for the future would be evaluating students that use this kind of program and students that do not, in order to check if
significant differences in the rates of drop out exist.
From the results obtained in this study it would be convenient that each university, in the elaboration of its plans acts to decrease the
rate of drop out, takes into account the risk of abandoning (based on a model of logistic regression as we did) or that at least takes into
account the academic performance and success rates to elaborate a more effective program which supervises the students with a big risk
of drop out.
As stated before, the drop out phenomenon is highly subjected dependent. The profiles of the students vary at different faculties in the
same university. For this reason, a method which can be applied straight forwardly is needed. We have developed a methodology based on
a DW architecture which is general enough to be applied in any university centre. It uses the common data any university has about their
students (personal information and grades) without them having to be involved in the process. The most tedious part is the development of
an application capable of transforming all the operational data, i.e. the data to manage the students and their grades, into a DM focused on
the drop out. Some of the tasks this tool has to perform are the cleaning of the data, the determination of the drop out of each student and
the summary of the academic history of each student in a unique record (defining some rates).

Acknowledgements

We are grateful to the referees for their constructive comments and to Mª Carmen Aguilera and Mª Belén García for their assistance with
the manuscript. This work has been supported by the Spanish Research Program under projects EA-2007-0228 and TIN2005-09098-C05-03
and by the Research Program under project GR2007/07-2.

References

Araque, F., Salguero, A., Martínez, L., Navarro, E., & Calero, M. D. (2007a). Data warehousing for improving web based learning sites. International Journal of Emerging
Technologies in Learning, 2, 1–8.
Araque, F., Salguero, A., Calero, M. D., Fernández-Parra, A., Jiménez, M. I., Vives, M. C., et al. (2007b). E-learning platform as a teaching support in psychology. Lecture Notes in
Computer Science, 4739.
Braak, J. (2001). Factors influencing the use of computer mediated communication by teachers in secondary schools. Computer and Education, 36, 41–57.
Callejo, J. (2001). A cohorty study on UNED students: An approximation to drop-out analysis. Revista Iberoamericana de Educación a Distancia, 4(2), 33–69.
Feldman, R. S. (2005). Improving the first year of college: Research and practice. Mahwah, NJ: Lawrence Erlbaum Associates.
Forbes, A., & Wickens, E. (2005). A good social live helps students to stay the course. Times High Education Supplement, 1676, 58–63.
Fortin, L., Marcotte, D., Potvin, P., Royer, E., & Joly, J. (2006). Typology of student at risk of dropping out of school: Description by personal, family and school factors. European
Journal of Psychology of Education, 21(4), 363–383.
Hosmer, D. W., & Lemeshow, S. (2004). Applied logistic regression. Textbook and solutions manual (2nd ed.). New York, USA: John Wiley and Sons.
Inmon, W. H. (2005). Building the data warehouse (3rd ed.). New York, USA: John Wiley and Sons.
Kimball, R., & Ross, M. W. H. (2002). The data warehouse tool kit: The complete guide to dimensional modelling (2nd ed.). John Wiley and Sons.
Kirton, M. J. (2000). Transitional factors influencing the academic persistence of first semester undergraduate freshmen. Dissertation Abstracts International Section A:
Humanities and Social Sciences, 61(2-A), 522.
574 F. Araque et al. / Computers & Education 53 (2009) 563–574

Landry, C. C. (2003). Self-efficacy, motivation and outcome expectation correlates of college students’ intention certainty. Dissertation Abstracts International Section A:
Humanities and Social Sciences, 64(3-A), 825.
Last, L., & Fulbrook, P. (2003). Why do student nurses leave? Suggestions from a Delphi study. Nurse Education Today, 23(6), 449–458.
Lessard, A., Fortin, L., Joly, J., Royer, É., & Blaya, C. (2004). Students at-risk for dropping out of school: Are there gender differences among personal, family and school factors?
Journal of At-Risk Issues, 10(2), 91–127.
Lessard, A., Butler-Kisber, L., Fortin Marcotte, D., Potvin, P., & Royer, É. (2008). Shadows disengagement: High school dropouts speak out. Social Psychology of Education, 11,
25–42.
Levy, P. S., & Lemeshow, S. (2008). Sampling of populations: Methods and applications (4th ed.). New York, USA: John Wiley and Sons.
Levy, P. S., & Lemeshow, S. (2003). Sampling of populations: Methods and applications. Solutions manual (3rd ed.). New York, USA: John Wiley and Sons.
Lightseym, O. R. (2006). Resilience, meaning and well-being. Counseling Psychologist, 34(1), 96–107.
Martins, C. B. M. J., Steil, A. V., & Todesco, J. L. (2004). Factors influencing the adoption of the internet as a teaching tool at foreign language schools. Computer and Education, 42,
353–374.
Orazem, V. (2000). Understanding why they stay and why they leave: A grounded theory investigation of undecided students at a rural grant institution. Dissertation Abstracts
International Section A: Humanities and Social Sciences, 61(6-A), 2214.
Rainardi, V. (2007). Building a data warehouse: With examples in SQL server. Berkeley, California, USA: Apress.
Root, S., Rudawski, A., Taylor, M., & Rochon, R. (2003). Attrition of Hmong students in teacher education programs. Bilingual Research Journal, 27(1), 137–148.
Saunders, J., Davis, L., Williams, T., & Williams, J. (2004). Gender differences in self-perceptions and academic outcomes: A study of African American high school students.
Journal of Youth and Adolescence, 33(1), 81–90.
Sinclair, H., & Dale, T. (2000). The effect of student tuition fees on the diversity of intake within a Scottish new university. Paper presented at British Educational Research
Association Annual Conference, 7–9 September, 2000, Cardiff University.
Thomas, L. (2002). Student retention in higher education: The role of institutional habitus. Journal of Education Policy, 17(4), 423–442.
Wasserman, K. N. (2000). Psychological and development differences between students who withdraw from college for personal-psychological reasons and continuing
students. Dissertation Abstracts International Section A: Humanities and Social Sciences, 62(3-A), 915.

View publication stats

You might also like