You are on page 1of 42

AI tools predicting and

preventing tuition
delinquency and dropouts
among College students

1
The AI boom in analytics
“In the field of credit scoring, studies have shown that neural networks perform
significantly better than statistical techniques. [1], [5]. ANN have been used in
credit rating and credit scoring quite extensively as illustrated in the following
papers : “Artificial Neural Networks for Corporation Credit Rating Analysis”[7], “Personal Credit Rating Assessment
for the National Student Loans based on Artificial Neural Network”[8], “Personal Credit Rating Using Artificial
Intelligence Technology for the National Student Loans” where a Back Propagation neural network was used [9],
“Research of electronic commercial credit rating based on Neural Network with Principal Component Analysis” [10]
(Thabiso Peter Mpofu1, Macdonald Mukosera2Credit Scoring Techniques: A Survey
– August 2014).

“…Furthermore, a comparison between different statistical approaches


demonstrates that advanced/sophisticated techniques, such neural networks and
genetic programming perform better than more conventional techniques, such as
discriminant analysis and logistic regression, in terms of their higher predictive
ability…” (Hussein A. Abdou & John Pointon - Credit Scoring, Statistical Techniques
and Evaluation Criteria: A Review of the Literature – 2011).
2
Investments in AI

Growing fast

3
In score modeling, ANNs have been consistently performing at the top

3,1 Neural Networks


The number is the
rounded average 3,3 SVM RBF LS
ranking of each
algotithm. The lower,
the better 4.0 Logit regression

4.9 SVM LF LS

5.2 Linear discriminant analysis

5.3 Bayesian probabilistic networks

7.0 Naive Bayesian networks

Source: Credit Technology Dec/2015 – Serasa Experian


(showing only the top 7 of 17 algorithm types)
4
The AI solutions used by IntelliSearch

✓ As shown in the previous graphic, Neural Networks outperform


all other approaches used in the market (logistic regression,
Naïve Bayes SVMs,...), for scoring applications, in both
maximum and average performance.
✓ Nevertheless, pure neural networks are easily trapped into
overfitting and/or local optima, eventually failing to reach a
global optimum or becoming more generalized.
✓ That’s why we use a hybrid model, having the ANN training cycle
supervised by a genetic algorithm, which prevents overfitting
and leads to the global optimum.
✓ Our solutions can be used in any hardware or software platform
(any DBMS), so they avoid vendor dependence.

5
The quality jump in dropout management

Timely action avoiding dropouts


and payment delinquency before
they take place. Optimized
classification to maximize
retention (tailored retention
strategies) in the short, medium
and long term.

Collection of delinquent or defaulted tuition


payments;
delayed reaction to dropouts.
Naive classification by notes or payment
delays. Non-specific targeting.
6
Our AI solutions
Ward Systems solutions are in fact the hybrid type, in which the
neural network (ANN) is supervised and has its performance
optimized by a genetic algorithm. It selects, over several
generations, the best combinations of neural network parameters,
with the following advantages over "unsupervised" neural
networks:
✓ It prevents ANN overfitting on in-sample data, by improving the
generalization capacity when acting on "out-of-sample“ data;
✓ The more generations covered during the training cycle, the better the ANN
performs on out-of-sample data;
✓ It prevents the ANN from getting stuck on a local optima, leading the search
to a global optimum;
✓ Eliminates the "black box" effect, usually found in unsupervised ANNs, by
showing which variables are most important in reaching the ANN’s predictive
and / or classificatory ability.
7
Our AI solutions

NeuroShell Predictor NeuroShell Classifier


for classification and clustering
and Chaos Hunter for of students. Eg, clustering
prediction of continuous potentially dropout students into
variables. Ex .:% of dropout retention strategies. Prediction,
or tuition delinquency in a at the individual level (each
class or a in a given student), whether she/he will
semester. % of financial loss drop out or get delinquent in
tuition fees. Classification for
due to tuition delinquency
mining (offering of new courses
to potertially dropout students)

Note: The names of software products shown on this slide and in the others
are the property of Ward Systems our strategic partner in AI.
8
IntelliSearch adds a valuable layer of software and
services to Ward Systems tools
Data
normalization/
treatment and
interface layer Chaos
Hunter (*)

Data capture and Integration with any


pre-processing application
Multi-
module
integration

NeuroShell NeuroShell
Predictor (*) Classifier (*)
End-user interface
(*) Trademarks of (web and/or mobile)
Ward Systems
9
Example #1: Predicting student dropout
It aims to predict, for each student, whether he or she will tend to
drop out the course along that semester. The following independent
variables (student-specific) will be used as input for the neural
network training:

✓ Student age;
✓ Count of student no-show ocurrences in class;
✓ Number of delayed deliveries of academic work;
✓ % of personal or house income commited to tuition fees;
✓ Marital status;
✓ Number of delinquencies in tuition payments;
✓ If the student has a regular job;
✓ Full-time worker?
✓ Average grades in tests and academic work;
✓ Previous dropout history.
11
Training data for independent and dependent variables
Raw data with characteristics of students from previous classes, including observation
about whether they have dropped out or not (evasao)
id idade hist_abs atrasos_ativ comprom_renda estado_civil atrasos_pagto ativ_prof Full_time_empl grau_medio hist_evasao_ant evasao
1 18,1 4 4 0,3 married with children 0 yes no 3,4 yes no
2 19,5 7 15 0,5 self-sustained single 4 yes yes 9,7 no no
3 20,2 5 8 0,8 married without children 2 yes no 6,4 no no
4 18,8 13 3 0,2 divorced 1 yes no 2,2 no yes
5 23,0 1 4 0,2 family-supporting single 0 no yes 0,6 yes yes
6 21,6 12 2 0,4 self-sustained single 1 no yes 1,5 yes no
7 22,4 15 16 0,4 married with children 0 yes no 0,9 no yes
8 18,9 10 3 0,3 married without children 1 no no 2,7 no no
9 22,8 8 11 0,5 self-sustained single 1 yes no 1,9 no no
10 19,7 6 14 0,3 married with children 0 yes yes 0,8 no yes
11 18,9 15 1 0,6 married without children 3 yes no 6,2 no no
12 22,0 6 13 0,3 married with children 2 yes yes 5,8 yes yes
13 20,0 1 11 0,4 family-supporting single 1 no yes 1,5 no yes
14 18,4 13 6 0,3 self-sustained single 0 yes yes 1,1 no no
15 18,8 12 16 0,5 married with children 2 yes yes 4,0 yes yes
16 19,0 10 1 0,2 married without children 4 yes yes 8,7 yes no
17 19,7 15 9 0,3 married with children 2 no no 5,6 no no
18 23,0 7 10 0,5 divorced 1 yes yes 2,0 no yes
19 19,8 7 13 0,7 married without children 4 no yes 9,6 yes no
20 21,1 15 0 0,1 family-supporting single 1 no no 1,3 no no
21 18,6 10 10 0,6 self-sustained single 2 no yes 5,2 no no
12
Treating the data before training/optimization
Before we pass the data to the algorithm training, we must convert
them and treat them so that the predictive capacity is maximal. This
is accomplished by the layer IntelliSearch developed, based on a
methodology thoroughly tested on our clients, an add-on to Ward
Systems tools. Essentially, this layer:
✓ Converts alphanumeric data into numeric (typically categorical / discrete);
✓ Treats continuous variable data that must pass through normalization
before submitted to the algorithm (eg student's age);
✓ Applies domain processing (mapping raw numeric domains into new
domains), distributing real or integer numbers more evenly;
✓ Converts some categorical numeric variables into a superset of binary
variables.
Such treatments also reduce the problem of dimensionality, increasing the ANN
generalization ability and decreasing the need for large amounts of data (table
rows).
13
Training data for independent and dependent variables
After treatment
id idade hist_abs atrasos_ativ comprom_renda estado_civil atrasos_pagto ativ_prof Full_time_empl grau_medio hist_evasao_ant evasao
1 0,02 0,03 0,22 0,3 3 0,078458744 1 0 0,355429859 1 0
2 0,98 0,06 0,87 0,5 1 1 1 1 1,002281665 0 0
3 0,62 0,04 0,49 0,8 2 0,631752326 1 0 0,655444708 0 0
4 0,2 0,11 0,18 0,2 5 0,237655146 1 0 0,23163432 0 1
5 0,06 0,01 0,21 0,2 4 0,061718283 0 1 0,066046073 1 1
6 0,1 0,1 0,12 0,4 1 0,163917902 0 1 0,153930913 1 0
7 0,06 0,12 0,92 0,4 3 0,101511923 1 0 0,096963852 0 1
8 0,23 0,08 0,17 0,3 2 0,279999741 0 0 0,275772049 0 0
9 0,14 0,07 0,65 0,5 1 0,194456576 1 0 0,196357348 0 0
10 0,02 0,05 0,81 0,3 3 0,084086631 1 1 0,078127698 0 1
11 0,62 0,12 0,03 0,6 2 0,67916316 1 0 0,643668876 0 0
12 0,54 0,05 0,75 0,3 3 0,601231087 1 1 0,594676548 1 1
13 0,1 0,01 0,67 0,4 4 0,141615822 0 1 0,150376481 0 1
14 0,07 0,11 0,38 0,3 1 0,127859037 1 1 0,110632927 0 0
15 0,41 0,1 0,97 0,5 3 0,451517517 1 1 0,410970845 1 1
16 0,88 0,08 0,03 0,2 2 0,919220948 1 1 0,894408902 1 0
17 0,53 0,12 0,5 0,3 3 0,539974118 0 0 0,574384909 0 0
18 0,17 0,06 0,59 0,5 5 0,188119498 1 1 0,203611975 0 1
19 0,97 0,06 0,78 0,7 2 1 0 1 0,985311297 1 0
20 0,11 0,12 0 0,1 4 0,165787515 0 0 0,129150122 0 0
21 0,53 0,08 0,57 0,6 1 0,553626156 0 1 0,538559781 0 0

14
Training/optimization cycle
(fitting criterion: minimize incorrect classifications in each category)

15
Removing the "black box" effect of conventional neural networks, by clearly showing which
variables are most important in predicting if a student will drop out)

16
The predictive capacity is shown by the classification matrix
(in this case a binary classification yes/no)

17
And graphically, in the ROC curve

18
The trained/optimized ANN can then be saved as a run-time program

19
The ANN going live
the run-time (here embedded in a spreadsheet) can be used to predict dropouts
among a brand new set of students
idade hist_abs atrasos_ativ comprom_renda estado_civil atrasos_pagto ativ_prof Full_time_empl grau_medio hist_evasao_ant evasao Probabilidade
22,7 4 7 0,5 married without children 3 yes yes 6,6 yes
20,0 7 12 0,3 married with children 2 no yes 4,4 yes
23,3 4 4 0,8 self-sustained single 3 no yes 7,0 yes
19,1 11 14 0,8 family-supporting single 2 no no 3,8 yes
21,4 7 10 0,5 married without children 3 yes yes 7,0 yes
21,8 15 7 0,8 married without children 0 no yes 1,3 yes
19,2 8 16 0,1 family-supporting single 2 no yes 4,2 no
18,8 4 9 0,3 married without children 1 no no 2,6 no
23,2 5 3 0,5 self-sustained single 2 no yes 5,4 yes
20,7 2 3 0,5 family-supporting single 2 no no 6,1 no
21,6 6 13 0,8 married with children 2 yes no 4,5 no
22,8 7 9 0,4 self-sustained single 0 yes yes 0,6 no
22,7 12 14 0,3 self-sustained single 2 no no 6,0 yes
23,1 6 10 0,4 self-sustained single 4 no yes 9,5 yes
23,5 15 8 0,8 self-sustained single 3 yes no 6,0 yes
22,9 4 17 0,5 self-sustained single 3 yes no 8,6 no
21,5 8 8 0,6 divorced 3 yes no 7,1 no
19,6 4 13 0,7 divorced 3 no yes 7,1 no

Reset Previsões

20
The ANN going live
the run-time (here embedded in a spreadsheet) can be used to predict dropouts
among a brand new set of students
idade hist_abs atrasos_ativ comprom_renda estado_civil atrasos_pagto ativ_prof Full_time_empl grau_medio hist_evasao_ant evasao Probabilidade
22,7 4 7 0,5 married without children 3 yes yes 6,6 yes yes 83%
20,0 7 12 0,3 married with children 2 no yes 4,4 yes yes 75%
23,3 4 4 0,8 self-sustained single 3 no yes 7,0 yes no 69%
19,1 11 14 0,8 family-supporting single 2 no no 3,8 yes yes 95%
21,4 7 10 0,5 married without children 3 yes yes 7,0 yes yes 88%
21,8 15 7 0,8 married without children 0 no yes 1,3 yes yes 91%
19,2 8 16 0,1 family-supporting single 2 no yes 4,2 no yes 77%
18,8 4 9 0,3 married without children 1 no no 2,6 no no 65%
23,2 5 3 0,5 self-sustained single 2 no yes 5,4 yes no 71%
20,7 2 3 0,5 family-supporting single 2 no no 6,1 no no 99%
21,6 6 13 0,8 married with children 2 yes no 4,5 no no 87%
22,8 7 9 0,4 self-sustained single 0 yes yes 0,6 no no 79%
22,7 12 14 0,3 self-sustained single 2 no no 6,0 yes no 83%
23,1 6 10 0,4 self-sustained single 4 no yes 9,5 yes no 66%
23,5 15 8 0,8 self-sustained single 3 yes no 6,0 yes no 94%
22,9 4 17 0,5 self-sustained single 3 yes no 8,6 no no 81%
21,5 8 8 0,6 divorced 3 yes no 7,1 no no 77%
19,6 4 13 0,7 divorced 3 no yes 7,1 no yes 95%

Reset Previsões

21
Example #2: Predicting class or semester level
dropout rates
It aims to predict the percentage of students that will drop out, per
class or semester. This info is extremely valuable for College
management decisions, such as to merge classes, thereby reducing
operational costs. The following variables will be used as
independent ones for the ANN training:
✓ Class average age;
✓ Class average no-show rate;
✓ Average delay in the delivery of academic work;
✓ % of students working in full-time jobs;
✓ Average delinquency (payment delay) per class;
✓ Class average house income;
✓ Observed dropout rates from previous classes/semester
(dependent variable to be predicted).
22
Training data (after treatment)
Data with characteristics of previous classes, including the percentage of observed
dropout in each one
turma media_idade media_absent media_atrasos work_full_time media_atrasos_pagto renda_media evasao
1 0,391357735 0,224327733 0,031559613 0,65 0,0710441 0,65 0,094929
2 0,086396411 0,376082488 0,197229429 0,61 0,184621836 0,61 0,110505
3 0,222590767 0,0182181 0,424122254 1 0,136407118 1 0,026593
4 0,424650937 0,47683103 0,29455022 0,73 0,347669876 0,73 0,162028
5 0,26008554 0,415313765 0,151949199 1 0,040567801 1 0,079589
6 0,200690804 0,103362191 0,347555996 0,34 0,09359077 0,34 0,084091
7 0,391711319 0,01216924 0,158899712 0,64 0,285056424 0,64 0,080618
8 0,056802141 0,097777114 0,052861186 1 0,333237699 1 0,053901
9 0,021801039 0,031210808 0,30524425 0,39 0,071295191 0,39 0,052636
10 0,296386483 0,454613893 0,338115983 1 0,137535737 1 0,102392
11 0,009401441 0,184816639 0,038325055 1 0,378391576 1 0,073633
12 0,053784372 0,12947928 0,278906042 0,77 0,429201769 0,77 0,091338
13 0,013706347 0,017654537 0,311985974 0,99 0,415435019 0,99 0,052359
14 0,478642115 0,348962475 0,253159679 1 0,117140998 1 0,093197
15 0,393037013 0,297852492 0,081568328 0,63 0,191958368 0,63 0,11302
16 0,111522676 0,060749732 0,314134968 1 0,089440297 1 0,021935
17 0,259571915 0,147834363 0,075030586 0,78 0,276100968 0,78 0,082274
18 0,445643746 0,328701656 0,149073859 0,39 0,179925689 0,39 0,140658
19 0,276070718 0,02076518 0,142391687 0,85 0,008264426 0,85 0,021365
23
Optimizing the ANN
(fitting criterion: minimize RMSE)

24
No black box: Relative importance of independent variables

25
ANN’s prediction ability shown by the scatter plot
(predicted x observed values)

26
The more generations of ANNs accumulate during the optimization
process, the better the predictive ability of the final model

27
Saving the optimized ANN as a run-time

28
The ANN going live
the run-time (here embedded in a spreadsheet) run-time can be used to predict
the % of dropout in new classes

turma media_idade media_absent media_atrasos work_full_time media_atrasos_pagto renda_media evasao_prevista


132 0,072729075 0,146112733 0,398559988 0,45 0,072499605 0,47
133 0,285524951 0,012172733 0,425478153 0,5 0,304030722 0,60
134 0,445643746 0,328701656 0,149073859 0,33 0,179925689 0,39
135 0,444504997 0,466400695 0,058332441 0,63 0,06290696 0,65
136 0,499918106 0,32385445 0,077462904 0,47 0,482240778 0,63
137 0,111522676 0,060749732 0,314134968 1 0,089440297 0,98
138 0,290065731 0,378206289 0,080910097 0,59 0,297637501 0,69
139 0,363490849 0,422159346 0,334174677 0,31 0,487182482 0,47
140 0,287135398 0,305797347 0,178503325 0,75 0,005172449 0,10
141 0,27245388 0,409999574 0,481773123 0,56 0,016737446 0,57
142 0,286915961 0,316967264 0,171663072 0,35 0,138264496 0,40
143 0,269712875 0,463694832 0,487361228 0,98 0,427602048 1,00
144 0,015463438 0,328328074 0,006803547 0,43 0,160759655 0,48

Reset

Gera previsões

29
The ANN going live
the run-time (here embedded in a spreadsheet) run-time can be used to predict
the % of dropout in new classes

turma media_idade media_absent media_atrasos work_full_time media_atrasos_pagto renda_media evasao_prevista


132 0,072729075 0,146112733 0,398559988 0,45 0,072499605 0,47 6,4%
133 0,285524951 0,012172733 0,425478153 0,5 0,304030722 0,60 9,0%
134 0,445643746 0,328701656 0,149073859 0,33 0,179925689 0,39 18,1%
135 0,444504997 0,466400695 0,058332441 0,63 0,06290696 0,65 15,7%
136 0,499918106 0,32385445 0,077462904 0,47 0,482240778 0,63 21,4%
137 0,111522676 0,060749732 0,314134968 1 0,089440297 0,98 0,0%
138 0,290065731 0,378206289 0,080910097 0,59 0,297637501 0,69 15,4%
139 0,363490849 0,422159346 0,334174677 0,31 0,487182482 0,47 27,6%
140 0,287135398 0,305797347 0,178503325 0,75 0,005172449 0,10 20,3%
141 0,27245388 0,409999574 0,481773123 0,56 0,016737446 0,57 16,5%
142 0,286915961 0,316967264 0,171663072 0,35 0,138264496 0,40 14,6%
143 0,269712875 0,463694832 0,487361228 0,98 0,427602048 1,00 22,8%
144 0,015463438 0,328328074 0,006803547 0,43 0,160759655 0,48 8,4%

Reset

Gera previsões

30
When predicting continuous variables, in addition to NS
Predictor, we can also use Chaos Hunter

31
Delivering similar results, and with the possibility of constructing
models based on linear regression, exponential, logarithmic, logistic,
polynomial, etc.

32
Example #3: Clustering potential dropout students
around retention strategies
For students who have been classified as probable dropouts
(example #1), the next step would be to group them around
retention strategies that have been successfully used in the past.
Basically, the purpose of the predictive model in this example is to
answer the question: For a given student profile, what is the best
retention strategy, among those that worked in the past (listed
below)?

1. On-line reinforcement classes;


2. Switch to a diferente time shift (eg. to a night shift);
3. Offering of a partial or full scholarship;
4. Support on getting a state funded scholarship.

33
Training data for independent and dependent variables
After treatment
(using the same independent variables of Example #1))

id idade hist_abs atrasos_ativ comprom_renda estado_civil atrasos_pagto ativ_prof Full_time_empl grau_medio hist_evasao_ant estrategia
1 0,02 0,03 0,22 0,3 3 0,078458744 1 0 0,355429859 1 2
2 0,98 0,06 0,87 0,5 1 1 1 1 1,002281665 0 4
3 0,62 0,04 0,49 0,8 2 0,631752326 1 0 0,655444708 0 4
4 0,2 0,11 0,18 0,2 5 0,237655146 1 0 0,23163432 0 1
5 0,06 0,01 0,21 0,2 4 0,061718283 0 1 0,066046073 1 1
6 0,1 0,1 0,12 0,4 1 0,163917902 0 1 0,153930913 1 2
7 0,06 0,12 0,92 0,4 3 0,101511923 1 0 0,096963852 0 1
8 0,23 0,08 0,17 0,3 2 0,279999741 0 0 0,275772049 0 1
9 0,14 0,07 0,65 0,5 1 0,194456576 1 0 0,196357348 0 2
10 0,02 0,05 0,81 0,3 3 0,084086631 1 1 0,078127698 0 1
11 0,62 0,12 0,03 0,6 2 0,67916316 1 0 0,643668876 0 4
12 0,54 0,05 0,75 0,3 3 0,601231087 1 1 0,594676548 1 3
14 0,07 0,11 0,38 0,3 1 0,127859037 1 1 0,110632927 0 2
15 0,41 0,1 0,97 0,5 3 0,451517517 1 1 0,410970845 1 3
16 0,88 0,08 0,03 0,2 2 0,919220948 1 1 0,894408902 1 4
17 0,53 0,12 0,5 0,3 3 0,539974118 0 0 0,574384909 0 2
18 0,17 0,06 0,59 0,5 5 0,188119498 1 1 0,203611975 0 2
20 0,11 0,12 0 0,1 4 0,165787515 0 0 0,129150122 0 1

34
Training/optimization cycle
(fitting criterion: minimize incorrect classifications in each category)

35
Relative importance of independent variables in the ANN prediction ability

36
The predictive capacity is shown by the classification matrix
(in this case, clustered in 4 categories)

37
And graphically, in the ROC curve

38
The ANN going live
the run-time can be used to group a brand new set of students around
retention strategies

evasao
id idade hist_abs atrasos_ativ comprom_renda estado_civil atrasos_pagto ativ_prof Full_time_empl grau_medio hist_evasao_ant Probabilidade Estratégia de retenção
prevista
701 22,7 4 7 0,5 married without children 3 yes yes 6,6 yes yes 83% State funded scholarship
702 20,0 7 12 0,3 married with children 2 yes yes 4,4 yes yes 75% switch to a different shift
703 19,1 11 14 0,8 family-supporting single 2 no no 3,8 yes yes 95% On-line reinforcement classes
704 21,4 7 10 0,5 married without children 3 yes no 7,0 yes yes 88% State funded scholarship
705 21,8 15 7 0,8 married without children 0 yes yes 1,3 yes yes 91% On-line reinforcement classes
706 19,2 8 16 0,1 family-supporting single 2 yes yes 5,3 no yes 77% switch to a different shift
707 18,8 4 9 0,3 married without children 1 yes no 2,6 no yes 65% On-line reinforcement classes
708 19,6 4 13 0,7 divorced 3 no no 7,1 no yes 95% State funded scholarship

Reset Previsões

Note: the columns with a light blue background are outputs from the ANN in example #1

39
Our AI solutions

✓ Cutting-edge;

✓ Reliable and robust;

✓ Easy and fast to implement – quick payback;

✓ User-friendly.

40
IntelliSearch and Ward Systems

✓ Since 2004, IntelliSearch has been accumulating continuous


experience in the use of Ward Systems tools, in several projects.
During this time we developed additional methodologies and
software layers (interfaces, input and output data normalizers for
neural networks, seeding algorithms, etc).
✓ Our long-standing cooperation with Ward Systems also allowed us
to learn about how to set the best optimization parameters
(training of neural networks), as well as the techniques for
selecting training data, for each case.
✓ Our customers enjoy discounts on the standard Ward Systems’
price list.

41
IntelliSearch and Ward Systems

✓ For more references, take a look on the Ward Systems’ website.

http://www.wardsystems.com/index.asp and

http://www.wardsystems.com/apptalk.asp (sections “Financial


applications” and “business predictions”)

42
Thanks for watching
• Contact us for more details on how develop and
apply the ideas presented here

Av. das Nações Unidas, 12495 – 15º andar

04578-000 – São Paulo – SP – Brasil

+55 11 2844-1871

www.intellisearch.com.br