Professional Documents
Culture Documents
Shivam Bansal
National University of Singapore
Abstract
Conventional methods for credit risk assessment of SMEs (small and medium
enterprises) uses historical credit information along with banking and transactional
data. Most of the financial institutions and banks rely solely on financial information
to evaluate the creditworthiness of business entities. One of the major downfalls of
this approach is the cold start problem, i.e. it is almost impossible to evaluate new
business entities because of lack of data. Relying only on static banking data also
means that a comprehensive profile about the business entities is never obtained. A
lot of non-banking data specifically the one present in unstructured form which is
highly valuable is missed out. In this paper, we present a sophisticated novel system
that exploits non-banking unstructured data available on the open web to improve
the conventional credit scoring models. The system uses a generic data extraction
module which mines the unstructured data at scale. The system uses state of the art
deep learning and language processing architectures such as Context Free Grammar
Parsers and Bidirectional Encoding Representations from Transformers (Bert) to
derive structured entities and sentiments from the text. All the derived information is
fed into a statistical modelling architecture. In this paper, we show that use of
additional non-banking data improves the credit profiles of the business entities
along with significant improvements in credit scoring model performance. The results
also shows an increased relative feature importance of the derived features.
1. Introduction
Credit scoring has been a standard problem statement of data mining, data
1
science and financial analytics since past few decades. It has been regarded
as a core evaluation tool used by different institutions and has been widely
investigated in different areas, such as banking, accounting, and even
insurance. Different scoring techniques are being used in areas of
classification and prediction, where statistical techniques have conventionally
been used. A number of research and works have been done in the past that
are aimed to solve this problem. One of the promising results were shown by
Atiya A. when their team of researchers used neural networks to derive
accurate credit scores for consumers [Atiya, A. F. 2001]. They used the
banking data of customers to train the neural network models. In 2009,
Bellotti and Crook showed the use of support vector machines for credit
scoring and discovery of significant features [Bellotti, T., Crook, J. 2009].
Cramer in 2004 showed an excellent case study in which they identified the
cases where scoring bank loans that may go wrong [Cramer, 2004]. The
usage of data mining techniques such as classification, clustering, and
ensembling in the area of research has increased and become the dominant
area in the field.
2
possibly gets a better score as they provide richer data for assessing
creditworthiness.
In the paper “All data is credit data: Constituting the unbanked” Rob Aitken
suggested that credit scoring should not be just limited to banking data. A
variety of other alternative datasets can be used to supplement the scoring
models. [Rob Aitken, 2017]. Relying only on banking data creates several
issues and challenges. One of them is the credit assessment for businesses
which do not have comprehensive banking data becomes almost impossible.
This indicates that a number of potential customers are often being ignored
to give loans. Not just the new customers, there are instances in which even
the existing customers had relatively fewer past transactions so the credit
scoring models give less accurate results.
For SMEs, financial accounts are not reliable and it's up to the owner to
withdraw or retain cash. There are also other issues, for example small
companies are affected by their partners and their bad or good financial
status affects them, so monitoring the SMEs counterparts is another way of
scoring them. Additionally, small businesses have a major share of the world
economy and their share is growing, so accurate SME scoring is a major
concern. In a whitepaper from Moody’s, the authors have discussed seven
major challenges in assessment of SMEs [Moody’s 2016]
Past works and researches have been done on alternative data to improve
credit scoring models. Alaien Shema in 2019 presented a paper on effective
credit scoring using limited mobile phone data. [Alaien, 2019]. In other
works, Bhattacharya, P. and their team used network science and analytics
for Predicting Loan Defaults in Microfinance using Behavioral Sequences.
[Bhattacharya, P., Mehrotra, R., Tan, T., and Phan, T.Q. (2018)]. In 2016,
the same authors also used mobile data for Credit-worthiness Prediction in
Microfinance. They used a Spatio-network Approach to generate the
alternative data. [Bhattacharya, P., and Phan, T.Q. (2016)]
3
In our proposed system, we have focused on using unstructured data which
is mainly about the digital footprints of the companies as the main source of
non-banking data. This data includes meta information about the companies
on different webpages, company’s own websites, and common information
providers such as wiki pages, mashable, bloomberg, reuters etc. The
historical news data from multiple known sources is also added as the
mainstream data point. Additionally, any other information obtained from
social media and common websites is also added in the non-banking dataset.
In a similar work done by researchers from Central University of Finance and
Economics, Beijing they used social media data in credit scoring
improvements. They used limited statistical machine learning and natural
language processing techniques. [Yuejin Zhanga 2106].
All of these examples suggest that online information about the companies
serve as a proxy about the company’s digital figure. Even though the
company may have good bank balance, or past financial interactions, they
may not have good digital reflection. Hence, adding such information in the
credit scoring models can significantly improve the credit score of the
companies. Another hypothesis is that the relevant digital information, as
long as can be human searched can possibly be generated from web crawling
engine, further processed by NLP engine to extract the core concept, and
then fed into machine learning algorithm for risk scoring. This pipeline can
help the banks in understanding a company from multiple angles as well as
the whole industry.
Since most of this data is unstructured in nature and cannot be directly used
into statistical machine learning models, we used state of the art natural
language processing and deep learning architectures to perform entity
extraction and text classification. To enrich the dataset, we have derived two
types of sentiments associated with companies - entity specific sentiment
and sentence level overall sentiments. The details of these architectures are
shared in section 3. In the last part, a machine learning modelling
architecture is explained in which data from the banking side and non-
4
banking side is combined together, different statistical models are trained to
give a final risk score. The key intuition is that this risk score is an improved
reflection about the creditworthiness of the company.
2. Literature Review
The major factors in the traditional credit scoring model include the basic
information, repayment ability, life stability, credit record, guarantee and some
other factors. In 2008 during the recession period, Herzenstein et al. find that
the borrowers’ financial strength and their effort when listing and publicizing
the loan are more important than demographic attributes for funding success.
The shared the empirical results in their paper titled “The democratization of
personal consumer loans? Determinants of success in online peer-to-peer
lending communities” [Herzenstein et al 2008]. Qiu et al. show that the
personal information, social capital, loan amount, acceptable maximum
interest rate and loan period set by borrowers are all significant factors of
funding success in their work from the paper - Effects of borrower defined
conditions in the online peer-to-peer lending market. E-life: web-enabled
convergence of commerce, work, and social life [Qiu et al., 2012]. Riza et al.
use Cox Proportional Hazard regression technique to evaluate credit risk and
measure loan performances. They find that credit grade, debt-to-income ratio,
FICO score and revolving line utilization plays an important role in loan
defaults in their work done in the paper “ Evaluating credit risk and loan
performance in online Peer-to-Peer (P2P) lending” [Riza et al. 2015]. Everett
analyzes the relationship between the social relationship and the default risk
and the interest rate in online P2P lending, and concludes that there is a low
default rate between the group members who have actual social relationship
on the mutual financing platform. Their work was presented in the paper titled
“Group membership, relationship banking and loan default risk: the case of
online social lending” [Everett 2010]. Collier et al. make an empirical analysis
of the financing behavior between members of community groups on P2P
lending platform, and confirm that by combining individual reputation with the
reputation of the group, the group members can supervise each other, which
can effectively reduce the adverse selection phenomenon. [Collier et al., 2010]
5
1996; Malhotra & Malhotra, 2002; West, 2000], and genetic programming
models [Ong, Huang, & Tzeng, 2005]. From the computational results made
by Tam and Kiang [Tam and Kiang 1992], the neural network is most accurate
in bank failure prediction, followed by linear discriminant analysis, logistic
regression, decision trees, and k-nearest neighbor.
Recently, researchers have proposed the hybrid data mining approach in the
design of an effective credit scoring model. Hsieh proposed a hybrid system
based on clustering and neural network techniques [Hsieh, 2005]; Lee and
Chen proposed a two-stage hybrid modeling procedure with artificial neural
networks and multivariate adaptive regression splines [Lee and Chen 2005];
Lee, Chiu, Lu, and Chen integrated the backpropagation neural networks with
traditional discriminant analysis approach [Lee, Chiu, Lu, and Chen 2002];
Chen and Huang presents a work involving two interesting credit analysis
problems and resolves them by applying neural networks and genetic
algorithms techniques. [Chen and Huang 2003]
The first component is used to mine the maximum possible data about SMEs
from the open web. The second component is to derive structured information
6
from the data. The role of the third component is to provide a scoring model in
which multiple predictive models are trained. Models are then framed into a
stacking architecture which gives slightly improved accuracy. The model
results are also evaluated using a validation dataset which is the unseen data.
1. Total size of the dataset used was 10,000 rows and 16 columns including
one ID column and one target column.
2. Dataset consists of 9 numerical features related to bank and financial data
of the companies. These features were mainly related to past amounts,
current balances, average transactions, median spends etc. The snapshot of
these columns is shown in figure 2:
7
figure 3: Distribution of Categorical Variables in the dataset
4. The target variable followed a slight imbalance distribution with more than
70% of companies were tagged as 0 and the remaining companies were
tagged as 1. The target variable is the indication of credit worthiness, if the
company failed to make loan installment amount for one of the months in
first 12 month, then it was tagged as 1 else it was tagged as 0.
The dataset was relatively clean and did not require preprocessing. This is
because this was bank’s production ready data which they were using for
other services such as personalization, recommendations, insights and
dashboards. To store the entire data, a data lake architecture was also
developed as part of the entire system.
5. The dataset was split into two sets : a training set and validation sets. The
training set was used for training the credit scoring model and validation set
was used to evaluate the model performance on unseen dataset.
8
developed to clean the company names and fix issues such as HTML tags,
special characters, and handle different text encodings.
From the unstructured text data, system then extracted named entities
specifically mentions of company names. For this task, NLP engine was used.
Additionally, all the text objects where the input company name was
mentioned were tagged separately. For these sentences, entity wise
sentiment analysis was performed along with the sentence level sentiment
using deep learning state of the art architectures.
In the end, for 10,000 companies a total of 56,341 total webpages were
extracted from open web. Out of these webpages, 19,653 were news links
obtained from sources like bloomberg, yahoo finance, reuters, mashable etc.
Further from these news links, there were 4943 instances where input
company name was mentioned. These were the relevant sentences to gain
information about negative sentiment and entity specific sentiment. Entity
specific sentiment was classified as one of the four types of risk. This became
a single feature. The figure 5 shows the data lake architecture:
9
figure 5: Data Lake Architecture
3.2 Data Extractor Module
The banking data consisted of only static details and it is enriched with non-
banking attributes. The data extractor module is responsible for performing
automated searching on open web and obtaining all the relevant information.
It works in a pipeline manner in which the first step is to define an input
query. The input query is prepared by combining company name and the
operating country in single string. The linkfinder module performs automated
searches on multiple search engine websites such as google, bing, and
yahoo. These searches are triggered programmatically using python
programming language. In the linkfinder module, different methods are
incorporated to avoid the inaccuracies and discrepancies in the results. Figure
7 shows the detailed pipeline which is part of data extractor module.
10
figure 6: Workflow of Data Extraction Engine
First, the linkfinder module performs a Google Search using the custom
search API which is available for free to use. If relevant results are obtained
they are saved otherwise the engine then uses google-search package from
python to programmatically search google. It then also tries scraping google
very carefully and systematically using requests library. Further, even if all
the above steps do not generate the desired output, possibly due to blocking
limits, then a selenium wrapper is used to get the data which is a web
automation tool. it is used to mimic the searching process similar to humans
and obtain the results. The output is a list of URLs which are saved in local
file storage.
The next step in the data extractor pipeline is html extraction step. In this
step, all the links obtained in the previous part are passed to a
link_request_module which makes the programmatic requests to each
module. It requests them one by one and obtains their HTML text. Depending
upon the URL, this module makes a Get or Post request and stores the
response in the files. Finally, the HTML is parsed using python’s beautifulsoup
library in a step by step and systematic manner. This module does not obtain
all the text from an HTML page, rather obtains the main portion of the
website. The main page extraction is performed by keeping a check on
specific html tags. Some html tags are given high priority such as body, div,
p tags while others such as script, style, title tags are ignored.
11
Most of the non-banking data obtained from the web, documents, and
databases is raw and unstructured in nature. Before using it for analysis or
modelling purposes it is necessary to convert it into a structured form. In this
component of the system, a natural language processing engine is
developed. NLP engine is used for text cleaning, entity extraction, and text
classification for sentence sentiment and entity sentiment.
a. Text Cleaning: The first step in the NLP engine is the cleaning of text. A
pipeline to take raw text as input, performing several cleaning techniques
and producing a cleaned text as output is developed. The key techniques are:
Removal of HTML entities, Removal of special characters, Standardization of
text encodings, and removal of unwanted spaces, delimiters, and slang. All of
this noise is captured in the raw form from the web during the mining part.
This text cleaning task makes sure to remove all such noises and produces a
cleaned dataset for analysis and modelling.
The primary goal of performing entity wise sentiment analysis in our project
is to accurately identify if the company (entity extracted from the text) is
12
associated with any negative sentiment or risk terms. We used two different
approaches for entity wise and sentence level sentiment scores. One uses
pure natural language processing and the other uses deep learning based
novel architectures. The output of this module is not only a polarity score of
an overall sentence but also entity-wise sentiment analysis. In other words,
the polarity of a sentence or the attitude of the source actor (company)
towards the target entity with respect to a context (check for negative
sentiment) in a sentence.
The text data obtained from news, articles, blogs is tokenized into
paragraphs, then into sentences. We filtered all the sentences which
contained at least one company name obtained from entity extraction. All the
sentences were iterated one by one. First, every input sentence is analysed
and the “Subjective information” such as subject, object, and verb etc is
identified and extracted from it. This information is obtained by generating
the context-free grammar trees based on dependency grammar for every
sentence. We used a mix of stanford core nlp and spacy for this purpose. The
Stanford typed dependencies representation is designed to provide a simple
description of the grammatical relationships in a sentence that can easily be
understood and effectively used by people without linguistic expertise who
want to extract textual relations. In particular, rather than the phrase
structure representations that have long dominated in the computational
linguistic community, it represents all sentence relationships uniformly as
typed dependency relations.
For example: “Bell, based in Los Angeles, makes and distributes electronic,
computer and building products”. For this sentence, the Stanford
Dependencies (SD) representation is:
13
figure 7: Graphical representation of the Stanford
Dependencies for the sentence, Ref [Stanford, 30]
While parsing the graph, each word (in the form of leaf and root nodes) is
analysed and based on the word itself, its part of speech and grammatical
dependency relation, it is categorised as the part of source actor, the part of
target actor or the context/verb part. The remaining words which are not
part of actor groups are checked in a huge bag of words and decided as
positive, negative or neutral and a corresponding value (+1, -1, 0) is
assigned to them. The grammatical dependency relations are also used
decide a “sentiment factor” for each word. This factor is used to intensify or
negate the calculated scores. For a subtree, each word’s value is added and
each factor is multiplied to give an overall score. When all the subtrees are
completely parsed, the output is in the form of a subject, object, verb triplet
with a sentiment score.
14
sentence, we also trained and fine-tuned state of the art deep learning
architectures for text classification. Specifically, we focused on Bidirectional
Encoding Representations from Transformers a.k.a. BERT Architecture. The
BERT framework, a new language representation model from Google AI, uses
pre-training and fine-tuning to create state-of-the-art NLP models. Bert Uses
the concepts of Transformers, and Self-Attention which are explained below.
figure 8: Figure describing role of BERT based architecture for sentiment classification
15
are passed to the first encoder. These are then transformed and propagated
to the next encoder. The output from the last encoder in the encoder-stack is
passed to all the decoders in the decoder-stack.
The inner workings of self attention are explained in this paragraph. First,
three key vectors are created from the first pass of the encoder: Query
Vector, Key Vector, and Value Vector. These vectors are trained and updated
during the training process. Next, the self-attention for every word in the
input is calculated. For a phrase, it calculates scores for all the words in the
phrase with respect to a particular word. This score determines the
importance of other words when we are encoding a certain word in an input
sequence. The score for the first word is calculated by taking the dot product
of the Query vector (q1) with the keys vectors (k1, k2, k3) of all the words.
Then, these scores are divided by the square root of the dimension of the key
vector, followed by normalization using the softmax. These normalized scores
are then multiplied by the value vectors (v1, v2, v3) and sum up the
resultant vectors to arrive at the final vector (z1). This is the output of the
self-attention layer. It is then passed on to the feed-forward network as
input. So, z1 is the self-attention vector for the first word of the input
sequence and similarly it can get the vectors for the rest of the words.
BERT: Google released two variants of the model: BERT Base with Number
of Transformers layers = 12, Total Parameters = 110M, BERT Large with
Number of Transformers layers = 24, Total Parameters = 340M. In our
project, we trained BERT Base. BERT uses a multi-layer bidirectional
Transformer encoder as its self-attention layer performs self-attention in both
directions. BERT uses bidirectionality by pre-training on two important tasks
— Masked Language Model and Next Sentence Prediction.
16
context. Unlike left-to-right language model pre-training, the MLM objective
allows the representation to fuse the left and the right context, which allows
us to pre-train a deep bidirectional Transformer.” The task then becomes to
predict these masked words.
17
In the last part of the system, a scoring pipeline is developed. Figure 12
shows the credit risk scoring architecture which is built on top of the data
lake architecture, NLP engine, and the Data Mining Module. The scoring
model uses the ensemble modelling technique in which multiple base and
meta machine learning models are stacked together to give a final output.
figure 11: Complete workflow and process diagram for credit scoring
18
We used the stacking modelling architecture for predictive modelling. All the
base features and NLP based features are used in the classification model.
The models which are used to predict share counts are Logistic Regression,
K-Nearest Neighbour Classifier, Random Forest Classifier and Extreme
Gradient Boosting.
19
baseline predictor set. The various models of level 1 trainer that was trained
are Linear Regression, Random Forest, and XGBoost. The prediction of
baseline Level1 models are now features for level 2. In level 2 XGBoost is
used as model as a meta learner.
After the model, there is a prediction layer, in which all the preprocessing
steps remains the same. But instead of learning the part, trained model
weights are used to make predictions. The final model predictions are
generated and given in the form of files, visualizations, and reports. Along
with it, the interpretations are also shared. This is because ML models may
act as black boxes but in problems like credit risk scoring, we need to explain
why a model is making certain predictions, or what are the most important
features. Hence, work is done by making use of partial dependence plots,
feature importance, and permutation importance to give interpretability.
Final credit risk scoring also makes use of business rules and domain
knowledge. This is because a purely data-driven approach may not give
desired and relevant results. Hence, an additional rule-based engine is also
developed in which hard-coded business rules are implemented.
4. Discussion
20
such new SME, there is no sufficient centralized data available. But that
doesn’t mean that they cannot avail of credit. New age alternative credit
scoring companies use other tangible factors like digital footprint to
determine the credit-worthiness of a new customer.
This provides benefits at both ends. By extending access to credit, SMEs who
are new to the credit and loan system can still avail of loans irrespective of
lack of credit scoring data on traditional channels. Banks and Financial
institutions too can utilize alternative credit scoring in order to boost their
penetration in previously unexplored geographies like semi-urban and rural
areas but still keep their risk minimum and check frauds. At the core of the
alternate credit scoring companies’ competencies are three key factors – the
ability, intent, and stability of the customer in repayment of the loans
advanced on the basis of these innovative scoring systems.
Most of the data from which useful information can be obtained is present in
unstructured form. With such a heavy focus on structured databases in the
traditional method of calculating a credit score, some would view it as
outdated tool in our technologically evolving world. The emergence of social
media has brought about a data revolution, and with it a trove of new
information for businesses to tap into. It’s thought that by simply overlaying
social data onto traditional data, the context of these financial decisions –
that on the surface appear to be risky or irresponsible – will give lenders
more confidence when agreeing to offer loans to borrowers. The technology
is already in existence. News or social media data can be incorporated to
enrich reliable credit scores by delivering deeper insight into each customer
as an individual, instead of a number. Unlike traditional credit scoring, it
reveals recorded events and phrases that businesses can analyse to discover
what is going on “behind the scenes” financially. It echoes the original
methods employed by the Merchant Associations and small Credit Bureaus,
where a part of the credit score looks at personality, and what’s actually
going on in a borrower’s life. Social data offers that same insight, minus the
one-to-one visit, and brings personalisation into the equation. Affordability
assessments will become more detailed and bespoke than ever before.
For example, lenders would be able to see if a borrower had recently needed
to invest in a new boiler, whether they currently shop at Lidl or Waitrose, if
they have just been on holiday, got engaged, or had a new baby. Social data
reveals these snapshots so businesses can draw their own, well-informed,
financial conclusions. For lenders, there has never been a better time to join
the data revolution and utilise thousands of data points, extracted from
digital footprints, to supplement traditional credit scoring.
21
Natural language processing techniques such as Named entity extraction are
very useful in obtaining relevant information from unstructured data. Entity
extraction focuses on the extraction of semantically meaningful named entities
and their semantic classes from text, serves as an indispensable component
for several down-stream natural language processing tasks such as relation
extraction and event extraction. Dependency trees also convey crucial
semantic-level information. These techniques can be applied to gain more
information about the consumers before giving them loan or before generating
their credit scores. We discussed about how to utilize the structured
information conveyed by dependency trees, entity extraction, along with state
of the art deep learning models. Also, use of state of the art deep learning
models can help to gain better insights about the unstructured data. For
example, one can quickly identify the themes and sentiments associated with
a company from news or social media data. These additional features can
significantly improve the machine learning models and at the same time gives
a more credible view of credit score of the company. Machine learning
techniques such as logistic regression, tree based boosting, and stacking
architectures can be used to model the relationship between credit scores and
the features of the SMEs obtained from banking data and unstructured non
banking data. Through extensive experiments, we show that our proposed
novel system is a useful choice which can be used by multiple banks or
financial institutions for credit scoring.
22
We discussed that valuable information can be obtained from the unstructured
data which can heavily impact the performance of credit scoring models. The
use of state of the art deep learning and natural language processing
architectures can deeply help in improving credit scoring profiles. The
unstructured data is not only useful in credit scoring but there are several
other possible implications of greater use of alternative data and modeling
such as fair lending and discrimination, minimising default rates, setting up
new legal expectations, and privacy.
The entire theme presented in the paper, i.e. the use of alternative
unstructured dataset for credit scoring of SMEs can in fact be extended to
other use-cases as well. For instance - Risk Assessment of customers on online
ecommerce websites, Risk scoring of potential customers by an Insurance
company before giving them loans, or measuring healthcare risk associated
with the patients in a hospital. All these examples have a similar theme and
core problem statement to be solved. The system presented in this paper can
be fine tuned and customized for specific use case.
Bibliography
23
6. Bellotti, T., Crook, J.; Support vector machines for credit scoring and
discovery of significant features; 2009
7. Bhattacharya, P., Mehrotra, R., Tan, T., and Phan, T.Q.; Predicting Loan
Defaults in Microfinance using Behavioral Sequences; 2008
8. C. Everett, Group membership, relationship banking and loan default
risk: the case of online social lending
9. Chen, M. C., & Huang, S. H.; Credit scoring and rejected instances
reassigning through evolutionary computation techniques. Expert; 2003
10. Chen, S. Y., & Liu, X.; The contribution of data mining to
information science; 2004
11. Cramer, J. S.; Scoring bank loans that may go wrong: A case
study. Statistica Neerlandica; 2004
12. Desai, V. S., Crook, J. N., & Overstreet, G. A.; A comparison of
neural networks and linear scoring models in the credit union
environment; 1996
13. Henley, W. E., & Hand, D. J.; A k-nearest neighbor classifier for
assessing consumer credit risk; 1996
14. Henley, W. E.; Statistical aspects of credit scoring. Dissertation;
1995
15. Hsieh, N.-C.; Hybrid mining approach in the design of credit
scoring models; 2005
16. J. Qiu, Z. Lin, and B. Luo, Effects of borrower defined conditions
in the online peer-to-peer lending market.
17. J.E. Stiglitz, A. Weiss; Credit Rationing in Market with Imperfect
Information;
18. Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova,
BERT: Pre-training of Deep Bidirectional Transformers for Language
Understanding; 2018
19. Lipika Dey, Ishan Verma, Arpit Khurdiya, Sameera Bharadwaja
H.; A framework to integrate unstructured and structured data for
enterprise analytics
24
20. M. Herzenstein, R. Andrews, U. Dholakia, et al; The
democratization of personal consumer loans? Determinants of success in
online peer-to-peer lending communities, Working Paper; 2008
21. Malhotra, R., & Malhotra, D. K.; Differentiating between good
credits and bad credits using neuro-fuzzy systems; 2002
22. Moody’s Seven key challenges in assessing SME credit risk;
23. Ong, C.-S., Huang, J.-J., & Tzeng, G.-H.; Building credit scoring
models using genetic programming. Expert Systems with Applications;
2005
24. Riza Emekter, Yanbin Tu, Benjamas Jirasakuldech & Min Lu,
Evaluating credit risk and loan performance in online Peer-to-Peer (P2P)
lending; 2015
25. Rob Aitken; ‘All data is credit data’: Constituting the unbanked;
26. Sustersic, M., Mramor, D., Zupan J.; Consumer credit scoring
models with limited data; 2009
27. Tam, K. Y., & Kiang, M. Y.; Managerial applications of neural
networks: the case of bank failure prediction; 1992
28. Tan, T., Bhattacharya, P., and Phan, T.Q.; Credit-worthiness
Prediction in Microfinance using Mobile Data: A Spatio-network
Approach; 2016
29. Tobias Berg, Valentin Burg, Ana Gombović, Manju Puri; On the
Rise of the FinTechs—Credit Scoring using Digital Footprints
30. A. Vaswani, 2017; Attention is All you Need;
25