Application scoring models are used by loan institutions to evaluate creditworthiness of potential clients applying for credit product. The aim of scoring models is to classify applicantsinto two groups: the ones who will not default and the ones who will default. Applicationscoring models take into account all relevant information about applicants that is known at theapplication date and reported in an application form (i.e. demographic characteristics such asage, education, income as well as employment, marital and accommodation status).Application scoring models are used in retail and small business segments as they enable theautomation of creditworthiness evaluation process and help making quick and objective creditdecisions.Out of the variety of methods for scoring models, in this study we focus on twomethods: logit model approach and the divergence method. Logit model is a widely usedstatistical parametric model for modelling binary dependent variable and is supported bystatistical tests verifying estimated parameters. The divergence method is a kind of optimisation method, not supported by econometric theory and statistical testing.The aim of the study is to show how the scoring model can be constructed. In section 2we present two methods used for building scoring models: logit model approach and thedivergence method. Section 3 provides detailed data description. In section 4 the dependenciesbetween explanatory variables and their association with the dependent variable are examined.Sections 5 and 6 present the models constructed with the use of both logit approach and thedivergence method. In section 7 the resulted models are evaluated in terms of their predictivepower. Section 8 concludes the report.
2. Theoretical background
Preliminary action undertaken in the model building process is to collect theappropriate data set and to divide it into base sample (used for model building) and hold-outsample (used for model validation). An important aspect of scoring model building is to definethe dependent variable. In most cases the dependent variable is a binary one whichdistinguishes between two groups of applicants, defaulted and non-defaulted ones, however,the definition of default may vary between models. Let us denote by
the dependent dummyvariable which equals 0 for non-defaulted applicants and 1 for defaulted ones, and by thevalue of the variable