Welcome to Scribd!

Sapient Problem Statement

Uploaded by

0% found this document useful (0 votes)

23 views3 pages

This document outlines a challenge to predict loan repayment ability using machine learning models. Participants are given training and test datasets containing loan applicant information and repayment labels for the training set. The goal is to predict repayment probabilities for applicants in the test set. A scoring metric of ROC AUC will be used to evaluate predictions against true targets. Participants are asked to submit source code, an architectural diagram, and a report discussing exploratory data analysis and feature engineering. Scalability and fault tolerance must be considered in the proposed system architecture.

Original Description:

Sapient hackathon

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

23 views3 pages

Sapient Problem Statement

Uploaded by

somil dua

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 3

Search inside document

Financing the un-financed (Social financial lending)

Financing or loans are essential for any community, today Banks are desperate
to provide loans to individuals having good tracking financial/credit history. But
many people are not part of this system, they don't have trackable financial
history and thus Banks are reluctant to offer loans to them. And these people are
often exploited by local untrustworthy lenders

Idea at Sapient AI hub is to predict borrowers' repayment capability by using

other easily available dataset like telco and transactional information.

This challenge is to predict loan pay-ability using various statistical and

machine learning methods

Data is broken into two files for training (with TARGET) and testing (without
TARGET).
Static data for all applications. One row represents one loan in data sample.
Data Set  Download Data Set
File Name Description Format
application_train.csv Train Data Set csv
Sample
sample_submission.csv csv
Submission
application_test.csv Test Data Set csv

Data Dictionary
Here's a brief version of what you'll find in the data description file.
Variable Description
SK_ID_CURR applicant_id
probability of applicant paying
TARGET
back
Submission
Data pre-processing:

1. data in application_train.csv may also be corrupt (some of mandatory

fields are missing), filter out those records
2. Since this data was collected from multiple sources hence it may contain
duplicate records, please clean up duplicate records.
Designing and Architectural considerations:

1. Propose an end to end design from collecting uses' data to processing and
the finally accessing output in real time fashion.
2. This application is expected to process huge number of applications in
real time, hence think about scalability.
3. This application is also backbone of entire loan process, hence it's very
important to keep it up all time and think about fault-tolerance.
Expected output/submission:

 You have to predict weather a borrower's is capable of paying loan back.

Your code should generate output with 2 columns – applicant_id and
probability of applicant paying back.
SK_ID_CURR TARGET
100001 0.5
100005 0.5
100013 0.5
100028 0.5
100038 0.5
100042 0.5
100057 0.5
100065 0.5
100066 0.5
100067 0.5
100074 0.5
100090 0.5
100091 0.5
 Complete solution

This should contain the following:

1. A source file containing bug-free code

2. An architectural diagram proposing end to end solution
3. PDF file containing - EDA

Evaluation Metric
1. Your output file will be evaluated on area under the ROC curve between
the predicted probability and the observed target [45 % weightage]
2. 20 % marks will be awarded on code quality
3. 30 % marks will be awarded for feature engineering and EDA
4. 5% bonus marks will be awarded if problem is solved using any big data
technology (like pyspark)

Data Mining with Microsoft SQL Server 2008
From Everand
Data Mining with Microsoft SQL Server 2008
Jamie MacLennan
Rating: 4 out of 5 stars
4/5 (1)
Microsoft NAV Interview Questions: Unofficial Microsoft Navision Business Solution Certification Review
From Everand
Microsoft NAV Interview Questions: Unofficial Microsoft Navision Business Solution Certification Review
Equity Press
Rating: 1 out of 5 stars
1/5 (1)
Final Project Report - Kelompok 4
Document6 pages
Final Project Report - Kelompok 4
andonifikri
No ratings yet
Case Study - Churn Mdel Prediction
Document77 pages
Case Study - Churn Mdel Prediction
Krishna Kalyan
No ratings yet
Add in Bank
Document5 pages
Add in Bank
aayushkjaat324
No ratings yet
Machine Learning - Customer Segment Project. Approved by UDACITY
Document19 pages
Machine Learning - Customer Segment Project. Approved by UDACITY
Carlos Pimentel
100% (1)
OM v1.0
Document5 pages
OM v1.0
Annu Babu
No ratings yet
Tutorial 1 Project Management Concept
Document13 pages
Tutorial 1 Project Management Concept
JIA HUI NG
No ratings yet
Project Report - Lendingclub - FINAL
Document24 pages
Project Report - Lendingclub - FINAL
Amit
No ratings yet
SENG380: Software Process and Management Assignment 1 - Solutions
Document6 pages
SENG380: Software Process and Management Assignment 1 - Solutions
Afzal Hussain
No ratings yet
MN405 Data and Information Management
Document7 pages
MN405 Data and Information Management
Sambhav Jain
No ratings yet
Minor Project Final Documentation
Document40 pages
Minor Project Final Documentation
saswataroy
No ratings yet
Advanced Statistics - Project Report
Document14 pages
Advanced Statistics - Project Report
Rohan Kanungo
100% (5)
ITNPBD6 Assignment 2018-2 PDF
Document2 pages
ITNPBD6 Assignment 2018-2 PDF
Vaibhav Jain
No ratings yet
Personal Loan Campaign Final
Document12 pages
Personal Loan Campaign Final
Deephika S
No ratings yet
ODI Case Study Financial Data Model Transformation
Document40 pages
ODI Case Study Financial Data Model Transformation
Amit Sharma
100% (1)
Bank Alliance
Document18 pages
Bank Alliance
vishesh
No ratings yet
DM 1107infospheredataarchmodeling1 PDF
Document38 pages
DM 1107infospheredataarchmodeling1 PDF
marhendra135
No ratings yet
Lab Manual 01 CSE 324 Project Proposal
Document6 pages
Lab Manual 01 CSE 324 Project Proposal
Shimul Chandra Das
No ratings yet
Loan Campaign Modeling for Thera Bank
Document21 pages
Loan Campaign Modeling for Thera Bank
Deepali Kumar
No ratings yet
Data Mining Case Study PDF
Document21 pages
Data Mining Case Study PDF
Deepali Kumar
100% (1)
Project Proposal
Document9 pages
Project Proposal
Noman M Hasan
No ratings yet
Learneverythingai
Document14 pages
Learneverythingai
nasby18
No ratings yet
Atm System
Document13 pages
Atm System
Hannah Misganaw
No ratings yet
Zoyeb - SRSdoc CABLE
Document18 pages
Zoyeb - SRSdoc CABLE
Zoyeb Manasiya
No ratings yet
Day13-K-Means Clustering
Document10 pages
Day13-K-Means Clustering
SBS Movies
No ratings yet
Optimizing and Enhancing Performance of MVC Architecture Based On Data Clustering Technique
Document5 pages
Optimizing and Enhancing Performance of MVC Architecture Based On Data Clustering Technique
MdHafizurRahman
No ratings yet
Customer Churn Data - A Project Based On Logistic Regression
Document31 pages
Customer Churn Data - A Project Based On Logistic Regression
Shyam Kishore Tripathi
100% (12)
HCI ScorecardModel PPT
Document9 pages
HCI ScorecardModel PPT
Indira Muetia
No ratings yet
Lab No. 01 Title: Project Assignment, Proposal Preparation and Planning
Document6 pages
Lab No. 01 Title: Project Assignment, Proposal Preparation and Planning
Mufizul islam Nirob
No ratings yet
Draft of Final Report
Document23 pages
Draft of Final Report
vallimeenaavellaiyan
No ratings yet
Thesis Book Suhayb and Khadar
Document44 pages
Thesis Book Suhayb and Khadar
khadar jr films
No ratings yet
Project Management Tutorials
Document9 pages
Project Management Tutorials
LUDWIG CHEW
No ratings yet
Biometric System Project Estimation and Team Building
Document11 pages
Biometric System Project Estimation and Team Building
Kalyan K
No ratings yet
Blood Donation
Document31 pages
Blood Donation
Saniya Attar
No ratings yet
OperatorKeyFigure-SAP IBP
Document58 pages
OperatorKeyFigure-SAP IBP
harry4sap
No ratings yet
Project Report On ATM System
Document56 pages
Project Report On ATM System
Kumar Siva
72% (116)
Reliable Predictions With Limited Data White Paper Hydropower Feb 2022
Document8 pages
Reliable Predictions With Limited Data White Paper Hydropower Feb 2022
laap85
No ratings yet
Problem Statements:: Inferential Statistics
Document5 pages
Problem Statements:: Inferential Statistics
vinutha
No ratings yet
Ciwr 2introandpracticals
Document62 pages
Ciwr 2introandpracticals
Abhishek Das
No ratings yet
ATM Project Details
Document43 pages
ATM Project Details
Srini Vasan kp
No ratings yet
Inventory Management (0IC - C03) Part - 2
Document10 pages
Inventory Management (0IC - C03) Part - 2
Ashwin Kumar
No ratings yet
A.Manmadhakumar Reddy Mobile:-91+9581462390 Mail
Document5 pages
A.Manmadhakumar Reddy Mobile:-91+9581462390 Mail
Kishore Dammavalam
No ratings yet
Janani Prakash Loan Prediction Study
Document97 pages
Janani Prakash Loan Prediction Study
Janani Prakash
No ratings yet
Prepare SPMP and SRS documents for online banking system
Document10 pages
Prepare SPMP and SRS documents for online banking system
Dimo Official
No ratings yet
SM-BILL
Document38 pages
SM-BILL
Nandha Kumar
100% (5)
E Banking
Document33 pages
E Banking
Mayank Kumar
100% (3)
1 CESS3005 CSE112 Major Task Students
Document6 pages
1 CESS3005 CSE112 Major Task Students
Mohamed Alaa
No ratings yet
Broker Management Software
Document32 pages
Broker Management Software
vickram jain
No ratings yet
Crearea Unui Cub de Date Cu SQL Server2005
Document8 pages
Crearea Unui Cub de Date Cu SQL Server2005
Talida Talida
No ratings yet
Using MATLAB To Develop and Deploy Financial Models
Document71 pages
Using MATLAB To Develop and Deploy Financial Models
Daefnate Khan
No ratings yet
How to Answer Interview Questions About Your Java Project
Document12 pages
How to Answer Interview Questions About Your Java Project
Lav Kumar
No ratings yet
SAP Client Concept: Understanding Clients in SAP Systems
Document32 pages
SAP Client Concept: Understanding Clients in SAP Systems
Surut Shah
No ratings yet
Synopsis Sukesh
Document9 pages
Synopsis Sukesh
sukeshbhalke10
No ratings yet
Six Sigma
Document46 pages
Six Sigma
Hemant Chaudhary
No ratings yet
ML Use Cases Ebook
Document53 pages
ML Use Cases Ebook
Sliptnock Martinez
100% (2)
1 SSGB Amity Bsi Intro
Document20 pages
1 SSGB Amity Bsi Intro
HarshitaSingh
No ratings yet
Customer Exit
Document9 pages
Customer Exit
rajeshvaramana
No ratings yet
ML0101EN Clas Logistic Reg Churn Py v1
Document13 pages
ML0101EN Clas Logistic Reg Churn Py v1
banicx
100% (1)
Visual Basic 2010 Coding Briefs Data Access
From Everand
Visual Basic 2010 Coding Briefs Data Access
Kevin Hough
Rating: 5 out of 5 stars
5/5 (1)
Word 2021 Advanced Quick Reference
Document3 pages
Word 2021 Advanced Quick Reference
Davinia Lajay
No ratings yet
Citrus Computers - Company Brochure
Document4 pages
Citrus Computers - Company Brochure
tperricone
No ratings yet
Lecture 01
Document47 pages
Lecture 01
Zrin Zidane
No ratings yet
Web Technology Paper Mids
Document5 pages
Web Technology Paper Mids
Abdullah Bin Rauf
100% (1)
Errfree 803004 09.22
Document5 pages
Errfree 803004 09.22
Mariana Perez
No ratings yet
Estimation of Age and Gender Using Convolutional Neural Network
Document35 pages
Estimation of Age and Gender Using Convolutional Neural Network
Nagesh Singh Chauhan
100% (1)
AP Computer Science - Exploring Boolean Expressions and Conditional Statements
Document4 pages
AP Computer Science - Exploring Boolean Expressions and Conditional Statements
Anish Buddolla
No ratings yet
Chapter 2: Machine Translation: A. Computer Assisted Translation Tools
Document32 pages
Chapter 2: Machine Translation: A. Computer Assisted Translation Tools
Rinaldy Alidin
No ratings yet
ATS Heavy-duty Smart Card Reader Product Data Sheet
Document2 pages
ATS Heavy-duty Smart Card Reader Product Data Sheet
silviu_dj
No ratings yet
LabPlanningManual 02 en
Document12 pages
LabPlanningManual 02 en
Julio Alceram
No ratings yet
Video Content Metrics Benchmark Report PDF
Document11 pages
Video Content Metrics Benchmark Report PDF
Demand Metric
No ratings yet
Top devices and drivers in Windows system report
Document8 pages
Top devices and drivers in Windows system report
Q Sang Pangerandz
No ratings yet
01.performing Hexadecimal Conversions
Document11 pages
01.performing Hexadecimal Conversions
asegunlolu
No ratings yet
Extraction of SSVEP Signals of A Capacitive EEG Helmet For Human
Document4 pages
Extraction of SSVEP Signals of A Capacitive EEG Helmet For Human
sandramaratorresmuller
No ratings yet
Greedy Algorithm
Document20 pages
Greedy Algorithm
Mohammed
No ratings yet
RET - 521 - 2.3 Technical - Reference - Manual PDF
Document455 pages
RET - 521 - 2.3 Technical - Reference - Manual PDF
elo_elo_elo_elo
No ratings yet
Finalpayrollsmsdocu
Document127 pages
Finalpayrollsmsdocu
api-271517277
No ratings yet
Step 3 Vol 01 TCCS
Document191 pages
Step 3 Vol 01 TCCS
Hadi Karyanto
100% (2)
Ergonomic Design For People at Work PDF
Document2 pages
Ergonomic Design For People at Work PDF
Mark
No ratings yet
API Certification Exam Dates & Deadlines 2015
Document2 pages
API Certification Exam Dates & Deadlines 2015
slxanto
No ratings yet
Scopus Submission Paper
Document21 pages
Scopus Submission Paper
Dr. Tarun Kumar Singhal
No ratings yet
PHD Entrance Exam BIT Mesra
Document5 pages
PHD Entrance Exam BIT Mesra
Mohammad Mottahir Alam
No ratings yet
Dae Cit 3nd Year Outline Course Design by Fayaz Tareen
Document50 pages
Dae Cit 3nd Year Outline Course Design by Fayaz Tareen
Noman Asif
No ratings yet
Programming Languages
Document2 pages
Programming Languages
Regiene Divino
No ratings yet
What Is Carpet Area, Built-Up Area and Super Built-Up Area and How To Calculate IT
Document8 pages
What Is Carpet Area, Built-Up Area and Super Built-Up Area and How To Calculate IT
Apoorva
No ratings yet
Esportfinal
Document27 pages
Esportfinal
mks88750
No ratings yet
SE Tools and Practices Lab 2
Document7 pages
SE Tools and Practices Lab 2
Hiziki Tare
No ratings yet
Digital Taylorism
Document3 pages
Digital Taylorism
Nivaldo Silva
No ratings yet
Stretchable Wearable MRI Device
Document2 pages
Stretchable Wearable MRI Device
Folk Narongrit
No ratings yet
Button Delay$1.Smali Compare
Document5 pages
Button Delay$1.Smali Compare
user26453
No ratings yet