Case Study - Delinquency Prediction Of A Home Loan Company

Case Study - Delinquency Prediction Of A Home Loan Company

  Mar 22, 2023 17:43:00  |    Joseph C V   #analytics #python #azureML

Company’s Profile and Objective

The client is a home loan company based out of India with branches spread across several states in all the major cities and towns. They believe in providing affordable home loans to middle-class families and improving the quality of and access to housing schemes.

In its 25+ years of existence, the company has been revered with various awards. They’re deeply involved in CSR activities like the education of girls. They also run an NGO that provides shelter to distressed women and children. The company is also involved in raising awareness of women’s health and hygiene, primarily in the underprivileged section of the country.

Being one of the leading financial companies in the home loan segment, they deal with many delinquencies in repayments by their customers. The client wanted to predict the delinquent customers at the start of the month so that they could follow up in advance with the users who might skip the EMI.

When done manually after actual delinquency, follow-ups might go in vain. Sometimes the delay in payments damaged the business strategy of the company.

So, they reached out to the Logesys team of data experts to tackle this problem.


Data Volume

As the company is spread across several locations in India, the input data was huge to wrangle through. The input we received was in an excel file that contained nearly 100K records for 1 year. With this massive dataset, identifying the delinquency probabilities was rather a challenging task.

Data issues

Null values, outliers, and spelling errors skewed the records and their integrity. For instance, college became clg in several records in the educational variable.

Imbalanced Dataset

90% of the input values leaned towards no delinquency. With such data, the training model works more on the on-time paying borrowers instead of the defaulters, making the learning process insufficient and leading to a skewed prediction.


The client didn’t have any existing solution for predicting loan repayment probability. So, below is the solution team Logesys had designed.

We used binary logistic regression for this use case of prediction.

SMOTE, a statistical technique, was used to add extra negative data records. This made the imbalanced dataset more balanced for learning and training.

Both the models, logistic regression and Lasso regression, were built in Python and ran in Azure ML.


Logesys team completed this complex project in 3 months and delivered the desired results within the stipulated time.

The project proved highly effective for the client. The solution offers near-accurate predictions for the delinquency probabilities in the upcoming payment cycle. This helped them reach out to the predicted names in advance to nudge them for on-time payment.

With timely reminders before the due date, the client managed to avoid over half of the non-remittance and streamlined the process. Their delinquency rate dropped drastically in the next few months.

Our client’s team has appreciated our efforts and apt delivery and is satisfied with the project’s execution and outcome.