Using AI/ML to enhance the underwriting process

Regardless of the size of the institution, lending is at the core of any bank. In today’s low-interest rate environment, increasing lending can be a driver for revenue growth. The key, however, is to increase this lending while maintaining or ever reducing default ratios. Extending credit in a selective way to thin- and no-file borrowers can be the answer to growing revenue and more importantly profits. Growing your lending portfolio requires solving one particularly hard problem–underwriting more costumers.

Most underwriting processes currently in use are rules-based and might now necessarily identify complex relationships among the most important variables. For example, the rule might specify to eliminate anyone with a DTI (Debt to Income) ratio of 50% but the absolute income might have supported the loan or some of the debt might have been low-interest debt, circumstances that might have mitigated the risk. Traditional underwriting hasn’t really changed since the 1970s. This lack of innovation makes it challenging to identify additional creditworthy borrowers.

As many as 40% of Americans—including tens of millions of millennials—now have thin credit files or no credit file. These applicants—whether they will be good credit risks or not—are neglected because they haven’t amassed the extensive credit histories needed to fuel traditional underwriting models. In addition, many other applicants might not conform to the rules and ratios laid out by the underwriting process. Lastly, the underwriting rules might be letting unworthy borrowers slip through the cracks. Some borrowers might be meeting individual guidelines but, in the aggregate, they might not have the right profile.

This problem is even worse in many emerging markets because the data needed for traditional underwriting doesn’t exist in those markets. The result: Businesses are often reluctant to expand approvals to thin- and no-file borrowers, which in turn can restrict their growth to new markets.

The use of machine learning in the underwriting process enables lenders to identify high-risk borrowers that traditional underwriting misses while lending to unconventional, thin-file and no-file applicants.

How does Machine Learning lower default rates while increasing the borrower pool?

Scoring thin-file applicants effectively require, not surprisingly, adding more data than that found in the credit bureau files. In fact, there are thousands of pieces of information on applicants both on the internet and in company internal databases. However, traditional underwriting is unable to handle much more than about 50 data points. How can that hurdle be overcome? Artificial Intelligence and Machine learning (AI/ML) can be the answer. AI/ML can help lenders confidently increase approval rates in these previously hard-to-score populations by using the same data more efficiently and using more data to produce even better results.

But AI/ML is not a magic bullet. It’s quite difficult to move from traditional underwriting methods. Upfront costs—in time and money—can be prohibitive for acquiring and preparing the necessary data and building the supporting AI/ML infrastructure.

Even if we can obtain the data, there is a dearth of data scientists and data engineers who know the math and computer science that underpins AI/ML. As a result, it’s extremely hard to hire great, experienced talent.


One advantage of using an algorithmic and mechanistic solution in your underwriting is the minimization of bias. By being able to prove that there was no human intervention in the evaluation of creditworthiness can go a long way towards appeasing regulators. It is possible to still inject a systematic bias in models if one is not careful in identifying variables that are correlated to disallowed features. For example, zip code might be highly correlated to an applicant’s race and for that reason should be used as an input feature. Models can and must be tuned to identify these indirectly biased features.


In addition, AI/ML models often function as “black boxes.” One can see the model’s output but can’t explain what drove that output.

This affects the lender and its regulators. The lack of transparency makes it hard for modelers to understand how to iterate and improve their models.

Even more importantly, the black box nature of AI/ML models makes it impossible for lenders to provide legally required information to applicants—like adverse action— and to regulators—like disparate impact reports.

There are techniques that allow you to overcome the “black box” nature of some AI/ML algorithms.

We do two types of development in our lab. The first type is coming up with performant algorithms without regards to explainability. However, we know that these models are not very useful to deploy in production, so we also develop other models that have full explainability. These models provide dynamic reports to support modeler iteration. The latter models allow the lender to provide adverse action in whatever format they choose to applicants. Disparate impact reports, using the regulators’ approved recommendations and guidance, are then automatically generated.


The platform we have developed can assist and enhance the underwriting process by scoring customers. Our tools allow lenders to acquire, onboard and prepare massive amounts of disparate data for modeling. This data can come from internal and external sources. The data is then merged and cleansed and finally, we train, ensemble, and productionalize models.

Machine learning facilitates the inclusion of hundreds of variables not used in traditional underwriting. With this additional data, the AI/ML models can produce more accurate credit decisions even for hard-to-score borrowers, allowing lenders to increase the number of approved borrowers while reducing the number of defaults.