Project · Machine Learning · Finance

Credit Risk Classifier

A live loan default predictor built on a Random Forest classification model and deployed via Django. Takes eight loan-level inputs — term length, amount, business location type, company age, borrower state/city, employee count, and issuing bank state — and returns a probability of default.

Model was iteratively trained and pruned from a dataset provided by Dr. Brent Albrecht. Worth noting: this dataset is based in the past and most likely does not reflect current loan environments.

Run the Model

How It Works

Credit Risk Modeling

Credit risk modeling relates to predicting whether a loan will default based on variables related to the loan itself. Data science approaches allow banks to utilise unbiased, probabilistic methods to evaluate potential loan outcomes at scale. Learn more →

Random Forest Classifier

A pre-trained Random Forest model is loaded at runtime via pickle. The model was iteratively trained and pruned down to eight features that maximise predictive power while minimising overfitting on held-out SBA loan data.

Eight Input Features

Term length, loan amount, business location type (urban/rural), new vs. existing company, borrower state, borrower city, employee count, and issuing bank state. One-hot encoding is aligned against the production training set at inference time.

Full Report

The full methodology write-up — covering data preparation, feature selection, model training, and evaluation — is available on Google Drive.

View Report →