Project · Machine Learning · Finance
Credit Risk Classifier
A live loan default predictor built on a Random Forest classification model and deployed via Django. Takes eight loan-level inputs — term length, amount, business location type, company age, borrower state/city, employee count, and issuing bank state — and returns a probability of default.
Model was iteratively trained and pruned from a dataset provided by Dr. Brent Albrecht. Worth noting: this dataset is based in the past and most likely does not reflect current loan environments.
How It Works
Credit Risk Modeling
Credit risk modeling relates to predicting whether a loan will default based on variables related to the loan itself. Data science approaches allow banks to utilise unbiased, probabilistic methods to evaluate potential loan outcomes at scale. Learn more →
Random Forest Classifier
A pre-trained Random Forest model is loaded at runtime via pickle. The model was iteratively trained and pruned down to eight features that maximise predictive power while minimising overfitting on held-out SBA loan data.
Eight Input Features
Term length, loan amount, business location type (urban/rural), new vs. existing company, borrower state, borrower city, employee count, and issuing bank state. One-hot encoding is aligned against the production training set at inference time.
Full Report
The full methodology write-up — covering data preparation, feature selection, model training, and evaluation — is available on Google Drive.
View Report →