Credit Risk Modeling with Python (PD, LGD, EAD)

Credit Risk Modeling using Machine Learning with Python
This project explores a large consumer loan dataset from Lending Club and applies a full risk modeling pipeline using supervised machine learning techniques.
Models Built:
- PD (Probability of Default) – using logistic regression
- LGD (Loss Given Default) – two-stage model
- EAD (Exposure at Default) – linear regression
A FICO-style scorecard was also developed to summarize risk scores for each loan.
📊 Key Features:
- Over 800,000 loans cleaned and processed
- Missing value treatment using correlated features
- Log transformation for skewed variables
- PSI analysis to monitor data drift
- Expected Loss (EL) calculated using PD × LGD × EAD
Tools Used:
- Python (pandas, scikit-learn, matplotlib, plotly)
- Jupyter Notebooks for analysis
- Flask (optional) for model deployment
Business Value:
- Help lenders evaluate borrower risk
- Automate and monitor model stability
- Provide a scorecard interpretable by business teams
Notebooks Included:
- General preprocessing
- PD model data prep + training
- Scorecard development
- PSI monitoring
- LGD & EAD modeling with expected loss
View notebook on GitHub:
🔗 Expected Loss Estimation and Credit Risk Analysis
Full project repo:
🔗 GitHub Repository
Data + models:
🔗 Google Drive Folder