Credit Risk Modeling with Python (PD, LGD, EAD)

Credit Risk Modeling using Machine Learning with Python

This project explores a large consumer loan dataset from Lending Club and applies a full risk modeling pipeline using supervised machine learning techniques.

Models Built:

  • PD (Probability of Default) – using logistic regression
  • LGD (Loss Given Default) – two-stage model
  • EAD (Exposure at Default) – linear regression

A FICO-style scorecard was also developed to summarize risk scores for each loan.

📊 Key Features:

  • Over 800,000 loans cleaned and processed
  • Missing value treatment using correlated features
  • Log transformation for skewed variables
  • PSI analysis to monitor data drift
  • Expected Loss (EL) calculated using PD × LGD × EAD

Tools Used:

  • Python (pandas, scikit-learn, matplotlib, plotly)
  • Jupyter Notebooks for analysis
  • Flask (optional) for model deployment

Business Value:

  • Help lenders evaluate borrower risk
  • Automate and monitor model stability
  • Provide a scorecard interpretable by business teams

Notebooks Included:

  • General preprocessing
  • PD model data prep + training
  • Scorecard development
  • PSI monitoring
  • LGD & EAD modeling with expected loss

View notebook on GitHub:
🔗 Expected Loss Estimation and Credit Risk Analysis

Full project repo:
🔗 GitHub Repository

Data + models:
🔗 Google Drive Folder

Wing Li
Wing Li
Data Analyst | Aspiring Data Scientist

Passionate about turning data into insights. Experienced in data analytics, e-commerce market research, and risk modeling. Skilled in Python, SQL, Tableau, and machine learning.