SHAP and LIME are not interpretability. They are tools that produce numbers regulators and validators find useful. Treating them as more than that is one of the most common mistakes in regulated ML. This training is the antidote: interpretable-by-design methods first, then post-hoc tools with explicit honesty about what they do and do not tell you, and finally a fairness audit that goes beyond producing a single disparate-impact number.
The training is built for analysts and validators at banks, microfinance institutions, and insurers operating in African and West-African-facing markets. We use real African credit and mobile-money data, with its sparsity, its semi-formal income patterns, and the regulatory constraints (BCEAO, BCBS, EU AI Act for European-facing institutions) that rule some standard tricks out.
Program Overview
Three consecutive days — six half-day sessions — in Cotonou or, for cohorts of 8+ from a single institution, at the client site. The curriculum is built around the question a validator will actually ask after the model lands on their desk: "why this decision, for this person, defensibly enough that I would sign this off?"
Every session is hands-on in Python, using real African credit and mobile-money datasets and reading the regulatory texts together — BCEAO circulars on model risk, BCBS principles, EU AI Act high-risk-AI provisions for European-facing institutions.
Program structure
- Day 1 — Regulatory framing + interpretable-by-design. What regulators actually ask; the difference between interpretability and explainability. Then GAMs, monotonic gradient boosting, decision lists, sparse linear models — trained on a real default dataset and compared against an unconstrained XGBoost baseline.
- Day 2 — Post-hoc explanations and their failure modes. TreeSHAP, KernelSHAP, partition explainers, LIME, counterfactuals — what they compute, what they assume, where they break. Exercises: produce a confidently wrong explanation, then fix it.
- Day 3 — Fairness audits + final project. Group-fairness definitions on African credit data with semi-formal income, missing protected attributes, small minority subgroups. A short look at neuro-symbolic methods. Afternoon: 30-minute defense of a model card + validation report + fairness audit in front of the instructors.
Faculty
Lead instructors: AIRINA researchers in applied machine learning and financial-systems modeling. Guest lecture from a senior model-validation practitioner at a partner bank or supervisory authority (named two weeks before each cohort).
Certificate
The grade is based on whether the deliverable would survive a real model-validation submission. The certificate is AIRINA Interpretable ML for Finance · graded final + defense.
Learning Outcomes
By the end of the program, participants will be able to:
- Articulate the difference between interpretability and explainability, and use the right word in regulatory conversations.
- Build an interpretable-by-design model — a GAM, a monotonic gradient boosting model, or a decision list — that meets a real performance bar.
- Apply SHAP and LIME correctly, and recognize three common failure modes that produce confidently wrong explanations.
- Run a fairness audit on a credit-scoring model under multiple definitions of fairness, and explain to a non-technical stakeholder why the definitions disagree.
- Read a regulatory text (BCEAO circular, BCBS principle, EU AI Act article) and translate its requirements into model-validation tasks.
- Write a model-card and a model-validation report that a real validator would accept.
Program curriculum
BCEAO circulars on model risk; BCBS principles for the management of model risk; the EU AI Act's high-risk-AI provisions for European-facing institutions. The difference between interpretability and explainability, and why it matters when the validator pushes back. Reading a real regulatory text together, identifying the operational requirements.
Generalized additive models (GAMs), monotonic gradient boosting, decision lists, sparse linear models with feature engineering. Hands-on: training each on a credit-default dataset, comparing performance against an unconstrained XGBoost baseline. The cases where the interpretable model wins, and the cases where it loses by enough to matter.
TreeSHAP, KernelSHAP, partition explainers — what they actually compute and what assumptions they require. LIME and its instability problem. Counterfactual explanations and the cost of constructing them. Live-coding each tool on the Day-1 models; reading the outputs critically.
Three documented failure modes: feature correlation breaking SHAP's marginal-contribution story; LIME's sampling sensitivity producing different explanations on different runs; counterfactual explanations that are mathematically valid but practically impossible. Exercises: produce a confidently wrong explanation, then fix it.
Group-fairness definitions (demographic parity, equalized odds, predictive parity) and why they cannot all hold at once. Disparate-impact analysis under semi-formal income data, missing protected-attribute fields, and small minority subgroups. A short introduction to neuro-symbolic methods for decisioning where pure ML cannot meet the explainability bar.
Participants build a credit-scoring or fraud-detection model on a provided dataset, produce a model card, a validation report, and a fairness audit. 30-minute defense in front of the instructors. Constructive criticism on whether a real validator would sign off, and what is still missing.
Who Should Attend
The training is for the people on the line of fire when a model goes to validation — the builders, the validators, and the supervisors who ask them questions.
- Risk modellers, credit analysts, fraud analysts at banks, MFIs, mobile-money operators, insurers.
- Model-validation teams responsible for sign-off under BCEAO, BCBS, or EU AI Act constraints.
- Analytics leads at fintechs building lending or scoring products.
- Regulators and supervisors who need to ask the right questions of submitted models.
Prerequisites
- Working Python data-science. NumPy, pandas, scikit-learn. The training is hands-on.
- Familiarity with at least one classification model — logistic regression, random forests, gradient boosting. We do not teach the basics.
- Some exposure to credit-risk or fraud-detection workflows is helpful but not required.
- Laptop with Python environment. Setup instructions sent two weeks before.
Selection
Cohort capped at ~12 participants; minimum 8 to run. Client-site delivery for cohorts of 8+ from a single institution. Institutional cohorts (a single bank's risk team, an MFI network, a regulator's supervision desk) are prioritized when the topic fit is strong.
Brochure
The detailed program brochure (PDF, EN/FR) is sent on request — including the full day-by-day curriculum, regulatory reading list, project briefs, and the cohort calendar.
To receive the current brochure, write to contact@airina.africa with "Interpretable ML — brochure request" in the subject. The brochure is updated each cohort; we send the version current at the time of your request.