Reportability model monitoring
Generated 2026-06-04 15:40 UTC · benchmark n=900 · current n=3000 · model: stand-in TF-IDF+LogReg
Control decision
REVIEW
Recall floor 0.85 · drift must not be significant
The monitor is flagging a reportability risk: 454 reportable complaints were missed,
recall is 0.76, and overall drift is significant.
Treat this as a model-risk review pack, not an auto-approval dashboard.
The classifier decides whether a customer complaint is reportable because it shows financial or emotional impact. The core control question is whether reportable complaints are being missed.
Missed reportable complaints
454
false negatives in current window
Recall floor check
0.76
Δ -0.198 vs benchmark
Overall drift PSI
0.33
significant
Evidence map
| Page | Question it answers | Primary risk signal |
| Quality metrics | Is the model accurate, and where do errors fall? | False negatives and reportable recall |
| Distribution & bias | Does it under/over-flag? Are confidence scores safe to use? | Prediction bias, calibration, subgroup skew |
| Drift vs benchmark | Has the input or score distribution moved? | PSI, new categories, score shift |
| Assessment | What should an owner do next? | Tuning, retraining, data, and monitoring controls |
Design principle: the monitor consumes a predictions table. The classifier can be a lightweight stand-in today, a production export tomorrow, or a local RoBERTa backend later. Reproduce: python -m reportability_monitoring.cli run-all --out reports/