Reportability model monitoring

Generated 2026-06-04 15:40 UTC · benchmark n=900 · current n=3000 · model: stand-in TF-IDF+LogReg
Control decision
REVIEW
Recall floor 0.85 · drift must not be significant
The monitor is flagging a reportability risk: 454 reportable complaints were missed, recall is 0.76, and overall drift is significant. Treat this as a model-risk review pack, not an auto-approval dashboard.

The classifier decides whether a customer complaint is reportable because it shows financial or emotional impact. The core control question is whether reportable complaints are being missed.

Missed reportable complaints
454
false negatives in current window
Recall floor check
0.76
Δ -0.198 vs benchmark
Overall drift PSI
0.33
significant
Precision
0.97
Δ +0.013
F1
0.85
Δ -0.105
Accuracy
0.83
Δ -0.117

Evidence map

PageQuestion it answersPrimary risk signal
Quality metricsIs the model accurate, and where do errors fall?False negatives and reportable recall
Distribution & biasDoes it under/over-flag? Are confidence scores safe to use?Prediction bias, calibration, subgroup skew
Drift vs benchmarkHas the input or score distribution moved?PSI, new categories, score shift
AssessmentWhat should an owner do next?Tuning, retraining, data, and monitoring controls
Design principle: the monitor consumes a predictions table. The classifier can be a lightweight stand-in today, a production export tomorrow, or a local RoBERTa backend later. Reproduce: python -m reportability_monitoring.cli run-all --out reports/