Fairness audits and equalised odds

Part 9, Machine learning for researchers

Learning objectives

Define key fairness criteria: DEMOGRAPHIC PARITY, EQUALIZED ODDS, PREDICTIVE PARITY
Recognise the IMPOSSIBILITY THEOREM (Chouldechova 2017, Kleinberg-Mullainathan-Raghavan 2016): fairness criteria are mutually incompatible when base rates differ
Audit a deployed model: compute per-group TPR, FPR, PPV, selection rate
Apply MITIGATION techniques: threshold adjustment, post-processing, re-weighting, adversarial debiasing
Recognise fairness as a socio-technical choice, not a technical one

When ML models make decisions affecting people, hiring, lending, sentencing, healthcare, they MUST be audited for FAIRNESS across demographic groups. A model with 90% overall accuracy may have 95% accuracy on one group and 80% on another. The disparity matters for trust, legal compliance, and ethical practice. §9.5 develops the modern fairness toolkit and confronts its fundamental limits.

Three formal fairness criteria

Most popular criteria, defined over a binary classifier with groups (A, B):

Demographic parity: $P(\hat{Y} = 1 \mid \text{group} = A) = P(\hat{Y} = 1 \mid \text{group} = B)$ . Selection rates equal across groups.
Equalized odds (Hardt-Price-Srebro 2016): equal TPR (true positive rate) AND equal FPR (false positive rate) across groups. Symmetry in both error types.
Predictive parity: equal PPV (positive predictive value, i.e., precision) across groups. Among predicted-positive cases, equal probability of being a true positive.

The IMPOSSIBILITY THEOREM

Chouldechova (2017) and Kleinberg-Mullainathan-Raghavan (2016) proved: when groups have DIFFERENT BASE RATES of the positive class, you CANNOT simultaneously satisfy equalized odds AND predictive parity AND demographic parity. Period. Math forbids it.

The implication: every applied ML system makes an IMPLICIT or EXPLICIT choice among incompatible fairness criteria. There is no "fair" model in an absolute sense. The right choice depends on the stakes and substantive context:

Healthcare screening: prioritise equal SENSITIVITY (equal TPR), false negatives are catastrophic.
Hiring: prioritise equal SELECTION rate (demographic parity), historically marginalized groups should not face higher hidden hurdles.
Recidivism prediction: prioritise PREDICTIVE PARITY, among defendants flagged as high-risk, the rates of actual recidivism should be equal.

The choice is a SOCIO-TECHNICAL one, not a technical one. The ML community has converged on transparency: state which criterion you optimise for, why, and what trade-offs that implies.

How fairness audits work

The Hardt-Price-Srebro (2016) framework: collect a fair test set with demographic labels; compute per-group confusion matrices; compute fairness metrics; flag disparities.

Per-group ACCURACY: are predictions equally accurate across groups?
Per-group TPR / FPR: equalized odds.
Per-group PPV: predictive parity.
Per-group selection rate: demographic parity.
Per-group CALIBRATION: are probability outputs equally trustworthy?

Modern packages (AI Fairness 360 (IBM), Fairlearn (Microsoft)) automate this for any binary classifier.

Mitigation techniques

Once disparity is detected, FIX it via one of:

Pre-processing: re-weight or transform training data to remove the dependency on protected attributes (Kamiran & Calders 2012; Zemel et al. 2013).
In-processing: add fairness constraints to the training objective (e.g., Zafar et al. 2017). The model is trained to balance accuracy and fairness explicitly.
Post-processing: adjust the decision threshold per group AFTER training to equalize a chosen criterion (Hardt-Price-Srebro 2016). Simple, effective for equalized odds.

Trade-off: each mitigation typically sacrifices a small amount of overall accuracy for a large reduction in disparity. Whether the trade-off is acceptable depends on stakes.

The case for transparency

Modern ML deployment ethics: REPORT per-group performance metrics in the model documentation. Datasheets (Gebru et al. 2018) and Model Cards (Mitchell et al. 2019) provide standardised documentation templates. Regulators (EU AI Act, US Algorithmic Accountability Act) increasingly require this.

Best practice: even if you cannot satisfy all fairness criteria, you can AND SHOULD be transparent about which you chose and why.

Try it

Defaults: equal base rates, no hidden bias, threshold 0.5. All three fairness criteria pass. The model is "fair" in this trivial sense.
Increase the bias gap to 0.8 (group B has lower hidden ability). Now TPR_B is lower than TPR_A, group B has lower sensitivity. Equalized odds violated. The model has "learned" the bias in the data and reflects it in predictions.
Drop base rate B to 0.2 (group B has rare positive class). With the same threshold, group B selection rate is lower than group A. Demographic parity violated. Common with rare-class minorities.
With base rate disparity (rate_A = 0.6, rate_B = 0.3) and no bias gap, try to satisfy all three: change the threshold. You'll see different criteria favour different thresholds. The impossibility theorem in action.
Set bias = 0, equal base rates, threshold = 0.5. Pass. Now add bias of 0.5, equalized odds breaks. Add base-rate disparity, demographic parity breaks too. Fairness disparities accumulate from multiple sources.

A bank deploys an ML model that has 92% accuracy overall but 95% on majority group A and 80% on minority group B. The fairness audit reveals lower TPR on group B (good loan applicants from B are rejected at higher rates). What three options does the bank have, and what trade-off does each entail?

What you now know

Fairness has multiple incompatible mathematical definitions. The impossibility theorem (Chouldechova 2017) forbids simultaneously satisfying equalized odds, demographic parity, and predictive parity when base rates differ. Modern practice: choose the criterion that matches the stakes, document explicitly, and mitigate via pre/in/post-processing techniques. Fairness is a socio-technical decision the model designer makes. §9.6 next: causal ML / double ML, using ML to estimate causal effects with valid inference.

References

Hardt, M., Price, E., Srebro, N. (2016). "Equality of opportunity in supervised learning." NeurIPS. (Equalized odds.)
Chouldechova, A. (2017). "Fair prediction with disparate impact: A study of bias in recidivism prediction instruments." Big Data 5(2), 153-163. (Impossibility theorem.)
Kleinberg, J., Mullainathan, S., Raghavan, M. (2016). "Inherent trade-offs in the fair determination of risk scores." arXiv:1609.05807.
Barocas, S., Hardt, M., Narayanan, A. (2023). Fairness and Machine Learning: Limitations and Opportunities. MIT Press. (Comprehensive modern reference.)
Mitchell, M., et al. (2019). "Model cards for model reporting." FAccT.