Bias in the Black Box:
Artificial Intelligence (AI) tools are increasingly used in the criminal justice system to predict
recidivism and guide sentencing. However, evidence suggests that these systems can
perpetuate or amplify racial biases. This paper conducts a critical analysis of widely used risk
assessment algorithms such as COMPAS and PATTERN. Using real-world datasets and
fairness metrics (e.g., demographic parity, equalized odds), we reveal significant disparities in
how individuals of different racial backgrounds are scored. The study advocates for transparent
algorithmic governance, better data practices, and legal oversight to ensure ethical deployment
of AI in justice systems.
1. Introduction
AI and machine learning (ML) technologies are lauded for their efficiency and predictive power,
but when deployed in sensitive domains like criminal justice, their opacity—“the black box”
problem—raises urgent ethical and legal concerns. Risk assessment tools are intended to
provide objective predictions, but they often rely on biased historical data, leading to
disproportionate outcomes for minority communities. This paper explores how such biases are
embedded, their effects, and how they might be mitigated.
2. Background and Related Work
- COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) is widely
used in U.S. courts but has shown racial disparities, especially overpredicting recidivism among
Black defendants (ProPublica, 2016).
- PATTERN (Prisoner Assessment Tool Targeting Estimated Risk and Needs), used in the
federal system, faces criticism for lacking transparency.
- Researchers like Barocas & Selbst (2016) argue that fairness in ML cannot be separated from
the quality and neutrality of training data.
3. Methodology
Data Source: Publicly available datasets from Broward County (COMPAS) and DOJ case
studies (PATTERN).
Metrics Used:
- Demographic Parity
- Equal Opportunity
- Equalized Odds
- Predictive Parity
Analysis Tools: Python (scikit-learn), IBM AI Fairness 360 Toolkit, statistical significance testing
(p < 0.05).
4. Results
- COMPAS:
- Black defendants labeled as “high risk” reoffended at a lower rate (43%) than white
defendants labeled the same (59%).
- Disparity in false positives: 45% for Black defendants vs. 23% for white defendants.
- PATTERN:
- Data opacity limited full metric evaluation, but indicators showed consistent racial imbalance in
score distributions.
5. Discussion
- Source of Bias: Historical data, policing practices, and selective enforcement
disproportionately affect minority populations.
- Impact: Biased scores may lead to longer sentences, denial of parole, or stricter pre-trial
conditions.
- Legal Ramifications: Use of biased AI may violate Equal Protection Clause if not properly
regulated.
6. Recommendations
- Data Auditing: Regular fairness audits must be mandated before deployment.
- Transparency: Open-source algorithms or independent oversight should be implemented.
- Human Oversight: Algorithms should augment—not replace—human judgment.
- Policy Reform: Legal frameworks should evolve to address algorithmic accountability.
7. Conclusion
AI holds promise for making justice systems more efficient, but without ethical safeguards, it
risks deepening structural inequality. Transparent, fair, and accountable AI is essential in
contexts with such profound consequences for human lives and liberty.
References
1. Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). Machine Bias. ProPublica.
2. Barocas, S., & Selbst, A. D. (2016). Big Data's Disparate Impact. California Law Review,
104(3), 671–732.
3. Berk, R., Heidari, H., Jabbari, S., Kearns, M., & Roth, A. (2018). Fairness in Criminal Justice
Risk Assessments. PMLR, 81, 1–9.
4. Eubanks, V. (2018). Automating Inequality. St. Martin’s Press.