You are on page 1of 9

RESEARCH ARTICLE SOCIAL SCIENCES OPEN ACCESS

Eliminating unintended bias in personalized policies using


bias-eliminating adapted trees (BEAT)
Eva Ascarzaa,1 and Ayelet Israelia,1

Edited by Elke Weber, Departments of Psychology & Public Affairs, Andlinger Center for Energy and Environment, Princeton University, Princeton, NJ;
received August 19, 2021; accepted December 26, 2021

An inherent risk of algorithmic personalization is disproportionate targeting of individ-


uals from certain groups (or demographic characteristics, such as gender or race), even Significance
when the decision maker does not intend to discriminate based on those “protected”
attributes. This unintended discrimination is often caused by underlying correlations in Decision makers now use
the data between protected attributes and other observed characteristics used by the al- algorithmic personalization for
gorithm to create predictions and target individuals optimally. Because these correlations resource allocation decisions in
are hidden in high-dimensional data, removing protected attributes from the database many domains (e.g., medical
does not solve the discrimination problem; instead, removing those attributes often treatments, hiring decisions,
exacerbates the problem by making it undetectable and in some cases, even increases the product recommendations, or
bias generated by the algorithm. We propose BEAT (bias-eliminating adapted trees) to dynamic pricing). An inherent risk
address these issues. This approach allows decision makers to target individuals based on of personalization is
differences in their predicted behavior—hence, capturing value from personalization—
disproportionate targeting of
while ensuring a balanced allocation of resources across individuals, guaranteeing both
individuals from certain protected
group and individual fairness. Essentially, the method only extracts heterogeneity in the
data that is unrelated to protected attributes. To do so, we build on the general random groups. Existing solutions that
forest (GRF) framework [S. Athey et al., Ann. Stat. 47, 1148–1178 (2019)] and develop firms use to avoid this bias often
a targeting allocation that is “balanced” with respect to protected attributes. We validate do not eliminate the bias and may
BEAT using simulations and an online experiment with N = 3,146 participants. This even exacerbate it. We propose
approach can be applied to any type of allocation decision that is based on prediction BEAT (bias-eliminating adapted
algorithms, such as medical treatments, hiring decisions, product recommendations, or trees) to ensure balanced
Downloaded from https://www.pnas.org by 179.251.80.56 on November 28, 2022 from IP address 179.251.80.56.

dynamic pricing. allocation of resources across


individuals—guaranteeing both
algorithmic bias | personalization | targeting | fairness | discrimination
group and individual
fairness—while still leveraging the
In the era of algorithmic personalization, resources are often allocated based on individual-
value of personalization. We
level predictive models. For example, financial institutions allocate loans based on in-
dividuals’ expected risk of default, advertisers display ads based on users’ likelihood to validate our method using
respond to the ad, hospitals allocate organs to patients based on their chances to survive, simulations as well as an online
and marketers allocate price discounts based on customers’ propensity to respond to experiment with N = 3,146
such promotions. The rationale behind these practices is to leverage differences across participants. BEAT is easy to
individuals, such that a desired outcome can be optimized via personalized or targeted implement in practice, has
interventions. For example, a financial institution would reduce risk of default by desirable scalability properties,
approving loans to individuals with the lowest risk of defaulting, advertisers would increase and is applicable to many
profits when targeting ads to users who are most likely to respond to those ads, and so personalization problems.
forth.
There are, however, individual differences that firms may not want to leverage for
personalization, as they might lead to disproportionate allocation to a specific group. These Author affiliations: a Marketing Unit, Harvard Business
individual differences may include gender, race, sexual orientation, or other protected School, Harvard University, Boston, MA 02163

attributes. In fact, several countries have instituted laws against discrimination based on
protected attributes in certain domains (e.g., in voting rights, employment, education, and
housing). However, discrimination in other domains is lawful but is often still perceived as Author contributions: E.A. and A.I. designed research,
unfair or unacceptable (1). For example, it is widely accepted that ride-sharing companies performed research, contributed new analytic tools,
analyzed data, and wrote the paper.
set higher prices during peak hours, but these companies were criticized when their prices
The authors declare no competing interest.
were found to be systematically higher in non-White neighborhoods compared with White
This article is a PNAS Direct Submission.
areas (2).
Copyright © 2022 the Author(s). Published by PNAS.
Intuitively, a potentially attractive solution to this broad concern of protected This open access article is distributed under Creative
attributes–based discrimination may be to remove the protected attributes from the data Commons Attribution-NonCommercial-NoDerivatives
and to generate a personalized allocation policy based on the predictions obtained from License 4.0 (CC BY-NC-ND).
1
models trained using only the unprotected attributes. However, such an approach would To whom correspondence may be addressed. Email:
eascarza@hbs.edu or aisraeli@hbs.edu.
not solve the problem as there might be other variables remaining in the dataset that are
This article contains supporting information online at
related to the protected attributes and therefore, will still generate bias. Interestingly, as https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.
we show in our empirical section, there are cases in which removing protected attributes 2115293119/-/DCSupplemental.
from the data can actually increase the degree of discrimination on the protected attributes Published March 8, 2022.

PNAS 2022 Vol. 119 No. 11 e2115293119 https://doi.org/10.1073/pnas.2115293119 1 of 9


(i.e., a firm that chooses to exclude protected attributes from our approach are noteworthy. First, allocating resources based on
its database might create a greater imbalance). This finding is CBT scores does, by design, achieve both group and individual
particularly relevant today because companies are increasingly fairness. Second, we leverage computationally efficient methods
announcing their plans to stop using protected attributes in fear of for inference that are easy to implement in practice and also have
engaging in discrimination practices. In our empirical section, we desirable scalability properties. Third, out-of-sample predictions
show the conditions under which this finding applies in practice. for CBT do not require protected attributes as an input. In other
Personalized allocation algorithms typically use data as input words, firms or institutions seeking allocation decisions that do
to a two-stage model. First, the data are used to predict accu- not discriminate on protected attributes only need to collect the
rate outcomes based on the observed variables in the data (the protected attributes when calibrating the model. Once the model
“inference” stage). Then, these predictions are used to create an is estimated, future allocation decisions can be based on (out-of-
optimal targeting policy with a particular objective function in sample) predictions, which only require the unprotected attributes
mind (the “allocation” stage). The (typically unintended) biases of the new individuals.
in the policies might occur because the protected attributes are We propose a practical solution where the decision maker can
often correlated with the predicted outcomes. Thus, using either leverage the value of personalization without the risk of dispro-
the protected attributes themselves or variables that are correlated portionately targeting individuals based on protected attributes.
with those protected attributes in the inference stage may generate The solution, which we name BEAT (bias-eliminating adapted
a biased allocation policy.* trees), generates individual-level predictions that are independent
This biased personalization problem could be in principle of any preselected protected attributes. Our approach builds on
solved using constrained optimization, focusing on the allocation general random forests (GRFs) (15, 16), which are designed to
stage of the algorithm (e.g., refs. 3 and 4). Using this approach, efficiently estimate heterogeneous outcomes. Our method pre-
a constraint is added to the optimization problem such that serves most of the core elements of GRF, including the use of
individuals who are allocated to receive treatment (the “targeted” forests as a type of adaptive nearest neighbor estimator and the use
group) are not systematically different in their protected attributes of gradient-based approximations to specify the tree-split point.
from those who do not receive treatment. Although methods Importantly, we depart from GRF in how we select the optimal
for constrained optimization problems often work well in low split for partitioning. Rather than using divergence between chil-
dimensions, they are sensitive to the curse of dimensionality (e.g., dren nodes as the primary objective of any partition, the BEAT
if there are multiple protected attributes). algorithm combines two objectives—heterogeneity in the out-
Another option would be to focus on the data that are fed to come of interest and homogeneity in the protected attributes—
the algorithm and “debias” the data before they are used: that when choosing the optimal split. Essentially, the BEAT method
is, transform the unprotected variables such that they become only identifies individual differences in the outcome of interest
independent of the protected attributes and use the resulting (e.g., heterogeneity in response to price) that are homogeneously
Downloaded from https://www.pnas.org by 179.251.80.56 on November 28, 2022 from IP address 179.251.80.56.

data in the two-stage model (e.g., refs. 5 and 6). While doing distributed in the protected attributes (e.g., race). As a result, not
so guarantees pairwise independence of each variable from the only the protected attributes will be equally distributed across
protected attributes, it is difficult to account for underlying depen- policy allocations (group fairness), but the method will also ensure
dencies between the protected attributes and interactions of the that individuals with the same unprotected attributes would have
different variables (6). Most importantly, while these methods are the same allocation (individual fairness).
generally effective at achieving group fairness (statistical parity), Using a variety of simulated scenarios, we show that our
they often harm individual fairness (7–9).† Finally, debiasing method exhibits promising empirical performance. Specifically,
methods require the decision maker to collect protected attributes BEAT reduces the unintended bias while leveraging the value of
at all times, both when estimating the optimal policy and when personalized targeting. Further, BEAT allows the decision maker
applying that policy to new individuals. A more desired approach to quantify the trade-off between performance and discrimina-
would be to create a mapping between unprotected attributes and tion. We also examine the conditions under which the intuitive
policy allocations that not only is fair (both at the group level and approach of removing protected attributes from the data alleviates
at the individual level) but can also be applied without the need or increases the bias. Finally, we apply our solution to a market-
to collect protected attributes for new individuals. ing context in which a firm decides which customers to target
In this paper, we depart from those approaches and propose with a discount coupon. Using an online sample of n = 3,146
an approach that addresses the potential bias at the inference participants, we find strong evidence of relationships between
stage (rather than pre- or postprocessing the data or adding “protected” and “unprotected” attributes in real data. Moreover,
constraints to the allocation). Our focus is to infer an object of applying personalized targeting to these data leads to significant
interest—“conditional balanced targetability” (CBT)—that mea- bias against a protected group (in our case, older populations)
sures the adjusted treatment effect predictions, conditional on due to these underlying correlations. Finally, we demonstrate that
a set of unprotected variables. Essentially, we create a mapping BEAT mitigates the bias, generating a balanced targeting policy
from unprotected attributes to a continuous targetability score that does not discriminate against individuals based on protected
that leads to balanced allocation of resources with respect to the attributes.
protected attributes. Previous papers that modified the inference Our contribution fits broadly into the vast literature on fairness
stage (e.g., refs. 10–14) are limited in their applicability because and algorithmic bias (e.g., refs. 2 and 17–22). Most of this
they typically require additional assumptions and restrictions and literature has focused on uncovering biases and their causes as well
are limited in the type of classifiers they apply to. The benefits of as on conceptualizing the algorithmic bias problem and potential
solutions for researchers and practitioners. We complement this
literature by providing a practical solution that prevents algo-
* Notethat algorithmic bias may also occur due to unrepresentative or biased data. This rithmic bias that is caused by underlying correlations. Our work
paper focuses on bias that is created due to underlying correlations in the data.

also builds on the growing literature on treatment personaliza-
Statistical parity ensures that the distribution of protected groups is similar between the
treated and nontreated groups. Individual fairness ensures that individuals of different
tion (e.g., refs. 23–28). This literature has mainly focused on
protected groups that have the same value of unprotected attributes are treated equally. the estimation of heterogeneous treatment effects and designing

2 of 9 https://doi.org/10.1073/pnas.2115293119 pnas.org
A B C D E F
0.4 Protected 0 1
2 2 2 2

0.3 1 1 1 1 0.8

0.6
0.2 0 0 0 0

0.4
−1 −1 −1 −1
0.1
0.2
−2 −2 −2 −2
0.0 0.0
−2 0 2 −2 −1 0 1 2 −2 −1 0 1 2 −2 −1 0 1 2 −2 −1 0 1 2 −2 0 2
Predicted CATE X1 X2 (corr. Z1) X3 X4 Target score

Fig. 1. Estimated CATE using GRF including protected and unprotected attributes. A shows the distribution of CATE; B–E show the relationships between
unprotected characteristics X1 through X4 (x axis), respectively, and predicted CATE (y axis); and F shows the distribution of CATE by different values of the
protected attribute Z1 .

targeting rules accordingly, but it has largely ignored any fairness with highest predicted CATE (τ˜i ). We use GRF as developed
or discrimination considerations in the allocation of treatment. in ref. 16 to estimate the model and present the results (on test
data) in Fig. 1. Fig. 1A shows the distribution of τ˜i across the
Illustrative Example population, showing a rich variation across individuals. Fig. 1 B–E
presents the marginal effect of each of the variables X1 through
We begin with a simulated example to illustrate how algorithmic X4 on τ˜i . GRF captures the true relationships (or lack thereof )
personalization can create unintended biases. Consider a mar- between the observables and the treatment effect very well. At first
keting context in which a firm runs an experiment to decide glance, the relationship shown in Fig. 1C is surprising given that,
which customers should receive the treatment (e.g., coupon) in the by construction, X2 does not impact τ˜i . However, the correlation
future. That is, the results of the experiment will determine which between Z1 and X2 causes this indirect relationship, which is
customers are most likely to be impacted by future treatment. revealed in Fig. 1.
We recreate this situation by simulating the behavior of N = Evidence of “unintended bias” is presented in Fig. 1F, where
10, 000 + 5, 000 (train and test) individuals. We use Yi ∈ R to we plot the distribution of τ˜i by the different values of the binary
denote the outcome (e.g., profitability) of individual i, Zi ∈ RD protected attribute Z1 .¶ It clearly shows the systematic difference
to denote the vector of protected attributes (e.g., race, gender), in treatment effect between customers with Zi1 = 0 and Zi1 = 1,
Downloaded from https://www.pnas.org by 179.251.80.56 on November 28, 2022 from IP address 179.251.80.56.

Xi ∈ RK to denote the set of unprotected variables that the firm causing a disproportional representation of customers with Zi1 =
has collected about the individual (e.g., past purchases, channel 1 in the targeted group. For example, if the firm was to target
of acquisition), and Wi ∈ [0, 1] to denote whether customer i 50% of its customers in decreasing order of τ˜i , 12 times more
has received the treatment.‡ We generate individual-level data as individuals with Z = 1 would be targeted compared with Z = 0
follows: individuals (i.e., the ratio is 12:1), and the outcome would be
81% higher than that obtained from an average random policy.
Xik ∼ Normal (0, 1) for k ∈ {1, . . . , 4}, Specifically, to compute the relative outcome, we compute the
Xik ∼ Bernoulli (0.3) for k ∈ {5, . . . , 10}, inverse probability score (IPS) estimator of the outcome generated
  by each allocation policy in the test data if the firm was to
Zid ∼ Bernoulli (1 + e −6Xi2 )−1 for d = 1,
target 50% of the population. We then normalize this metric
Zid ∼ Normal (0, 1) for d ∈ {2, . . . , 3}, with respect to the outcome generated by a random allocation,
Zid ∼ Bernoulli (0.3) for d = 4, such that 0% corresponds to the outcome if the firm did not
leverage any personalization. (We chose a fixed proportion of
and we simulate individual behavior following the process individuals to be targeted [i.e., 50%] to keep the “cost” of the
intervention constant across scenarios. Our results are robust to
Y i = τi W i +   ∼ Normal (0, 1), choosing different levels of targeting.# )
τi = max(Xi1 , 0) − 0.5Xi3 + Zi1 . Finally, to explore the degree of individual fairness of the policy,
we compute the fraction of individuals for whom the allocation
The main patterns to highlight are that the protected attribute would change had their protected attributes been different. This
Xi2 and Zi1 are positively correlated; that individuals respond metric, which we denote as Δ Policy, is inspired by ref. 9. Essen-
differently to treatment depending on Xi1 , Xi3 , and Zi1 ; and that tially, for each individual, we compare their allocation with that
all other variables have no (direct) impact on the treatment effect of their “almost identical twin” who is identical in all unprotected
(τi ) or on the outcome of interest (Yi ).§ attributes but different on the protected attributes.|| We find that
Δ Policy = 83%, meaning that 83% of individuals would have
Case 1: Maximizing Outcomes. A firm maximizing the outcome
been assigned to a different policy allocation had their protected
(Yi ) via allocation of treatment Wi would estimate τi , the
attributes be different.
conditional average treatment effect (CATE), as a function of the
observables Xi and Zi and would allocate treatment to the units

In the interest of space, we do not report treatment effects or allocations for Z2 to Z4
‡ because they do not impact τi or Yi .
For simplicity, we consider a single allocation decision: yes/no. This can easily be extended
#
to multiple treatment assignments. The IPS (29) is a popular metric for policy counterfactual evaluation in computer science
§ and statistics.
In generating the data, we normalize τi to be centered at zero and have SD of one across
||
the population. That way, the degree of heterogeneity is comparable across scenarios For binary attributes, we simply flip attribute Z by using 1 − Z. For continuous attributes,
where we vary the functional form of τi (Table 2). we first standardize the data, and then, we move it by ±1 SD.

PNAS 2022 Vol. 119 No. 11 e2115293119 https://doi.org/10.1073/pnas.2115293119 3 of 9


A B C D E F
Protected 0 1
0.4 2 2 2 2

0.3 1 1 1 1
0.6
0 0 0 0
0.2
0.4
−1 −1 −1 −1
0.1 0.2
−2 −2 −2 −2
0.0 0.0
−2 0 2 −2 −1 0 1 2 −2 −1 0 1 2 −2 −1 0 1 2 −2 −1 0 1 2 −2 0 2
Predicted CATE X1 X2 (corr. Z1) X3 X4 Target score

Fig. 2. Estimated CATE using GRF including unprotected attributes only. A shows the distribution of CATE; B–E show the relationships between unprotected
characteristics X1 through X4 (x axis) , respectively, and predicted CATE (y axis); and F shows the distribution of CATE by different values of the protected
attribute Z1 .

Case 2: Maximizing Outcomes and Removing Protected At- policy. Although Fig. 3 B and D shows a clear relationship between
tributes. A firm might try to avoid discrimination by removing the unprotected attributes and CBT, Fig. 3C highlights that the
protected attributes from the data. Naturally, Δ Policy = 0 in this proposed algorithm does not capture differences by unprotected
case because the protected variables are excluded from the whole variables (in this case, X2 ) that lead to bias in targeting.
analysis, and therefore, a change in protected attributes would not Finally, looking at Fig. 3A, the distribution of CBT is narrower
change the policy allocation. To recreate that scenario, we replicate than those obtained for CATE in Figs. 1A and 2A. This is a direct
the analysis in case 1, but we exclude all Z variables from the policy result of how BEAT leverages heterogeneity. Because the algorithm
analysis. The results are presented in Fig. 2 and are remarkably only extracts differences that are unrelated to protected attributes,
similar to those in which protected attributes are used. We direct it naturally captures less variability in the outcome of interest than
attention to Fig. 2F, which shows the persistent systematic differ- other algorithms that are designed to achieve the single objective
ences between individuals with Z1 = 1 and Z1 = 0, even when of exploring heterogeneity. Accordingly, the outcome in case 3 is
protected attributes Z were excluded from the estimation. Here, lower than in the previous cases.
the presence of X2 causes the unbalanced targeting, which in this Regarding individual fairness, even though BEAT incorporates
case, translates to a ratio of 5:1 and an increase of 71% with respect Z in the analysis—to ensure a balanced targetability—the method
to the average random policy. is designed such that the CBT scores do not depend on the
Downloaded from https://www.pnas.org by 179.251.80.56 on November 28, 2022 from IP address 179.251.80.56.

protected attributes and therefore, will always achieve individual


Case 3: Maximizing Outcomes and Balance Jointly. We propose fairness (i.e., Δ Policy = 0).
to pursue both objectives jointly by using BEAT. The key idea
of our approach is to be selective about which differences in the Methods
data are “usable” for policy allocation. For example, even though
Overview. The primary goal of BEAT is to identify (observed) heterogeneity
empirically, there is a relationship between (τ˜i ) and Xi2 , BEAT
will not leverage individual differences on X2 when estimating across individuals that is unrelated to their protected attributes. That hetero-
geneity is then used to implement targeted allocations that are, by construction,
the targetability scores because that would necessarily lead to un-
balanced on the protected attributes. Furthermore, we want our method to be
balanced targeting with respect to Z1 . Importantly, our approach
as general as possible, covering a wide variety of personalization tasks. In the
does not require the user to know which unprotected variables are illustrative example, we were interested in identifying differences in the effect of
related to the protected attributes or the shape and strength of a treatment Wi on an outcome Yi . This is commonly the case for personalized
those relationships. Instead, it captures any kind of relationship medicine, targeted firm interventions, or personalized advertising. In other cases
between them. The only information that the user needs a priori (e.g., approving a loan based on the probability of default), the goal would be
is the identity of the attributes that are meant to be protected. to extract heterogeneity on the outcome itself, Yi (e.g., Yi indicates whether the
Fig. 3 shows the results of using BEAT. Starting from Fig. 3F, individual will default in the future).
it is remarkable how the distribution of CBT is almost identical We build on GRFs (15, 16), which are designed to efficiently estimate hetero-
for the different values of Z1 , showing BEAT’s ability to produce geneous outcomes. Generally speaking, GRF uses forests as a type of adaptive
balanced outcomes. The ratio of targeted individuals by Z1 is 1:1, nearest neighbor estimator and achieves efficiency by using gradient-based
and the outcome is a 44% increase over that of the average random approximations to specify the tree-split point. (Ref. 16 has details on the method,

A B C D E F
Protected 0 1
2 2 2 2
0.6
1 1 1 1
0.6
0.4 0 0 0 0
0.4
0.2 −1 −1 −1 −1
0.2
−2 −2 −2 −2
0.0 0.0
−2 0 2 −2 −1 0 1 2 −2 −1 0 1 2 −2 −1 0 1 2 −2 −1 0 1 2 −2 0 2
Predicted CBT X1 X2 (corr. Z1) X3 X4 Target score

Fig. 3. Estimated CBT using BEAT. A shows the distribution of CBT; B–E show the relationships between unprotected characteristics X1 through X4 (x axis),
respectively, and predicted CBT (y axis); and F shows the distribution of CBT by different values of the protected attribute Z1 .

4 of 9 https://doi.org/10.1073/pnas.2115293119 pnas.org
Fig. 4. Efficiency/balance trade-off for an illustrative example. Efficiency is the proportion increase in outcome compared with random allocation. Imbalance
is the Euclidean distance between the average value of the protected attribute (Z1 ) in treated and nontreated units of each policy. We compare CF-FD, causal
forest (CF) without the protected attributes, and BEAT with 0 ≤ γ ≤ 5. Select values of γ are reported.

and https://github.com/grf-labs/grf has details on its implementation.) The key B 1


where α̃i (x) = b=1 α̃b (x) and μ̂i is the leave-one-out estimator for
difference between GRF and our method is the objective when partitioning B i
the data to build the trees (and therefore, forests). Rather than maximizing E [Yi | Xi = x].
divergence in the object of interest (either treatment effect heterogeneity for the Note that the only difference between the CBT scores provided by BEAT and
causal forest or the outcome variable for prediction forests) between the children the estimates provided by GRF are the similarity weights, whereby α̃bi (x) (from
nodes, BEAT maximizes a quantity we define as balanced divergence (BD), which BEAT) are balanced with respect to Z and αbi (x) from GRF do not depend on Z.
combines the divergence of the outcome of interest (object from GRF) with a For the special case in which variables X and Z are independent, maximizing Eq.
penalized distance between the empirical distribution of the protected attributes 1 corresponds to maximizing Δ̂(C1 , C2 ), exactly as GRF does, and therefore,
across the children nodes. CBTi (x) from Eq. 2 corresponds to the outcome of GRF, τ̂ (x) for causal forests
Specifically, in the BEAT algorithm, trees are formed by sequentially finding and μ̂(x) for regression forests.
the split that maximizes the BD object:
Downloaded from https://www.pnas.org by 179.251.80.56 on November 28, 2022 from IP address 179.251.80.56.

Efficiency/Balance Trade-Off. The penalty parameter γ in Eq. 1 determines


¯ ¯ how much weight the algorithm gives to balance in protected attributes. Running
BD(C1 , C2 ) = Δ̂(C1 , C2 ) − γDist(ZC1 |Xk,s , ZC2 |Xk,s ), [1]
      BEAT while setting γ = 0 would yield exactly the same outcomes as running
GRF split
criterion
BEAT added penalty GRF without protected attributes. As γ increases, the imbalance in the protected
attributes would be reduced, but the main outcome to optimize may be lower
where C1 and C2 denote the children nodes in each split s; Δ̂(C1 , C2 ) denotes as well. Naturally, introducing an additional constraint in contexts where bias
the divergence in the outcome of interest (as defined in equation 9 in ref. 16; exists would necessarily reduce efficiency while improving fairness. Because the
for brevity, we refer to the notation and expressions introduced in ref. 16); γ penalty parameter in BEAT is flexible,one can easily explore the efficiency/fairness
is a nonnegative scalar denoting the balance penalty, which can be adjusted trade-off of any particular context by adjusting the value of γ. Fig. 4 illustrates this
by the researcher (a trade-off discussion is given below); Dist is a distance trade-off using the illustrative example.
function (e.g., Euclidean norm); and Xk,s is the splitting point for C1 and C2 The y axis in the figure represents the efficiency of each method measured as
at dimension k ∈ K.** When selecting splits to build a tree, ref. 16 optimizes the the value obtained by personalization compared with the average random policy
first term of Eq. 1, indicated with “GRF split criterion,” and BEAT essentially adds as described above, while the x axis captures the imbalance in protected attribute
a second term to the equation, introducing a penalty when there are differences Z1 , which is the overallocation proportion of Z1 = 1.
in the Z attributes between the children nodes. Therefore, maximizing BD when As illustrated by the example, when using the causal forest (CF) with the full
sequentially splitting the data ensures that all resulting trees (and hence, forests) data (namely, CF-FD: CF with full data; case 1 above), both the outcome and
are balanced with respect to the protected attributes Z. imbalance are the highest: at 81% over random and 0.69, respectively. Using a
Once trees are grown by maximizing Eq. 1, we proceed exactly as GRF and causal forest without the protected attributes (case 2 above) yields lower results
use the trees to calculate the similarity weights used to estimate the object of for both axes: 71% and 0.42, respectively. Finally, the red line captures the
interest. More formally, indexing the trees by b = 1, . . . , B and defining a leaf performance of BEAT when using different penalty weights γ and so, captures
Lb (x) as the set of training observations falling in the same leaf as x, we calculate the efficiency/balance trade-off in this particular example. If one sets γ = 0, the
balanced weights α̃bi (x) as the frequency with which the ith observation falls outcome is identical to that of case 2. Then, as γ increases (γ is increased by
into the same leaf as x, and we calculate the CBT score by combining the weights 0.05 intervals in the figure), there is a reduction in both outcome and imbalance,
and the local estimate of the quantity estimated by GRF (i.e., the treatment getting to a minimum outcome of 44% when imbalance is reduced all the
effect heterogeneity [τi ] for causal forest or expected outcome variable [μi ] for way to zero.†† Note that the relationship between efficiency and balance in this
prediction forests). For example, in the simplest case of prediction forest, we can example is not linear—for values of γ ∈ [1.1, 1.25], BEAT can reduce imbalance
express the CBT score as significantly without sacrificing too much efficiency. This gives the decision maker
a degree of freedom to allow a certain amount of imbalance while ensuring

N sufficient efficiency.
CBT(x) = α̃i (x)μ̂i , [2]
i=1 Simulation. We validate the effectiveness of BEAT by simulating a large variety
of scenarios (SI Appendix, Table S1 shows the full set of results). Table 1 presents
** To gain efficiency when computing  ¯
ZCc |Xk,s in each child c, we preprocess the data. First,
we split Xk into 256 equal bins and subtract the average of  Z within the bin. Then, we take ††
Note that setting γ to values greater than 1.35 does not reduce the efficiency of the
¯
the average of Z for each splitting point X and child node c to obtain 
k,s Z |X . Cc k,s method, with 44% being the lowest outcome obtained by the BEAT method.

PNAS 2022 Vol. 119 No. 11 e2115293119 https://doi.org/10.1073/pnas.2115293119 5 of 9


Table 1. Simulated scenarios: Comparing outcomes across methods
Method Scenario 1 ↑ corr.; Scenario 2 ↓ corr.; Scenario 3 ↓ corr.; Scenario 4 ↓ corr.;
τ = f (Z), % τ = f (Z), % τ = f (Z, X2 ), % τ = f (Z, −X2 ), %
Efficiency Imbalance Δ Policy Efficiency Imbalance Δ Policy Efficiency Imbalance Δ Policy Efficiency Imbalance Δ Policy
CF-FD 81.2 100.0 82.8 78.7 100.0 82.6 74.6 100.0 28.8 76.3 100.0 1.6
CF-NP 71.1 60.8 0.0 50.8 5.6 0.0 71.8 30.3 0.0 76.7 112.0 0.0
Debiased 45.8 0.3 13.5 43.0 0.1 8.9 59.9 0.6 24.3 75.0 0.8 26.7
BEAT∗ 44.5 0.1 0.0 42.1 0.1 0.0 44.8 0.1 0.0 45.4 0.5 0.0
Results comparing CF-FD, CF-NP, debiased, and BEAT are reported. Debiased uses regression forests to debias the data. Additional benchmarks are reported in SI Appendix
(SI Appendix, Table S1). corr. indicates correlation, and the corresponding arrow sign indicates whether there is high or low correlation between Z and X2 . Efficiency is measured as
the percentage increase in outcome over random allocation. Imbalance is normalized to 100% for the imbalance obtained with CF-FD (e.g., column 2 of scenario 1 should be read CF-NP
generates 60.8% of the imbalance obtained when using the full data). Δ Policy measures the percentage of individuals for whom the outcome would change if their protected attributes
were different.

Scalar γ is set to a large number, yielding the most conservative estimates for efficiency and imbalance.

a subset of results that illustrate BEAT’s performance compared with other the underlying correlations between unprotected and protected attributes and
personalization methods. Scenario 1 in Table 1 presents the illustrative example. the relationship between these variables and the choice outcome. All materials
Then, in scenario 2, we use this baseline example and reduce the strength of the were reviewed and approved by Harvard University’s Institutional Review Board,
relationship between the protected attributes (Z) and the unprotected attributes and all participants provided informed consent. The experiment materials and
(X2 ). In scenarios 3 and 4, we further vary the empirical relationship between the results are available in the SI Appendix.
(X2 , Z) and the variable of interest (τ ). Specifically, we allow both (X2 , Z) to First, we explore the correlations between the protected attributes and other
directly affect τ , either with the same or the opposite sign. The table compares customer variables, for which we run a set of (univariate) regressions of each
the performance of BEAT and other approaches, including running GRF with and protected attribute on the (unprotected) variables. To illustrate the correlations,
without the protected attributes (cases 1 and 2 above) and running GRF using we report this analysis for the “activity” variables.‡‡ Specifically, we use linear
debiased data X  , with X  = X − fx (Z) with fx being a (multivariate) random regression for age and logistic regressions for gender (coded as nonmale) and
forest with X as outcome variables and Z as features. race (coded as non-White).
These scenarios illustrate several important patterns that highlight the ben- Fig. 5 shows the coefficients from those regressions, revealing that many of
efits of BEAT compared with existing solutions. First, when the unprotected the unprotected attributes predict the ones that are protected. Note that most of
attributes do not directly impact the variable of interest (scenarios 1 and 2), these activity-related variables capture very “seemingly innocent” behaviors,such
debiasing the X’s works almost as well as BEAT—the imbalance is completely as how frequently one cooks their own meals or uses a dating app, behaviors
removed. However, debiasing also suffers from individual fairness bias compared that are easily captured by information regularly collected by firms (e.g., past
with BEAT. Second, comparing scenarios 1 and 2, when the correlation between browsing activity). Importantly, the fact that these relationships commonly occur
Downloaded from https://www.pnas.org by 179.251.80.56 on November 28, 2022 from IP address 179.251.80.56.

the protected and unprotected attributes is relatively low, GRF without protected in real life implies that, in cases where personalization leads to imbalance in the
characteristics (which we name CF-NP: a CF with no protected attributes) is protected attributes, simply removing those attributes from the data will not be
almost as effective as BEAT and debiasing in removing the imbalance. Third, enough to mitigate such a bias.
once the correlated variables impact the variable of interest directly (scenarios Second, we explore what would have happened if the focal company used
3 and 4), debiasing is almost as effective as BEAT in removing the imbalance, these data to design personalized marketing policies. Specifically, we are in-
but the individual fairness bias becomes worse. Further, when both unprotected terested in exploring whether a profit-maximizing marketing campaign would
and protected attributes impact the variable of interest but with opposite signs lead to unbalanced policies (in the protected attributes) and if so, how much
(scenario 4),removing protected attributes yields more imbalance compared with bias would still remain even when those attributes are removed from the data.
the baseline approach that uses all the data. From a practical point of view, this Furthermore, we want to explore the performance of BEAT on the (real-world)
insight is relevant because firms and organizations are increasingly eliminating marketing example and demonstrate how BEAT can mitigate that unintended
protected attributes from their databases (or are pressured to do so). Our analysis bias. Accordingly, we analyze our experimental data by emulating what a firm
shows that doing so could make things worse as there are cases when removing designing targeted campaigns would do in practice (26, 27, 30). Specifically, we
protected attributes increases—rather than decreases—the imbalance between randomly split the data into train and test samples and use the training data to
protected and unprotected groups. learn the optimal policy under each approach. This corresponds to a firm running
an experiment (or a pilot test) among a subset of users to determine the optimal
Experiment. We demonstrate the real-world applicability of our method us- targeting policy. We then evaluate each policy using the test data (we use a 90/10
ing experimental data. As highlighted in Table 1, the imbalance in outcomes split for train and test samples), mimicking the step when the firm applies the
depends on the strength of the relationships between protected and unpro- chosen policy to a larger population. Because of the relatively low sample size
tected attributes as well as the underlying relationships between observables and high variance in the real-world data, we replicate this procedure 1,000 times.
and outcomes. Therefore, the goal of this analysis is twofold: 1) investigate We report the average and SD of each of the outcomes of the different methods
the extent to which underlying correlations between protected and unprotected in Table 2 and additional results in SI Appendix (SI Appendix, Table S4).§§
characteristics occur in real-world data and 2) explore whether those underlying As can be seen in Table 2, BEAT performs best in removing the imbalance
relationships lead to imbalance against a protected group (if so, demonstrate generated when using the CF-FD; however, it is not statistically different from the
how BEAT mitigates the imbalance and generates balanced outcomes when those imbalance achieved by just removing protected attributes (CF-NP). Additionally,
relationships are ingrained in high-dimensional data, as is the case in most real unlike the simulated scenarios, the debiased method performs poorly and does
applications). not eliminate the imbalance. This is likely due to the real-world nature of the data,
To that end, we ran an incentive compatible online experiment where 3,146 which have more variance than our simulated settings. In terms of individual
participants were randomly assigned to different conditions of a marketing cam-
paign akin to a situation where some participants were offered a $5 discount and
others were not (SI Appendix, section 2 has details). Participants were asked to ‡‡
Please refer to the “consumer behavior information” section in SI Appendix, section 4 for
make a subsequent choice of buying from the company that offered the coupon or details about the collection of the activity-related variables and to the “activity frequency”
section in SI Appendix, Table S2 for the corresponding summary statistics. SI Appendix
the competitor. In addition, we collected behavioral and self-reported measures includes details about all variables included in the experiment.
about each participant (e.g., brand preferences, attitudes toward social issues, §§
To calculate the efficiency and imbalance of each policy, we select the predicted top 50%
behavior in social media sites) as well as demographic characteristics, which are of individuals in the test data to target and compute the resulting outcomes, averaging
generally considered as protected attributes (age, gender, and race), to explore across the 1,000 draws.

6 of 9 https://doi.org/10.1073/pnas.2115293119 pnas.org
Fig. 5. Experiment results: The relationship between protected and unprotected attributes. The figure shows regression coefficients (and 95% CIs) of the
protected attributes on the activity-related variables. We use OLS (ordinary least squares) regression for age and logistic regression for nonmale and non-White.
Blue bars indicate which regression coefficients are significantly different from zero at the 95% CI.

fairness, the debiased method generates especially disparate outcomes with highest variable importance when estimating the full causal forest,## and then,
Δ Policy = 42.3%, while the full causal forest (CF-FD) yields different allocation we examine their relative importance when applying each of the other methods
for 16.7% of individuals. Unlike our simulated scenarios, the efficiencies of the (Fig. 6). (For completeness, we report the top 20 variables identified by each
personalization methods do not differ from each other in this real-world setting. method in SI Appendix, Fig. S1.)
[The finding that personalization may yield very little benefit in real settings is In the full causal forest, age (a protected attribute) is the most important
consistent with previous work (27, 31).] That is, in this case, a firm concerned variable, with participant’s location, time spent responding to the experiment
Downloaded from https://www.pnas.org by 179.251.80.56 on November 28, 2022 from IP address 179.251.80.56.

about imbalance may be better off by simply using a uniform or random policy survey, and variables related to price sensitivity (e.g., coupon usage, items on
rather than personalizing the offers based on individual’s observable informa- the price sensitivity scale, and spending across different domains) also ranking
tion.¶¶ Therefore, another way BEAT can be used is to help a decision maker among the high-importance variables. The relative importance of these (unpro-
decide whether to use a uniform or a personalized policy. tected) variables remains very similar when removing the protected attributes
Finally, we explore in more depth the attributes that cause the imbalance in (CF-NP)—where age is not used for personalization and therefore, its importance
the most efficient marketing allocation. We identify the 20 variables with the is zero by construction. In contrast, when using the debiased method, where all
the variables aside from age are still used (albeit transformed to be independent
from protected attributes), their importance varies significantly compared with
Table 2. Experiment results: Comparing outcomes the previous two methods. Specifically, location and time spent on the survey
across methods (duration)—variables likely to correlate with age and other protected attributes—
Efficiency Imbalance Δ Policy, % are less important,albeit still relatively important.Finally,results from BEAT reveal
that such variables should not be used when training the model (i.e., splitting the
CF-FD 0.578 0.157 16.7 trees) because doing so would create an imbalance in personalized policies. In
SD (0.050) (0.064) (4.0) other words, consistent with the results presented in Fig. 5, many of the unpro-
CF-NP 0.574 0.057 0
tected variables used by the other methods were deemed to have a relationship
SD (0.050) (0.026) —
with the protected attributes, which then generated unintended imbalance in
Debiased 0.569 0.240 42.3
the other methods. That led BEAT to eliminate these variables as a source of
SD (0.047) (0.088) (3.9)
BEAT (γ = 3) 0.575 0.042 0 heterogeneity for personalization and use other unprotected attributes instead.
SD (0.050) (0.018) —
BEAT (γ = 5) 0.573 0.042 0 Discussion
SD (0.049) (0.018) —
BEAT (γ = 8) 0.562 0.041 0 As personalized targeted policies become more prevalent, the risk
SD (0.049) (0.017) — of algorithmic bias and discrimination of particular groups grows.
The more data used for personalization, the more likely under-
Results comparing CF-FD, CF-NP, debiased, and BEAT are reported. Debiased uses regres-
sion forests to debias the data; BEAT was estimated with moderate and high penalties
lying correlations between protected and unprotected attributes
for imbalance. The table reports the average outcomes (on the test data) of 1,000 runs are to exist, thus generating unintended bias and increasing the
of each method, and the SDs are reported in parentheses. Efficiency is measured as the
proportion of users choosing the discounted product (i.e., market share of the focal brand).
imbalance between targeted and nontargeted groups. As we have
Imbalance calculates the average distance between standardized protected attributes of shown, even existing methods of personalization that are sup-
targeted and nontargeted individuals in the test data. Δ Policy measures the percentage posed to tackle this issue tend to maintain—or even exacerbate—
of individuals for whom the outcome would change if their protected attributes were
different. In particular, we change the age of each customer by ±1 SD. (We chose age imbalance between groups or threaten individual fairness. This
because that was the most meaningful protected attribute as can be seen in Fig. 6.) insight is especially important as firms implement policies that

##
Variable importance indicates the relative importance of each attribute in the forest, and
¶¶
Note that a uniform policy would yield higher costs compared with the policies it is calculated as a simple weighted sum of how many times each feature was split at each
presented in the table, which only targeted 50% of individuals. depth when training the forest.

PNAS 2022 Vol. 119 No. 11 e2115293119 https://doi.org/10.1073/pnas.2115293119 7 of 9


Downloaded from https://www.pnas.org by 179.251.80.56 on November 28, 2022 from IP address 179.251.80.56.

Fig. 6. Experiment results: Variable importance across methods. The figure plots the variable importance of the top 20 variables for the CF-FD and the
corresponding importance of these variables across the other methods: CF-NP, the debiased method, and BEAT. Details about the experiment and the different
variables appear in SI Appendix (SI Appendix, section 2).

remove protected attributes, presuming that doing so will remove prediction, product recommendations, or dynamic pricing), even
unintended biases or eliminate the risk of discrimination in tar- without the random assignment stage. For example, BEAT can
geted policies. be applied to the case of lending when the decision depends
We introduce BEAT, a method to leverage unprotected at- on the individual’s probability of default. In that case, BEAT is
tributes for personalization while achieving both group and indi- applied to simple regression forests (instead of causal forest) and
vidual fairness in protected attributes, thus eliminating bias. This similarly mitigates potential discrimination in the allocation of
flexible method can be applied to multidimensional protected loans. We illustrate how BEAT can be applied in prediction tasks
attributes and to both discrete and continuous explanatory and in SI Appendix (SI Appendix, section 3).
outcome variables, making it implementable in a wide variety of A natural extension of this research would be to incorporate
empirical contexts. Through a series of simulations and also, using other measures of “fairness.” For example, BEAT uses the dis-
real data, we show that our method indeed reduces the unintended tribution of protected attributes in the sample population and
bias while allowing for benefits from personalization. We also assigns equal penalty weights to each attribute. A decision maker
demonstrate how our method can be used to evaluate the trade- might want to apply different weights to different attributes or
off between personalization and imbalance and to decide when a might be interested in mimicking the distribution of a different
uniform policy is preferred over a personalized one. population. These are straightforward extensions to the proposed
Regarding the applicability of BEAT in our simulations and algorithm. Moreover, decision makers may have different fairness
experiment, we used examples in which the allocation was based or optimization objectives in mind. As has been documented in
on the treatment effect. These included an initial phase of random the literature (32, 33), different fairness definitions often yield
assignment into treatment, and we utilized BEAT in the second different outcomes, and practitioners and researchers should have
stage for inference. (Like GRF, BEAT does not require randomized a clear objective in mind when designing “fair” policies. We
assignment to be present prior to analyzing the data. It can be used acknowledge that BEAT, in its current form, ensures equality of
with observational data as long as there is unconfoundedness.) It is treatment among groups as well as individual parity but might
important to note that BEAT can easily be applied to any type of not be effective for other measures of fairness (e.g., equity, equal
allocation decision that is based on prediction algorithms (such as distribution of prediction errors across protected groups). We look
medical treatments, hiring decisions, lending decisions, recidivism forward to future solutions that allow optimization of multiple

8 of 9 https://doi.org/10.1073/pnas.2115293119 pnas.org
goals, beyond the ones already incorporated in this research. We personalization while ensuring individual fairness and reducing
also acknowledge that the goal of BEAT is to leverage personaliza- the risk of disproportionately targeting individuals based on pro-
tion while balancing the allocation of treatments among individ- tected attributes.
uals with respect to protected attributes. In doing so, it naturally
impacts the efficiency of the outcome of interest as documented in Data Availability. The BEAT package is available at GitHub (https://github.com/
our simulation results. We hope that future research will develop ayeletis/beat). The experimental data and the replication files for the simula-
methods that improve the efficiency levels while still achieving the tions have been deposited in Harvard Business School Dataverse (https://doi.
org/10.7910/DVN/EWEB0W) (34).
desired goals.
To conclude, BEAT offers a practical solution to an important ACKNOWLEDGMENTS. We thank Hengyu Kuang for outstanding reserach assis-
problem and allows a decision maker to leverage the value of tance.

1. D. G. Pope, J. R. Sydnor, Implementing anti-discrimination policies in statistical profiling models. Am. 15. S. Wager, S. Athey, Estimation and inference of heterogeneous treatment effects using random
Econ. J. Econ. Policy 3, 206–231 (2011). forests. J. Am. Stat. Assoc. 113, 1228–1242 (2018).
2. A. Pandey, A. Caliskan, “Disparate impact of artificial intelligence bias in ridehailing economy’s price 16. S. Athey et al., Generalized random forests. Ann. Stat. 47, 1148–1178 (2019).
discrimination algorithms” in AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society 17. B. A. Williams, C. F. Brooks, Y. Shmargad, How algorithms discriminate based on data they lack:
(AAAI/ACM AIES 2021) (Association for Computing Machinery, New York, NY, 2021), pp. 822–833. Challenges, solutions, and policy implications. J. Inf. Policy 8, 78–115 (2018).
3. G. Goh, A. Cotter, M. Gupta, M. Friedlander, Satisfying real-world goals with dataset constraints. arXiv 18. A. Lambrecht, C. E. Tucker, Algorithmic bias? An empirical study of apparent gender-based
[Preprint] (2017). https://arxiv.org/abs/1606.07558v2 (Accessed 6 July 2021). discrimination in the display of stem career ads. Manage. Sci. 65, 2966–2981 (2019).
4. A. Agarwal, A. Beygelzimer, M. Dudik, J. Langford, H. Wallach, “A reductions approach to fair 19. B. Cowgill, Bias and productivity in humans and machines. SSRN [Preprint] (2019).
classification” in Proceedings of the 35th International Conference on Machine Learning, J. Dy, A. http://dx.doi.org/10.2139/ssrn.3433737 (Accessed 1 June 2021).
Krause, Eds. (PMLR, 2018), vol. 80, pp. 60–69. 20. M. Raghavan, S. Barocas, J. Kleinberg, K. Levy, “Mitigating bias in algorithmic hiring: Evaluating
5. M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, S. Venkatasubramanian, “Certifying and claims and practices” in Proceedings of the 2020 Conference on Fairness, Accountability, and
removing disparate impact” in Proceedings of the 21th ACM SIGKDD International Conference on Transparency (Association for Computing Machinery, New York, NY, 2020), pp. 469–481.
Knowledge Discovery and Data Mining (Association for Computing Machinery, New York, NY, 2015), 21. J. Kleinberg, J. Ludwig, S. Mullainathan, C. R. Sunstein, Algorithms as discrimination detectors. Proc.
pp. 259–268. Natl. Acad. Sci. U.S.A. 117, 30096–30100 (2020).
6. J. E. Johndrow, K. Lum, An algorithm for removing sensitive information: Application to 22. H. B. Cowgill, C. E. Tucker, Algorithmic fairness and economics. SSRN [Preprint] (2020).
race-independent recidivism prediction. Ann. Appl. Stat. 13, 189–220 (2019). http://dx.doi.org/10.2139/ssrn.3361280 (Accessed 1 June 2021).
23. P. R. Hahn et al., Bayesian regression tree models for causal inference: Regularization, confounding,
7. C. Dwork, M. Hardt, T. Pitassi, O. Reingold, R. Zemel, “Fairness through awareness” in Proceedings of
and heterogeneous effects (with discussion). Bayesian Anal. 15, 965–1056 (2020).
the 3rd Innovations in Theoretical Computer Science Conference (Association for Computing
24. E. H. Kennedy, Optimal doubly robust estimation of heterogeneous causal effects. arXiv [Preprint]
Machinery, New York, NY, 2012), pp. 214–226.
(2020). https://arxiv.org/abs/2004.14497 (Accessed 2 July 2021).
8. A. Castelnovo, R. Crupi, G. Greco, D. Regoli, The zoo of fairness metrics in machine learning. arXiv
25. X. Nie, E. Brunskill, S. Wager, Learning when-to-treat policies. J. Am. Stat. Assoc. 116, 392–409 (2021).
[Preprint] (2021). https://arxiv.org/abs/2106.00467 (Accessed 8 June 2021).
26. D. Simester, A. Timoshenko, S. I. Zoumpoulis, Efficiently evaluating targeting policies: Improving on
9. P. K. Lohia et al., “Bias mitigation post-processing for individual and group fairness” in ICASSP 2019 –
champion vs. challenger experiments. Manage. Sci. 66, 3412–3424 (2020).
2019 IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE, New York, 27. H. Yoganarasimhan, E. Barzegary, A. Pani, Design and evaluation of personalized free trials. arXiv
2019), pp. 2847–2851. [Preprint] (2020). https://arxiv.org/abs/2006.13420 (Accessed 2 July 2021).
10. T. Kamishima, S. Akaho, H. Asoh, J. Sakuma, “Fairness-aware classifier with prejudice remover 28. S. Athey, S. Wager, Policy learning with observational data. Econometrica 89, 133–161 (2021).
Downloaded from https://www.pnas.org by 179.251.80.56 on November 28, 2022 from IP address 179.251.80.56.

regularizer” in Machine Learning and Knowledge Discovery in Databases, P. A. Flach, T. De Bie, N. 29. D. G. Horvitz, D. J. Thompson, A generalization of sampling without replacement from a finite
Cristianini, Eds. (Springer, Berlin, Germany, 2012), pp. 35–50. universe. J. Am. Stat. Assoc. 47, 663–685 (1952).
11. B. Woodworth, S. Gunasekar, M. I. Ohannessian, N. Srebro, Learning non-discriminatory predictors. 30. E. Ascarza, Retention futility: Targeting high-risk customers might be ineffective. J. Mark. Res. 55,
arXiv [Preprint] (2017). https://arxiv.org/abs/1702.06081 (Accessed 9 June 2021). 80–98 (2018).
12. M. B. Zafar, I. Valera, M. G. Rodriguez, K. P. Gummadi, “Fairness constraints: Mechanisms for fair 31. O. Rafieian, H. Yoganarasimhan, Targeting and privacy in mobile advertising. Mark. Sci. 40, 193–218
classification” in Proceedings of the 20th International Conference on Artificial Intelligence Statistics (2021).
(AISTATS), A. Singh, J. Zhu, Eds. (PMLR, 2017), pp. 962–970. 32. S. A. Friedler, C. Scheidegger, S. Venkatasubramanian, The (im)possibility of fairness: Different value
13. B. H. Zhang, B. Lemoine, M. Mitchell, “Mitigating unwanted biases with adversarial learning” in systems require different mechanisms for fair decision making. Commun. ACM 64, 136–143 (2021).
Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society (Association for Computing 33. T. Miconi, The impossibility of “fairness”: A generalized impossibility result for decisions. arXiv
Machinery, New York, NY, 2018), pp. 335–340. [Preprint] (2017). https://arxiv.org/abs/1707.01195 (Accessed 8 July 2021).
14. M. Donini, L. Oneto, S. Ben-David, J. Shawe-Taylor, M. Pontil, Empirical risk minimization under 34. E. Ascarza, A. Israeli, Replication Data for ’Eliminating Unintended Bias in Personalized Policies Using
fairness constraints. arXiv [Preprint] (2020). https://arxiv.org/abs/1802.08626 (Accessed 9 June Bias Eliminating Adapted Trees (BEAT)’. Harvard Dataverse. https://doi.org/10.7910/DVN/EWEB0W.
2021). Deposited 25 February 2022.

PNAS 2022 Vol. 119 No. 11 e2115293119 https://doi.org/10.1073/pnas.2115293119 9 of 9

You might also like