You are on page 1of 4

📖

Literature Review
Literature Review

Research Year
Paper Title Paper Link Description Comments
Questions Published

Untitled

The paper talks about


UA framework, based
on an unwarranted
association, which
considers associations
between protected
attributes and
subgroups only if
there's no good
explanation for them.
Some attributes are
deemed explanatory
attributes, on which it
is deemed acceptable
to differentiate
FairTest:
Developing (depends on the social
Discovering
a framework norms,
Unwarranted https://ieeexplore.ieee.org/stamp/stamp.jsp?
to analyse policies, or legal 2017
Associations in tp=&arnumber=7961993
algorithmic guidelines governing
Data-Driven
fairness the application
Applications
athand). UA framework
makes no attempts at
objectifying how this
distinction should be
made. After integrating
explanatory attributes,
appropriate metrics
are selected. The
system then utilizes its
association-guided
tree construction
algorithm to pinpoint
subgroups most
impacted by potential
bias.

How Far Can It https://aclanthology.org/2023.eacl-main.248/ 1. Do The paper concludes 2023


Go? On Intrinsic different that intrinsic gender
Gender Bias intrinsic bias bias mitigation
Mitigation for metrics techniques do not
Text respond eliminate gender bias
Classification differently to completely, just hide it,
different and they don’t impact
bias extrinsic bias. They
mitigation measure intrinsic bias
techniques? by analyzing word
2. Can embeddings using a
intrinsic bias classifier. They select

Literature Review 1
Research Year
Paper Title Paper Link Description
Questions Published Comments

mitigation three techniques: CDA


techniques (during pre-
hide bias processing, generating
instead of opposite (based on
resolving it? gender) for each
3. Do example in training
intrinsic bias corpus), context-
mitigation debias (in-processing,
techniques makes stereotype
in language terms orthogonal to
models embedding terms),
improve sent-debias (post-
fairness in processing, transforms
downstream the embeddings by
tasks? removing their
projections onto the
gender subspace). To
measure intrinsic bias
in the pretrained
models, they used
SEAT and LPBS. They
dound that current
intrinsic bias mitigation
techniques don't truly
eliminate gender bias.
The bias scores
obtained using SEAT
and LPBS metrics
differed significantly.
For extrinsic bias
measurement, they
used original,
attribute-scrubbed,
and attribute-swapped
data. They found
downstream data
processing
significantly improves
downstream fairness.
They concluded
intrinsic bias mitigation
should ideally be
combined with other
fairness interventions.

Upstream https://aclanthology.org/2022.acl-long.247/ Investigating They found that 2022


Mitigation Is Not the bias intrinsic and extrinsic
All You Need: transfer metrics are correlated
Testing the Bias hypothesis: for the typical LLM—
Transfer the theory but that the correlation
Hypothesis in that social is mostly explained by
Pre-Trained biases (such biases in the fine-
Language as tuning dataset. They
Models stereotypes) did two tasks by fine-
internalized tuning RoBERTa -
by large occupation
language classification from
models online biography and
during pre- toxicity classification.

Literature Review 2
Research Year
Paper Title Paper Link Description Comments
Questions Published
training For occupation
transfer into classification, they
harmful fine-tuned using a
task- scrubbed dataset in
specific which pronouns were
behavior not given. Downstream
after fine- bias was calculated
tuning. using true positive rate
(low for “surgeons”),
and upstream bias
using pronoun ranking
bias (difference in log
probabilities for
she/her and he/him).
For toxicity
classification, it used
online comments
marked by humans as
toxic/non-toxic as the
fine-tuning dataset.
Mentions of certain
identity groups are
more likely to be
flagged as toxic. For
measuring upstream
bias, they found the 20
most likely tokens for
MASK in {identity}
{person} is [MASK],
and did a sentiment
analysis. They
measured this subject
to 4 interventions: no
pre-training, andom
perturbations, bias
mitigation(SentDebias),
re-balancing and
scrubbing fine-tuning
dataset. Their results
were : 1) Upstream
variations have little
impact on downstream
bias. 2)Most
downstream bias is
explained by the fine-
tuning step. 3)Re-
sampling and re-
scrubbing has little
effect on downstream
behavior.

How Gender
Debiasing
Affects Internal just
Model https://aclanthology.org/2022.naacl-main.188/ 2022 skimmed
Representations, it
and Why It
Matters

Literature Review 3
Research Year
Paper Title Paper Link Description
Questions Published Comments

Mitigating
Gender Bias in
Natural
literature
Language https://arxiv.org/ftp/arxiv/papers/1906/1906.08976.pdf 2019
review
Processing:
Literature
Review

In both Hindi and


Mitigating Marathi, each noun is
Gender assigned a gender
Stereotypes in https://aclanthology.org/2022.gebnlp-1.16.pdf (masculine and 2022
Hindi and feminine for Hindi, also
Marathi neutral gender for
Marathi).

Literature Review 4

You might also like