Fairness Machine Learning

Photo by rawpixel on Unsplash

Objective Functions with Fairness Constraints

A binary classifier does not suffer from disparate impact if the probability that a classifier assigns a user to the positive class $\hat y = 1$ is the same for both values of the sensitive features $z$

\begin{equation} P(\hat y = 1 | z = 0) = P(\hat y = 1 | z = 1) \end{equation}

We measure the decision boundary due to disparate impact by means of the covaraince between the users sensitive attributes $z$ and the signed distance from the users' feasture vectors to the decision boundary $d_\theta(x)$

\begin{equation} Cov_{DI}(z, d_\theta(x)) = E\bigg[ (z-\bar z)d_\theta(x) \bigg] - E\bigg[ (z-\bar z) \bigg]\bar d_\theta(x) \approx \frac{1}{N}\sum_{x,z \in \mathcal D}(z-\bar z) d_\theta(x) \end{equation}

If the decision boundary satisfies the definition of not suffering from disparate impact, we will have

\begin{equation} P(d_\theta(x) > 0 | z = 0) = P(d_\theta(x) > 0 | z = 1)
\end{equation}

meaning that the $\text{Cov}_{DI}$ should be approximately zero for a sufficiently large training dataset.

More specifically, we can format our objective function as follow

\begin{aligned} &\text{minimize } &&L(\theta) \\ &\text{subject to } &&\frac{1}{N}\sum_{x,z \in \mathcal D}(z-\bar z) d_\theta(x) \le c \\ & &&\frac{1}{N}\sum_{x,z \in \mathcal D}(z-\bar z) d_\theta(x) \ge - c \end{aligned}

  • where $c\in \mathbb R^+$ is a given threshold which trades off accuracy and unfairness due to the disparate impact.
  • it’s a convex problem and ensures that the trade-off between the classifier loss function and decision boundary covariance is Pareto Optimal

Distance Function

a distance function $d(\vec x)$ is defined as \begin{equation} d(\vec{x} - \vec{x_I}) = \min(|\vec{x} - \vec{x_I}|) \quad \text{for all } \vec{x_I} \in \partial \Omega \end{equation}

implying that $d(\vec{x}) = 0$ on the boundary where $\vec{x_I} \in \partial \Omega$

Signed Distance Funtion}

\begin{equation} |\phi(\vec x)| = \begin{cases} d(\vec x) & \vec x \in \Omega^+ \\ -d(\vec x) & \vec x \in \Omega^- \\ 0 & \vec x \in \partial \Omega \end{cases} \end{equation}

Convex Problem in Standard Form

\begin{aligned} &\text{minimize } &&f_0(x) \\ &\text{subject to } &&f_i(x) \le 0, && i = 1,\ldots, m \\ & && a_i^Tx = b_i, && i = 1,\ldots, p \end{aligned}

  • the objective function $f_0(x)$ must be convex
  • the inequality constraint functions must be convex
  • the equality constraint function $h_i(x) = a_i^T - b_i = 0$ must be affine

Let’s prove that when $\Omega \in R^n$ be a nonempty convex set, function given by $f(x) = d(x,\Omega)$ is a convex function.

Let first redefine the distance function

\begin{equation} f(x) = d(x;\Omega) = \inf\bigg\{ ||x-y||: y \in \Omega \bigg\} \end{equation}

We will show that

\begin{equation} f(\lambda x_1 + (1- \lambda)x_2) \le \lambda f(x_1) + (1-\lambda)f(x_2) \end{equation}

Equivalently,

\begin{equation} d(\lambda x_1 + (1+\lambda)x_2;\Omega) \le \lambda d(x_1;\Omega) + (1-\lambda)d(x_2;\Omega) \end{equation}

By definition, for any $\varepsilon > 0$ there exist $y_1\in \Omega$ and $y_2\in \Omega$ such that

\begin{aligned} ||x_1 - y_1|| < d(x_1;\Omega) + \varepsilon \\ ||x_2 - y_2|| < d(x_2;\Omega) + \varepsilon \end{aligned}

Then

Hanchao Zhang
Hanchao Zhang
Ph.D. Candidate of Biostatistics
President of the Chinese Students and Scholars Association (CSSA)

My research interests include Machine Learning, Functional Data Analysis, and Imaging Data.

Related