Updated: Jun 3, 2022
Banner created using canva.com
You have now trained and tested your model for performance and are ready to deploy. But did you validate the model for fairness? Let's discuss this using the binary classification model.
Use case: You are building a classification model that will hire or reject the applications. For example, there are ten male and ten female applicants. Ideally, your model will employ all the qualified applicants and reject all the unqualified applicants. However, like every other model, your hiring model has also got statistical bias. I.e. it hires some unqualified and rejects some qualified. So, let's create a confusion matrix for this as mentioned below:
Here, 'actual' is represented as 'qualified' and 'unqualified', and the 'prediction' is designated as 'hire' and 'reject'. As you can see above, I have also calculated the model assessment metrics using the confusion matrix.
The model accuracy is 80% which is very good but is this model fair. Do we have any metrics? The answer is yes and no. Yes, because we have metrics to help assess fairness. No, because this is not an engineering problem but a societal problem. So, relying on these metrics to evaluate fairness is not entirely recommended by research scholars. Furthermore, these metrics are not fully developed for models other than classification problems. Therefore, even though many metrics are being discussed in the research community, we will discuss only five of them.
I am limiting the discussion to fairness in subgroups, and we will not discuss individual fairness here. However, before we move further, let's create the confusion matrix for male and female applicants.
As you can see, the accuracy is the same for both subpopulations, i.e. 80%. However, let's assess the five fairness metrics as mentioned below:
1. Demographic Parity: It is calculated only based on the classifier output and doesn't consider the 'actual' labels. For the above Use case, it is calculated as the number of males and females hired out of the males and females, respectively.
Demographic Parity Male: (TP+FP)/Total = (7+2) / 10 = 0.90 Demographic Parity Female: (TP+FP)/Total = (5+0) / 10 = 0.50
The expectation is that the values should match each other or be closely matched. But, if you compare the values above, there is a vast gap. The model hired 90% males as against only 50% females. Hence, the model doesn't satisfy the Demographic parity.
2. Predictive Parity is calculated based on the classifier output and the 'actual' labels. The formula is the same as the precision metric, but we have to compute it for each group.
Predictive Parity male: TP / (TP+FP) = 7 / (7+2) = 7/9 = 0.78 Predictive Parity female: TP / (TP+FP) = 5 / (5+0) = 5/5 = 1
It is the measure of how often the model hired the qualified. But, again, the values don't match, and hence the model doesn't satisfy 'Predictive Parity'.
3. False Positive rate balance: The probability of unqualified applicants being hired among the two subpopulations, i.e. males and females.
False positive rate balance for male: FP / (FP+TN) = 2 / (2+1) = 0.67 False positive rate balance for female: FP / (FP+TN) = 0 / (0+3) = 0
The model should treat all the unqualified applicants equally, regardless of whether they are males or females. However, the model has hired 67% unqualified males, whereas it has not hired any unqualified females. Hence the model doesn't satisfy the 'False positive rate balance'.
4. False Negative rate balance: The probability of qualified applicants being rejected among the two subpopulations i.e. males and females.
False negative rate balance for male: FN / (TP+FN) = 0 / (7+0) = 0 False negative rate balance for female: FN / (TP+FN) = 2 / (5+2) = 0.29
The model should treat all the qualified applicants equally, regardless of whether they are males or females. However, the model has not rejected any qualified males, whereas it has rejected 29% of qualified females. Hence the model doesn't satisfy the False-negative rate balance.
5. Equalised Odds: The probability of qualified applicants being hired as well as the probability of unqualified applicants being rejected among the two subpopulations, i.e. males and females. The formula is the same as the sensitivity and specificity metrics; however, we have to compute it for each group.
Sensitivity for male: TP / (TP+FN) = 7 / (7+0) = 1 Sensitivity for female: TP / (TP+FN) = 5 / (5+2) = 0.71 Specificity for male: TN / (FP+TN) = 1 / (2+1) = 0.33 Specificity for female: TN / (FP+TN) = 3 / (0+3) = 1
As you can see, the model has hired all qualified males, but it has hired only 71% of qualified females. Likewise, it rejected all unqualified females but only 33% of unqualified males. Hence the model doesn't satisfy Equalised odds as well.
As per the impossibility theorem, it is not possible to satisfy all the three below metrics:
1) 'Predictive Parity', 2) 'False Positive rate balance' and 3)'False Negative rate balance'.
So, any model that satisfies two of these metrics will have to fail to satisfy the third metric necessarily. So, it is challenging to strike a balance between model accuracy and fairness. You will have to make trade-offs while using these metrics depending upon your Use case.
So how do we overcome this unfairness?
Richard Zemel et al published a research paper that talks about transforming the data into a new dimension and using an objective function to minimise the parity loss along with the training loss. As per this research, using this approach there is negligible discrimination between the subpopulation while maintaining almost no drop in 'accuracy'.
Also, the "Fairlearn" Open-source package provides two mitigation algorithms namely "Reduction" and "Post-processing".
I hope it gives a fair understanding of assessing fairness in classifier models. "Responsible AI" is still evolving. "Fairness in AI" is not a technical challenge but a socio-technical challenge that must be addressed by engaging all stakeholders.
If you find this article interesting, please like, comment and share.
Views are personal
Reference and additional reading: