The UK’s independent authority set up to uphold information rights in the public interest, promoting openness by public bodies and data privacy for individuals.

At a glance

  • Once you have extracted the rationale of the underlying logic of your AI model, you will need to take the statistical output and incorporate it into your wider decision-making process.
  • Implementers of the outputs from your AI system will need to recognise the factors that they see as legitimate determinants of the outcome they are considering.
  • For the most part, the AI systems we consider in this guidance will produce statistical outputs that are based on correlation rather than causation. You therefore need to check whether the correlations that the AI model produces make sense in the case you are considering.
  • Decision recipients should be able to easily understand how the statistical result has been applied to their particular case.

Checklist

☐ We have taken the technical explanation delivered by our AI system and translated this into reasons that can be easily understood by the decision recipient.

☐ We have used tools such as textual clarification, visualisation media, graphical representations, summary tables, or a combination, to present information about the logic of the AI system’s output.

☐ We have justified how we have incorporated the statistical inferences from the AI system into our final decision and rationale explanation.

In more detail

Introduction

The non-technical dimension to rationale explanation involves working out how you are going to convey your model’s results in a way that is clear and understandable to users, implementers and decision recipients.

This involves presenting information about the logic of the output as clearly and meaningfully as possible. You could do this through textual clarification, visualisation media, graphical representations, summary tables, or any combination of them. The main thing is to make sure that there is a simple way for the implementer to describe the result to an affected individual.

However, it is important to remember that the technical rationale behind an AI model’s output is only one component of the decision-making and explanation process. It reveals the statistical inferences (correlations) that your implementers must then incorporate into their wider deliberation before they reach their ultimate conclusions and explanations.

Integrating statistical associations into their wider deliberations means implementers should be able to recognise the factors that they see as legitimate determinants of the outcome they are considering. They must be able to pick out, amongst the model’s correlations, those associations that they think reasonably explain the outcome given the specifics of the case. They then need to be able to incorporate these legitimate determining factors into their thinking about the AI-supported decision, and how to explain it.

It is likely they will need training in order to do this. This is outlined in more detail in Task 5.

Understand the statistical rationale

Once you have extracted your explanation, either from an inherently interpretable model or from supplementary tools, you should have a good idea of both the relative feature important and significant feature interactions. This is your local explanation, which you should combine with a more global picture of the behaviour of the model across cases. Doing this should help clarify where there is a meaningful relationship between the predictor and response variables.

Understanding the relevant associations between input variables and an AI model’s result (ie its statistical rationale) is the first step in moving from the model’s mathematical inferences to a meaningful explanation. However, on their own, these statistical inferences are not direct indicators of what determined the outcome, or of significant population-level insights in the real world.

As a general rule, the kinds of AI and machine learning models that we are exploring in this guidance generate statistical outputs that are based on correlational rather than causal inference. In these models, a set of relevant input features, X, is linked to a target or response variable, Y, where there is an established association or correlation between them. While it is justified, then, to say that the components of X are correlated (in some unspecified way) with Y, it is not justified (on the basis of the statistical inference alone) to say that the components of X cause Y, or that X is a direct determinant of Y. This is a version of the phrase ‘correlation does not imply causation’.

You need to take further steps to assess the role that these statistical associations should play in a reasonable explanation, given the particulars of the case you are considering.

Sense-check correlations and identify legitimate determining factors in a case-by-case manner

Next, you need to determine which of the statistical associations that the model’s results have identified as important are legitimate and reasonably explanatory in this case. The challenge is that there is no simple technical tool you can use to do this.

The model’s prediction and classification results are observational rather than experimental, and they have been designed to minimise error rather than to be informative about causal structures. This means it is difficult to draw out an explanation.

You will therefore need to interpret and analyse which correlations and associations are consequential for providing a meaningful explanation. You can do this by drawing on your knowledge of the domain you are working in, and the decision recipient’s specific circumstances.

Taking this context sensitive approach should help you do two things:

  • Sense-checking which correlations are relevant to an explanation. This involves not only ensuring that these correlations are not spurious or caused by hidden variables, but also determining how applicable the statistical generalisations are to the affected individual’s specific circumstances.

    For example, a job candidate, who has spent several years in a full-time family care role, has been eliminated by an AI model because it identifies a strong statistical correlation between long periods of unemployment and poor work performance. This suggests that the correlation identified may not reasonably apply in this case. If such an AI-generated recommendation were weighed as part of a decision-support process or if an automated outcome based on this result were challenged, the model’s implementer or reviewer would have to sense-check whether such a correlation should play a significant role given the decision recipient’s particular circumstances. They would also have to consider how other factors should be weighed in justifying that outcome.
  • Identifying relevant determining factors. Taking a context sensitive approach should help you pick out the features and interactions that could reasonably make a real-world difference to the outcome. This is because it specifically applies to the decision recipient under consideration

    For example, a model predicts that a patient has a high chance of developing lung cancer in their lifetime. The features and interactions that have significantly contributed to this prediction include family history. The doctor knows that the patient is a non-smoker and has a family history of lung cancer, and concludes that, given risks arising from shared environmental and genetic factors, family history should be considered as a strong determinant in this patient’s case.

Integrate your chosen correlations and outcome determinants into your reasoning

The final step involves integrating the correlations you have identified as most relevant into your reasoning. You should consider how this particular set of factors that influenced the model’s result, combined with the specific context of the decision recipient, can support your overall conclusion on the outcome.

Similarly, your implementers should be able to make their reasoning explicit and intelligible to affected individuals. Decision recipients should be able to easily understand how the statistical result has been applied to their particular case, and why the implementer assessed the outcome as they did. You could do this through a plain-language explanation, or any other format they may need to be able to make sense of the decision.