15 March 2023 - This is an old chapter with new additions, including what we need to include in our DPIA.
At a glance
This section is about the accountability principle, which makes you responsible for complying with data protection law and for demonstrating that compliance in any AI system that processes personal data. A data protection impact assessment (DPIA) is an ideal way to demonstrate your compliance. The section will also explain the importance of identifying and understanding controller/ processor relationships. Finally, it covers striking the required balance between the right to data protection and other fundamental rights in the context of your AI system.
Who is this section for?
This section is aimed at senior management and those in compliance-focused roles, including DPOs, who are accountable for the governance and data protection risk management of an AI system. There are some terms and techniques described that may require the input of a technical specialist.
How should we approach AI governance and risk management?
If used well, AI has the potential to make organisations more efficient, effective and innovative. However, AI also raises significant risks for the rights and freedoms of individuals, as well as compliance challenges for organisations.
Different technological approaches will either exacerbate or mitigate some of these issues, but many others are much broader than the specific technology. As the rest of this guidance suggests, the data protection implications of AI are heavily dependent on the specific use cases, the population they are deployed on, other overlapping regulatory requirements, as well as social, cultural and political considerations.
While AI increases the importance of embedding data protection by design and default into an organisation’s culture and processes, the technical complexities of AI systems can make this more difficult. Demonstrating how you have addressed these complexities is an important element of accountability.
You cannot delegate these issues to data scientists or engineering teams. Your senior management, including DPOs, are also accountable for understanding and addressing them appropriately and promptly (although overall accountability for data protection compliance lies with the controller, ie your organisation).
To do so, in addition to their own upskilling, your senior management will need diverse, well-resourced teams to support them in carrying out their responsibilities. You also need to align your internal structures, roles and responsibilities maps, training requirements, policies and incentives to your overall AI governance and risk management strategy.
It is important that you do not underestimate the initial and ongoing level of investment of resources and effort that is required. You must be able to demonstrate, on an ongoing basis, how you have addressed data protection by design and default obligations. Your governance and risk management capabilities need to be proportionate to your use of AI. This is particularly true now while AI adoption is still in its initial stages, and the technology itself, as well as the associated laws, regulations, governance and risk management best practices are developing quickly.
The risk-based approach of data protection law requires you to comply with your obligations and implement appropriate measures in the context of your particular circumstances – the nature, scope, context and purposes of the processing you intend to do, and the risks this poses to individuals’ rights and freedoms. That is to say, you need to identify the risks to people’s data protection rights associated with your processing activities. This will help you to determine the measures you need to put in place to ensure your processing complies with your data protection obligations.
Your compliance considerations therefore involve assessing the risks to the rights and freedoms of individuals and judging what is appropriate in those circumstances. In all cases, you need to ensure you comply with data protection requirements.
This applies to the use of AI just as to other technologies that process personal data. In the context of AI, the specific nature of the risks posed and the circumstances of your processing will require you to strike an appropriate balance between competing interests as you go about ensuring data protection compliance. This may in turn impact the outcome of your processing. It is unrealistic to adopt a ‘zero tolerance’ approach to risks to rights and freedoms, and indeed the law does not require you to do so. It is about ensuring that these risks are identified, managed and mitigated. We talk about trade-offs and how you should manage them below and provide examples of some trade-offs throughout the guidance.
To manage the risks to individuals that arise from processing personal data in your AI systems, it is important that you develop a mature understanding of fundamental rights, risks, and how to balance these and other interests. Ultimately, it is necessary for you to:
assess the risks to individual rights that your use of AI poses;
determine how you will address these; and
establish the impact this has on your use of AI.
You should ensure your approach fits both your organisation and the circumstances of your processing. Where appropriate, you should also use risk assessment frameworks.
This is a complex task, which can take time to get right. However, it will give you, as well as the ICO, a fuller and more meaningful view of your risk positions and the adequacy of your compliance and risk management approaches.
The following sections deal with the AI-specific implications of accountability including:
how you should undertake data protection impact assessments for AI systems;
how you can identify whether you are a controller or processor for specific processing operations involved in the development and deployment of AI systems and the resulting implications for your responsibilities;
how you should assess the risks to the rights and freedoms of individuals, and how you should address them when you design, or decide to use, an AI system; and
how you should justify, document and demonstrate the approach you take, including your decision to use AI for the processing in question.
What do we need to consider when undertaking data protection impact assessments for AI?
DPIAs are a key part of data protection law’s focus on accountability and data protection by design.
You should not see DPIAs as simply a box ticking compliance exercise. They can effectively act as roadmaps for you to identify and control the risks to rights and freedoms that using AI can pose. They are also an ideal opportunity for you to consider and demonstrate your accountability for the decisions you make in the design or procurement of AI systems.
Why are DPIAs required under the data protection law?
In the vast majority of cases, the use of AI will involve a type of processing likely to result in a high risk to individuals’ rights and freedoms, and will therefore trigger the legal requirement for you to undertake a DPIA. You will need to make this assessment on a case by case basis. In those cases where you assess that a particular use of AI does not involve high risk processing, you still need to document how you have made this assessment.
If the result of an assessment indicates residual high risk to individuals that you cannot sufficiently reduce, you must consult with the ICO prior to starting the processing.
In addition to conducting a DPIA, you may also be required to undertake other kinds of impact assessments or do so voluntarily. For example, public sector organisations are required to undertake equality impact assessments, while other organisations voluntarily undertake ‘algorithm impact assessments’. Similarly, the machine learning community has proposed ‘model cards’ and ‘datasheets’ which describe how ML models may perform under different conditions, and the context behind the datasets they are trained on, which may help inform an impact assessment. There is no reason why you cannot combine these exercises, so long as the assessment encompasses all the requirements of a DPIA.
The ICO has produced detailed guidance on DPIAs that explains when they are required and how to complete them. This section sets out some of the things you should think about when carrying out a DPIA for the processing of personal data in AI systems.
How do we decide whether to do a DPIA?
We acknowledge that not all uses of AI will involve types of processing that are likely to result in a high risk to rights and freedoms. However, you should note that Article 35(3)(a) of the UK GDPR requires you to undertake a DPIA if your use of AI involves:
systematic and extensive evaluation of personal aspects based on automated processing, including profiling, on which decisions are made that produce legal or similarly significant effects;
large-scale processing of special categories of personal data; or
systematic monitoring of publicly-accessible areas on a large scale.
Beyond this, AI can also involve several processing operations that are themselves likely to result in a high risk, such as use of new technologies or novel application of existing technologies, data matching, invisible processing, and tracking of location or behaviour. When these involve things like evaluation or scoring, systematic monitoring, and large-scale processing, the requirement to do a DPIA is triggered.
In any case, if you have a major project that involves the use of personal data it is also good practice to do a DPIA. Read our list of processing operations ‘likely to result in high risk’ for examples of operations that require a DPIA, and further detail on which criteria are high risk in combination with others.
Your DPIA needs to describe the nature, scope, context and purposes of any processing of personal data. It needs to make clear how and why you are going to use AI to process the data. You need to detail:
how you will collect, store and use data;
the volume, variety and sensitivity of the data;
the nature of your relationship with individuals; and
the intended outcomes for individuals or wider society, as well as for you.
Whether a system using AI is generally more or less risky than a system not using AI depends on the specific circumstances. You therefore need to evaluate this based on your own context. Your DPIA should show evidence of your consideration of less risky alternatives, if any, that achieve the same purpose of the processing, and why you didn’t choose them. This consideration is particularly relevant where you are using public task or legitimate interests as a lawful basis. See ‘How do we identify our purposes and lawful basis’.
When considering the impact your processing has on individuals, it is important to consider both allocative harms and representational harms:
Allocative harms are the result of a decision to allocate goods and opportunities among a group. The impact of allocative decisions may be loss of financial opportunity, loss of livelihood, loss of freedom, or in extreme circumstances, loss of life.
Representational harms occur when systems reinforce the subordination of groups along identity lines. For example, through stereotyping, under-representation, or denigration, meaning belittling or undermining their human dignity.
Example of allocative harm
An organisation may use an AI system in recruitment that disproportionally classifies applications from male candidates as suitable compared to women. The use of this system has implications for the allocation of job opportunities to female candidates and the relevant economic results.
Example or representational harm
An individual belonging to an ethnic minority group uploads pictures of their holiday photos on an internet platform. The image recognition system operated by the platform assigns labels to their ‘selfie’ photos that are denigrating reflecting racist tropes.
In the context of the AI lifecycle, a DPIA will best serve its purpose if you undertake it at the earliest stages of project development. It should feature, at a minimum, the following key components.
How do we describe the processing?
Your DPIA should include:
a systematic description of the processing activity, including data flows and the stages when AI processes and automated decisions may produce effects on individuals;
a description of the scope and context of the processing, including:
what data you will process;
the number of data subjects involved;
the source of the data; and
to what extent individuals are likely to expect the processing.
Your DPIA should identify and record the degree of any human involvement in the decision-making process and at what stage this takes place. Where automated decisions are subject to human intervention or review, you should implement processes to ensure this is meaningful and also detail the fact that decisions can be overturned.
It can be difficult to describe the processing activity of AI systems, particularly when they involve complex models and data sources. However, such a description is necessary as part of a DPIA. In some cases, although it is not a legal requirement, it may be good practice for you to maintain two versions of an assessment, with:
the first presenting a thorough technical description for specialist audiences; and
the second containing a more high-level description of the processing and explaining the logic of how the personal data inputs relate to the outputs affecting individuals (this may also support you in fulfilling your obligation to explain AI decisions to individuals).
Your DPIA should set out your roles and obligations as a controller and include any processors involved. Where AI systems are partly or wholly outsourced to external providers, both you and any other organisations involved should also assess whether joint controllership exists under Article 26 of the UK GDPR; and if so, collaborate in the DPIA process as appropriate.
If you use a processor, you can illustrate some of the more technical elements of the processing activity in a DPIA by reproducing information from that processor. For example, a flow diagram from a processor’s manual. However, you should generally avoid copying large sections of a processor’s literature into your own assessment.
Do we need to consult anyone?
You must, where appropriate:
seek and document the views of individuals whose data you will be processing during the AI lifecycle, or their representatives, unless there is a good reason not to;
consult all relevant internal stakeholders;
consult with your processor, if you use one; and
consider seeking legal advice or other expertise.
Unless there is a good reason not to do so, you should seek and document the views of individuals whose personal data you process, or their representatives, on the intended processing operation during a DPIA. It is therefore important that you can describe the processing in a way that those you consult can understand. However, if you can demonstrate that consultation would compromise commercial confidentiality, undermine security, or be disproportionate or impracticable, these can be reasons not to consult.
You can help to identify the potential risks of your systems by engaging with:
independent domain experts who have a deep understanding of the context in which your system will be deployed; and
people with lived experience within that context that could also be impacted by the system.
How do we assess necessity and proportionality?
The deployment of an AI system to process personal data needs to be driven by evidence that there is a problem, and a reasoned argument that AI is a sensible solution to that problem, not by the mere availability of the technology. By assessing necessity in a DPIA, you can evidence that you couldn’t accomplish these purposes in a less intrusive way.
A DPIA also allows you to demonstrate that your processing of personal data by an AI system is a proportionate activity. When assessing proportionality, you need to weigh up your interests in using AI against the risks it may pose to the rights and freedoms of individuals. For AI systems, you need to think about any detriment to individuals that could follow from bias or inaccuracy in the algorithms and data sets being used.
Within the proportionality element of a DPIA, you need to assess whether individuals would reasonably expect an AI system to conduct the processing. If AI systems complement or replace human decision-making, you should document in the DPIA how the project might compare human and algorithmic accuracy side-by-side to better justify their use.
You should also describe any trade-offs that are made, for example between statistical accuracy and data minimisation, and document the methodology and rationale for these.
How do we identify and assess risks to individuals?
The DPIA process will help you to objectively identify the relevant risks to individuals’ interests. You should assign a score or level to each risk, measured against the likelihood and the severity of the impact on individuals.
The use of personal data in the development and deployment of AI systems may not just pose risks to individuals’ information rights. When considering sources of risk, your DPIA should consider the potential impact of other material and non-material damage or harm on individuals.
For example, machine learning systems may reproduce discrimination from historic patterns in data, which could fall foul of equalities legislation. Similarly, AI systems that stop content being published based on the analysis of the creator’s personal data could impact their freedom of expression. In these contexts, you should consider the relevant legal frameworks beyond data protection.
How do we identify mitigating measures?
Against each identified risk to individuals’ interests, you should consider options to reduce the level of assessed risk further. Examples of this could be data minimisation techniques or providing opportunities for individuals to opt out of the processing.
You should ask your DPO (if you have one) for advice when considering ways to reduce or avoid these risks, and you should record in your DPIA whether your chosen measure reduces or eliminates the risk in question.
It is important that DPOs or other information governance professionals or both are involved in AI projects from the earliest stages. There must be clear and open channels of communication between them and the project teams. This will ensure that they can identify and address these risks early in the AI lifecycle.
Data protection should not be an afterthought, and a DPO’s professional opinion should not come as a surprise at the eleventh hour.
You can use a DPIA to document the safeguards you put in place to ensure the individuals responsible for the development, testing, validation, deployment, and monitoring of AI systems are adequately trained and have an understanding of the data protection implications of the processing.
Your DPIA can also evidence the organisational measures you have put in place, such as appropriate training, to mitigate risks associated with human error. You should also document any technical measures designed to reduce risks to the security and accuracy of personal data processed in your AI system.
Once you have introduced measures to mitigate the risks you have identified, the DPIA should document the residual levels of risk posed by the processing.
You are not required to eliminate every risk identified. However, if your assessment indicates a high risk to the data protection rights of individuals that you are unable to sufficiently reduce, you are required to consult the ICO before you can go ahead with the processing.
How do we conclude our DPIA?
You should record:
what additional measures you plan to take;
whether each risk has been eliminated, reduced or accepted;
the overall level of ‘residual risk’ after taking additional measures;
the opinion of your DPO, if you have one; and
whether you need to consult the ICO.
What happens next?
Although you must carry out your DPIA before the processing of personal data begins, you should also consider it to be a ‘live’ document. This means reviewing the DPIA regularly and undertaking a reassessment where appropriate (eg if the nature, scope, context or purpose of the processing, and the risks posed to individuals, alter for any reason).
The European Data Protection Board (EDPB), which has replaced the Article 29 Working Party (WP29), includes representatives from the data protection authorities of each EU member state. It adopts guidelines for complying with the requirements of the EU version of the GDPR.
EDPB guidelines are no longer directly relevant to the UK regime and are not binding under the UK regime. However, they may still provide helpful guidance on certain issues.
How should we understand controller / processor relationships in AI?
Why is controllership important for AI systems?
Often, several different organisations will be involved in developing and deploying AI systems which process personal data.
The UK GDPR recognises that not all organisations involved in the processing will have the same degree of control or responsibility. It is important to be able to identify who is acting as a controller, a joint controller or a processor so you understand which UK GDPR obligations apply to which organisation.
How do we determine whether we are a controller or a processor?
You should take the time to assess, and document, the status of each organisation you work with in respect of all the personal data processing activities you carry out.
If you exercise overall control of the purpose and means of the processing of personal data – you decide what data to process, why and how – you are a controller.
If you don’t have any purpose of your own for processing the data and you only act on a client’s instructions, you are likely to be a processor – even if you make some technical decisions about how you process the data.
Organisations that determine the purposes and means of processing will be controllers regardless of how they are described in any contract about processing services.
As AI usually involves processing personal data in several different phases or for several different purposes, it is possible that you may be a controller or joint controller for some phases or purposes, and a processor for others.
What type of decisions mean we are a controller?
Our guidance says that if you make any of the following overarching decisions, you will be a controller:
to collect personal data in the first place;
what types of personal data to collect;
the purpose or purposes the data are to be used for;
which individuals to collect the data about;
how long to retain the data; and
how to respond to requests made in line with individuals’ rights.
What type of decisions can we take as a processor?
Our guidance says that you are likely to be a processor if you don’t have any purpose of your own for processing the data and you only act on a client’s instructions. You may still be able to make some technical decisions as a processor about how the data is processed (the means of the processing). For example, where allowed in the contract, you may use your technical knowledge to decide:
the IT systems and methods you use to process personal data;
how you store the data;
the security measures that will protect it; and
how you retrieve, transfer, delete or dispose of that data.
How may these issues apply in AI?
When AI systems involve a number of organisations in the processing of personal data, assigning the roles of controller and processor can become complex. For example, when some of the processing happens in the cloud. This can raise broader questions outside the scope of this guidance.
For example, questions about the types of scenario that could result in an organisation becoming a controller, which may include when an organisation makes decisions about:
the source and nature of the data used to train an AI model;
the target output of the model (what is being predicted or classified);
the broad kinds of ML algorithms that will be used to create models from the data (eg regression models, decision trees, random forests, neural networks);
feature selection – the features that may be used in each model;
key model parameters (eg how complex a decision tree can be, or how many models will be included in an ensemble);
key evaluation metrics and loss functions, such as the trade-off between false positives and false negatives; and
how any models will be continuously tested and updated: how often, using what kinds of data, and how ongoing performance will be assessed.
We will also consider questions about when an organisation is (depending on the terms of their contract) able to make decisions to support the provision of AI services, and still remain a processor. For example, in areas such as:
the specific implementation of generic ML algorithms, such as the programming language and code libraries they are written in;
how the data and models are stored, such as the formats they are serialised and stored in, and local caching;
measures to optimise learning algorithms and models to minimise their consumption of computing resources (eg by implementing them as parallel processes); and
architectural details of how models will be deployed, such as the choice of virtual machines, microservices, APIs.
We intend to address these issues in more detail in future guidance products, including additional AI-specific material, as well as revisions to our cloud computing guidance. As we undertake this work, we will consult and work closely with key stakeholders, including government, to explore these issues and develop a range of scenarios when the organisation remains a data processor as it provides AI services.
In our work to date we have developed some indicative example scenarios:
An organisation provides a cloud-based service consisting of a dedicated cloud computing environment with processing and storage, and a suite of common tools for ML. These services enable clients to build and run their own models, with data they have chosen, but using the tools and infrastructure the organisation provides in the cloud. The clients will be controllers, and the provider is likely to be a processor.
The clients are controllers as they take the overarching decisions about what data and models they want to use, the key model parameters, and the processes for evaluating, testing and updating those models.
The provider as a processor could still decide what programming languages and code libraries those tools are written in, the configuration of storage solutions, the graphical user interface, and the cloud architecture.
An organisation provides live AI prediction and classification services to clients. It develops its own AI models, and allows clients to send queries via an API (‘what objects are in this image?) to get responses (a classification of objects in the image).
First, the prediction service provider decides how to create and train the model that powers its services, and processes data for these purposes. It is likely to be a controller for this element of the processing.
Second, the provider processes data to make predictions and classifications about particular examples for each client. The client is more likely to be the controller for this element of the processing, and the provider is likely to be a processor.
An AI service provider isolates different client-specific models. This enables each client to make overarching decisions about their model, including whether to further process personal data from their own context to improve their own model.
As long as the isolation between different controllers is complete and auditable, the client will be the sole controller and the provider will be a processor.
Further reading outside this guidance
This is a complicated area, and you should refer to our specific guidance for more information:
How should we manage competing interests when assessing AI-related risks?
Your use of AI must comply with the requirements of data protection law. However, there can be a number of different values and interests to consider, and these may at times pull in different directions. These are commonly referred to as ‘trade-offs’, and the risk-based approach of data protection law can help you navigate them. There are several significant examples relating to AI, which we discuss in detail elsewhere:
If you are using AI to process personal data you therefore need to identify and assess these interests, as part of your broader consideration of the risks to the rights and freedoms of individuals and how you will meet your obligations under the law.
The right balance depends on the specific sectoral and social context you operate in, and the impact the processing may have on individuals. However, there are methods you can use to assess and mitigate trade-offs that are relevant to many use cases.
How can we manage these trade-offs?
In most cases, striking the right balance between these multiple trade-offs is a matter of judgement, specific to the use case and the context an AI system is meant to be deployed in.
Whatever choices you make, you need to be accountable for them. Your efforts should be proportionate to the risks the AI system you are considering to deploy poses to individuals. You should:
identify and assess any existing or potential trade-offs, when designing or procuring an AI system, and assess the impact it may have on individuals;
consider available technical approaches to minimise the need for any trade-offs;
consider any techniques which you can implement with a proportionate level of investment and effort;
have clear criteria and lines of accountability about the final trade-off decisions. This should include a robust, risk-based and independent approval process;
where appropriate, take steps to explain any trade-offs to individuals or any human tasked with reviewing AI outputs; and
review trade-offs on a regular basis, taking into account, among other things, the views of individuals whose personal data is likely to be processed by the AI (or their representatives) and any emerging techniques or best practices to reduce them.
You should document these processes and their outcomes to an auditable standard. This will help you to demonstrate that your processing is fair, necessary, proportionate, adequate, relevant and limited. This is part of your responsibility as a controller under Article 24 and your compliance with the accountability principle under Article 5(2). You must also capture them with an appropriate level of detail where required as part of a DPIA or a legitimate interests assessment (LIA) undertaken in connection with a decision to rely on the "legitimate interests" lawful basis for processing personal data.
You should also document:
how you have considered the risks to the individuals that are having their personal data processed;
the methodology for identifying and assessing the trade-offs in scope; the reasons for adopting or rejecting particular technical approaches (if relevant);
the prioritisation criteria and rationale for your final decision; and
how the final decision fits within your overall risk appetite.
You should also be ready to halt the deployment of any AI systems, if it is not possible to achieve a balance that ensures compliance with data protection requirements.
Outsourcing and third-party AI systems
When you either buy an AI solution from a third party, or outsource it altogether, you need to conduct an independent evaluation of any trade-offs as part of your due diligence process. You are also required to specify your requirements at the procurement stage, rather than addressing trade-offs afterwards.
Recital 78 of the UK GDPR says producers of AI solutions should be encouraged to:
take into account the right to data protection when developing and designing their systems; and
make sure that controllers and processors are able to fulfil their data protection obligations.
You should ensure that any system you procure aligns with what you consider to be the appropriate trade-offs. If you are unable to assess whether the use of a third party solution would be data protection compliant, then you should, as a matter of good practice, opt for a different solution. Since new risks and compliance considerations may arise during the course of the deployment, you should regularly review any outsourced services and be able to modify them or switch to another provider if their use is no longer compliant in your circumstances.
For example, a vendor may offer a CV screening tool which effectively scores promising job candidates but may ostensibly require a lot of information about each candidate to assist with the assessment. If you are procuring such a system, you need to consider whether you can justify collecting so much personal data from candidates, and if not, request the provider modify their system or seek another provider.
Culture, diversity and engagement with stakeholders
You need to make significant judgement calls when determining the appropriate trade-offs. While effective risk management processes are essential, the culture of your organisation also plays a fundamental role.
Undertaking this kind of exercise will require collaboration between different teams within the organisation. Diversity, incentives to work collaboratively, as well as an environment in which staff feel encouraged to voice concerns and propose alternative approaches are all important.
The social acceptability of AI in different contexts, and the best practices in relation to trade-offs, are the subject of ongoing societal debates. Consultation with stakeholders outside your organisation, including those affected by the trade-off, can help you understand the value you should place on different criteria.
What about mathematical approaches to minimise trade-offs?
In some cases, you can precisely quantify elements of the trade-offs. A number of mathematical and computer science techniques known as ‘constrained optimisation’ aim to find the optimal solutions for minimising trade-offs.
For example, the theory of differential privacy provides a framework for quantifying and minimising trade-offs between the knowledge that can be gained from a dataset or statistical model, and the privacy of the people in it. Similarly, various methods exist to create ML models which optimise statistical accuracy while also minimising mathematically defined measures of discrimination.
While these approaches provide theoretical guarantees, it can be hard to meaningfully put them into practice. In many cases, values like privacy and fairness are difficult to meaningfully quantify. For example, differential privacy may be able to measure the likelihood of an individual being uniquely identified from a particular dataset, but not the sensitivity of that identification. Therefore, they may not always be appropriate. If you do decide to use mathematical and computer science techniques to minimise trade-offs, you should always supplement these methods with a more qualitative and holistic approach. But the inability to precisely quantify the values at stake does not mean you can avoid assessing and justifying the trade-off altogether; you still need to justify your choices.
In many cases trade-offs are not precisely quantifiable, but this should not lead to arbitrary decisions. You should perform contextual assessments, documenting and justifying your assumptions about the relative value of different requirements for specific AI use cases.