Can we identify an individual indirectly from the information we have (together with other available information)?

In detail

How do we identify someone indirectly?
What kind of information could allow an individual to be indirectly identified?
Can we identify someone from other information we hold?
Can we or someone else identify an individual from information we hold and they hold?
If there is only a very slight possibility that an individual could be indirectly identified, is it still personal data?
What factors should we consider when assessing the possibility of identification?

How do we identify someone indirectly?

It’s important to be aware that information you hold may indirectly identify an individual and therefore can still be personal data. If so, this means that the information is subject to the UK GDPR.

If you cannot identify an individual directly from the information that you are processing (for example where all identifiers have been removed) an individual may still be identifiable by other means. This may be from information you already hold, or information that you need to obtain from another source. Similarly, a third party could use information you process and combine it with other information available to them.

You must carefully consider all of the means that any party is reasonably likely to use to identify that individual. This is important because you could inadvertently release or disclose information that could be linked with other information and (inappropriately) identify an individual.

What kind of information could allow an individual to be indirectly identified?

The following is a non-exhaustive list of information that could constitute personal data on the basis that it allows for an individual to be singled out from others:

car registration number and/or VIN;
national insurance number;
passport number; or
a combination of significant criteria (eg age, occupation, place of residence).

The key point of indirect identifiability is when information is combined with other information that then distinguishes and allows for the identification of an individual.

Example

A vehicle’s registration number can be linked to other information held about the registration (eg by the DVLA) to indirectly identify the owner of that vehicle.

Example

If an individual is not known to the operators of an out-of-town shopping centre CCTV system, but they are able to distinguish that individual on the basis of physical characteristics, that individual is identified. Therefore, if the operators are tracking a particular individual that they have singled out in some way (perhaps using such physical characteristics) they will be processing ‘personal data’.

Can we identify someone from other information we hold?

You may process information that, by itself, does not permit the direct identification of an individual. However, within your organisation you may also process other information that, when combined, allows a particular individual to be indirectly identified. If the information relates to that identified individual it constitutes personal data. It’s important to recognise this, so that you can comply with your obligations under the UK GDPR.

Example

An individual submits an application for a job.

On receiving the application, the organisation’s HR department removes the first page, which contains the individual’s name, contact details, etc and saves the remainder of the form in ‘Folder 1’. The application form is saved with a randomly generated application number and sent on to the recruiting manager.

In a restricted-access folder, ‘Folder 2’, the HR department stores the first page of the application, alongside the application number.

The information in Folder 1 does not allow for the identification of any individual. However, when it is combined with the information in Folder 2, the applicant can be identified.

Example

A business uses Wi-Fi analytics data to count the number of visitors per hour across different retail outlets. It is not necessary to know whether an individual has visited an individual store (or multiple stores) before.

This involves the business processing the Media Access Control (MAC) addresses of mobile devices that broadcast probe requests to its public Wi-Fi hotspots. MAC addresses are intended to be unique to the device (although they can be modified or spoofed using software).

If an individual can be identified from that MAC address, or other information in the possession of the network operator (the business, in this example), then the data is personal data. Additionally, even if the business does not know the name of the individual, using a MAC address (or other unique identifier) to track a device with the purpose of singling out that individual or treating them differently means the data is also personal data.

Can we or someone else identify an individual from information we hold and they hold?

Sometimes, whether someone can be identified may depend on who may have access to the information and any other information that can be combined with it.

It’s important to be aware that you may hold information, which when combined with other information held outside of your organisation, could lead to an individual being indirectly identified or identifiable.

Example

An online platform release statistical data sets about the use of its services for research purposes. This information does not contain the names of the services users, but instead profile data showing usage patterns. However, a number of those individuals have made public comments about their use of the platform. The information released by the platform can be matched to the public comments to identify those individuals.

Example

A public authority releases information about complaints in response to a request under Freedom of Information Act 2000. It does not reveal the names or addresses of the complainants, but other information is in the public domain that can easily be used to match the identity of those complainants.

If there is only a very slight possibility that an individual could be indirectly identified, is it still personal data?

Sometimes it is not immediately obvious whether an individual can be identified or not, for example, when someone holds information where the names and other identifiers have been removed or where you process a ‘non-obvious’ identifier. In these cases, Recital 26 of the UK GDPR states that, to determine whether or not the individual is identifiable you should take into account ‘all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly’.

Therefore, the fact that there is a very slight hypothetical possibility that someone might be able to reconstruct the data in such a way that the individual is identified is not necessarily sufficient to make the individual identifiable. You must consider all the factors at stake.

What factors should we consider when assessing the possibility of identification?

You should consider what means are reasonably likely to be used to identify the individual taking into account all objective factors, such as:

the costs and amount of time required for identification;
the available technology at the time of the processing; and
likely technological developments.

You should also document this assessment.

Your starting point might be to look at what means are available to identify an individual and the extent to which these are readily available. For example, if searching a public register or reverse directory would enable you to identify an individual from an address or telephone number, and you are likely to use this resource for this purpose, you should consider that the address or telephone number data is capable of identifying an individual.

You should assume that you are not looking just at the means reasonably likely to be used by an ordinary person, but also by a determined person with a particular reason to want to identify individuals. For example, investigative journalists, estranged partners, stalkers, or industrial spies.

Means of identifying individuals that are feasible and cost-effective, and are therefore likely to be used, will change over time. If you decide that the data you hold does not allow the identification of individuals, you should review that decision regularly in light of new technology or security developments or changes to the public availability of certain records.

The measures reasonably likely to be taken to identify an individual may vary depending upon the perceived value of the information. For example, if the information is thought to be about a high profile public figure, it is likely that there will be some who are willing to use extreme measures to identify that individual.