What do we mean by genomics and genomic data in this report?
Genomic research uses and shares data from the whole of a person’s DNA sequence and structure, rather than an individual gene. Using this definition provides a very clear link to the definition of genetic data as it exists under the UK GDPR, with implications as to how and when this may be considered personal data as well as special category personal data.
Each person’s DNA sequence is represented by some 6.4 billion bases (or letters) in our genome. Genomics refers to the study of these individual genes, grouped into either coding or non-coding genes. 10 Some genes determine traits such as eye colour or blood type. Others, complex traits and characteristics that reflect the interaction of multiple genes (polygenic), variances in a DNA sequence and the impact of environment. Common complex traits are highly polygenic and influenced but there are thousands of common DNA variants.
Other factors that can alter the instructions our genes provide include mutations that generate positive, negative or neutral effects to our characteristics. These may be inherited or experienced in the lifetime of a person. Access to the information encoded in DNA can be shifted or changed through epigenetic factors and quasi-environmental factors. At its largest scale, functional genomics can be considered a study of multiomics, encompassing not only DNA but its possible modulations, RNA transcripts and modifications and protein structures and how these interact with their environment.
Genetics under the UK GDPR
While genomic data is not itself referenced in UK data protection law, genetic data is. Under Article 4(13) of the GDPR, “genetic data” is defined as:
“personal data relating to the inherited or acquired genetic characteristics of a natural person which give unique information about the physiology or the health of that natural person and which result, in particular, from an analysis of a biological sample from the natural person in question.”
Recital 34 of the GDPR provides further context for understanding the definition set out in Article 4(13), noting that organisations should interpret it as including any type of analysis that enables them to obtain equivalent information. For example, ribonucleic acid (RNA) plays an essential part in the coding, decoding, regulation and expression of genes.
For the purposes of this report, we will broadly consider that genomic data can be substituted for genetic data and therefore always considered special category data under article 9 of the UK GDPR. However, key issues remain around when genomic data is personal information and when inferences drawn from genomic data may fall into other categories such as biometric, health or simply personal (rather than Article 9 special category) data.
For the purposes of this report, we will define genomic data as:
“personal data relating to the inherited or acquired genomic characteristics of a natural person which can give unique information about the physiology, characteristics or the health of that person and which result, in particular, from an analysis of the interplay of the genetic information of a natural person and their environment.”
This definition has been designed to encompass as broad a spread of uses of genomic information as possible, including:
- epigenetic information; and
- polygenic risk scores.
Some information, such as polygenic risk scores, may be considered second order data, posing later challenges as to whether it falls under special category or genomic data. Recital 35 of the UK GDPR suggests that health data should be interpreted as:
‘information derived from the testing or examination of a body part or bodily substance, including from genetic data and biological samples.’
This may cover certain uses of the information being considered. Therefore, this definition will allow us to examine a variety of risks and challenges in the context of privacy and data protection.
This report does not try to set out the enormous depth and variety of genomic technologies. 11 Rather, this report focuses on the uses, rather than the creation, of personal information.
10 Completing the human genome sequence
11 Single nucleotide polymorphism arrays: a decade of biological, computational and technological advances - PMC