The UK’s independent authority set up to uphold information rights in the public interest, promoting openness by public bodies and data privacy for individuals.

AI development tools A service that allows clients to build and run their own models, with data they have chosen to process, but using the tools and infrastructure provided to them by a third-party.
AI prediction as a service A service that provides live prediction and classification services to customers.
Application Programming Interface (API) A computing interface which defines interactions between multiple software intermediaries.
Automation bias Where human users routinely rely on the output generated by a decision-support system and stop using their own judgement or stop questioning whether the output might be wrong.
Black box A system, device or object that can be viewed in terms of its inputs and outputs, without any knowledge of its internal workings.
Black box attack Where an attacker has the ability to query a model and observe the relationships between inputs and outputs but does not have access to the model itself.
Black box problem The problem of explaining a decision made by an AI system, which can be understood by the average person.
Concept/model drift Where the domain in which an AI system is used changes over time in unforeseen ways leading to the outputs becoming less statistically accurate.
Constrained optimisation A number of mathematical and computer science techniques that aim to find the optimal solutions for minimising trade-offs in AI systems.
Deep learning A subset of machine learning where systems ‘learn’ to detect features that are not explicitly labelled in the data.
Differential privacy A system for publicly sharing information about a dataset by describing the patterns of groups within the dataset while withholding information about individuals in the dataset.
False negative (‘type II’) error When an AI system incorrectly labels cases as negative when they are positive.
False positive (‘type I’) error When an AI system incorrect labels cases as positive when they are negative.
Feature selection The process of selecting a subset of relevant features for in developing a model.
Federated learning A technique which allows multiple different parties to train models on their own data (‘local’ models). They then combine some of the patterns that those models have identified into a single, more accurate ‘global’ model, without having to share any training data with each other.
‘K-nearest neighbours’ (KNN) models  An approach to data classification that estimates how likely a data point is to be a member of one group or the other depending on what group the data points nearest to it are. KNN models contain some of the training data in the model itself.
Lack of interpretability An AI system which has outputs that are difficult for a human reviewer to interpret.
Local Interpretable Model-agnostic Explanation (LIME) An approach to low interpretability which provides an explanation of a specific output rather the model in general.
Machine learning (ML) The set of techniques and tools that allow computers to ‘think’ by creating mathematical algorithms based on accumulated data.
Membership inference attack An attack which allows actors to deduce whether a given individual was present in the training data of a machine learning model.
Model inversion attack An attack where attackers already have access to some personal data belonging to specific individuals in the training data, but can also infer further personal information about those same individuals by observing the inputs and outputs of the machine learning model.
Perturbation Where the values of data points belonging to individuals are changed at random whilst preserving some of the statistical properties of those features in the overall dataset.
Precision The percentage of cases identified as positive that are in fact positive (also called ‘positive predictive value’).
Pre-processing The process of transforming data prior to using it for training a statistical model.
Privacy enhancing technologies (PETs) A broad range of technologies that are designed for supporting privacy and data protection.
Programming language A formal language comprising a set of instructions that produce various kinds of outputs that are using in computer programming to implement algorithms. 
Query A request for data or information from a database table or combination of tables.
Recall (or sensitivity) The percentage of all cases that are in fact positive that are identified as such.
Statistical accuracy The proportion of answers that an AI system gets correct.
Supervised machine learning A machine learning task of learning a function that maps an input to an output based on examples of correctly labelled input-output pairs.
Support Vector Machines (SVMs) A method of separating out classes by using a line (or hyperplane) to divide a plane into parts where each class lay in either side.
‘Virtual machines’ or ‘containers’ Emulations of a computer system that run inside, but isolated from the rest of an IT system.
‘White box’ attack  Where an attacker has complete access to the model itself, and can inspect its underlying code and properties. White box attacks allow additional information to be gathered (such as the type of model and parameters used) which could help an attacker infer personal data from the model.