Glossary
A
Comparing two versions of something to determine which performs better.
AI designed to act with autonomy, breaking down goals into steps and executing them.
The European regulation classifying AI systems by risk level and setting rules for their use.
An AI system that can plan, use tools, and take actions autonomously to achieve a goal.
Systematic unfairness in AI outputs, often inherited from biased training data.
The policies and oversight ensuring AI systems are used safely and in line with organisational values.
A defined sequence of steps a computer follows to solve a problem or perform a task.
The systematic analysis of data to discover patterns and support decisions.
A set of rules allowing different software systems to communicate and exchange data.
Systems designed to perform tasks that normally require human intelligence, such as reasoning or perception.
B
Datasets so large, fast, or varied that traditional tools cannot process them efficiently.
A model whose internal decision-making process is difficult or impossible to interpret.
Tools and practices that turn data into dashboards, reports, and insights for business users.
C
A program that interacts with users through conversation, often powered by language models.
A task where a model assigns inputs to predefined categories, such as spam or not spam.
Delivering computing services such as storage and processing over the internet, on demand.
Grouping similar data points together without predefined categories.
Adhering to laws, regulations, and internal policies governing data use.
The field of AI enabling machines to interpret and analyse images and videos.
The maximum amount of text a language model can consider at once.
A statistical relationship between two variables, which does not necessarily imply causation.
D
A visual interface displaying key metrics and indicators at a glance.
Raw facts, figures, or observations that have not yet been processed to produce meaning.
The process of building a shared data culture and mindset across an organisation.
Removing or altering personal identifiers so individuals can no longer be identified.
An organised inventory of an organisation's data assets, making them easy to find and understand.
The framework of policies, roles, and processes ensuring data is managed responsibly across an organisation.
The process of collecting and importing data from various sources into a storage system.
A storage system holding raw data in its original format until it is needed.
An architecture combining the flexibility of a data lake with the management features of a warehouse.
The documented journey of data, showing where it comes from and how it is transformed.
The ability to read, understand, question, and communicate with data.
A decentralised approach where business domains own and serve their data as products.
Exploring large datasets to discover hidden patterns and relationships.
The person accountable for a data asset, its access rules, and its business value.
An automated sequence of steps that moves and transforms data from source to destination.
A single unit of information, such as one measurement or one customer record field.
The protection of personal information and individuals' rights over how their data is used.
The degree to which data is accurate, complete, consistent, and fit for its intended use.
Rules defining how long data is kept before being archived or deleted.
The measures protecting data against unauthorised access, corruption, or theft.
A person responsible for the quality, documentation, and proper use of specific data assets.
Combining data, visuals, and narrative to communicate insights in a compelling way.
Representing data graphically through charts, maps, or diagrams to make it easier to understand.
A central repository storing structured data from multiple sources, optimised for analysis and reporting.
An organised system for storing, managing, and retrieving data electronically.
A structured collection of related data, usually organised in rows and columns.
A type of machine learning using multi-layered neural networks to learn complex patterns.
Analysis that explains what happened, based on historical data.
The integration of digital technologies, including data and AI, to reshape how an organisation operates and delivers value.
E
A modern variant where raw data is loaded first and transformed inside the target system.
A numerical representation of text or other data that captures its meaning for machines.
A process that extracts data, transforms it into a usable format, then loads it into a target system.
The ability to understand and explain how an AI model reaches its decisions.
F
The input variables a model uses to make predictions.
A model performing a task after seeing only a handful of examples in the prompt.
Further training a pre-trained model on specific data to specialise it for a task.
A large model trained on broad data, adaptable to many downstream tasks.
G
The European regulation governing how organisations collect, store, and process personal data.
AI systems that create new content such as text, images, code, or audio.
H
When an AI model generates information that sounds plausible but is false or invented.
Keeping humans involved in AI decisions to validate, correct, or override outputs.
I
Using a trained model to make predictions on new, unseen data.
Data that has been organised and contextualised so it becomes useful for decision-making.
K
A measurable value showing how effectively an objective is being achieved.
L
The correct answer attached to a training example in supervised learning.
A model trained on massive amounts of text to understand and generate human language.
M
A branch of AI where systems learn patterns from data instead of following explicit rules.
The core reference data of an organisation, such as customers, products, or suppliers.
Data that describes other data, such as a file's author, creation date, or format.
The practices and tools for deploying, monitoring, and maintaining machine learning models in production.
The output of training an algorithm on data, capable of making predictions on new inputs.
The degradation of a model's performance over time as real-world data changes.
AI capable of processing several types of input, such as text, images, and audio together.
N
A computing model inspired by the human brain, made of interconnected layers of artificial neurons.
The field of AI enabling machines to understand and generate human language.
O
An AI model whose weights are publicly available for anyone to use or adapt.
When a model memorises training data too closely and performs poorly on new data.
P
Analysis that uses historical data to forecast what is likely to happen.
Analysis that recommends actions to take based on predicted outcomes.
The instruction or question given to a generative AI model to produce a response.
The practice of crafting effective prompts to get better results from AI models.
R
A technique where a model retrieves relevant documents before generating its answer.
A task where a model predicts a continuous numerical value, such as a price.
Training an agent through trial and error, using rewards and penalties.
The regular production of structured documents presenting data on activities and performance.
Developing and using AI in ways that are ethical, fair, transparent, and accountable.
S
Tools enabling business users to explore data and build reports without technical help.
Data with some organisational markers but no rigid schema, such as JSON or XML files.
The standard language used to query and manipulate data in relational databases.
Data organised in a predefined format, such as tables with rows and columns.
Training a model on labelled examples where the correct answer is provided.
Hidden instructions that define an AI assistant's behaviour, tone, and boundaries.
T
A setting controlling how creative or predictable a model's responses are.
A small unit of text, such as a word or word fragment, that language models process.
The process of teaching a model by exposing it to data and adjusting its parameters.
U
When a model is too simple to capture the patterns in the data.
Data without a predefined format, such as emails, images, videos, or free text.
Training a model on unlabelled data to discover hidden structures or groups.
Z
A model performing a task without having seen any examples of it.