Glossary Wiki

A

Agent: An AI system that can take actions on its own, such as using tools, browsing, or running code, to complete a goal rather than just answering in text.
AGI (Artificial general intelligence): A hypothetical AI that can learn and perform any intellectual task a human can, rather than being good at just one narrow job.
Alignment: The work of making sure an AI system actually does what people intend and holds to human values, especially as it becomes more capable.
Artificial intelligence (AI): Software that performs tasks we usually associate with human intelligence, like understanding language, recognising images, or making decisions.
Attention: The mechanism inside a transformer that lets a model weigh which earlier words matter most when predicting the next one.

Benchmark: A standard test or dataset used to measure and compare how well AI models perform on a task.
Bias: Systematic skew in a model's outputs, usually learned from patterns in its training data, that can make results unfair or inaccurate.

Chain of thought: A prompting style, or model behaviour, where the AI works through a problem step by step instead of jumping straight to an answer.
Compute: The raw processing power, usually measured in chips and hours, used to train or run an AI model.
Context window: The amount of text a model can consider at once, including your prompt and its own reply. Anything beyond it is forgotten.

Diffusion model: A model that creates images, audio, or text by starting from random noise and gradually refining it into a clear result.

Embedding: A list of numbers that represents the meaning of a word, sentence, or image so a computer can compare how similar two things are.

Fine-tuning: Taking an already trained model and training it a bit more on a smaller, specific dataset so it is better at a particular task or style.
Foundation model: A large model trained on broad data that can be adapted to many different tasks, serving as a base others build on.

GAN (Generative adversarial network): A setup where two networks compete, one creating fake data and one judging it, which pushes the creator to produce realistic results.
GPU (Graphics processing unit): A chip originally built for graphics that turned out to be ideal for the parallel maths behind training and running AI models.
Guardrails: Rules and filters added around a model to stop it producing harmful, unsafe, or off-limits responses.

Hallucination: When an AI states something false or made up as if it were true, because it predicts plausible text rather than checking facts.

Inference: The act of running a trained model to get an answer. This is what happens every time you send it a prompt.

Large language model (LLM): A model trained on huge amounts of text that predicts and generates language, powering tools like chatbots and writing assistants.

Machine learning: A branch of AI where systems learn patterns from data and improve with experience, instead of being programmed with fixed rules.
Mixture of experts (MoE): A model design that routes each input to a few specialised sub-networks instead of using the whole model, saving compute.
Multimodal: An AI that can work with more than one kind of input or output, such as text, images, audio, and video together.

Neural network: A model loosely inspired by the brain, made of layers of connected units that adjust as they learn from data.

Open weights: A model whose trained parameters are released publicly so anyone can download, run, and adapt it, though the training data may not be shared.
Overfitting: When a model memorises its training data too closely and performs well there but poorly on new, unseen examples.

Parameters: The internal numbers a model adjusts during training. More parameters can mean more capacity, and counts like 70B refer to these.
Pre-training: The first, broad training stage where a model learns general patterns from a very large dataset before any task-specific tuning.
Prompt: The instruction or question you give an AI model to tell it what you want.
Prompt engineering: The craft of wording and structuring prompts to get more useful, accurate, or reliable results from a model.

Quantization: Shrinking a model by storing its numbers at lower precision, so it uses less memory and runs on smaller hardware, with a small quality trade-off.

RAG (Retrieval-augmented generation): A technique where a model looks up relevant documents and uses them to answer, so it can cite sources and stay current.
Reasoning model: A model trained to spend extra steps thinking through a problem before answering, which improves results on maths, coding, and logic.
Red teaming: Deliberately probing a model to find weaknesses, unsafe outputs, or ways it can be misused, so they can be fixed before release.
Reinforcement learning: Training a model through trial and error, rewarding good outcomes and penalising bad ones, so it learns a useful strategy.
RLHF (Reinforcement learning from human feedback): A training method that uses human ratings of model answers to teach it to be more helpful, honest, and safe.

Scaling laws: The observed pattern that model performance improves predictably as you increase data, parameters, and compute together.
Superintelligence: A hypothetical AI that far surpasses the best human minds across essentially all fields.
Synthetic data: Data generated by a model or simulation, rather than collected from the real world, used to train or test other models.

Temperature: A setting that controls how random a model's output is. Low values make it focused and predictable, high values make it more varied.
Token: The small chunk of text, often a word piece, that a model reads and generates. Models measure input and cost in tokens.
Tokenization: The step of splitting text into tokens so a model can process it.
Training: The process of feeding data to a model so it adjusts its parameters and learns to perform a task.
Transformer: The neural network architecture behind most modern AI, which uses attention to handle sequences like text efficiently.

Vector database: A database built to store embeddings and quickly find the items most similar in meaning to a query. Often used with RAG.

Weights: Another word for a model's learned parameters, the values that determine how it turns an input into an output.