AI Glossary: Artificial Intelligence Terms Explained by TCB

AI Reference

AI Glossary

Clear definitions for the most important terms in artificial intelligence, from foundational concepts to frontier research.

20 termsUpdated: June 21, 2026

Agent (AI): An AI system that perceives its environment, makes decisions, and takes actions to achieve a goal. Unlike a static model that responds to prompts, an AI agent can plan multi-step tasks, use tools, and operate autonomously over time.

Benchmark: A standardized test used to measure and compare AI model performance across tasks such as reasoning, coding, or factual recall. Benchmarks are used to track progress but can be gamed, making real-world evaluation essential.

Computer Vision: A branch of AI that enables machines to interpret and act on visual information from images and video. Applications range from medical imaging to autonomous vehicles and real-time object detection.
Context Window: The maximum amount of text or tokens a language model can process in a single interaction. Models with larger context windows can analyze longer documents and maintain coherent reasoning across extended conversations.

Diffusion Model: A generative AI architecture that learns to create images, audio, or video by reversing a noise process. Diffusion models underpin most modern image generators including Stable Diffusion and DALL-E.

Embeddings: Numerical vector representations of text, images, or other data that capture semantic meaning. Models trained on embeddings can measure similarity between concepts, enabling search, recommendation, and retrieval-augmented generation.

Fine-tuning: The process of further training a pre-trained foundation model on a domain-specific dataset to improve its performance on targeted tasks. Fine-tuning is more efficient than training from scratch but requires curated data.
Foundation Model: A large AI model trained on broad data at scale that can be adapted to a wide range of downstream tasks. GPT, Claude, and Gemini are examples of foundation models underlying most commercial AI products.

Hallucination: When an AI model generates plausible-sounding but factually incorrect or fabricated information. Hallucinations are a known failure mode of language models and require verification for high-stakes use cases.

Inference: The process of running a trained AI model to generate outputs from new inputs. Inference costs scale with model size and token volume, making efficiency a key factor in commercial AI deployment.

LLM: Large Language Model. A type of AI trained on massive text corpora that can generate, summarize, translate, and reason about language. LLMs are the core technology behind chatbots, coding assistants, and search AI.

Multimodal: An AI model capable of processing and generating multiple types of data, such as text, images, audio, and video, within a single system. Multimodal models can answer questions about images or generate images from text descriptions.

Neural Network: A computational system loosely inspired by biological brains, composed of layers of connected nodes that learn patterns from data by adjusting connection weights during training.

Prompt Engineering: The practice of crafting inputs to AI models to elicit accurate, useful, or creative outputs. Effective prompt engineering can significantly improve model performance without modifying the underlying model.

RAG: Retrieval-Augmented Generation. A technique that enhances AI responses by retrieving relevant documents from an external knowledge base before generating an answer. RAG reduces hallucinations and keeps model outputs current without retraining.
RLHF: Reinforcement Learning from Human Feedback. A training method where human raters score model outputs, and those scores are used to fine-tune the model toward more helpful, accurate, and safe responses. RLHF is central to aligning LLMs with human preferences.

Synthetic Data: Artificially generated data used to train or test AI models when real-world data is scarce, sensitive, or biased. Synthetic data allows teams to create diverse training sets without privacy risks.

Token (AI): The basic unit of text processed by a language model. Tokens roughly correspond to word fragments (about 4 characters on average). Model costs and context limits are typically measured in tokens.
Transformer: The neural network architecture that underpins most modern large language models. Transformers use attention mechanisms to weigh the relevance of all words in a sequence when processing each token, enabling sophisticated language understanding.

Zero-shot Learning: The ability of an AI model to perform a task it was never explicitly trained on, using only a description of the task in the prompt. Zero-shot capability is a hallmark of powerful foundation models.

Last updated: June 21, 2026