A comprehensive reference of 68+ essential AI terms, concepts, and acronyms. From AGI to Zero-Shot Learning.
A mathematical function applied to a neural network node's output that determines whether it should be activated. Common activation functions include ReLU, sigmoid, and tanh. They introduce non-linearity, allowing neural networks to learn complex patterns.
A hypothetical form of AI that possesses the ability to understand, learn, and apply knowledge across any intellectual task that a human can perform. Unlike narrow AI, AGI would exhibit flexible, general-purpose reasoning. It remains a theoretical concept and a subject of significant debate in the AI research community.
The research field focused on ensuring that AI systems behave in accordance with human values, intentions, and goals. Alignment is considered one of the most critical challenges in AI safety, especially as systems become more capable and autonomous.
The branch of ethics that examines the moral implications of developing and deploying artificial intelligence systems. Key concerns include bias, fairness, transparency, accountability, privacy, and the societal impact of automation.
A phenomenon where an AI model generates information that sounds plausible but is factually incorrect, fabricated, or nonsensical. Hallucinations are a known limitation of large language models and highlight the importance of human verification of AI outputs.
The ability to understand, use, evaluate, and communicate about artificial intelligence technologies. AI literacy encompasses knowing what AI can and cannot do, recognizing AI-generated content, and understanding the ethical implications of AI deployment.
An AI safety company founded in 2021 by former OpenAI researchers, known for developing the Claude family of AI assistants. Anthropic focuses on building reliable, interpretable, and steerable AI systems with a strong emphasis on safety research.
A set of protocols and tools that allows different software applications to communicate with each other. In the AI context, APIs enable developers to integrate AI capabilities (like text generation, image recognition, or speech-to-text) into their own applications without building models from scratch.
A component in neural network architectures that allows the model to focus on different parts of the input when producing output. The self-attention mechanism in Transformers enables the model to weigh the relevance of each word in a sentence relative to every other word, which is fundamental to how modern LLMs process language.
An AI system that can independently perceive its environment, make decisions, and take actions to achieve specified goals without continuous human intervention. Examples include self-driving cars, robotic process automation bots, and AI agents that can browse the web and complete tasks.
The primary algorithm used to train neural networks. It calculates the gradient of the loss function with respect to each weight in the network, then adjusts weights to minimize error. This process of propagating errors backward through the network is what enables deep learning models to improve over time.
Systematic errors in AI outputs that arise from prejudiced assumptions in training data, algorithm design, or deployment context. AI bias can lead to unfair outcomes that disproportionately affect certain groups. Addressing bias requires careful data curation, model auditing, and ongoing monitoring.
A prompt engineering technique that instructs an AI model to break down complex reasoning into intermediate steps before arriving at a final answer. By explicitly asking the model to 'think step by step,' this approach significantly improves performance on math, logic, and multi-step reasoning tasks.
A conversational AI product developed by OpenAI, built on the GPT (Generative Pre-trained Transformer) architecture. Launched in November 2022, ChatGPT popularized the use of large language models for general-purpose conversation, writing, coding, analysis, and creative tasks.
A family of AI assistants developed by Anthropic. Claude models are designed with a focus on being helpful, harmless, and honest. They are known for strong performance in analysis, writing, coding, and following nuanced instructions.
A field of AI that enables machines to interpret and understand visual information from the world, including images and videos. Applications include facial recognition, object detection, medical image analysis, autonomous driving, and quality control in manufacturing.
An approach to AI alignment developed by Anthropic where AI systems are trained to follow a set of principles (a 'constitution') that guide their behavior. The model critiques and revises its own outputs based on these principles, reducing the need for human feedback on every interaction.
The maximum amount of text (measured in tokens) that a language model can process in a single interaction. A larger context window allows the model to consider more information when generating responses. Modern models range from 4K to over 1 million tokens in context window size.
An AI image generation model created by OpenAI that produces images from text descriptions. DALL-E demonstrates the ability of AI to understand and visually represent complex concepts, compositions, and styles described in natural language.
The process of annotating raw data (images, text, audio) with meaningful tags or categories so it can be used to train supervised machine learning models. High-quality labeled data is essential for model accuracy and is often the most time-consuming part of AI development.
A subset of machine learning that uses artificial neural networks with multiple layers (hence 'deep') to learn representations of data at increasing levels of abstraction. Deep learning has driven breakthroughs in image recognition, natural language processing, speech recognition, and generative AI.
A type of generative AI model that creates data (typically images) by learning to reverse a gradual noising process. Starting from random noise, the model iteratively denoises to produce coherent outputs. Stable Diffusion and DALL-E 3 are prominent examples.
A numerical representation of data (words, sentences, images) as vectors in a high-dimensional space. Embeddings capture semantic meaning — similar concepts are placed closer together in the vector space. They are fundamental to how AI models understand relationships between concepts.
Capabilities or behaviors that appear in AI models at scale but were not explicitly programmed or anticipated. As language models grow larger, they sometimes develop unexpected abilities like in-context learning, chain-of-thought reasoning, or multilingual translation without specific training for those tasks.
The European Union's comprehensive regulatory framework for artificial intelligence, adopted in 2024. It classifies AI systems by risk level (unacceptable, high, limited, minimal) and imposes requirements accordingly, including transparency obligations, human oversight mandates, and prohibitions on certain AI practices.
A machine learning approach where a model learns to perform a task from only a small number of examples. In the context of LLMs, few-shot prompting involves providing a few examples of the desired input-output pattern within the prompt to guide the model's behavior.
The process of further training a pre-trained AI model on a specific, smaller dataset to adapt it for a particular task or domain. Fine-tuning allows organizations to customize general-purpose models for specialized applications like medical diagnosis, legal analysis, or customer service.
A large AI model trained on broad data at scale that can be adapted to a wide range of downstream tasks. Examples include GPT-4, Claude, Llama, and Gemini. Foundation models serve as the base upon which specialized applications are built through fine-tuning or prompting.
Google DeepMind's family of multimodal AI models capable of processing and generating text, images, audio, and video. Gemini represents Google's flagship AI offering and is integrated across Google products including Search, Workspace, and Android.
AI systems capable of creating new content — text, images, audio, video, code, or other media — based on patterns learned from training data. Generative AI represents a shift from AI that classifies or predicts to AI that creates, and includes technologies like ChatGPT, Midjourney, and Suno.
A family of large language models developed by OpenAI. GPT models are trained using unsupervised learning on vast text corpora, then fine-tuned for specific tasks. The architecture is based on the Transformer, using self-attention mechanisms to generate coherent, contextually relevant text.
The technique of connecting AI model outputs to verified, factual sources of information. Grounding helps reduce hallucinations by ensuring the model's responses are anchored in real data, documents, or databases rather than relying solely on patterns learned during training.
Safety mechanisms and constraints built into AI systems to prevent harmful, biased, or undesirable outputs. Guardrails can include content filters, output validation rules, topic restrictions, and behavioral guidelines that keep AI systems operating within acceptable boundaries.
An open-source platform and community for machine learning, hosting thousands of pre-trained models, datasets, and tools. Hugging Face has become the de facto hub for sharing and discovering AI models, particularly in natural language processing.
The ability of large language models to learn and adapt their behavior based on examples or instructions provided within the prompt, without any changes to the model's weights. This emergent capability allows LLMs to perform new tasks simply by being shown what to do in the conversation context.
The process of using a trained AI model to make predictions or generate outputs on new, unseen data. Inference is the 'production' phase of AI — after a model is trained, inference is how it's actually used in applications. Inference speed and cost are key considerations for deployment.
Techniques used to bypass the safety guardrails and content restrictions of AI models, causing them to produce outputs they were designed to refuse. Jailbreaking highlights the ongoing challenge of making AI systems robust against adversarial manipulation.
The date after which a language model has no training data, meaning it lacks awareness of events, developments, or information that occurred after that date. Understanding a model's knowledge cutoff is important for evaluating the currency and reliability of its outputs.
An open-source framework for building applications powered by language models. LangChain provides tools for chaining together multiple LLM calls, integrating external data sources, managing memory, and building AI agents that can use tools and make decisions.
An AI model trained on massive amounts of text data that can understand, generate, and manipulate human language. LLMs like GPT-4, Claude, and Llama use billions of parameters to capture patterns in language, enabling them to perform tasks ranging from conversation to code generation to analysis.
An efficient fine-tuning technique that adapts large language models by training only a small number of additional parameters rather than modifying the entire model. LoRA dramatically reduces the computational cost and memory requirements of fine-tuning, making model customization more accessible.
A subset of artificial intelligence where systems learn patterns from data and improve their performance on tasks without being explicitly programmed. The three main types are supervised learning (labeled data), unsupervised learning (unlabeled data), and reinforcement learning (reward-based).
An AI image generation platform that creates images from text descriptions (prompts). Known for producing highly artistic and aesthetically refined outputs, Midjourney operates primarily through a Discord bot interface and has become one of the most popular tools for AI-generated art.
A neural network architecture where multiple specialized sub-networks (experts) are combined, with a gating mechanism that routes each input to the most relevant experts. MoE allows models to be very large in total parameters while only activating a fraction for each input, improving efficiency.
AI systems that can process, understand, and generate multiple types of data — such as text, images, audio, and video — within a single model. GPT-4o and Gemini are examples of multimodal models that can seamlessly work across different data modalities.
The field of AI focused on enabling computers to understand, interpret, and generate human language. NLP encompasses tasks like sentiment analysis, translation, summarization, question answering, and text generation. Modern NLP is dominated by Transformer-based language models.
A computing system inspired by the biological neural networks in the human brain. It consists of interconnected nodes (neurons) organized in layers that process information. Neural networks are the foundation of deep learning and power most modern AI systems.
A voluntary framework published by the U.S. National Institute of Standards and Technology that provides guidance for managing risks associated with AI systems. It covers governance, risk mapping, measurement, and management across the AI lifecycle, and is widely referenced in AI policy discussions.
AI models and tools whose source code, weights, and/or training data are made publicly available for anyone to use, modify, and distribute. Open source AI promotes transparency, collaboration, and accessibility. Notable examples include Meta's Llama, Stability AI's Stable Diffusion, and Mistral's models.
An AI research and deployment company founded in 2015, known for developing the GPT series of language models, DALL-E image generation, Whisper speech recognition, and the ChatGPT conversational AI product. OpenAI has been instrumental in bringing large language models to mainstream adoption.
A problem in machine learning where a model learns the training data too well — including its noise and outliers — resulting in poor performance on new, unseen data. Overfitting means the model has memorized rather than generalized, and is addressed through techniques like regularization, dropout, and cross-validation.
A variable within a machine learning model that is learned from training data. In neural networks, parameters are the weights and biases that the model adjusts during training. The number of parameters is often used as a rough measure of model size — GPT-4 is estimated to have over 1 trillion parameters.
A metric used to evaluate language models, measuring how well the model predicts a sample of text. Lower perplexity indicates better prediction. Also the name of an AI-powered search engine that provides cited, conversational answers to questions.
The practice of crafting effective inputs (prompts) to guide AI models toward producing desired outputs. Prompt engineering encompasses techniques like few-shot examples, chain-of-thought reasoning, role assignment, and structured formatting to maximize the quality and relevance of AI responses.
A technique that enhances language model outputs by first retrieving relevant information from external knowledge sources, then using that information to generate more accurate and grounded responses. RAG reduces hallucinations and allows models to access up-to-date or proprietary information.
The ability of AI models to perform logical deduction, multi-step problem solving, and complex analysis. Recent models like OpenAI's o1 and o3 series demonstrate improved reasoning capabilities through techniques like chain-of-thought processing and extended 'thinking' time before responding.
A training technique where human evaluators rank model outputs by quality, and this feedback is used to train a reward model that guides further optimization. RLHF is a key technique used to align language models with human preferences and make them more helpful, harmless, and honest.
An approach to developing and deploying AI systems that prioritizes ethical considerations, fairness, transparency, accountability, and societal benefit. Responsible AI frameworks guide organizations in building AI that respects human rights, minimizes harm, and operates within legal and ethical boundaries.
An open-source AI image generation model that creates images from text descriptions using a latent diffusion architecture. Its open-source nature has enabled a large ecosystem of tools, extensions, and fine-tuned variants for specialized image generation tasks.
Artificially generated data that mimics the statistical properties of real-world data. Synthetic data is used to train AI models when real data is scarce, sensitive, or expensive to collect. It can help address privacy concerns and data imbalance issues.
Hidden instructions provided to an AI model that define its behavior, personality, capabilities, and constraints for a given interaction. System prompts are set by developers and are not visible to end users. They are a fundamental tool for customizing AI assistant behavior.
A parameter that controls the randomness of an AI model's outputs. Lower temperature (e.g., 0.1) produces more deterministic, focused responses, while higher temperature (e.g., 1.0) produces more creative, varied outputs. Adjusting temperature is a key technique in prompt engineering.
The basic unit of text that language models process. A token can be a word, part of a word, or a punctuation mark. For English text, one token is roughly 3/4 of a word. Tokenization — breaking text into tokens — is the first step in how LLMs process language. Model pricing and context windows are measured in tokens.
A machine learning technique where a model trained on one task is repurposed for a different but related task. Transfer learning is the principle behind foundation models — a model pre-trained on general text can be adapted for specific tasks like sentiment analysis, translation, or code generation.
A neural network architecture introduced in the 2017 paper 'Attention Is All You Need' that revolutionized natural language processing. Transformers use self-attention mechanisms to process all parts of an input simultaneously (rather than sequentially), enabling much faster training and better performance. Nearly all modern LLMs are based on the Transformer architecture.
A test proposed by Alan Turing in 1950 to evaluate a machine's ability to exhibit intelligent behavior indistinguishable from a human. If a human evaluator cannot reliably distinguish between the machine and a human in conversation, the machine is said to have passed the test.
A specialized database designed to store, index, and query high-dimensional vector embeddings efficiently. Vector databases are essential for RAG systems, semantic search, and recommendation engines, enabling fast similarity searches across millions of embedded documents or data points.
The ability of an AI model to perform a task it has never been explicitly trained on, without any examples. In the context of LLMs, zero-shot prompting means giving the model a task description without any examples and relying on its general knowledge to produce the correct output.