AI Training

Glossary

Every term, in plain English

A living glossary - each term links to the lesson that introduces it. The same definitions pop up on hover throughout the lessons.

27 terms

Agent

Coming up

An LLM that can plan, use tools, and take multi-step actions toward a goal.

Lesson 10.1

AI slop

Mass-produced, low-effort AI content that looks fine but adds little value.

Lesson 1.3

Artificial Intelligence (AI)

Software that performs tasks we'd normally call intelligent - the broad umbrella over ML, deep learning, and GenAI.

Lesson 1.1

Bias

Skew a model absorbs from human-written training data (stereotypes, skewed defaults).

Lesson 1.3

Chain of Thought (CoT)

Coming up

Prompting a model to reason step by step, improving logic and math.

Lesson 2.3

Context window

How much text a model can consider at once; long inputs can get truncated or lost.

Lesson 1.3

Deep Learning

ML using many-layered neural networks that learn features directly from raw data (images, audio, text).

Lesson 1.1

Embedding

Coming up

A numeric vector capturing the meaning of text, so machines can compare similarity.

Lesson 8.1

Feature engineering

Humans hand-picking the useful columns/inputs a traditional ML model learns from.

Lesson 1.1

Generative AI

Deep learning trained to create new content (text, images, audio, code), not just classify or predict.

Lesson 1.1

Hallucination

When a model states something false as if true, because it generates plausible text without a fact-check.

Lesson 1.3

Knowledge cutoff

The date a model's training data ends; it can't know newer events without tools/search.

Lesson 1.3

Large Language Model (LLM)

A large neural network that generates text by predicting the next token, trained on huge amounts of text.

Lesson 1.2

Machine Learning (ML)

Algorithms that learn patterns from data instead of following hand-written rules.

Lesson 1.1

MCP

Coming up

Model Context Protocol - a standard way to connect models to tools and data sources.

Lesson 7.3

Next-token prediction

How an LLM generates: it assigns probabilities to possible next tokens and samples one, repeatedly.

Lesson 1.2

Parameter

One of the model's billions of adjustable 'dials', set during training.

Lesson 1.2

PII

Personally Identifiable Information - names, emails, IDs, health, payment data.

Lesson 1.4

Post-training

Later training (fine-tuning + RLHF) that makes a base model helpful, honest, and safe.

Lesson 1.2

Pre-training

The first training phase: predict the next token across internet-scale text (self-supervised) to learn language and knowledge.

Lesson 1.2

Prompt

Coming up

The instruction/input you give an AI model.

Lesson 2.1

RAG

Coming up

Retrieval-Augmented Generation - fetch relevant documents and feed them to the model so it answers from real sources.

Lesson 8.2

ReAct

Coming up

An agent pattern that interleaves reasoning and actions (tool calls).

Lesson 10.1

RLHF

Reinforcement Learning from Human Feedback - humans rank answers and the model learns what people prefer.

Lesson 1.2

Temperature

A dial for randomness: low = focused/predictable, high = creative/riskier.

Lesson 1.2

Token

The chunk of text a model reads/writes - roughly 4 characters or 0.75 of a word.

Lesson 1.2

Tokenization

Splitting text into tokens before the model processes it.

Lesson 1.2