Tier 01 / Foundations

What AI actually is, in plain English.

Before you touch a prompt, you need a working mental model of what's happening on the other side of the screen. This tier covers the seven ideas that make every other tier click into place. No math. No jargon you can't say out loud.

Approx 18 min read No prerequisites You will leave with vocabulary

01 · The nesting dolls

AI, ML, deep learning, generative AI, LLM. They are not the same thing.

These words get used interchangeably and it's killing your ability to understand anything. They are nested. Each one is a subset of the one before it.

Outermost

AI (Artificial Intelligence)

The widest umbrella. Any computer system that does something we'd call "intelligent" if a human did it. Includes chess engines, GPS routing, spam filters. Most AI in the wild is not new and not generative.

Inside AI

Machine Learning (ML)

A subset of AI where the system learns patterns from data instead of being programmed with explicit rules. Show it ten thousand cat photos, it learns what a cat looks like. Recommendation engines, fraud detection, weather models.

Inside ML

Deep Learning

A flavor of ML that uses neural networks with many layers. The "deep" just means "many layers." This is what unlocked the modern wave of AI starting around 2012, and everything you actually use day-to-day is built on it.

Inside deep learning

Generative AI

Deep-learning systems that produce new content, text, images, audio, video, code, instead of just classifying or predicting. ChatGPT writes, Midjourney paints, ElevenLabs speaks. All generative.

Inside generative AI

Large Language Model (LLM)

A generative AI specialized in language. GPT-4, Claude, Gemini, Llama. It reads text and writes text. It's the thing under the chat box. When people say "AI" in 2026, they almost always mean an LLM.

Rule of thumb

If you're reading a headline that says "AI did X," ask yourself: was that an LLM, or was it old-school ML? The answer changes the meaning of the headline. A bank's fraud system is AI. ChatGPT is AI. They have almost nothing in common.

02 · How an LLM works

It is a very, very fancy autocomplete.

This sentence is the most important thing on this page. Internalize it and most "weird AI behavior" stops being weird.

You hand it some text.

Your prompt. Plus, often, a hidden system prompt and any earlier messages in the conversation.

It predicts the next token.

A token is a chunk of text, usually a word or part of a word. The model assigns a probability to every possible next token and picks one. That's the whole trick.

It does this thousands of times in a row.

Token, token, token, until it decides to stop or hits a limit. What looks like a "response" is just this loop running fast.

It does not "know" things.

It has patterns from training data baked into its weights. There is no database of facts it queries. When it sounds confident and wrong, that's not a bug, it's the default behavior of the system.

AiAi Bro

People reach for the word "thinking" and get themselves into trouble. An LLM is not thinking in any sense you'd recognize. It is producing the next likely token given the previous tokens. If you keep that image in your head, you will write better prompts and you will stop being shocked when it makes things up.

03 · Training vs inference

Two completely different events. Don't confuse them.

Training is how the model gets built. Inference is what happens every time you press Enter. Confusing these is the source of half the "but does my data get used?" anxiety.

	Training	Inference
When it happens	Once, before the model is released (then occasionally re-done for new versions)	Every time you send a message
Who does it	The lab (OpenAI, Anthropic, Google) on huge GPU clusters	You, with one click. The model runs on the lab's servers.
Cost & time	Millions of dollars, weeks or months	Fractions of a cent, seconds
Data flow	Massive corpus → model weights (the model "absorbs" patterns)	Your prompt → response. Weights don't change.
What changes	The model's parameters	Nothing about the model. Just the conversation log.

Common misconception

"If I tell ChatGPT a secret, it'll learn it and tell other users." Almost certainly not. Your message is used at inference time to generate your reply, and that's it. Whether the provider stores your conversation for future training is a separate question with a per-product answer, usually controllable in settings. Inference doesn't change the model; training does.

04 · Tokens & context

The two units that govern almost everything you do.

If you only memorize two technical words from this whole guide, make it these. Tokens are the unit. Context window is the budget.

Token

A chunk of text the model sees as a unit. Roughly 4 characters of English, or about 3/4 of a word. "Strawberry" is 3 tokens. "Hello" is 1. The model never actually sees letters; it sees tokens.

Context window

The maximum number of tokens the model can consider at once, your prompt + the conversation history + its reply. If you exceed it, the oldest stuff falls off the front.

Input tokens

Everything you and the system send into the model.

Output tokens

Everything the model writes back. Output tokens cost more than input tokens, usually 3-5x more, because generating is harder than reading.

A back-of-napkin scale

Context window	Roughly equals	What that lets you do
8K tokens	~12 pages of text	A long chat or a short document. Old models.
128K tokens	~250 pages	A book chapter or a long PDF. Modern frontier baseline.
200K tokens	~400 pages	A full novel. Claude's standard window.
1M tokens	~2,000 pages	A whole bookshelf. Gemini's flagship, and Claude's enterprise tier.

Why this matters in practice

"Why did the AI forget what I said earlier?" Almost always: the conversation outgrew the context window, or the model had a smaller window than you assumed. Long chats degrade. Start fresh with a clean summary when you feel a thread getting incoherent.

05 · Parameters & "size"

What "70 billion parameter model" actually means.

A parameter is one of the numbers inside the model that gets adjusted during training. Modern frontier models have hundreds of billions to trillions of them. You'll see numbers like 7B, 70B, 405B thrown around.

Bigger ≠ always smarter

For years it more or less did. Today, training technique matters at least as much as raw parameter count. A well-trained smaller model can beat a poorly-trained larger one.

Bigger = more expensive

Larger models are slower and cost more per token to run. Why the frontier labs ship a fleet (e.g. Haiku, Sonnet, Opus): a tier for cheap-and-fast, a tier for smart-and-slow.

You usually don't pick the size

In the consumer apps, you pick a product (ChatGPT Plus, Claude Pro, Gemini Advanced) and the app picks the model. You only think about parameters if you're using the API or running an open-source model locally.

06 · Hallucinations

It will confidently make things up. This is not optional.

A hallucination is when the model produces something that sounds right but isn't true. Fake citations, fake quotes, fake URLs, fake legal cases, fake people. Built into how the technology works. Mitigated, not eliminated.

Why it happens (in one sentence)

The model is rewarded during training for producing plausible-sounding text, not for being right. When it doesn't know, it produces something that sounds like the right kind of answer instead of saying "I don't know."

How to defend against it

Verify anything you'd be embarrassed to be wrong about.

Names, dates, numbers, citations, quotes, laws, medical claims. Treat the LLM's answer as a confident draft, not a fact.

Ask for sources, then check them.

The model may invent the source itself. The check is mandatory, not the ask.

Use grounding features when you can.

"Search the web," "use the files I uploaded," ChatGPT Search, Gemini Grounding, Perplexity. These tie answers to real retrieved documents and cut hallucination rates dramatically. Not to zero.

Give it permission to say "I don't know."

Most prompts implicitly punish "I don't know" by demanding a confident answer. Explicitly tell it: "If you don't know, say so. Don't guess." That single line helps.

AiAi Bro

Hallucination is the cost of the technology, not a sign you're using it wrong. The skill isn't picking a model that doesn't hallucinate. It's building the habit of verifying anything load-bearing, and structuring your workflow so a wrong answer is caught before it does damage. Trust, but read.

07 · The model landscape

Frontier vs open-source. Five names you should know.

You don't need to track every release. You need the rough shape of the map so you don't get lost when a new headline drops.

The frontier labs

These are the companies racing to build the most capable closed models. You access them through their consumer apps and APIs.

OpenAI GPT family

Models: GPT-4o, GPT-4.1, o-series reasoning models, GPT-5 era
Product: ChatGPT
Strength: broadest capability + ecosystem (Custom GPTs, Sora, DALL-E)

Anthropic Claude family

Models: Claude Haiku, Sonnet, Opus, in numbered versions (4.x as of this guide)
Product: Claude.ai + Claude Code
Strength: writing quality, long context, careful reasoning, coding

Google DeepMind Gemini family

Models: Gemini Flash, Pro, Ultra
Product: Gemini app + Workspace integration
Strength: huge context windows, multimodal-native, Google data integration

The open-source side

Open models you can download and run yourself (or have someone host for you). You will not use these directly as a beginner, but you should know the names because they show up everywhere.

Meta Llama Open weights

The dominant open-source family
Free to use, modify, run on your own hardware
Powers many of the cheaper AI products you've heard of

Mistral & others Open weights

Mistral, Qwen, DeepSeek, Gemma, and a long tail
Each has strengths in size, speed, or specific tasks
You'll meet them if you start self-hosting or using developer tools

The 80/20

As a non-technical user, your world is three frontier labs and three consumer products: ChatGPT, Claude, Gemini. That's it. Open-source models matter for what they enable other tools to do cheaply, but you won't typically pick one yourself.

08 · Multimodality

One model. Many input types.

"Multimodal" means the model can take more than just text as input, images, audio, sometimes video. Most modern frontier models are multimodal by default.

Input	What you can do	Example
Text	Standard chat	Ask a question, draft an email
Image	Describe, transcribe, analyze, critique	"Read this screenshot of an error message," "What's wrong with this chart?"
PDF / file	Summarize, extract, compare	"Pull every dollar figure out of this contract"
Audio (voice in)	Speech-to-text in flight	Voice mode in ChatGPT or Gemini Live
Audio (voice out)	Spoken reply	Same voice modes, replying back to you
Video	Frame-by-frame understanding	Gemini's video features; growing in others

AiAi Bro

The biggest unlock for new users is the screenshot. Once you realize you can paste a screenshot of literally anything, an error, a chart, a contract, a UI you don't understand, and ask the model to read it, your daily workflow changes overnight. You stop typing descriptions of things you could just show.

09 · Before you climb

Self-check.

If you can't answer each of these in one or two sentences without scrolling up, re-read the section before climbing to Tier 2.

What's the difference between AI, ML, and an LLM?

In one sentence, how does an LLM produce a response?

What's a token? What's a context window?

Why does the AI "forget" things in a long chat?

What's the difference between training and inference?

Why do LLMs hallucinate, and what's the standard defense?

Name the three frontier labs and their products.

What's "multimodal" mean, and what's one thing it lets you do?