AI Training
Level 1 · Generative AI Literacy
Lesson 1.3Beginner 11 min

The Boundaries

Where LLMs shine vs. where they fail, why hallucinations happen, and how to mitigate.

What you’ll be able to do
  • Know at a glance what LLMs are great at vs unreliable at - each with an example.
  • Explain why hallucinations happen (the mechanism, not just the label).
  • Identify the main limits: hallucination, bias, knowledge cutoff, weak exact math, context limits, slop.
  • Apply the right mitigation; confidence is not correctness.

Ask an AI for a source and it may invent one - perfectly formatted, completely fake.

A real case

Lawyers filed a brief citing court cases an AI generated. The cases didn’t exist. They were sanctioned. The model wasn’t lying on purpose - it was doing exactly what it does: producing plausible-sounding text.

What LLMs can and can’t do

The big picture: LLMs are brilliant at working with language, and shaky whenever an answer has to be exactly, verifiably true.

Green zone - use freely

  • Drafting & rewriting - turn 5 messy bullets into a polished client email.
  • Summarizing - condense a 40-message thread into 5 takeaways.
  • Explaining simply - 'explain a mortgage to a 12-year-old.'
  • Brainstorming - 20 product name ideas.
  • Translating & reformatting - notes into a clean table.
  • Working with text you give it - pull action items from a transcript.

Red zone - verify

  • Exact facts from memory - 'what was our Q3 revenue?'
  • Fresh or real-time info - 'what's in the news today?'
  • Precise math & counting - 'how many r's in strawberry?'
  • Your private/internal data - it doesn't know your wiki or CRM.
  • Strict multi-step logic - a tricky scheduling puzzle in one shot.
  • Knowing its own limits - it answers anyway instead of 'I don't know.'

The pattern: language tasks = use freely (and skim). Exact-truth tasks = verify, or give it the facts.

Why these limits exist

1. Hallucination

Hallucination happens because the model predicts plausible tokens. When it lacks a fact, it fills the gap with something that looks right - there’s no built-in “I checked this” step. Highest risk: niche or recent topics, exact numbers, names, citations.

2. Bias

Trained on human-written internet text, so it absorbs human Bias and over-represents majority views. Ask it to “describe a nurse and a CEO” and, unprompted, it may default the nurse to “she” and the CEO to “he.”

3. Knowledge cutoff & no live data

It only “knows” up to its Knowledge cutoff, and can’t see today’s news or your private documents unless connected to tools/search (Level 8, RAG).

4. Weak at exact math & strict logic

It’s matching language patterns, not running a calculator, so it can fumble arithmetic or “count the letters” tasks - unless it works step by step or uses tools.

5. Context limits

It can only hold so much text at once; very long inputs get truncated or key details get lost in the middle. Its working memory is the context window.

6. AI slop

Mass, low-effort generated content. Beyond being low value, AI slop pollutes the web - and future training data - creating a feedback loop.

Confidence is not correctness

The confidence needle can pin high while the answer is wrong. Read a confident tone as style, not proof.

How to mitigate

  • Verify anything that matters against a trusted source (Lesson 2.6 teaches the technique).
  • Give it the facts - paste the source or use retrieval - instead of trusting its memory.
  • For logic/math, ask for steps or use a reasoning model/tool.
  • Read confident tone as style, not proof.

Interactive

Spot the hallucination

Round 1 of 2 - flag the fabricated answer.

Which answer about remote-work productivity is fabricated?

Recap
  • LLMs predict plausible text, so they can hallucinate, carry bias, and miss recent or private facts.
  • By default they're weak at exact math/counting and limited by context size.
  • Verify what matters, supply the facts, and never mistake confidence for correctness.

Finished the lesson?

Mark it complete to track your progress.