AI pillar · Module 2 of 6

How AI systems actually work

You don’t need to understand the maths to work with AI effectively. But knowing the basic process helps you ask better questions and spot bullshit.

← Back to AI Fundamentals Training

2.1 The AI lifecycle (simplified)

Here’s how an AI system goes from idea to production:

Step 1: Get data. Lots of it. A language model might train on hundreds of billions of words. An image recogniser needs millions of labelled photos. Data quality matters hugely—garbage in, garbage out.
Step 2: Train the model. The system processes all that data, adjusting millions (or billions) of internal parameters until it can recognise patterns. This takes serious computing power—training GPT-4 reportedly cost over $100 million in compute.
Step 3: Evaluate. Test on data the model hasn’t seen. Does it actually work? Is it biased? Does it fail in dangerous ways?
Step 4: Deploy. Put it into production where people can use it. Now you’re dealing with real-world edge cases you never anticipated.
Step 5: Monitor. Watch for problems. Models can “drift” as the real world changes. They can be gamed. They will definitely surprise you.

2.2 How large language models work

Since LLMs like ChatGPT are what most people interact with, let’s demystify them:

The core trick: Predict the next word. That’s literally it. Given “The cat sat on the...” the model predicts “mat” is likely. Do this trillions of times with trillions of words, and the model learns grammar, facts, reasoning patterns, and even some semblance of personality.

Why it works so well: To predict well, you need to understand context, relationships, logic, and facts. Predicting text turns out to require learning a lot about the world.

Why it also fails: The model has no way to know if something is true—it only knows what patterns appeared in training data. It will confidently generate plausible-sounding nonsense (called “hallucinations”) because it’s optimised for plausibility, not truth.

⚠️ Important caveat

LLMs don’t “know” things the way you do. They don’t have beliefs or understanding. They generate text that statistically resembles text written by humans who do know things. This distinction matters a lot when you’re deciding whether to trust AI output.

Free resources to go deeper

Video (10 min): Andrej Karpathy: Intro to Large Language Models — Former Tesla AI director explains LLMs simply
Hands-on: OpenAI Playground — Free tier to experiment with GPT models
Deep dive: The Illustrated Transformer — Visual guide to the architecture behind modern LLMs

← Previous: What is AI?

Next: AI in the real world →