You Already Know AI – You Just Called It Search

Chapter 3 of 18 Primer · 7 min

In 2017, a team at Google published a paper that most developers never read. Eight authors. Nine pages. A title that sounds like philosophy: "Attention Is All You Need." That paper retired the architecture that had dominated natural language processing for a decade – and replaced it with the transformer. If you have used a language model in the last three years, you have been living inside the consequences of that paper.

// the crux

Modern AI is search's descendant. Tokens, attention, vector retrieval, ranking – you already wrestled with these as a search engineer, under older names. Connect the two and the whole stack stops being magic and starts being engineering.

// in one breath

Why almost everything in "modern AI" is something a search engineer already wrestled with a decade ago – under an older name.
A working search engine taken apart into the very algorithms that, renamed, became AI – autocomplete, edit distance, n-grams – each one you can run yourself, right here.
The one move search never made: generating language it had not already indexed. That is the machine itself, and Chapter 4 takes it apart.

But here is the part almost no one says when explaining AI to developers: you already understand the core concepts. You learned them in a different context, under different names. Search engine development – especially the decade between 2005 and 2015 – was applied machine learning at scale. The vocabulary was different. The problems were identical. And if you can connect the two, the entire AI stack stops being magic and starts being engineering.

before & after

The Year the Architecture Changed

Before 2017, machine learning systems processed language sequentially – reading a sentence word by word, carrying a kind of rolling memory forward. The approach worked. It did not scale. Long documents, complex instructions, multi-step reasoning: all degraded predictably as context grew. The model read everything, but attended to almost nothing.

2017

// The inflection point

The transformer replaced sequential processing with parallel processing and introduced attention – a mechanism that learns which parts of an input to weight more heavily when producing an output. The model does not read left to right. It considers all tokens simultaneously and decides what matters.

The result: better language understanding, larger context windows, generalisation across domains. Everything you call "AI" today – GPT, Claude, Gemini, Mistral, Llama, Codex – runs on some variant of this architecture. The transformer is the Intel 8086 of this era: not the final word, but the foundation everything else is built on.

the lineage

Search Was Your First Teacher

The search engine problems solved between 2005 and 2015 were not different problems from the ones AI solves now. They were the same problems, solved with less compute and simpler models. What changed in 2017 was the scale of the solution – not the shape of the problem.

Walk through the concepts side by side. If you built on search infrastructure, or just thought carefully about how search worked, you already have the mental model:

// concept lineage: search → AI

Search (2005–2015)	What it solved	AI equivalent (2026)
Query tokenisation	Split raw text into units the system can operate on	LLM tokenisation
TF-IDF / BM25 ranking	Score documents by relevance to a query	Attention weights
Inverted index	Map terms to documents for fast retrieval	Vector database / embedding index
Query expansion	Retrieve related concepts, not just exact matches	RAG / context retrieval
PageRank / link graph	Score nodes by their connections, not just their content	Graph RAG / knowledge graphs
Personalised ranking	Adapt results to user context and history	Fine-tuning / RLHF / agent memory
Autocomplete	Predict the most likely next token given prior input	Next-token prediction (LLM core)
Session context	Maintain relevance across a multi-query session	Conversation context / in-context memory

If you understand why inverted indexes exist, you understand why vector databases exist. If you understand what query expansion solved, you understand what RAG solves. If you understand PageRank, you understand why GraphRAG retrieves better than flat keyword search for connected data. The problem space did not change. The architecture did. The scale changed dramatically. But the instinct is the same.

Concepts are one thing; the journey is another. Here is what actually happens between a keystroke and a result, taken from a working search engine: two connected engines, Suggest and Result, with a learning loop underneath that quietly gets better every time. Where a stage is a named algorithm you can run, open its try it live panel and step through it yourself, right here on the page.

Capture

Language

Normalize

Suggestion Dictionaries

Words Dictionary

Results / Inventory

Learning Loop

PHASE A Search Suggest // keystroke → corrected, ranked suggestion

Capture user keywords

trigger: keystroke

A character (or partial word) lands in the search bar. The raw, unprocessed string enters the pipeline exactly as typed – including typos, casing and stray characters.

raw input»“sheng…”, “クラム”, “Por”, “toukyo”

Detect language

engine: Apache LangDetect

Identify the script and spoken language so the right rules apply. Disambiguates near-identical alphabets – Japanese vs Chinese, Hindi vs Bengali, Urdu vs Persian – and handles cross-language intent (search in CJK, results in English).

script トッキオ · とうきょう · 東京 → all resolve to “Tokyo”

Auto-correction (if applicable)

engine: Elasticsearch

Fix misspellings using N-Gram / Shingle similarity and Edit-Distance scoring. Four classic error types are repaired by transposition, insertion, deletion or substitution.

transpose Brelin → Berlin

insert Munchen → Muenchen

delete Toukyo → Tokyo

substitute Shenghai → Shanghai

▶ Try it live: edit distancethe typo fixer

open full screen ↗

▶ Try it live: n-gramspredicting the next word

open full screen ↗

Detect abuse · root word · stop words

engine: Kafka Streams

Clean and reduce the term: strip stop words (the, to, of, and), flag slang / negative / abusive tokens, then stem to the root word so variants collapse to one canonical form.

stem running · ran · runs → run

Look up the Suggestion Dictionaries

3-tier cascade

The cleaned term is checked against three suggestion dictionaries in priority order. The first tier that has a match wins and returns immediately – personalised history beats regional, which beats global.

TIER 1

User history

What this person has searched & picked before.

scope · individual

TIER 2

Country history

Popular searches within the user’s region.

scope · regional

TIER 3

Global history

System-wide demand, ordered by rank & word frequency.

scope · everyone

HIT Suggestion found

Return the suggestion immediately, ranked by frequency & popularity. Pipeline ends here – fast path. → flows into Phase B.

MISS Not in any suggestion dict

No history match anywhere. Fall through to the authoritative Words Dictionary check below.

▶ Try it live: autocompleteTrie & TST prefix search

open full screen ↗

Fallback → Words Dictionary

authoritative lexicon

Is the cleaned term a real, valid word at all? The Words Dictionary is the source of truth that decides whether this becomes a brand-new suggestion or gets rejected as noise.

VALID Word exists

Promote it – add the term into all three suggestion dictionaries (user, country & global) so it’s instantly available next time, then return it as a suggestion.

INVALID Not a word

No suggestion. The input is treated as a bug / garbage / nonsense string and the suggest pipeline stops cleanly.

SUGGESTION READY ↓ feeds the result engine

PHASE B Search Result // suggestion → tagged inventory → ranked results

Receive keyword from Search Suggest

handoff

The clean, corrected, language-aware keyword arrives from Phase A as the trusted query seed.

seed “men shirt basic”

Fetch linked tag words

tag graph

Map the keyword onto the inventory’s tag vocabulary – the words products are labelled with, e.g. men, shirt, basic.

B3 · B4

Match inventory & resolve context

retrieve

Fetch tagged items
Pull every product carrying the tags men + shirt + basic.

Establish context
The intersection defines intent: “basic men’s shirts” inventory context.

Return the result list

render

Emit the matched list with concrete product codes / product links – the visible search results shown to the user.

↻ The Learning Loop // every result feeds the system smarter

Search isn’t one-and-done. Each interaction quietly upgrades the dictionaries and tag graph, so the next query for everyone gets better. This is the engine behind the keyword → word → tag promotion ladder.

Increment frequency

Every searched word that returns inventory gets its usage count bumped.

Raise word rank

When frequency crosses a threshold, the word climbs to the next ranking tier.

Record breadcrumb

On a product click, log the trail into user history & link it to that user’s other picks.

Promote word → tag

If a word’s rank reaches “tag” level, attach it to the product’s tags. The vocabulary grows itself.

Search-Oriented Architecture · stack behind the pipeline

Elasticsearchauto-correct · autocomplete · suggest

Apache LangDetectlanguage detection

Kafka Streamsstop words · stemming · abuse filter

Inverted Indexcore retrieval structure

Graph DBalternate spellings · domain

WordNetsynonyms · semantic relations

Apache UIMANLP · context analysis

Postgres FTSsuggestions / recommendations

Algorithms in play

N-Gram / Shinglesfuzzy similarity

Edit Distancetypo correction

Trie / TSTprefix autocomplete

Stemmingroot-word reduction

Likelihood Modelranking suggestions

// the honest version

Most of what gets explained as AI magic is search engineering with better compute and a different name. Start there, and the rest becomes learnable.

That is the search engine, end to end: the lineage from a single keystroke, the two engines that turn it into a result, and the algorithms underneath that you can run for yourself. Every one of them predates the word "AI" by years. The one move search never made was to generate language that was not already indexed somewhere.

// carry forward

You have the lineage: modern AI is search with better compute and a new name. The one move search never made was to generate language it had not already indexed – and that single step is the whole of a language model. Chapter 4 takes it apart the same way.