Edit mode — click text · Ctrl/⌘+S downloads a copy

Lloyds Banking Group interview · Data Scientist

The Author
in the Machine

A practical guide to LLMs: from AI and machine learning to transformers, attention, retrieval, and responsible banking systems.

CONTEXTLLMs

The map

Four questions, one mental model

01What do we mean by AI and machine learning?
02What is a large language model actually doing?
03Why did transformers and attention change the game?
04How do we make LLMs useful, grounded, and safe?
^one banking sentence will follow us through the whole talk.
The Author in the MachineOrientation02 / 17

01 · The landscape

LLMs are one branch — not the whole tree

Artificial intelligence
Machine learning
Deep learning
LLMsthe author
Artificial intelligenceAny system doing something we'd call intelligent.
Machine learningLearns patterns from data, not hand-written rules.
Deep learningLearns its own features through layered networks.
LLMsA deep network for language — predict the next token, at scale.
The Author in the MachineLandscape03 / 17

01 · The road to LLMs

Language models didn't start with ChatGPT

Each era learned more of the language from data — until attention let them finally scale.

1960s–80s Rule-based hand-written language rules 1990s–2000s Statistical n-gram models learn word patterns from data 2013–2016 Neural · RNN/LSTM read text one word at a time — in sequence 2017 Transformer attention: read all words at once 2018 → LLMs scaled up to today's assistants
The Author in the MachineLineage04 / 17

02 · Meet the LLM

An LLM is autocomplete — scaled up beyond recognition

The same instinct as your phone keyboard — guess what comes next — trained on a vast amount of text. It's what sits behind ChatGPT, Copilot and Gemini.

The customer disputed the
paymentmost likely
refundless likely
chargeless likely
^scale it from one word to whole documents and the same trick can summarise, draft, classify and answer.
The Author in the MachineLLMs05 / 17

02 · Tokens

It doesn't read words. It reads tokens.

Our sentence, the way the model actually sees it — common words stay whole, rarer words break into pieces, and punctuation is its own token.

Thecustomerdisputedthepaymentbecauseitwas duplicated1 word → 2 tokens .

So 9 words ≈ 11–13 tokens — the exact split is tokenizer-specific, but tokens are the unit of reading, pricing and context limits.

The Author in the MachineTokens06 / 17

02 · Next-token prediction

All it really does: predict the next token

WritingThe customer disputed the payment because it was
duplicated0.41
unauthorised0.17
declined0.12
incorrect0.08
^pick one, add it, repeat. That loop is how it wrote the sentence above — one token at a time.
The Author in the MachineNext token07 / 17

03 · The transformer moment

Older models read in a line.
The transformer reads everything at once.

Before — one word after another
4 hops apart 12345 Thecustomerdisputedthepayment

It reads in order, so two distant words are many steps apart — the thread between them fades.

Transformer — every word, at once
1 step Thecustomerdisputedthepayment

Every word can look at every other word directly — which is what let LLMs handle long context and scale.

The Author in the MachineTransformers08 / 17

03 · Attention

Attention asks: what should this word look back at?

0.62 The customer disputed the payment because it was duplicated
“it” — the word being resolved “payment” — what it refers to thicker arc = more attention
The Author in the MachineAttention09 / 17

03 · The context window

Think of it as the model's desk

Everything it can see at once must fit on one finite desk: instructions, the conversation, retrieved evidence, and room for the reply.

Context window
System prompt — role, rules, guardrails2k
The conversation so far7k
Retrieved policy and evidence12k
Answer being written600
not on the desk = not reliably available
^this explains memory limits, document limits, and why grounding matters.
The Author in the MachineContext10 / 17

04 · The author

The model isn't the chatbot.
It's the author writing its next line.

SystemYou are a careful banking-controls assistant…
UserMap this regulation to the controls we'd expect to see.
AssistantBased on the evidence provided, the expected controls are

Prompting is setting the scene for the author: role, task, evidence, format.

The Author in the MachineAuthor11 / 17

04 · Reliability

A fluent answer is not the same as a true one

Fluent

It can produce polished, plausible language even when the right evidence is missing.

Grounded

The answer is tied to trusted evidence on the context desk, ideally with citations.

To the model, every question feels like an exam — it would rather guess than say “I don't know.” In a bank, a polished but ungrounded answer is a risk, not a gain.

The Author in the MachineReliability12 / 17

04 · Retrieval — RAG

Turn a closed-book exam into an open-book one

Retrieval-Augmented Generation indexes trusted documents, fetches the relevant passages, places them on the desk, then lets the author write a grounded answer.

Question
“what controls
apply here?”
Retrieve
search trusted
documents
Desk
relevant
evidence
Generate
answer with
citations
^not the whole library — just the right pages, at the right moment.
The Author in the MachineRAG13 / 17

04 · From model to product

The product is the system around the author

Evaluation · governance · UI
Retrieval · tools · APIs
System prompt · memory · policy
at the centre
the model — the author

A ChatGPT-style tool is prompts, retrieval, tools, memory, UI, logs, evaluations, access controls and human workflows wrapped around a model.

Production reliability lives in the wrapper.

The Author in the MachineProduct14 / 17

05 · In a bank

Where it helps — and how we keep it safe

Where LLMs fit

Customer ops — complaint triage, call summaries, colleague assist
Risk & controls — evidence gathering, policy mapping, gap analysis
Knowledge work — regulatory updates, policy Q&A, document search
Data science — code help, write-ups, stakeholder translation

How we keep it safe

Ground answers in trusted sources, with citations
Evaluate on realistic tasks and known failure modes
Keep humans in the loop for high-impact decisions
Log, monitor and audit; protect data and access
The Author in the MachineIn a bank15 / 17

The takeaway

Five things to keep

1AI is the umbrella; machine learning learns patterns from data.
2An LLM is autocomplete at scale — it predicts the next token, over and over.
3Attention is how it weighs context — that's how “it” found “the payment.”
4The model only uses what's in its weights or on its context desk.
5Real value comes from the wrapper: retrieval, tools, evaluation, governance.
^as a data scientist, the work isn't just the model — it's making the system measurable, grounded and safe.
The Author in the MachineTakeaway16 / 17

What I'd bring

Make retrieval the strength, not the ceiling

Most retrieval systems plateau at finding the right context, not writing from it — fast vector search is broad but blunt. So I'd make retrieval two-stage.

Query
question
comes in
Vector search
top ~50
fast but blunt
Reranker
re-scores →
the real top 5
LLM
grounded
answer
^a cross-encoder, fine-tuned on our language and run in-house — the piece I've not had time to build properly yet, and the first thing I'd want to.