Prompt Functions: Treating Prompts as Reusable Functions
A 2023 chat trick that quietly became how we build LLM systems
Some prompt engineering ideas age badly because the models got better. Prompt functions aged well for the opposite reason: the idea was right, and the whole industry quietly rebuilt itself around it. The premise is simple. A prompt with a name, defined inputs, and a rule for processing them is a function. Once you see prompts that way, you stop writing one-off instructions and start building a library you can call and compose.
The original trick#
The Prompt Function chapter of the Prompt Engineering Guide describes the chat-window version, and it is worth understanding even though you would not ship it today. You start with a meta prompt that teaches the model the convention:
I will be using a template to describe the function, input, and instructions
on what to do with the input:
function_name: [Function Name]
input: [Input]
rule: [Instructions on how to process the input]
The format is function_name(input). If you understand, just answer one word with ok.Then you define functions in that template. A translator:
function_name: [trans_word]
input: ["text"]
rule: [I want you to act as an English translator, spelling corrector and
improver. I will provide you with input forms including "text" in any
language and you will detect the language, translate it and answer in the
corrected version of my text, in English.]Plus, say, expand_word (make text more literary) and fix_english (polish vocabulary and phrasing). Now the interesting part. Because each function has a name, you can call them, and even nest them, right in the chat:
fix_english(expand_word(trans_word('婆罗摩火山处于享有"千岛之国"美称的印度尼西亚...')))The model executes the pipeline inside out: translate from Chinese, expand the result, then polish it. Functions can take multiple named parameters too. The guide's password generator example is pg(length = 10, capitalized = 1, lowercase = 5, numbers = 2, special = 1), and the model respects each constraint.
There is real insight packed in here: named encapsulation, explicit inputs, single-responsibility prompts, and composition. That is 80% of what a good LLM pipeline is.
Why the chat-window version does not survive contact with production#
I would not build on the meta-prompt version today, for reasons that teach you something about LLM systems generally.
Everything lives in one context window. Every function definition, every call, every intermediate result shares the same conversation. Definitions drift, get overridden by later instructions, and silently degrade as the conversation grows. There is no isolation between calls.
There is no type checking and no error handling. If you call pg(10,1,5,2,1) with arguments in the wrong order, nothing warns you. The model does its best, which is the worst failure mode: wrong answers that look right.
Composition is uncontrolled. When the model evaluates fix_english(expand_word(trans_word(x))) internally, you never see the intermediate values. If the final output is bad, you cannot tell which stage failed. Compare that with running each stage as its own API call, where every intermediate result is loggable, testable, and cacheable.
What prompt functions look like in 2026#
The idea survived by moving out of the chat window and into code. The honest modern implementation is a template plus a dedicated API call per function:
from anthropic import Anthropic
client = Anthropic()
def prompt_function(rule: str):
def call(**inputs) -> str:
rendered = "\n".join(f"{k}: {v}" for k, v in inputs.items())
response = client.messages.create(
model="claude-sonnet-5",
max_tokens=1024,
system=rule,
messages=[{"role": "user", "content": rendered}],
)
return response.content[0].text
return call
trans_word = prompt_function(
"Detect the language of the given text, translate it to English, "
"and return only the corrected translation."
)
fix_english = prompt_function(
"Improve the text's vocabulary and phrasing so it reads naturally. "
"Keep the meaning identical. Return only the improved text."
)
result = fix_english(text=trans_word(text=source_text))Same mental model as the chat trick, but now each function has an isolated context, you can unit test each one against a small eval set, swap models per function (a cheap model for translation, a stronger one for polish), and cache results. Each function stays small and single-purpose, which is exactly what makes the composition debuggable.
Three modern descendants of the same idea:
| Descendant | What it adds |
|---|---|
| Prompt templates in LangChain and friends | Versioning, variable validation, prompt registries |
| Tool calling | The model decides which function to invoke, with schema-validated arguments (see structured outputs and function calling) |
| Skills and agent instructions | Prompt functions with documentation, loaded on demand |
The line between a prompt function and a chained pipeline is thin. Once you are composing three or more functions with logic between them, you are doing prompt chaining, and the design considerations change: where to validate, where to retry, where a cheap model is enough. The guide covers chaining as its own technique, and I treat it as one too.
When I still use the raw chat version#
One place the original trick earns its keep: personal, non-programmatic workflows. If you live in a chat interface and repeat the same transformation daily (rewrite in my voice, summarize to three bullets, translate and polish), defining a few named functions at the top of a project's custom instructions is faster than building anything. The guide suggested keeping your function library in a notes app, and honestly, that aged into what custom instructions and project memory do now.
For anything a user depends on, though: real code, one call per function, logged intermediates, and a small eval per function. The encapsulation idea is the part worth keeping. The single shared context window is the part to leave in 2023.
If you want the fundamentals under this, start with the anatomy of a good prompt: every prompt function is just a well-structured prompt with a name.

Folarin Akinloye is an AI Engineer based in London, UK. He builds production-ready agentic AI systems, multi-agent architectures, and sophisticated RAG implementations, and writes about the engineering decisions behind them.