Building MCP Servers with FastMCP: A Practical Guide
The anatomy of a server, the rules that matter, stdio vs remote, the pitfalls I hit, and how to test and ship it
In the last post I explained what MCP is and how an agent uses it. This one is the build. I have shipped two MCP servers with FastMCP, a UK visa sponsor lookup and an Adzuna job search server, and most of what I know about doing it well I learned from getting them wrong first. So this is the practical version: the anatomy of a server, the rules that actually matter, local versus remote, the pitfalls that cost me time, and how to test and deploy.
FastMCP is the Python framework for building MCP servers. It turns plain functions into MCP tools and handles the protocol plumbing so you do not have to touch JSON-RPC by hand.
The anatomy of a server#
A FastMCP server is smaller than you expect. You create a server object, decorate functions as tools, and run it. That is the whole shape.
# server.py
from fastmcp import FastMCP
mcp = FastMCP(name="CalculatorServer")
@mcp.tool
def add(a: int, b: int) -> int:
"""Add two integer numbers together."""
return a + b
if __name__ == "__main__":
mcp.run()That is a complete, working MCP server. Both of my servers are just this shape with more tools and a real API behind them. The @mcp.tool decorator does the heavy lifting: it reads the function and exposes it to any MCP client. Notice there is no schema written by hand, no JSON anywhere. FastMCP builds all of that from the function itself, which leads straight to the first rule.
Rule 1: your type hints and docstring are the API#
This is the rule everything else hangs on. When you decorate a function, FastMCP turns the function name into the tool name, the type hints into the input schema, and the docstring into the description the model reads. So your signature is not just Python, it is the contract the model sees.
That means sloppy typing produces a sloppy tool. def search(params: dict) tells the model nothing about what to pass. def search(keywords: str, location: str, salary_min: int) tells it exactly what each argument is and validates it for you. Precise types are not style, they are how the model knows how to call you.
The docstring matters just as much, because it is the only place the model learns when to use the tool. "Search jobs" is a weak description. "Search job listings by keyword, location, and salary; call get_categories first to get valid category tags" tells the model how and in what order to use it. Write docstrings for the model, not for yourself.
Before you add a single feature, get the name, the types, and the docstring right. A well-described tool with a tight schema gets called correctly. A powerful tool with a vague description gets ignored or misused.
Rule 2: design tools, do not dump an API#
The temptation when wrapping an API is to expose it one-to-one: one giant tool with thirty parameters. Resist it. My Adzuna server has seven focused tools (search_jobs, get_categories, get_salary_histogram, and so on) instead of one adzuna tool, because the model picks a clear, single-purpose tool far more reliably than it fills a kitchen-sink one.
A few habits that paid off:
Keep returns lean. An API might hand back a huge JSON blob, but the model pays for every token of it. Return the fields that matter, not the raw response. Bloated tool results are one of the quietest ways to make an agent slow and expensive.
Encode usage rules in the description. Adzuna's categories are country-specific, so the search_jobs description tells the model to call get_categories first. The model cannot guess your API's quirks, so spell them out where it reads.
Be honest about limits. Free-tier APIs have rate limits, and many job listings have no salary. I put those caveats in the tool descriptions so the model sets the right expectations instead of looping on empty results.
Secrets and configuration#
Both of my servers talk to external APIs, which means credentials, which means do not hardcode them. Read them from the environment, and ship a .env.example so anyone cloning the repo knows what to set.
import os
from fastmcp import FastMCP
mcp = FastMCP(name="AdzunaJobs")
APP_ID = os.environ["ADZUNA_APP_ID"]
APP_KEY = os.environ["ADZUNA_APP_KEY"]This is the same lesson from the security section of the last post, applied at the smallest scale. Secrets live in the environment, never in the code, and never in the repo.
stdio vs remote: how your server is reached#
MCP servers run in one of two modes, and FastMCP makes switching between them a one-line change.
stdio is the default. The client launches your server as a local subprocess and talks to it over standard input and output. No network, no ports, no auth. This is what Claude Desktop and Cursor expect, and it is how you run a server on your own machine.
if __name__ == "__main__":
mcp.run() # stdio by defaultHTTP (streamable) turns your server into a web service at a URL that many clients can hit at once. This is what you want for anything hosted and shared.
if __name__ == "__main__":
mcp.run(transport="http", host="127.0.0.1", port=8000)
# now serving at http://localhost:8000/mcpOne thing worth flagging, because I used the older approach in my first server: MCP also has an SSE transport, but it is legacy now. For any new remote server, use transport="http" (streamable HTTP), not SSE. SSE only exists for backward compatibility with older clients.
For local clients, you wire the server into a config file. Claude Desktop, for example:
{
"mcpServers": {
"uk-visa-sponsors": {
"command": "fastmcp",
"args": ["run", "/absolute/path/to/server.py"]
}
}
}The pitfalls I actually hit#
These are the ones that cost me real time, so you can skip the lesson.
Never print to stdout in a stdio server. This one is brutal the first time. With stdio, stdout is the protocol channel. A stray print() for debugging injects garbage into the JSON-RPC stream and breaks the connection in ways that look like a mystery. Log to stderr instead, or use the client's logging. If your stdio server "just stops working," look for a print.
Watch the size of what you return. My first instinct was to return the whole API response. That ballooned the context and slowed everything down. Trim to what the model needs.
Vague tools get misused. Every time a tool behaved oddly, the fix was almost always the description or the types, not the logic. The model is only as good as what you told it about the tool.
Mind blocking calls. FastMCP servers are async under the hood. A slow synchronous HTTP call blocks the server. Use an async HTTP client, or at least know you are paying for the block.
Too many tools, or overlapping ones. If two tools could plausibly answer the same request, the model dithers. Keep each tool's job distinct.
Testing without a chat client#
You do not need to wire the server into Claude to test it. FastMCP can connect a client straight to your server object in memory, which makes for fast, normal pytest tests. This is the tight loop I wish I had used from the start.
import pytest
from fastmcp.client import Client
from my_server import mcp
@pytest.fixture
async def client():
async with Client(mcp) as c:
yield c
async def test_tools_are_exposed(client):
tools = await client.list_tools()
assert len(tools) == 7
async def test_add(client):
result = await client.call_tool("add", {"a": 2, "b": 3})
assert result.data == 5Set asyncio_mode = "auto" under [tool.pytest.ini_options] in your pyproject.toml so you do not decorate every test. For interactive poking, fastmcp dev server.py launches the MCP Inspector, a browser UI where you can call tools by hand and see exactly what comes back. I use the Inspector to explore and pytest to lock behaviour down.
Deploying it#
How you ship depends on who uses it.
For local use, fastmcp run server.py is enough. The CLI finds your server instance automatically and can add dependencies on the fly with --with, which is handy.
For a hosted server, containerise it. Both of my repos have a Dockerfile, and the pattern is the usual one: build the image, run it exposing the HTTP port.
docker build -t adzuna-job-search-mcp .
docker run -p 8000:8000 adzuna-job-search-mcpIf you want others to install it trivially, publish to PyPI. My Adzuna server is on PyPI as adzuna-mcp, so anyone can run it with uvx adzuna-mcp without cloning anything. Registries like Smithery give it a discovery home too. And when you go to HTTP, add a health check route so your platform can tell the server is alive:
from starlette.responses import PlainTextResponse
@mcp.custom_route("/health", methods=["GET"])
async def health_check(request) -> PlainTextResponse:
return PlainTextResponse("OK")Wrapping up#
A FastMCP server is a handful of decorated functions and a run() call. The craft is in the details: precise types and clear docstrings, because they are your API; focused tools with lean returns; secrets in the environment; stdio for local and streamable HTTP for remote; and a real test loop with the in-memory client. Avoid the stdout trap, keep payloads small, and ship it with Docker or PyPI. If it helps to see finished examples, both of mine are open source: the UK visa sponsor server and the Adzuna job search server.
Next in this series, I turn to something every system here eventually needs and most teams put off too long: evaluating your agents with LangSmith, and why it is the tool I keep reaching for to do it properly.
Folarin Akinloye is an AI Engineer based in London, UK. He builds production-ready agentic AI systems, multi-agent architectures, and sophisticated RAG implementations, and writes about the engineering decisions behind them.