Blog
MCP under the hood

Real-time query example
If you have been following AI development, you have probably heard about Model Context Protocol, or MCP for short. It is a simple protocol that empowers LLMs and extends their capabilities.
In this post, we’ll walk through a simple example to show how MCP works.
We'll implement a simple weather MCP server that allows users to query real-time weather conditions from any location using the Claude Desktop App.
LLMs are incapable of querying real-time data, so if you ask it “Current weather in NYC?”, without MCP (and web search turned off), it’ll return something like this:
And with MCP, it can now answer the query:
How does it work?
From a high-level perspective, there are several components:
It's important to note that in this example, the MCP Client and MCP Server both exist locally on the user's desktop. The Desktop Claude App is the MCP Client host, meaning that the MCP Client exists inside the app.
The MCP Server provides two functions to return real-time information:
get_alerts
: Get weather alerts for a US state. It requires an argument, state
- a two-letter US state code (e.g., CA, NY)
get_forecast
: Get a weather forecast for a location. It requires 2 arguments, latitude
- latitude of the location, and longitude
- longitude of the location.
For each function, the MCP makes a call to its respective Weather API endpoint, which are just normal HTTP endpoints.
Under the hood, several steps are going on:
- The Claude Desktop sends your question to Claude LLM
- Claude LLM analyzes the available tools and decides which ones to use
- The Claude Desktop executes the chosen tools through the MCP server
- The results are sent back to Claude LLM
- Claude LLM formulates a natural language response
- Claude Desktop displays a final response
A deeper view
To understand how all these components coordinate, we'll use a request diagram. Please note that it has already been simplified to omit some protocol details.
Each request is marked with the timeline on the left from T0
to T7
.
T0 -> T1
Tool calling is actually supported by Claude LLM, quoted from the official docs:
If you include
tools
in your API request, the model may returntool_use
content blocks that represent the model's use of those tools. You can then run those tools using the tool input generated by the model, and then optionally return results back to the model usingtool_result
content blocks.
In this example, the two functions are provided to the LLM in this format:
[
{
'name': 'get_alerts',
'description': 'Get weather alerts for a US state.',
'input_schema': {
'properties': {
'state': {'title': 'State', 'type': 'string'}
},
'required': ['state'],
'title': 'get_alertsArguments', 'type': 'object'}
},
{
'name': 'get_forecast',
'description': 'Get weather forecast for a location',
'input_schema': {
'properties': {
'latitude': {'title': 'Latitude', 'type': 'number'},
'longitude': {'title': 'Longitude', 'type': 'number'}
},
'required': ['latitude', 'longitude'],
'title': 'get_forecastArguments', 'type': 'object'
}
}
]
T2 -> T3
With the provided tools, the LLM knows which is the right tool for a user query. In this example, it understands that for the query "What is the current temp in NYC?" it will need to call get_forecast
Since this tool requires two arguments: longitude
and latitude
, the LLM understands that it has to provide the longitude and latitude of NYC, which is something it has in its training data.
LLM will respond:
Message(
id='msg_***',
content=[
TextBlock(citations=None, text="Let me get the weather forecast for New York City. I'll use approximate coordinates for Manhattan.", type='text'),
ToolUseBlock(id='toolu_***', input={'latitude': 40.7128, 'longitude': -74.006}, name='get_forecast', type='tool_use')], model='claude-3-5-sonnet-20241022', role='assistant', stop_reason='tool_use', stop_sequence=None, type='message', usage=Usage(cache_creation_input_tokens=0, cache_read_input_tokens=0, input_tokens=542, output_tokens=97))
[{'role': 'user', 'content': 'What is the current temp in NYC?'}]
T4 -> T5
The Claude Desktop app will then utilize the protocol to query the weather API, and the response will be passed back to the Claude Desktop app. The request and response will go through the Host (Claude Desktop App), MCP Client, and MCP Server due to the protocol design.
The response at T5
is similar to this:
This Afternoon:
Temperature: 72°F
Wind: 9 mph S
Forecast: A chance of rain showers. Partly sunny. High near 72, with temperatures falling to around 68 in the afternoon. South wind around 9 mph. Chance of precipitation is 30%.
---
Tonight:
Temperature: 62°F
Wind: 6 to 9 mph SE
Forecast: A chance of rain showers before 11pm, then showers and thunderstorms. Cloudy. Low around 62, with temperatures rising to around 64 overnight. Southeast wind 6 to 9 mph. Chance of precipitation is 100%. New rainfall amounts between 1 and 2 inches possible.
...
T6 -> T7
:
Claude Desktop App will send the final query "Current weather in NYC?" along with the weather forecast from T5
to Claude LLM. Claude LLM now has all the information needed to formulate the final response!
And that's the whole flow of this simple example. As you can see, it is a simple mechanism that standardizes and simplifies how the LLM works with external functions or resources.
Why is MCP needed?
MCP helps standardize the tools. The Claude Desktop app supports this; you just need to add the tools to its configuration files. There are many MCP servers ready for use, such as GitHub, Jira, Slack, Databases, etc. You can just add them, and your Claude Desktop app will have many new capabilities!
MCP supports other capabilities:
- Resources: File-like data that can be read by clients (like API responses or file contents)
- Tools: Functions that can be called by the LLM (with user approval)
- Prompts: Pre-written templates that help users accomplish specific tasks
In this example, the MCP client and MCP server stay on the local desktop. MCP also has Authorization that allows remote implementation.