How to Connect OpenClaw to MCP Servers Without Losing a Weekend
MCP works fine once it's wired up. The wiring is where most people lose a Saturday. Here are the four failure modes and the patterns that fix each.
Connecting OpenClaw to an MCP server should take ten minutes. In practice it eats a Saturday, because four configuration steps fail quietly and the errors don't point at the real cause.
The Model Context Protocol works fine once it's wired up. The wiring is the problem. If you've ever watched an MCP server start, advertise tools, then disappear from your assistant five minutes later — or fail to advertise anything at all — one of four issues is the cause. This post names each one and shows the pattern that fixes it.
What MCP is doing under the hood
MCP is a thin protocol that lets a language model call external tools through a separate process or service.
An MCP server exposes capabilities — read a file, query a database, fetch a calendar, send a message. Your OpenClaw client connects to the server, asks what's available, and routes tool calls to it during a conversation. The server runs as its own process, with its own credentials, and the client talks to it over a defined transport.
That separation is what makes MCP useful. It's also what makes it brittle: every piece of the handshake has to agree, and most failures happen at the seams.
Failure 1: Auth scoping
Auth scoping breaks when the MCP server's credentials don't match the operations the model tries to perform.
Typical symptom: the tool appears in the assistant's available list, the model calls it, and you get a 401 or 403 — sometimes wrapped in a generic tool execution failed message that hides the real status code.
The usual cause is reusseing a personal access token with broad scopes during local development, then deploying with a service token that has narrower scopes. The model still sees a tool description that mentions write operations, but the token can only read.
Pattern that works:
- Generate the production token first, before writing the tool descriptions.
- Walk the tool list and confirm each operation maps to a scope the token actually has.
- If the model needs both read and write, issue two tokens and run two MCP servers — don't grant write to a process that only needs read.
This is the same least-privilege principle covered in the OpenClaw security guide: enforce scope at the credential layer, not at the tool-description layer.
Failure 2: Transport mismatch
Transport mismatch happens when the client and server disagree on how messages move between them.
MCP defines two main transports: stdio (the server is a subprocess of the client) and HTTP/SSE (the server runs separately and the client connects over the network). Each has different lifecycle, credential, and timeout assumptions.
The common mistake is configuring the server for stdio while the client expects an HTTP endpoint, or the reverse. The connection either fails immediately with an unhelpful error or hangs until something times out.
Quick decision table:
| Scenario | Transport | Why |
|---|---|---|
| Local-only tool, single user | stdio | No network surface; auth inherits process env |
| Tool shared across machines or users | HTTP/SSE | Server runs once; multiple clients connect |
| Tool calls a remote API and benefits from connection reuse | HTTP/SSE | Long-lived process can hold connection pools |
| Tool needs filesystem access on the user's machine | stdio | Subprocess inherits the user's permissions and paths |
If you don't know which one you need, start with stdio. It has fewer moving parts and the failure modes are easier to read.
Failure 3: Process lifecycle
Lifecycle failures happen when the MCP server crashes, hangs, or gets killed and nothing notices until the next tool call.
For stdio servers, the client owns the lifecycle: it launches the subprocess and tears it down on exit. That's fine until the subprocess dies on its own — an unhandled exception, an OOM kill, a runaway recursion — and the client doesn't restart it.
For HTTP servers, you own the lifecycle. The server has to survive reboots, log rotations, and the laptop closing. Most people run it under node or python directly in a terminal during development and forget that production needs launchd, systemd, or a hosted equivalent.
The symptom is identical in both cases: the assistant works for a while, then tools stop responding, then the model starts hallucinating that it called them successfully.
Working pattern:
- Run the server under a supervisor (launchd on macOS, systemd on Linux) with automatic restart.
- Log to a known location and rotate the logs.
- Expose a health endpoint and check it at session start if the server runs remotely.
Failure 4: Credential drift
Credential drift is the slow failure mode: tokens rotate, certs expire, OAuth refresh tokens hit their idle limit, and the MCP server keeps trying to use stale values.
This is the failure that's easiest to ship and hardest to debug, because the integration worked when you set it up and the error only appears weeks later, often during a demo.
Three habits that prevent it:
- Read credentials from a secret store at server start, not from a config file. Rotating the secret then only requires a restart.
- Log the credential expiry on startup so you can see at a glance when it'll fail.
- For OAuth-backed integrations, refresh proactively at 75% of the refresh window, not on the first 401.
If you're managing several MCP integrations, this becomes the single biggest source of toil. The same pattern from the assistant-config side is covered in OpenClaw without API keys.
A working setup template
The fastest way to get a new MCP integration running is to assemble it in this order: transport, credentials, tool list, lifecycle.
- Decide transport first. Stdio for local single-user, HTTP for anything shared.
- Generate production credentials before writing tool descriptions. Scope them tight.
- Implement the tool list with the scoped credentials. If a description mentions an operation the credentials can't perform, fix one or the other before continuing.
- Wrap the server in a supervisor. Treat the dev terminal as a debugger, not a runtime.
- Log credentials on start, health on demand, errors with the underlying status code.
Following this order means each step's failure modes are isolated. If transport works but credentials don't, you've narrowed the problem to one box.
When managed hosting changes the calculation
Most of the four failures above are operational, not protocol-level — they're about running processes, rotating credentials, and supervising lifecycles, not about MCP itself.
If you're running OpenClaw on a managed host, the supervisor, log rotation, and health-check pieces are already handled. Clowdbot runs each MCP server as a long-lived process under a supervisor, exposes the logs through the dashboard, and restarts on failure. Credentials live in a secret store that the server reads at startup, so rotating a token is a one-line operation instead of a redeploy.
That doesn't fix auth scoping or transport choice for you — those are still your decisions. But it removes the two operational categories (lifecycle and credential drift) that tend to bite weeks after the integration looked done.
FAQ
Do I need to run my own MCP server to use OpenClaw?
No. OpenClaw can connect to public MCP servers and to first-party servers shipped by tool vendors. You only need to run your own when the integration doesn't exist yet or when you want to expose internal systems.
Can one OpenClaw instance talk to multiple MCP servers?
Yes. The client maintains a separate connection per server, and tool names get namespaced by server in the model's view, so collisions don't matter.
How do I debug “tool not found” when the server is clearly running?
Three checks in order: confirm the transport in the client config matches the server's, confirm the server is advertising the tool name you expect (not the function name in your code), and confirm the credentials the server uses haven't expired since the last restart.
Should I use stdio or HTTP for production?
HTTP for anything shared across machines or users, stdio for anything local to a single user. The most common mistake is sticking with stdio in production because it worked in development, then discovering it doesn't survive the client process restarting on a schedule.
What's the simplest first MCP integration to build?
A read-only filesystem server or a calendar-reader. Both have narrow scope, well-understood transport (stdio), and immediate visible value, which makes the four failure modes easy to spot before you scale up to anything that writes.