Skills, CLI, and MCP: picking the right tool layer for your AI agent

The AI agent tooling landscape has split into three camps: Skills advocates, CLI purists, and MCP enthusiasts. Scroll through any AI engineering thread and you’ll find passionate arguments for each. The problem is that most of these debates treat Skills, CLI, and MCP as competing solutions. They aren’t. They operate at entirely different layers, and the teams getting the best results from their agents are using all three — each where it fits.

What each layer actually does

Before comparing anything, it helps to be precise about what these three things are.

Skills (like Claude’s SKILL.md format) encode reusable domain knowledge, workflows, and procedures directly into an agent’s context. They’re inert prompt-layer instructions that activate when relevant. A Skill doesn’t give the model new capabilities — it teaches the model how to use capabilities it already has. Think of them as institutional memory in a file.

CLI gives an agent access to external tools through shell commands. It reuses the same git, gh, npm, curl, and custom scripts that developers already use daily. The agent composes these tools the same way a human would — piping output, chaining commands, reading stdout.

MCP (Model Context Protocol) is a standardised protocol for AI agents to call external services via structured JSON-RPC-style messages. It provides a tools/list endpoint for dynamic discovery, typed schemas for inputs and outputs, and native support for stateful sessions. Anthropic introduced it, and it’s now adopted by OpenAI, Google, and others.

The key insight: these three sit at different layers of the stack. Skills are the knowledge layer. CLI is the execution layer for existing tooling. MCP is the integration layer for structured service access. Framing them as alternatives is like asking whether you should use documentation, a terminal, or an API — the answer is obviously all three, depending on the task.

Skills vs MCP: different layers, not rivals

Skills and MCP get compared most often, but they have the least actual overlap.

A Skill encodes the analytic logic — “when reviewing a PR, check for these five things” or “our deploy pipeline requires staging verification before production.” It lives in the prompt, costs tokens only when activated, and requires zero infrastructure. An MCP server provides access — “fetch the PR diff from GitHub” or “trigger the staging deploy.” It requires a running server, schema definitions, and protocol overhead.

A practical example: say you want an agent that diagnoses build failures. MCP connects the agent to your CI system to pull logs. A Skill then encodes the diagnostic logic — which error patterns to look for, what the common root causes are for your specific stack, how to check if it’s a flaky test versus a real regression.

Skills also have a meaningful cost advantage. They activate contextually within the existing prompt window. MCP requires schema discovery overhead on every session — the agent must call tools/list, parse the schemas, and hold them in context even before making a single tool call. For workflows where the agent already has the tools it needs (file reads, shell access, web search), Skills add knowledge without adding protocol tax.

The bottom line: Skills and MCP are complementary. Skills encode the how. MCP provides the reach.

CLI vs MCP: where the real debate lives

This is the comparison that actually generates useful tension. Both CLI and MCP let an agent take actions in the world, and there are real trade-offs between them.

Recent benchmarks from Jannik Reinhard put numbers to what many engineers have felt intuitively:

Dimension	CLI	MCP
Token efficiency	~33% better (TES: 202 vs 152)	Protocol overhead inflates usage, especially with many tools
Debuggability	Excellent — stdout/stderr, human-replayable	Requires MCP-specific inspectors
Stateful sessions	Difficult (requires workarounds)	Native support — great for browser sessions, databases
Tool discovery	Agent must already know what exists	`tools/list` for dynamic discovery
Setup speed	Fast — reuses existing scripts	Requires SDK boilerplate and schema definitions

A few of these deserve unpacking.

Token efficiency matters more than you think

A 33% token efficiency gap compounds fast. In an agentic loop where the model makes dozens of tool calls per task, that overhead translates directly into latency and cost. CLI commands return plain text that the model can parse naturally. MCP responses carry structured envelopes, and the tool schemas themselves consume context before any work begins.

The benchmarks also found that CLI completed tasks MCP structurally couldn’t handle — like memory profiling — because CLI allows selective output. An agent can run a command and pipe only the relevant lines back, rather than receiving an entire data structure through a typed schema.

Debuggability is an underrated advantage

When a CLI-based agent fails, you can copy the exact command it ran, paste it into your own terminal, and see what happens. The debugging loop is identical to debugging any shell script. MCP failures require understanding the protocol layer — inspecting JSON-RPC payloads, checking server logs, verifying schema compatibility. That’s manageable, but it’s a different skill set and a longer feedback loop.

MCP wins on statefulness

CLI’s weakness is persistent state across calls. If an agent needs to maintain a browser session, hold a database connection, or interact with an API that requires sequential operations on a shared resource, MCP handles this natively. Each MCP server can manage its own lifecycle and expose stateful resources that persist across tool calls. Doing this with CLI requires custom wrapper scripts and temporary files — it works, but it’s brittle.

When to reach for each one

Rather than abstract principles, here’s a decision framework based on how the tool will actually be used.

Use Skills when:

You want the agent to know how your project works — conventions, workflows, domain-specific logic
The knowledge is reusable across sessions (deploy procedures, code review checklists, architecture decisions)
You’re encoding expertise that doesn’t require external API calls
You want to reduce re-explaining the same context every session

Use CLI when:

A human-usable tool already exists for the task (git, gh, docker, curl, jq, custom scripts)
You need fine-grained control over output (piping, grepping, filtering)
Debuggability matters — you want to replay exactly what the agent did
You’re prototyping and want the fastest path to a working integration
The task is stateless or can be made stateless with simple file I/O

Use MCP when:

The service has no CLI (Figma, Notion, Slack, custom internal APIs)
You need stateful sessions (browser automation, database connections, multi-step API flows)
Dynamic tool discovery matters — the agent needs to learn what’s available at runtime
You’re building a multi-agent system where standardised tool interfaces reduce coordination cost
The integration needs typed schemas for safety (preventing malformed API calls to production systems)

The practical stack

The recommendation that keeps emerging from engineers who’ve benchmarked all three: start with Skills + CLI as your defaults, and add MCP where its specific strengths are genuinely needed.

Here’s what that looks like in practice:

Skills as the foundation. Encode your project’s conventions, workflows, and domain logic in Skill files. These cost almost nothing, require no infrastructure, and make every other tool more effective because the agent understands context before taking action.
CLI as the primary execution layer. For git operations, file manipulation, running tests, calling APIs with curl, interacting with cloud CLIs — use shell commands. The agent gets the same tools humans use, with the same debuggability.
MCP for structured integrations. When you need to connect to services that don’t have CLIs, maintain stateful sessions, or provide typed tool interfaces to a multi-agent system, spin up MCP servers for those specific integrations.

This layered approach avoids the “pick one” trap. Skills make the agent smarter about your domain. CLI keeps execution simple and debuggable. MCP fills the gaps where neither can reach.

Takeaways

Skills, CLI, and MCP operate at different layers (knowledge, execution, integration) — comparing them head-to-head misses the point
CLI outperforms MCP on token efficiency (~33%), debuggability, and setup speed for most development tasks
MCP earns its place for stateful sessions, services without CLIs, and dynamic tool discovery
Skills are the cheapest, highest-leverage investment — they make every tool call more effective by giving the agent domain context
The winning pattern is Skills + CLI by default, MCP selectively — not a religious commitment to any single approach