Back to Blog
mcpindirect-prompt-injectiontool-poisoningai-pentestappsec

Best tools for testing indirect prompt injection in MCP servers

Compare the best tools for testing indirect prompt injection in MCP servers, with a focus on poisoned tool metadata, malicious tool return values, local-server risk, and regression coverage.

ByClaire Song13 min read
Pen name disclosure: Claire Song is a pen name used by the 0xClaw editorial team for articles on AppSec operations, evidence quality, and remediation workflows. It is a disclosed byline persona rather than a public individual identity.
Quick answer
Infrastructure note

Compare the best tools for testing indirect prompt injection in MCP servers, with a focus on poisoned tool metadata, malicious tool return values, local-server risk, and regression coverage.

Key takeaways
  • Best tools for testing indirect prompt injection in MCP servers should explain infrastructure choices in a way that is easy to quote, compare, and operationalize.
  • Tie architecture explanations back to how local execution, governance, and evidence handling work in practice.
  • Use official docs plus product pages so the page can rank for definitions and support AI citation.
Related next steps

Quick answer: which tool is best for testing indirect prompt injection in MCP servers?

For most teams, Promptfoo is the best starting tool for testing indirect prompt injection in MCP servers because it already models the core pattern: untrusted content lands in a variable, resource, or external content path, and the test should fail if the model follows that injected instruction. For MCP specifically, I would not run it alone. The most practical pair is Promptfoo plus MCP Inspector. Promptfoo is good at finding the bad path. Inspector is good at showing the exact tool schema, resource content, prompt template, transport, and execution result that made the failure possible.

After that, the stack gets more specialized. Agent Scan is useful when you need to inventory local MCP servers and catch tool poisoning or shadowing before a human even connects them. PyRIT is the better choice for custom multi-step attack paths. RAMPART matters when you already found a real bug and want that bug locked into CI. A proxy workflow still helps when you need proof that the injected content actually changed a downstream action.

That may sound like an annoying answer if you wanted one clean winner. It is still the honest one. MCP servers widen the blast radius because the hostile text does not have to sit in the user prompt. It can live in a prompt template, tool description, tool return value, resource body, upstream document, or another connected server. OpenAI's MCP guidance says prompt injection can lead to unintended actions and private data being sent to an external destination. OWASP's MCP cheat sheet makes the same point from the defender side: tool descriptions, schemas, and return values are all injection surfaces. If you want the broader category first, read Best tools for testing prompt injection in MCP servers. If you are comparing wider buying options, the compare hub is the faster route.

Indirect prompt injection testing tools for MCP servers

Why indirect prompt injection is different in MCP servers

Indirect prompt injection in a normal chat app is already messy. Indirect prompt injection in MCP servers is worse because the protocol adds extra places where untrusted content can steer the model.

The official MCP architecture is host, client, and server. The server exposes tools, resources, and prompts. The model sees tool descriptions in context, and often sees content returned by tools or resources too. That creates more than one place for hostile instructions to hide.

OpenAI's current MCP docs are unusually direct about the risk. The page says a prompt injection can trick the model into taking actions the user and developer never intended, including sending private data elsewhere. It also says trusting the MCP developer is not enough if the data exposed through the MCP may contain malicious or untrusted user input. That is the critical shift. With indirect prompt injection, the attacker does not need to control the MCP server itself. They may only need to control content that the server can read.

OWASP gets even more concrete. The MCP cheat sheet lists tool poisoning, shadowing, confused deputy behavior, and data exfiltration through legitimate channels as key MCP risks. It also dedicates whole sections to tool description and schema integrity, supply chain review, and prompt injection via tool return values. That is why testing this class of issue with only chat-box jailbreak prompts is a waste of time. The interesting question is not whether the model will say "HACKED." The interesting question is whether poisoned content changes which MCP tool gets called, what parameters get passed, or what data crosses the boundary after the call.

The best tools for testing indirect prompt injection in MCP servers compared

| Tool | Best fit | What it proves well | Main limitation | | --- | --- | --- | --- | | Promptfoo | Best default for most teams | Indirect attacks through untrusted variables, retrieved content, and MCP-targeted test flows | Needs a second tool for deep protocol inspection | | MCP Inspector | Best verification tool | Raw prompt templates, resource content, tool schemas, execution results, and message flow | Not a full adversarial harness by itself | | Agent Scan | Best for local MCP inventory and poisoned server discovery | Prompt injection, tool poisoning, tool shadowing, toxic flows, risky local startup commands | Scanning MCP configs may execute server commands, so it must be handled carefully | | PyRIT | Best for custom exploit chains | Multi-target attacks, stored hostile content, custom scoring, file and workflow realism | More engineering overhead | | RAMPART | Best for regression after a real finding | Repeatable CI checks around indirect injection scenarios and side effects | Better after discovery than before it | | Proxy and replay tooling | Best for side-effect evidence | Downstream requests, parameter abuse, action proof, and exact exfiltration path | Manual work, not a one-command scanner |

The short version is simple. Promptfoo finds, Inspector explains, Agent Scan inventories, PyRIT customizes, RAMPART preserves, and a proxy proves. That stack is more useful than arguing about a single gold medal.

Promptfoo is the best default because it starts with the right failure model

Promptfoo earns the top spot because it starts with the real shape of indirect prompt injection instead of a simplified version. Its indirect prompt injection plugin tells you to identify the variable that contains untrusted data, then it injects adversarial payloads into that variable and fails the test if the model follows those instructions. The docs call out the common injection points plainly: context, documents, retrieved_chunks, email_body, notes, and similar fields. That maps well to MCP workflows because many MCP-backed assistants assemble prompts from resource reads, ticket bodies, CRM notes, or tool outputs.

Promptfoo's MCP provider matters too. The provider documentation says the MCP server itself can be the target system under test, and it recommends adjacent plugins such as pii, bfla, and bola. I like that because it keeps teams from pretending indirect prompt injection is only a language problem. In MCP systems, the injected instruction often matters because it reaches a real action surface: a tool call, a data object, a read action that was over-scoped, or a write path the user did not mean to trigger.

There is also a subtler reason Promptfoo works well here. It pushes teams toward named entry points. You have to decide which variable, tool input, or content source is untrusted. That sounds boring. It is actually the whole game. Most broken MCP deployments are not broken because the model is mysterious. They are broken because nobody wrote down which inputs were hostile and which capabilities were sensitive.

Use Promptfoo first when:

  • the MCP server is already reachable in a stable local or remote form
  • the dangerous data path can be mapped to specific variables or content sources
  • you want a test artifact engineers can read and own
  • you need coverage for retrieved data poisoning, tool output poisoning, or cross-server content flow

If your team is still early, this is usually the fastest path from suspicion to proof.

MCP Inspector is the tool that keeps you honest

MCP Inspector is not the glamorous answer, but it is the one that saves time once a suspicious result appears. The official docs describe it as an interactive tool for testing and debugging MCP servers. More importantly, the features line up with the exact questions indirect prompt injection raises.

Inspector lets you choose the transport, inspect resources, review prompt templates and arguments, inspect tool schemas and descriptions, execute tools with custom inputs, and look at logs and notifications. That is almost a checklist for MCP injection triage.

Why that matters:

  • A poisoned instruction may sit in a resource body rather than in the main prompt.
  • A tool description may overstate authority or hide a risky side effect.
  • A return schema may make exfiltration look like a harmless text field.
  • A prompt template may be gluing trusted instructions and hostile content together without any separation.

Inspector helps you answer a question that red-team tools alone often cannot: what did the server really expose at the protocol layer? That is the difference between "the model behaved strangely" and "this tool description plus this return value plus this prompt assembly created an exploitable path."

I would use Inspector right after Promptfoo finds anything interesting, and I would also use it before approving unfamiliar community servers. If you are still building your MCP security workflow, keep this article open next to the broader blog library. It helps to have protocol references and testing templates in the same tab set.

Agent Scan is the best way to catch poisoned MCP components before they become normal

This is the part many teams skip because it feels more like operational hygiene than red teaming. It is still one of the highest-value steps for MCP.

OWASP explicitly recommends using mcp-scan or equivalent tooling to detect poisoned descriptions and cross-server shadowing. The tool that now shows up at that old repository path is Agent Scan from Snyk. The current repo says it can auto-discover MCP configurations, detect prompt injection, tool poisoning, tool shadowing, and toxic flows, and inspect tool descriptions by connecting to servers.

That is useful for one practical reason: local MCP sprawl is real. People install a server for GitHub, another for Slack, another from a demo repo, another from a blog post, and six weeks later nobody remembers what is on which laptop. Agent Scan is one of the few tools in this category that starts by asking a basic fleet question: what MCP components are actually installed, and what do they expose?

The caveat matters just as much as the feature list. The repo warns that scanning an MCP config can execute the commands defined in it because the scanner starts stdio MCP servers to retrieve tool descriptions. It also says teams should run scans in a sandbox or disposable environment when evaluating untrusted configs. That warning is not a footnote. It is a reminder that MCP inventory is part of the attack surface. Your security tool may need to launch the thing you are afraid of in order to inspect it.

So I would use Agent Scan for:

  • periodic inventory of MCP servers across developer environments
  • catching prompt injection and tool poisoning in server metadata early
  • surfacing risky local startup commands before a user casually approves them
  • spotting shadowing or toxic-flow problems in multi-server environments

I would not use it as my only proof of exploit. I would use it as the tool that stops poisoned MCP components from blending into the background.

PyRIT and RAMPART cover the parts that generic scanners miss

PyRIT is what I reach for when the attack path stops being neat. The framework docs say a prompt target does not have to be an LLM and note that, for cross-domain prompt injection attacks, the target might be a storage account that a later target references. That is a very MCP-friendly mental model. The malicious content may be stored in one place, then pulled into context through a different tool or server later.

That flexibility makes PyRIT useful for scenarios like:

  • a poisoned document stored by one service and retrieved through another MCP tool
  • a multi-step workflow where hostile content persists across turns
  • custom scoring based on downstream actions rather than response text
  • lab environments where you need to model a real chain instead of a single prompt

RAMPART solves a different problem. Microsoft introduced it on May 20, 2026 as an open-source framework for turning adversarial and benign agent scenarios into repeatable CI tests. Microsoft also says its strongest current coverage is around cross-prompt injection attacks. That makes it a good fit once you already know the dangerous path and want to keep it fixed.

This distinction matters:

  • PyRIT helps you build or replay the weird exploit chain.
  • RAMPART helps you make sure the weird exploit chain stays dead after the fix.

That sequence is especially important in MCP work because indirect prompt injection regressions rarely come back with the exact same payload. They come back through a new tool description, a new resource field, a wider schema, or a new server that inherits the old mistake under a different name.

How to choose the right stack for your MCP deployment

The best tool depends on where your MCP risk really lives.

If you run a small number of remote MCP servers with clear content sources, start with Promptfoo and Inspector. That gives you the best mix of discovery and explanation.

If your bigger problem is local server sprawl on employee machines, add Agent Scan early. The official MCP security guidance spends a lot of time on local-server compromise, and for good reason. A poisoned local server can start with the user's privileges, reach local files, and sit next to valuable tokens.

If your workflows are multi-step, cross-server, or file-heavy, add PyRIT. Indirect prompt injection is often a stored-content problem before it is a text-generation problem.

If you already found a serious issue, stop chasing novelty and add RAMPART. The cheapest bug to fix is the one that never quietly returns.

If you are choosing tooling around a broader rollout rather than a one-off lab exercise, review pricing and download too. The workflow around the tools matters almost as much as the tools themselves. A stack that finds issues but cannot be rerun by engineering is not finished.

FAQ

What is the best open-source tool for testing indirect prompt injection in MCP servers?

For most teams, Promptfoo is the best open-source starting point because it directly models indirect prompt injection and can target MCP systems as the thing under test rather than as background infrastructure.

Why is MCP Inspector important if I already have a red-team tool?

Because indirect prompt injection in MCP often depends on what the server actually exposed: prompt templates, tool descriptions, schemas, resources, and execution results. Inspector is often the fastest way to verify those details.

Is Agent Scan enough by itself?

No. Agent Scan is very useful for inventory, poisoning checks, and local MCP hygiene, but it is not the whole exploit proof. You still need targeted adversarial testing and, for serious findings, action-layer evidence.

When should I use PyRIT instead of Promptfoo?

Use PyRIT when the dangerous path includes stored hostile artifacts, multi-step orchestration, or custom scoring logic that a simpler harness will not express cleanly.

What should a passing MCP indirect prompt injection test prove?

A good passing test should show that hostile content reached the workflow, the model did not follow the injected instruction, no unsafe tool was chosen, no sensitive data crossed the boundary, and repeated runs stayed inside the same control limits.

Bottom line

If I had to name one tool, I would still name Promptfoo.

If I had to name the most useful pair, I would pick Promptfoo and MCP Inspector.

If I were building the full stack for an MCP-heavy team, I would add Agent Scan for inventory and poisoning checks, PyRIT for custom exploit chains, and RAMPART for regression. That is not the tidiest answer. It is the one that lines up with how MCP systems actually fail.

Ready to run your first AI pentest?

Get 0xClaw up and running in under 3 minutes. No infrastructure setup. No cloud dependency.

Continue Reading

More AI Pentest Guides

Continue through the local AI pentesting cluster with related guides on workflow, evidence, comparisons, and remediation.