Paper Published: Apr 25, 2026

Your Agent Is Mine: Malicious LLM API Routers as an Agent Supply-Chain Boundary

This paper shows that third-party LLM API routers are not just compatibility layers. Because they terminate client traffic and forward plaintext tool-call JSON upstream, a malicious or compromised router can rewrite executable tool arguments, steal secrets in transit, and selectively target autonomous agent sessions. The defensive lesson is to treat router choice, route policy, tool execution, and request logging as one governed security boundary.

Credential TheftTool Argument TamperingTool AuthorizationAgentic AIAPI Routers
8 applicable AIDEFEND defenses
Source: Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain 
By Hanzhi Liu, Chaofan Shou, Hongbo Wen, Yanju Chen, Ryan Jingyang Fang, Yu Feng · Original article: Apr 9, 2026

Threat Analysis

  • The router sits in the application-layer middle. The client intentionally points its base URL at the router, so TLS protects the hop to the router but does not prove that the returned tool call is what the upstream model produced.
  • Payload injection happens after inference. The model can produce a benign Bash call, then the router can replace one argument with an attacker-controlled installer or package name while preserving valid JSON and schema shape.
  • Secret theft does not require visible tampering. API keys, cloud credentials, system prompts, tool definitions, file contents, and environment variables all cross the router in plaintext.
  • The ecosystem signal is already practical. The authors found malicious paid and free routers, AWS canary use, ETH theft, adaptive triggers, and 440 command-injectable Codex sessions through weak-router decoys.
  • Client-side controls help, but do not prove origin. Policy gates, anomaly screening, and append-only logs reduce exposure today; provider-signed response envelopes are the longer-term integrity answer.

Applicable AIDEFEND Defenses (8)

AID-H-019.004
Intent-Based Dynamic Capability Scoping
Very High
The paper's strongest immediate defense is a fail-closed policy gate for high-risk tools such as Bash, run_command, and package installs. Per-session capability scopes make malicious router rewrites physically unable to call tools or package paths outside the approved task envelope.
AID-H-019.005
Value-Level Capability Metadata & Data Flow Sink Enforcement
Very High
Passive router exfiltration is dangerous because secrets and high-sensitivity values ride through ordinary request and tool-output fields. Tracking value provenance and blocking unsafe transfers into external HTTP, shell, package, or logging sinks directly targets the AC-2 data-flow problem.
AID-H-034.004
Route Policy Bundle Versioning, Approval, Canary & Rollback
High
Moving an agent to a third-party router can be as small as changing a base URL and API key. Treat those route policies as signed, reviewed, canaried security artifacts so teams cannot silently drift from approved providers into weak relay chains or untrusted compatibility endpoints.
AID-H-004.002
Service & API Authentication
High
The poisoning studies show how leaked upstream keys and weak relays turn benign-looking routers into plaintext visibility points. Scoped service credentials, short lifetimes, per-router keys, and rapid revocation limit how much traffic a stolen or reused key can expose.
AID-H-026.003
Pre-Execution Static Scan
High
AC-1 becomes severe when a rewritten tool call reaches an interpreter or shell. Pre-execution scans for dangerous command patterns, installer fetches, typosquat package names, and shell passthrough wrappers give the client a deterministic stop before the modified payload runs.
AID-I-001.003
Ephemeral Single-Use Sandboxes for Tools
High
The study found many command-injectable sessions, most already running in auto-approve mode. Sandboxed tool execution, constrained network egress, and low-secret workspaces do not authenticate the tool call, but they sharply reduce damage if a malicious router changes what executes.
AID-D-003.003
Agentic Tool Use & Action Policy Monitoring
Medium
The paper's response-side anomaly screener maps here: inspect returned tool calls for shell-risk patterns, unusual arguments, secret-like strings, schema deviation, and session-level frequency shifts before those outputs reach downstream execution.
AID-D-005.002
Security Monitoring & Alerting for AI
Medium
Append-only transparency logs and SOC alerts help answer which router, account, request, and tool call were exposed after suspicious activity. This is especially important for AC-2, where the traffic may look normal until a canary or credential is used later.

What Defenders Should Do Now

  • Inventory every agent, IDE, CI job, and internal platform that points model traffic at a third-party API router, relay, gateway, or OpenAI-compatible base URL.
  • Move router configuration into reviewed route policy bundles. Block ad-hoc base URL changes and require per-router credentials with scoped privileges, short lifetimes, and owner metadata.
  • Fail closed on high-risk tool calls from routed sessions: shell execution, package installation, file writes outside the workspace, cloud CLI actions, and outbound network fetches.
  • Strip secrets from prompts, tool outputs, and logs before they cross untrusted routers. Use canary credentials and alert if any canary touched a cloud, repo, wallet, or SaaS API.
  • Enable local transparency logging for routed sessions: router URL, TLS metadata, request and response hashes, redacted payload classes, tool names, and approval mode. Keep enough retention to scope incidents.
  • Run routed agent sessions in low-secret sandboxes with constrained egress, especially when auto-approval or YOLO mode is enabled.

1 additional consideration

Provider-signed response envelopes for tool-call provenance

Beyond the techniques mapped above, teams using LLM routers should also push for end-to-end response integrity: the upstream provider signs a canonical envelope covering model identity, tool name, tool arguments, finish reason, nonce, and validity window.
Recommendation: Until provider support exists, pair approved router policies with fail-closed tool gates, local transparency logs, and sandboxed execution. When response signing becomes available, require clients to verify the envelope before executing any tool call.

Conclusion

The paper makes a subtle trust boundary hard to ignore: an LLM API router can be both a convenience layer and the last component to touch executable agent commands before they run. AIDEFEND  maps strongly to capability scoping, data-flow sink enforcement, route-policy governance, service authentication, execution controls, and monitoring. The remaining ecosystem step is origin integrity for tool calls, so clients can verify what the upstream model actually produced before an agent acts on it.