ChatGPhish: When a Webpage Makes ChatGPT Render a Phishing Interface
Permiso's ChatGPhish shows that a normal webpage can carry instructions into ChatGPT page summarization and make the assistant render attacker-controlled Markdown as trusted-looking UI. The demonstrated payloads include phishing links, fake account alerts, QR codes, and remote images that leak request telemetry when rendered. This is narrower than generic web indirect prompt injection (IDPI): the dangerous sink is the assistant's own output renderer, where links and images inherit user trust.
Threat Analysis
- The page carries the instruction. An attacker adds Markdown-oriented prompt text to a webpage, README, or HTML page that a user later asks ChatGPT to summarize.
- The model turns content into UI. ChatGPT produces an ordinary summary, then follows the injected formatting instruction and appends a fake security alert, additional resource link, image, or QR code.
- The renderer completes the lure. Links become clickable, images are fetched, and the result appears inside ChatGPT's trusted interface rather than in the attacker's page.
- Remote media adds tracking. Permiso showed image and QR-code variants that can reveal IP address, User-Agent, Referer where available, and timing tied to the rendered answer.
- The boundary failure is provenance. The user sees an assistant response, but part of that response is attacker-controlled web content that survived summarization into live UI.
Applicable AIDEFEND Defenses (6)
What Defenders Should Do Now
- Inventory every summarize-page, browser assistant, web reader, RAG preview, and copilot path that ingests third-party HTML or Markdown and then renders links or media in the assistant response.
- Demote webpage-derived Markdown before inference. Treat page links, image tags, QR-code references, and formatting instructions as untrusted fields, not as response instructions.
- Disable automatic remote media fetches from untrusted model output, or route them through a safe proxy with canonical URL checks, redirect controls, and no user-identifying headers.
- Require provenance and step-up confirmation for external links, shortened URLs, QR codes, and account-warning style UI generated from summarized pages.
- Add regression tests with fake security alerts, additional-resource links, QR codes, tracking pixels, shorteners, and hidden formatting requirements embedded in ordinary webpages.
- Log renderer decisions: which URLs were suppressed, which media were fetched, which links became active, and which response spans came from third-party web content.
1 additional consideration
Assistant-rendered content provenance
Conclusion
ChatGPhish is valuable because it moves the IDPI discussion from model obedience into the output layer. The model is influenced by untrusted web content, but the higher-risk moment is when the assistant renderer turns that content into links, images, QR codes, and trusted-looking alerts. AIDEFEND maps the defense to HTML demotion, value provenance, sink enforcement, URL gating, inference-time validation, detection, and isolation between webpage reading and trusted UI rendering.