Kimi Privacy Leak Report: When a Translation Request Returns Another User's Resume
Daily Economic News reported that a Kimi user asked the model to translate a PPT image and instead received a real stranger's resume with name, phone, email, work history, projects, and performance details. The affected person reportedly confirmed the resume was real.
This should not be treated as ordinary hallucination. A real cross-user document appearing in another user's answer points toward isolation, cache, retrieval binding, temporary-object access, async identity, or replay/logging failures in the AI service path.
Threat Analysis
- The failure is privacy leakage, not just wrong text. A hallucination invents content; this report describes a real resume returned to an unrelated user.
- The likely fault line is the whole AI service path. Experts quoted in the article point to possible data isolation failure, cross-user context contamination, RAG binding mistakes, temporary object-storage access-control failures, async task ID mismatch, log replay, and misconfigured sharing or indexing paths.
- Multimodal intake creates more places to lose identity. A PPT-image translation may touch upload, OCR or parsing, task queues, temporary storage, context assembly, output generation, and logging. Every hop has to preserve the user, tenant, session, file, and authorization binding.
- Output DLP is the last chance to catch the failure. Even if the wrong resume reaches the model, names, phone numbers, emails, and work-history blocks should trigger a gate before display.
- Incident response needs evidence. Request IDs, file IDs, retrieval IDs, cache keys, queue job IDs, and output records are what tell responders whether this was one request, one tenant, one cache class, or a broader isolation defect.
Applicable AIDEFEND Defenses (8)
What Defenders Should Do Now
- Trace one uploaded-file request end to end: upload ID, parser job, OCR result, temporary object, RAG chunk, cache key, model request, final output, and logs. Confirm every hop carries user, tenant, session, and authorization metadata.
- Review all prompt-response, semantic, prefix, KV, and parser-result caches. Cache keys must include tenant/user/session and authorization context; if that context is missing, disable reuse until fixed.
- Put a fail-closed authorization check before any retrieved document, uploaded file, temporary object, or replayed log line enters model context. The question should be simple: is this asset allowed for this user, this session, and this purpose?
- Add output DLP for resumes and personal records. Names plus phone numbers, emails, work-history sections, project history, or identity-number patterns should block or escalate before display.
- Run seeded cross-user isolation tests. Upload a canary resume under User A, issue unrelated translation and summarization requests under User B, and verify the canary never appears in context, retrieval, cache hits, logs, or output.
- Prepare incident forensics before the next privacy report: preserve request traces, cache decisions, retrieval decisions, and deletion/retention evidence so the response can be specific rather than speculative.
1 additional consideration
User notification and incident-scope communication
Conclusion
This Kimi report is a useful reminder that AI privacy incidents often look like model behavior but live in product architecture. The model may be the component that speaks, but the failure can sit in upload handling, OCR, temporary storage, RAG binding, cache reuse, session state, or output controls. AIDEFEND maps this case to isolation, cache integrity, data-use authorization, value-level sink checks, output DLP, and forensic session logging. The practical goal is simple: a user should never receive data unless every layer can prove that data belongs in that user's current request.