Blog Published: May 26, 2026

ChromaToast: ChromaDB Pre-Auth RCE Through Malicious Hugging Face Model Loading

HiddenLayer disclosed CVE-2026-45829, a ChromaDB Python FastAPI server flaw where an unauthenticated collection-creation request can download and execute attacker-controlled Hugging Face model code before authentication runs.

The lesson is direct: model loading is code execution. Vector databases need early authentication, restricted model provenance, unsafe-loader blocking, and isolated first-use execution.

Remote Code ExecutionMalicious ModelsRuntime IsolationVector DatabaseRAG Security

6 applicable AIDEFEND defenses

Source: ChromaToast Served Pre-Auth

Author: Esteban Tonglet (HiddenLayer)

Original article: May 18, 2026

Threat Analysis

The endpoint appears protected. ChromaDB marks collection creation as authenticated, but an unauthenticated request can still carry embedding-function configuration that points to an attacker-controlled Hugging Face model.
The dangerous flag is trust_remote_code. When set through request-controlled kwargs, it tells the model loader to fetch and run Python code from that model repository.
The ordering is the bug. The Python server instantiates the embedding function before the authentication check, so the model is downloaded and executed before the request is rejected.
The failed API call can still compromise the server. The response may look like an error, but the attacker-controlled code has already run inside the ChromaDB process.
The blast radius is the server process. Environment variables, API keys, mounted secrets, local data, and reachable internal services can all become exposed.

Applicable AIDEFEND Defenses (6)

AID-H-004.002

Service & API Authentication

Very High

This is the clearest break point. ChromaDB did have an authentication check, but it ran after configuration loading and model execution. Service and API authentication must run before any request-controlled model configuration is parsed, fetched, instantiated, or executed. If the requester is not authorized to create a collection, the server should never touch the referenced model.

AID-H-003.006

Model SBOM & Provenance Attestation

Very High

The exploit depends on a public model name and client-controlled loader flags reaching runtime without a provenance decision. A model SBOM and attestation should record the approved model bytes, digest, source, format, tokenizer, loader commit, and flags such as trust_remote_code. The ChromaDB server should load only models whose provenance and policy state were approved before the request arrived.

AID-H-026.001

Dangerous Construct Detection & Blocking

Very High

The high-risk construct here is not a shell command typed into the API. It is a loader path that lets trust_remote_code: true and attacker-controlled kwargs flow into AutoModel.from_pretrained(). Detection should fail closed on unsafe model-loading flags, untrusted remote-code execution, unsafe serialization paths, shell callbacks, and other code-execution behavior in model artifacts before the model is instantiated.

AID-H-003.002

CI/CD Release Gating, Model Artifact Signing & Secure Distribution

High

Production vector databases should not fetch arbitrary models directly from public Hugging Face namespaces because a client supplied a name. A release gate should promote only scanned, signed, digest-pinned, internally mirrored model artifacts into the runtime allowlist. Collection creation can then choose from approved model identifiers rather than turning each request into a live supply-chain decision.

AID-I-001.002

MicroVM & Low-Level Sandboxing

High

If a system must evaluate a new or third-party embedding model, the first load should happen in a stronger isolation boundary than the long-running vector database process. A microVM or low-level sandbox with no production secrets and no shared host filesystem reduces the chance that unexpected model code can reach the ChromaDB host, stored vectors, mounted credentials, or internal services.

AID-I-001.004

Sandbox Network Egress Restrictions

High

HiddenLayer's exploit path gives the attacker code execution inside the server process. Default-deny egress for model-loading sandboxes and vector database workloads can block reverse shells, credential exfiltration, and follow-on downloads even if malicious model code starts running.

What Defenders Should Do Now

Inventory every ChromaDB deployment, especially Python FastAPI servers with network-reachable ports. Record version, deployment path, exposed interface, authentication layer, and whether the server can reach public Hugging Face.
Prefer the Rust-based deployment path or a patched release when available. If the Python FastAPI server is still in use, restrict the ChromaDB port to trusted clients only and place it behind network policy, service authentication, and an API gateway or private service path.
Move authentication before any collection configuration loading, embedding-function construction, or model download in custom forks or compensating controls. A rejected request should not be able to instantiate an embedding model as a side effect.
Block request-controlled model names, kwargs, and trust_remote_code from production collection-creation paths. If users need configurable embeddings, offer an allowlist of approved internal model identifiers instead of raw public model references.
Route approved models through an internal registry or mirror with scanning, signatures, digest pinning, and loader-policy checks. Treat a new embedding model the way you would treat a new executable dependency.
Run first-use model loading in a short-lived sandbox with no production secrets, no shared home directory, and default-deny outbound network access.

Conclusion

ChromaToast is a clean example of how AI infrastructure can turn configuration into execution. The vulnerable request is not asking the server to run a command; it is asking the server to create a collection with a chosen embedding model. But if that model reference can pull remote code, and the server does it before authentication, the collection API becomes a pre-auth RCE path. AIDEFEND maps the practical defense to early service authentication, model provenance, unsafe loader blocking, secure model distribution, runtime isolation, and egress control.