Private by Design: The Architecture Behind Omnix AI Advisor

Published:
Learn how Dashlane built a conversational AI security assistant on sensitive credential telemetry without ever compromising zero-knowledge principles.

Co-written with José Arroyo, Pedro Granja, Edouard Lemaitre

Omnix AI Advisor is a natural-language security AI assistant embedded in the Dashlane admin console. It lets security and IT admins query their organization's credential telemetry in plain language and get instant insights and answers.

For example, who are the riskiest users? Which sites have the most compromised passwords? And what needs fixing this week?

When we started designing Omnix AI Advisor, the hardest challenge was how we could stay true to our zero-knowledge architecture commitment and build an AI solution that’s both secure and private by design. Those principles are critical requirements at Dashlane. They define what we’re allowed to build and what we’re not.

This post explains how we built a conversational AI assistant on top of sensitive credential telemetry without compromising those principles, why we had to rule out standard AI solutions, and what we’re still working to resolve before general availability.

Zero-knowledge as a requirement

Dashlane's zero-knowledge architecture means that user data is encrypted client-side, and Dashlane never holds the keys. That same boundary extends to the telemetry that powers Dashlane OmnixTM.

Activity logs, phishing detection events, and credential risks are all sensitive. For example, an activity log isn’t a password, but it maps credential behavior across your organization. It reveals which users are clicking on phishing sites, which credentials are compromised, which accounts are dormant. In the wrong hands, that signal map is as valuable to an attacker as the credentials themselves. Thus, the zero-knowledge boundary had to extend to the audit logs for IT and security teams.

The challenge with AI is that useful AI requires data and computation in the clear. A model can’t answer "Who are my riskiest users?" from encrypted logs. At some point, the data has to be decrypted and processed.

The standard industry answer is to send data to a commercial LLM API under a data processing agreement with appropriate confidentiality clauses. That answer isn’t sufficient for us.

A contract tells you what a provider promises to do. It tells you nothing about what’s technically possible for them to do. Data routed through a third-party inference API leaves your perimeter. You can’t verify what happens inside the provider's infrastructure, and you have no cryptographic proof that the data was not retained, logged, or used for training models.

We needed a solution where the privacy guarantee was enforced by hardware, not by contract.

Confidential computing as the answer

We’ve been using confidential computing for many years at Dashlane. A hardware-isolated enclave is a sealed execution environment where even the infrastructure operator can’t access the data being processed inside. The code running in the enclave is verified before execution through a cryptographic attestation mechanism. If the code has been tampered with, the attestation fails and no data enters.

Here is an analogy: Imagine a sealed room where a contractor comes to do analysis work. You can verify the contractor's identity before they enter. Once inside, no one, including the building owner, can observe what happens. The contractor produces a result, hands it back through a slot, and leaves. Nothing persists inside the room after they’re gone.

That’s the architecture we needed. The admin's prompts and the organization's telemetry go in encrypted. The plain-language answer comes out encrypted. Only the admin can see the data through the Dashlane admin console on their local device.

How we built it on AWS

There were two components required in our architecture:

  • The ability to run GPU-accelerated inference so we could benefit from LLMs that were powerful enough and fast enough
  • The isolation properties of a secure enclave to protect the customer data being processed

AWS Nitro Enclaves are the natural solution for confidential computing on AWS. They provide hardware-isolated execution environments with no persistent storage, no operator access, and no external network connectivity, baked in by default.

The problem is that Nitro Enclaves do not support GPUs. For CPU-only inference on a model of this size, the latency would make the product unusable.

Therefore, we took a different path: We leveraged GPU-accelerated EC2 instances, specifically g5 instances with NVIDIA A10G GPUs, with EC2 instance attestation enabled using NitroTPM. This gave us the GPU performance we needed.

But unlike a Nitro Enclave, where isolation is guaranteed by default, an attested EC2 instance is a standard EC2 instance. The security properties don’t come out of the box. We had to build them ourselves.

Concretely, we hardened the attested instance to replicate the isolation properties of an enclave:

  • No direct machine access. The instance accepts no interactive connections. No operator, including Dashlane engineers, can connect to it.
  • Read-only filesystem. The code running inside the instance can’t be modified at runtime. Any tampering causes the instance to terminate immediately.
  • No persistent storage. All data produced during inference—queries, intermediate outputs, responses—exists in RAM only. When the instance stops, it’s gone.

This work required close collaboration with AWS to validate the attestation model and confirm that the isolation properties matched our security requirements.

The data flow works as follows:

  1. The query is received by the AI Advisor orchestrator inside a Nitro Enclave.
  2. The orchestrator decorates the prompt with complementary instructions and submits it to the LLM.
  3. The LLM is responding with a set of data queries to conduct investigations.
  4. The orchestrator executes scoped queries against the organization's data, including activity logs, Password Health score signals, Dark Web Monitoring alerts, and phishing detection events.
  5. The model generates a plain-language response based on the retrieved data.
  6. The response is returned to the admin through an encrypted secure channel.
Diagram showing how Dashlane's Omnix AI Advisor processes data privately. Encrypted customer data —including risk detection, phishing alerts, dark web monitoring, and activity logs—flows into an AI orchestrator housed in a Nitro Enclave. The orchestrator communicates with the Dashlane extension through an encrypted secure tunnel carrying prompts and answers, and with an open-weight LLM on an EC2 attested instance with GPU through a second encrypted secure tunnel carrying prompts, tool calls, and answers. A separate key management component connects to the encrypted data store.

The model itself is a Ministral 3 14B, an open-weight LLM running via llama.cpp inside the enclave. The decision to use a self-hosted open-weight model was the only path consistent with the architecture. Any call to an external model API would route data outside the enclave boundary, which would break the zero-knowledge guarantee.

The GPU EC2 instance sits outside the Nitro Enclave, so before the two ever exchange data, they perform a two-way attestation: Each side cryptographically proves what code it’s running, and each independently verifies the other's proof.

Both the Nitro Enclaves and the EC2 instances can generate a verifiable attestation, ensuring that only trusted code is executed and that no unauthorized changes can occur. No data leaves the Nitro Enclave boundary until both sides match the expected measurements.

That handshake is what establishes the secure channel between the AI orchestrator and the LLM server, and it's the mechanism that makes our architecture private by design.

One secure channel, invisible to everything above it

That attestation handshake doubles as the channel's setup. The orchestrator and the LLM share a single mutually-attested tunnel, and the two-way attestation we just described is how it opens. From then on, every request flows through it encrypted. Establishing the channel and proving both ends trustworthy are one continuous handshake, completed before any application data moves.

For encryption, we leaned on TLS 1.3. The only custom cryptography is the attestation we bind into each TLS session; a hybrid post-quantum key exchange came along for free, keeping the channel safe even against an attacker who records traffic today to decrypt later.

The interesting benefit is that almost none of our code requires awareness of the tunnel. The LLM server uses an OpenAI-compatible API, the same interface AI client libraries of all sorts already use. Rather than modify those frameworks to teach them about enclaves and attestation, we simply slipped the tunnel underneath them.

On the orchestrator side, the whole thing—encryption, attestation handshake, measurement checks—hides behind a single custom fetch:

// All the TLS + attestation logic lives in `agent`; the rest is stock. const tunnelFetch: typeof fetch = (input, init) => fetch(input, { ...init, dispatcher: agent }); const openai = createOpenAI({ baseURL: 'http://llm/v1', fetch: tunnelFetch }); const model = openai.chat('ministral-3-14b'); // From here on, the AI SDK uses `model` like any other provider.

On the EC2 side, the tunnel terminates into a small proxy that forwards the decrypted request to a local llama-server, equally oblivious to the tunnel around it. One end runs a custom fetch, the other a proxy, with an attested channel between them.

And because it’s just an OpenAI-compatible endpoint, it works with virtually any client that can call one.

What this guarantees, and what it doesn’t

The confidentiality guarantee rests on the isolation properties of AWS EC2 Attested Instances. We want to be explicit about this. The threat model doesn’t assume a compromised cloud provider.

AWS controls the underlying hardware, and the security model depends on the integrity of their enclave implementation. We’ve validated this architecture against our threat model and believe it’s the right design, but we don’t claim it’s independent of the infrastructure layer.

What the architecture does guarantee:

  • No query or response data is transmitted to any third-party AI provider.
  • No customer data leaves the enclave boundary during processing.
  • Dashlane can’t read queries or responses.
  • Dashlane or AWS can’t tamper with the LLM code and risk exposing customer data.
  • Admin queries are not logged by Dashlane. Feedback is opt-in. A thumbs up or down sends us the rating and who sent it, never the query or response text.

For reference, AI Advisor architecture information is available in our security documentation.

One important scope boundary: Vault credential content—the passwords and secrets users store in Dashlane—is entirely out of scope for AI Advisor. The model operates on activity logs and account metadata visible to company admins only. It has no access to vault contents, and no path to them exists in the architecture.

On training and model updates

The model powering AI Advisor is a pre-trained open-weight model. We don’t train or fine-tune it on customer data. When an admin queries their organization's activity logs, that data is processed in memory inside the enclave to generate a response, then discarded. It doesn’t feed back into the model or influence future responses for any other customer.

Model updates follow the same attestation process as the initial deployment. Before an updated model is released, it’s validated against our internal test suite covering the supported use cases.

The per-response feedback signals we collect—the thumbs up/down and optional free-text—are used to evaluate model quality in aggregate. They carry no query or response text. The signals tell us whether a response was useful, not what was asked.

What we’re learning in beta

You can register for the beta now.

We’re looking for admins who will use it, stress-test the accuracy, and tell us where it fails. The architecture is sound, but getting it right for general availability requires real queries and edge cases, as well as honest feedback.

Sign up to receive news and updates about Dashlane