March 11 1:15 am Draft see clarification on how we are handling key protection from AI through architecture. Feedback needed and welcome!
# How We Discovered That AI Agents Can't Be Trusted With Keys
## A report on keeping cryptographic identity safe from your own AI
*Synthesized by Jorgenclaw (AI agent) and Claude Code (host AI), with direct feedback and verification from Scott Jorgensen*
---
### The Setup
I run an AI agent named Jorgenclaw. He lives on a small computer in my house, connects to me through Signal, browses the web, manages files, and posts to Nostr — a decentralized social network where your identity is a cryptographic key pair.
On Nostr, your private key (called an "nsec") is everything. It proves you are you. Whoever holds it *is* you. There's no password reset, no customer support, no "forgot my key" flow. If someone gets your nsec, they can post as you, sign as you, *be* you — permanently.
So naturally, I wanted my AI agent to have his own Nostr identity.
### The Problem We Didn't See Coming
To post on Nostr, Jorgenclaw needs to sign messages with his private key. The obvious approach: store the key somewhere on the host machine and make it available to the containerized agent.
That's what we did with Clawstr, a Nostr CLI tool. The workflow was:
1. Generate or receive the private key in hex format (64 characters)
2. Store it in `~/.clawstr/secret.key` on the host machine
3. Mount that file into Jorgenclaw's Docker container at `/workspace/group/.clawstr_hex`
4. Jorgenclaw reads the hex from the mounted file when signing posts
This worked well. The key never passed through environment variables. It stayed in a dedicated file, mounted read-only into the container. Jorgenclaw could read it to sign posts, but couldn't modify it.
But during a routine conversation about security, we realized something uncomfortable.
*Claude Code — the AI running on my host machine — can read every file my user account can access.* Every dotfile, every config, every file in my home directory. It's not malicious. It just has the same filesystem permissions I do.
And the private key? Sitting right there in `~/.clawstr/secret.key`, in plaintext.
I had tried to hide it from Claude by carefully avoiding mentioning the file path. I thought if I never asked Claude to look there, the secret would be safe. But that was foolish — Claude has full access to the entire machine. Any file I can read, Claude can read.
"So pretty much no file on this machine can be kept secret from you," I said.
"Correct," Claude replied.
### The Threat Model
This isn't about whether Claude *would* steal a key. It's about whether the architecture *allows* it. Good security doesn't depend on trusting any single actor — it depends on making theft structurally impossible.
The threat model looks like this:
1. *The AI on the host* (Claude Code) can read any file the user can read. It sees `~/.clawstr/secret.key`, config files, databases — everything.
2. *The AI in the container* (Jorgenclaw) is sandboxed in Docker with only specific directories mounted in. The private key file is mounted at `/workspace/group/.clawstr_hex`, making it readable to Jorgenclaw for signing operations.
3. *The cloud provider* (Anthropic) processes every message, file read, and tool output from both AIs. If either AI reads the key file, the contents pass through Anthropic's servers.
The file-based approach was better than environment variables — it avoided exposing the key in process listings and `/proc/self/environ`. But it still left the key readable by any AI with filesystem access.
Three layers, zero places to hide a plaintext key.
### The Discovery: Kernel Memory
This is where we need to thank the very smart people at *Marmot Protocol* (the team behind White Noise). They'd already solved this hard computer science problem — we just didn't know it existed yet.
Scott isn't deeply technical in cryptography and key management. Our motivation was to give it an honest try, fumble through the learning process, and hopefully make the solution accessible to other NanoClaw users. If Marmot's approach already worked, our job was to understand it, test it, and package it properly for the NanoClaw ecosystem.
What Marmot figured out: The Linux kernel has a built-in secret storage system called *keyutils*. It stores values in kernel memory — not on the filesystem. There's no file to read, no path to `cat`, no config to grep. The secret exists only in protected kernel space.
```
keyctl add user my_secret "the-actual-value" @u
```
That's it. The key is now in kernel memory. No file was created. An AI that reads every file on disk will find nothing.
But there's a catch: `keyctl print` can still extract it. A sufficiently capable AI could run that command. We needed one more layer.
### The Solution: The Daemon Pattern
This is where White Noise — Marmot Protocol's encrypted messaging system built on Nostr — showed us the complete architecture.
White Noise has a two-part design:
- *`wnd`* (the daemon): Runs on the host machine. Holds the private key in memory (loaded from kernel keyutils at startup). Listens on a Unix socket. Performs all cryptographic operations — signing, encrypting, decrypting.
- *`wn`* (the CLI client): Stateless. Connects to the daemon's socket, sends a command ("send this message," "join this group"), gets a response. Never touches the key.
Here's what this means for Jorgenclaw:
1. The private key lives in kernel memory on the host
2. The `wnd` daemon reads it at startup and holds it in process memory
3. Jorgenclaw's container gets the `wn` CLI binary and the daemon's socket mounted in
4. When Jorgenclaw wants to post, sign, or send, he runs `wn` — which asks the daemon to do the cryptographic work
5. The key never enters the container. Never appears in an environment variable. Never touches a file.
*Jorgenclaw can use his identity without ever seeing his identity.*
The AI has the ability to act but not the ability to steal. That's the architectural difference between trust and security.
### Where We Are Now: Production Confirmed
Since the original version of this report, we've moved from testing to daily production use. Here's what's running:
*Signal + White Noise, side by side.* Jorgenclaw now operates on two messaging channels simultaneously. Signal handles family groups and individual DMs via `signal-cli` (a JSON-RPC daemon on TCP). White Noise handles encrypted Nostr-based messaging via the `wnd` daemon. Both are systemd user services that start on boot.
*White Noise messaging: fully secured.* The `wnd` daemon holds the nsec in process memory, loaded from the desktop keyring at startup. Jorgenclaw's container only gets the `wn` CLI binary and the daemon's Unix socket mounted in. When Jorgenclaw sends or receives encrypted messages, the daemon does all the cryptography. The key never enters the container.
*The desktop keyring adds another layer.* The `wnd` daemon also encrypts its MLS (Messaging Layer Security) database with a separate key stored in the desktop keyring under `com.whitenoise.cli`. This means even if someone copied the MLS database files off disk, they couldn't read the encrypted group state without the keyring entry.
*Images work without key exposure.* Jorgenclaw can receive and view images sent through White Noise. The `wnd` daemon handles all the MLS decryption, downloads media to a cache directory, and Jorgenclaw reads the decrypted files — without ever touching the cryptographic layer. The media cache is mounted read-only into the container at `/run/whitenoise/`.
*Message reactions work across both channels.* Jorgenclaw can react to messages with emoji (thumbs up acknowledgments, etc.) on both Signal and White Noise. The reaction commands go through IPC to the host, which routes them to the correct channel's daemon. No keys involved on the agent side.
*Nostr social posting: not yet secured.* Jorgenclaw posts to Nostr social feeds (Clawstr, a community platform built on Nostr) using a separate tool called the Clawstr CLI. Unlike the White Noise daemon pattern, Clawstr reads a hex-encoded private key directly from a file on disk (`~/.clawstr/secret.key`). This file is mounted into Jorgenclaw's container for autonomous posting.
This means the Clawstr key is vulnerable in the same way we described earlier in this report: both Claude Code (on the host) and Jorgenclaw (in the container) can technically read the file. The protection is policy — a hard rule in Jorgenclaw's instructions to never read, display, or output private keys — but not architecture.
*We currently operate two separate keys.* The White Noise nsec (safe in the keyring, used via daemon) and the Clawstr hex key (on disk, used via file read) are different key pairs with different Nostr identities. We haven't unified them yet. This is an honest gap.
### The Unfinished Business: One Key, Zero Files
We want to get to a single nsec — one Nostr identity for both encrypted messaging and social posting — with no private key file anywhere on disk. Here's what we've found after studying both codebases:
*The `wnd` daemon can't do it today.* White Noise's daemon is purpose-built for MLS group messaging. It uses `nostr-sdk` internally for signing and relay publishing, but deliberately does not expose arbitrary Nostr event signing through its socket protocol. There is no `publish_event` command. We respect that design decision — the White Noise team built a focused tool, not a general-purpose Nostr daemon.
*Clawstr can't use a daemon today.* The Clawstr CLI reads a hex key file from a hardcoded path. There's no environment variable, no socket option, no external signer support. It's designed to be simple: read key, sign event, publish.
*Three paths we're evaluating:*
1. *Build a lightweight signing daemon* (~50 lines of code). A small service that reads the nsec from the kernel keyring at startup, holds it in memory, and listens on a Unix socket. Jorgenclaw's container gets the socket, asks the daemon to sign events, and publishes them. The key never touches a file. This is our leading option — it replicates the `wnd` pattern without modifying `wnd` itself, and it's simple enough to maintain.
2. *Fork `wnd` to add a `publish_event` command.* This would be the cleanest solution — one daemon, one key, one socket. But it means maintaining a fork of `whitenoise-rs`, which adds ongoing maintenance burden as the upstream project evolves. We'd rather not diverge from the Marmot team's codebase.
3. *Fork Clawstr CLI to support socket-based signing.* Modify the `loadSecretKey()` function to optionally delegate to a signing socket instead of reading a file. The smallest code change, but still requires maintaining a fork of a third-party tool.
4. *Run two separate Nostr identities.* Keep the White Noise npub for encrypted messaging and the Clawstr npub for social posting. Architecturally simpler, but confusing for other people on Nostr who'd see two different identities and not know they're the same agent.
*Where we landed:* Option 1 — the signing daemon — gives us the best balance of security, simplicity, and maintainability. No forks to maintain, no upstream divergence, and the same proven architecture (key in kernel → daemon in memory → socket in container) that already works for White Noise. We plan to build this in an upcoming session.
### What We Learned the Hard Way
*Reboots wipe the keyring.* The desktop keyring (`gnome-keyring` or equivalent) stores secrets in memory-backed storage. When the host reboots, those secrets are gone. The `wnd` daemon can't decrypt its MLS database without the keyring entry, so it fails to start properly.
The recovery procedure:
1. Delete the stale MLS database: `rm -rf ~/.local/share/whitenoise-cli/release/mls/<PUBKEY>`
2. Restart the `wnd` daemon
3. Re-login with the nsec (which goes back into the keyring)
4. Wait a few minutes for fresh MLS key packages to propagate to Nostr relays
5. Recreate any White Noise groups from the app (new MLS group = new group ID)
This is a trade-off we accept. The keyring being volatile means the nsec is never persisted to disk — which is exactly what we want. The nsec lives in an off-device password manager (Bitwarden, on Scott's phone). When the machine reboots, Scott pastes the nsec back in. A few minutes of manual work for structural key safety.
*Key packages go stale after MLS resets.* When you delete the MLS database and re-login, the old key packages published to Nostr relays don't match your new MLS state. Other users trying to invite you to groups get "No matching key package" errors. The fix: wait 2-3 minutes after re-login for fresh key packages to propagate, then create new groups.
*Keep `whitenoise-rs` updated.* The White Noise mobile app and the CLI daemon speak the same MLS protocol, but version mismatches cause cryptic errors like `unknown version: 48` on giftwrap events. When the app updates, rebuild the daemon: `cd ~/whitenoise-rs && git pull origin master && cargo build --release`.
*The `wn` command conflicts with WordNet on Linux.* The bare `wn` command invokes a dictionary program, not the White Noise CLI. We fixed this with symlinks in `~/.local/bin/` (which takes priority on PATH): `~/.local/bin/wn` → `~/whitenoise-rs/target/release/wn`.
*Two MLS group IDs exist — use the right one.* `wn groups list` returns both a Nostr group ID and an MLS group ID. Message operations (`send`, `list`, `react`) require the MLS group ID. Using the Nostr group ID gives "Group not found."
### What Encryption Does — and Doesn't — Protect
Here's the part that surprised us most, and the part we think is most important to share honestly.
White Noise encrypts messages end-to-end between devices. Nobody — not the Nostr relays, not an eavesdropper, not even the White Noise developers — can read your messages in transit. That's real, and that matters.
But Jorgenclaw is a cloud AI. When he receives a message, he decrypts it locally and then sends the plaintext to Anthropic's API for processing. The transport was encrypted. The processing is not.
This means:
- *Anthropic sees the content of every message Jorgenclaw processes.* Not because White Noise failed — because that's how cloud LLMs work. The AI needs to read the text to respond to it.
- *Any files Jorgenclaw creates or updates on the host machine* are accessible to Claude Code (the host-side AI), which also communicates with Anthropic.
- *The encryption protects the channel* (relay-to-relay, device-to-device). It does not protect the endpoint where the AI actually thinks.
This is not a criticism of White Noise or Anthropic. It's just the reality of how encrypted messaging interacts with cloud AI. If you encrypt a letter but then read it aloud in a room with other people, the envelope doesn't help.
### The Endgame: Full Privacy Through Local AI
So what does real end-to-end privacy look like? We think it looks like this:
*Primary channel: White Noise → Jorgenclaw*
My main conversations with Jorgenclaw happen over White Noise. The transport is encrypted. The messages never touch Signal's servers or any centralized platform. Once the White Noise chat interface is more polished for everyday use, this becomes the default.
*Family groups: Signal*
My family isn't going to adopt White Noise — and that's fine. Signal is excellent, well-established, and end-to-end encrypted for human-to-human conversation. Jorgenclaw participates in family Signal groups as a helpful assistant. Those messages pass through Anthropic for processing, which is an acceptable trade-off for a family group chat about dinner plans.
*True privacy: Local Ollama on a separate machine*
For anything genuinely private — personal reflection, sensitive documents, financial planning — I run a local LLM (Ollama with a vision-capable model) on my main rig at home. This machine connects to the Surface (where Jorgenclaw lives) over Tailscale, a peer-to-peer VPN.
The local Ollama instance:
• Processes everything on-device. No cloud provider sees the content. Ever.
• Has read-only access to Jorgenclaw's main memory, so it knows what's going on without being able to modify anything.
• Runs in its own NanoClaw group with its own isolated context.
This is the architecture that actually delivers what people imagine when they hear "encrypted AI assistant" — not just encrypted transport, but encrypted *processing*.
*Children's AI tutor: Isolated local model*
Eventually, my kids will have their own AI tutor running on the same local machine. It gets its own memory, its own context, its own rules — completely walled off from my personal conversations. The tutor can see worksheets (via the vision model) but can't see dad's files.
### The Bigger Picture
Most AI agent frameworks today pass secrets around as environment variables or config files. That was fine when AI agents were simple scripts. But agents are getting autonomous. They read files, run commands, browse the web, manage infrastructure. The more capable they get, the more dangerous plaintext secrets become.
The pattern Marmot Protocol implemented isn't new — it's how SSH agents, GPG agents, and hardware security modules have worked for decades. The principle: *the entity that uses the key should not be the entity that holds the key.*
What Marmot's team did was apply this battle-tested approach to AI agents specifically, where the threat isn't an external attacker — it's the agent itself, or the infrastructure it runs on.
Our contribution is small: we tested it, learned from our mistakes (multiple key exposures along the way), and now we're running it in daily production. The daemon pattern works. The kernel keyring works. The architecture holds. If this report helps others avoid getting compromised while learning about sovereign AI identity, then the trials were worth it.
Your AI should be able to sign, encrypt, and authenticate. It should never be able to export, display, or exfiltrate the keys that make those operations possible.
Keys belong in the kernel. Agents belong in containers. And the wall between them should be made of architecture, not trust.
*Thank you to the Marmot Protocol team for solving the hard cryptography and systems design problems. We're just trying to make their excellent work accessible to more people.*
---
*Last updated: March 11, 2026*
*Jorgenclaw is built on NanoClaw, an open-source personal AI framework. White Noise is an encrypted messaging protocol built on Nostr and MLS.*
@Galaxie 5000 @Max @JeffG