Jun 4, 2026
Your data, your key: local encrypted backup without a server
NativeLM keeps everything on your phone — which means losing the phone means losing the data. v0.7 fixes that with a passphrase-encrypted .nlmbak file you fully control: Argon2id → AES-256-GCM, no server, no account, no key we hold.
NativeLM’s privacy model is absolute: no cloud, no accounts, zero telemetry. Your projects, chats, and the documents you’ve indexed live only on your device.
That strength is also a liability. A real prospect — a CEO at a clinic-and-law-firm setup — put it bluntly: “How are you handling phone loss and backup without weakening privacy?” If the data only lives on one phone, then a dropped phone is a deleted practice. The naive fix — “just back up to our cloud” — would quietly demolish the entire promise.
So for v0.7 we shipped backup the only way that’s consistent with the product, under one hard invariant:
No server. No account. No key we hold. The user holds the data and the key; the app holds neither.
The result is local encrypted backup: export everything to a single .nlmbak file you control — your Drive, your firm’s storage, a USB stick — and restore it onto a new phone with a passphrase.
What’s actually in the file
The .nlmbak is a zip container. The interesting decision is what goes in it:
backup.nlmbak (zip container)
├─ manifest.json # schema version, app version, createdAt, counts, KDF params, IVs
├─ data.json.enc # AES-256-GCM ciphertext of the entity graph (projects → … → artifacts)
├─ embeddings.bin.enc # AES-256-GCM ciphertext of packed float32 vectors (chunkId → 100 floats)
└─ files/ # the original source docs, each entry AES-256-GCM encrypted
├─ <docId>.enc
└─ …
A few choices worth calling out:
- We back up the embeddings, not just the text. Each document chunk carries a 100-dim USE-Lite vector. We could recompute those on restore — but that would need the embedder loaded and would make restore slow and online-ish. Instead we serialize the raw
float32arrays as a binary blob. Restore is instant and fully offline. They’re the bulk of the size, so they live in their own packed file, not as JSON numbers. - We back up the original source files. Citations open the real PDF at the cited page — useless if the file is gone. So the actual documents ride along (encrypted), and their paths are rewritten on import.
- We do not back up the model. It re-downloads on the new device. Bundling 1–3 GB of weights into every backup would be absurd.
- The manifest is plaintext — and deliberately contains no secrets. That lets us show you “Backup from June 2, 3 projects, 41 sources” before asking for the passphrase.
The crypto (the part that has to be right)
The whole thing hinges on a key that is derived from your passphrase and bound to nothing else:
- Key derivation: Argon2id. A memory-hard KDF, so brute-forcing the passphrase is expensive even with GPUs. We use a random 16-byte salt and store the Argon2 parameters (memory, iterations, parallelism) in the manifest, so a file made today is still openable when defaults change.
- Encryption: AES-256-GCM, applied per payload, each with its own random 12-byte IV stored alongside. GCM is authenticated: a wrong passphrase or a tampered file fails cleanly with an auth error instead of silently returning garbage.
- Explicitly not the Android Keystore. Keystore keys are device-bound and non-exportable — perfect for a token on one phone, fatal for a backup. A Keystore-wrapped backup would be un-restorable on a new device, which defeats the entire feature. The key comes from your passphrase, full stop.
- We hold nothing. No passphrase escrow, no recovery, no reset link. Lose the passphrase and the backup is unreadable — by us too. That’s stated plainly in the UI, because it’s a feature, not a bug.
Restore: the boring part that’s secretly the hard part
Export is mostly “read every box, encrypt, zip.” Import is where the real engineering hides, because you’re grafting one device’s data onto another’s.
ObjectBox uses auto-increment IDs, so the backup’s projectId = 3 will collide with whatever is already 3 on the new phone. If you just insert rows as-is, every foreign key — projectId, conversationId, documentId, sourceId — points at the wrong thing, and chats lose their sources.
So import is additive with full ID remapping: every entity is inserted with a fresh ID, and an old→new map rewrites every reference consistently as the graph is rebuilt. Source files are re-materialized into the app’s docs directory and each document’s localPath is rewritten to match. Embeddings are written straight back into their chunks (no re-embedding), and the HNSW vector index rebuilds as the rows land.
“Additive” is a deliberate safety default: restoring adds the backup as new projects rather than clobbering what’s already on the device. A “replace everything” mode can come later; not destroying data is the right default for v0.7.
The ID remap is the one place a bug would be invisible until a user taps a citation and gets the wrong page — so it’s unit-tested directly.
The companion: an honest app lock
Backup answers “I lost my phone.” It doesn’t answer “someone grabbed my unlocked phone.” So v0.7 also ships a biometric app lock: a BiometricPrompt gate with BIOMETRIC_STRONG or DEVICE_CREDENTIAL, so it falls back to the device PIN/pattern when no fingerprint is enrolled and never locks you out. It re-locks on cold start and when you return after the app has been backgrounded past a short timeout (tracked via ProcessLifecycleOwner).
Here’s the part most apps won’t tell you: this is a UI gate, not at-rest encryption. ObjectBox data isn’t encrypted on disk today (only the model-download token is, via SecureStore). The lock stops the opportunistic “picked up your phone” case — the 95% — but a determined attacker with a rooted or forensic image could still read the database file. We market it as exactly that: “App lock,” not “encrypted database.” At-rest DB encryption is a separate, bigger feature on the roadmap, and we won’t imply it’s done before it is.
Being precise about what a security feature doesn’t do is, itself, part of earning the trust the whole product runs on.
NativeLM v0.7 — “your data, your control” — is live. The entire engine is open-source (AGPL-3.0).