Urja Labs Logo urjalabs
WorkWritingAbout GitHub ↗

Writing

Notes on shipping on-device AI

Jun 1, 2026

Shipping on-device RAG: Building NativeLM for Android

How we implemented fully offline document RAG using MediaPipe's USE-Lite and ObjectBox HNSW vector search to ground Gemma's chat answers in imported PDFs.

Jun 1, 2026

Why Android's ActivityManager lies about RAM — and how litertlm-kmp works around it

Xiaomi, Realme, and OPPO inflate reported RAM with swap-to-flash. Here's how we detect it and prevent OOM crashes when loading on-device LLMs.

May 30, 2026

Stateful KV-cache sessions for on-device Gemma on Android

How litertlm-kmp v0.3 makes multi-turn memory lossless and free — plus what an on-device CPU/GPU/NPU benchmark actually told me.

Urja Labs Logo urjalabs

On-device AI infrastructure.

Work Writing About GitHub LinkedIn
© 2026 Urja Labs Built on-device · open source