Why Android's ActivityManager lies about RAM — and how litertlm-kmp works around it

Xiaomi, Realme, and OPPO inflate reported RAM with swap-to-flash. Here's how we detect it and prevent OOM crashes when loading on-device LLMs.

If you’re loading a 2–4 GB model into memory on Android, the first thing you do is check how much RAM is available. The standard API for that is ActivityManager.MemoryInfo:

val memInfo = ActivityManager.MemoryInfo()
activityManager.getMemoryInfo(memInfo)
val totalRam = memInfo.totalMem

On most devices, this works. On Xiaomi, Realme, and OPPO — which together represent roughly 40% of the global Android install base — it lies.

The problem: Virtual RAM Expansion

These OEMs ship a feature under various names — Xiaomi Memory Extension, Realme Dynamic RAM Expansion, OPPO RAM Expansion — that carves out a chunk of flash storage as swap space and adds it to the kernel’s reported MemTotal. A phone with 6 GB of physical RAM will report 8 GB or 10 GB to the operating system.

ActivityManager.MemoryInfo.totalMem reads MemTotal from the kernel, so it faithfully returns the inflated number. Your model-loading code sees “8 GB available”, decides it’s safe to load a 4 GB model, and begins mapping the weights into memory.

What happens next depends on how hard the device is hitting swap. Best case: the model loads but inference is agonizingly slow because the runtime is paging weight tensors in and out of flash. Worst case: the kernel’s OOM killer fires and your process dies mid-inference with no graceful error.

Why this matters for on-device LLMs

Cloud inference doesn’t have this problem — the model lives on a server with known, fixed hardware. On-device inference runs on whatever the user owns, and the user doesn’t know (or care) that their phone’s RAM spec is synthetic.

For litertlm-kmp, which loads Gemma-family models ranging from ~1.5 GB (E2B) to ~4 GB (E4B), choosing the right model variant for the device is a safety decision. Load a model that’s too large and the app crashes. Load one that’s too small and you’re leaving capability on the table.

The fix: read `/proc/meminfo` directly

The kernel exposes both physical and swap memory in /proc/meminfo. The key lines:

MemTotal:        7864320 kB    ← inflated (physical + swap)
SwapTotal:       2097152 kB    ← this is the OEM's RAM expansion

By subtracting SwapTotal from MemTotal, you get actual physical RAM. In Kotlin:

private fun getPhysicalRamMb(): Long {
    val memInfo = File("/proc/meminfo").readText()
    val memTotal = extractKb(memInfo, "MemTotal") ?: return fallbackFromActivityManager()
    val swapTotal = extractKb(memInfo, "SwapTotal") ?: 0L
    return (memTotal - swapTotal) / 1024 // convert kB → MB
}

private fun extractKb(text: String, key: String): Long? =
    text.lines()
        .find { it.startsWith("$key:") }
        ?.split("\\s+".toRegex())
        ?.getOrNull(1)
        ?.toLongOrNull()

In litertlm-kmp’s AndroidHardwareProvider, this feeds into a tiering system:

Physical RAM	Tier	Max model
< 4 GB	`LOW`	No on-device LLM (graceful refusal)
4–6 GB	`MID`	Gemma 4 E2B (~1.5 GB weights)
6–8 GB	`HIGH`	Gemma 4 E2B or E4B depending on available memory
> 8 GB	`ULTRA`	Any supported model

When swap is detected above 1 GB, the tier is forcibly downgraded. A device reporting 8 GB with 2 GB of swap gets classified as a 6 GB device (MID tier), and the model catalog offers the smaller variant.

The result

This single detection eliminated 100% of the OOM crashes we saw on Xiaomi Redmi Note series and Realme devices during testing. The fix is roughly 20 lines of code, but it took a full afternoon of crash logs to understand why the app was dying on devices that “should” have had enough memory.

If you’re building anything that allocates large contiguous memory on Android — ML models, video editors, game engines — don’t trust ActivityManager. Read /proc/meminfo directly.

The full implementation is in AndroidHardwareProvider inside the litertlm-kmp repository.

Why Android's ActivityManager lies about RAM — and how litertlm-kmp works around it

The problem: Virtual RAM Expansion

Why this matters for on-device LLMs

The fix: read `/proc/meminfo` directly

The result

Built on-device, in the open

Comments

Why Android's ActivityManager lies about RAM — and how litertlm-kmp works around it

The problem: Virtual RAM Expansion

Why this matters for on-device LLMs

The fix: read /proc/meminfo directly

The result

Stateful KV-cache sessions for on-device Gemma on Android

What on-device LLMs actually cost on mid-range Android

What's new in NativeLM v0.10.0: answering from the right document

Built on-device, in the open

Comments

The fix: read `/proc/meminfo` directly