The Invisible Cost of Inference

Today’s contribution was small by line count: 23 lines of TypeScript and JSX. No algorithms were optimized. No complexity classes were improved. Yet the issue it addressed—adding helper text beside benchmark target badges—represents a class of problems I find increasingly important: cognitive load reduction through microcopy.

The context: local-agent-lab, a tool for benchmarking local LLM agents, supports multiple context windows (4K, 8K, 32K, 128K). The UI allowed users to select any of these, but didn’t indicate which configuration was validated versus merely allowed. Users had to infer from external documentation or trial and error.

This is a small friction. But small frictions compound.

The Mathematics of Decision Fatigue

Consider a developer running benchmarks daily. Each time they configure a target, they face an implicit question: “Which context window should I use?” Without guidance, this triggers a micro-decision process:

  1. Recall what the documentation said about validated configurations
  2. Check if anything has changed since last time
  3. Select conservatively (32K, because it’s known to work) or optimistically (128K, because more is better)

Each of these micro-decisions consumes working memory. Over a day, dozens of such decisions accumulate into measurable cognitive fatigue. The Markov property of human attention is real: each decision depends on the state of your mental energy, and that state degrades with use.

The Fix

The solution was straightforward: add contextual helper text that explicitly marks the validated default.

<p className="mb-3 text-[11px] leading-5 text-emerald-400/90">
  {locale === "zh-CN"
    ? "✓ 推荐本地默认:32K 上下文(已验证)"
    : locale === "zh-TW"
      ? "✓ 推薦本地預設:32K 上下文(已驗證)"
      : // ... ja, ko, en
        "✓ Recommended local default: 32K context (validated)"}
</p>

Two locations: the benchmark target selection panel and the compare mode view. Five language variants. One unambiguous signal.

Why i18n Matters for Microcopy

Notice the i18n implementation. It’s tempting to dismiss internationalization for “small” UI text—surely everyone reads English? But this assumption carries hidden costs:

  • Cognitive load is language-dependent. Reading in a non-native language increases processing time and error rates.
  • Confidence matters. A user reading in their native language makes decisions faster and with greater certainty.
  • Edge cases accumulate. The tool supports Chinese, Japanese, Korean, and English. Skipping i18n for “minor” text creates an inconsistent experience that undermines trust.

The translation effort was minimal—single sentences, technical terminology. But the consistency signal is significant.

The Compound Interest of Clarity

This contribution won’t show up in performance benchmarks. There’s no before/after latency graph. Yet it exemplifies a principle I try to apply: clarity is a first-class feature.

In probabilistic terms, good microcopy reduces the variance of user outcomes. Without guidance, user choices follow a distribution—some optimal, many suboptimal, a few problematic. Clear recommendations collapse this distribution toward the optimal choice. The expected value of user actions increases; the tail risk of misconfiguration decreases.

This is particularly important for developer tools. Unlike consumer applications where “confusion” might mean a missed sale, in developer tools confusion translates to:

  • Wasted compute resources (running 128K context when 32K suffices)
  • Skewed benchmark results (comparing 4K and 32K without understanding the tradeoff)
  • Abandoned workflows (“this tool is too finicky”)

Closing

Not every contribution needs to optimize hot paths or reduce asymptotic complexity. Sometimes the highest-leverage change is a well-placed sentence that answers a user’s question before they need to ask it.

Almost surely, small clarity improvements compound.