LLM Dependency Levers

Caching · Deterministic Algorithms · Pre-computation · Config-Driven Control

Interactive

Not Everything Needs
an LLM Call.

Our architecture is intentionally designed with levers to dial LLM dependency up or down — optimizing for speed, reducing latency, cutting cost, and improving reliability by using deterministic algorithms where an LLM adds no value.

The LLM is reserved for what it does best: natural language understanding, creative meal planning, and conversational flow. Everything else — ranking, caching, unit conversion, package optimization — is handled by purpose-built, auditable, fast code.

The LLM Dependency Spectrum

Every operation falls somewhere on a spectrum from "pure LLM reasoning" to "fully deterministic." We've deliberately shifted most operations left — toward speed and reliability.

Deterministic / Fast
LLM-Dependent / Flexible

Caching

Eliminate redundant LLM and API calls. Stale-while-revalidate ensures users never wait.

<1ms
cache hit
vs
800ms
BQ query
🔧

Deterministic Algorithms

Replace LLM reasoning with auditable, optimal solutions. ILP solver, unit tables, ranking stages.

Optimal
guaranteed
vs
~80%
LLM accuracy
🚀

Pre-computation

Pre-compute profiles, playbooks, semantic bridges. Context is ready before the LLM sees the query.

5ms
pre-built context
vs
1300ms
real-time fetch
💡

Why This Matters

Every LLM token costs money. Every LLM call adds latency. By routing deterministic operations away from the LLM, we achieve:

60-70%
of operations avoid the LLM entirely
260x
faster ranking via Redis cache
$0
LLM cost per cached response
100%
auditable deterministic decisions