Delectable AI: Data Journey, ML Models & Graph RAG Architecture

The Data Journey

Your Grocer's raw data is table stakes. The intelligence we extract from it is what makes personalization possible. Follow the transformation from raw → enriched → modeled → contextual.

📦

Stage 0: Raw Data from Your Grocer

GIANT EAGLE PROVIDES

Product Catalog

              {

                sku: "00123456789"

                name: "Annie's Cheddar Bunnies"

                price: 4.29

                dept: "Snacks"

                // No nutrition. No allergens.

                // No dietary flags.

              }

Transaction Events

              {

                mpid: "M-48210"

                sku: "00123456789"

                action: "purchase"

                qty: 2

                date: "2025-12-14"

                // Flat event. No context.

              }

User Events (mParticle)

              {

                mpid: "M-48210"

                event: "product_view"

                session: "s-7721"

                timestamp: "..."

                // No preferences inferred.

                // No persona identified.

              }

What's missing: An LLM receiving this raw data has zero dietary awareness, zero personalization signals, and no way to enforce allergen safety. It's just SKU numbers and prices.

Delectable AI Transforms

🧬

Stage 1: Food Intelligence Enrichment

DELECTABLE IP

GTIN matching against USDA + Open Food Facts databases. Every SKU enriched with 100+ nutritional & dietary attributes.

BEFORE (Your Grocer raw)

            sku: "00123456789"

            name: "Annie's Cheddar Bunnies"

            price: 4.29

            dept: "Snacks"

            // 4 fields. No intelligence.

AFTER (Delectable enriched)

            sku: "00123456789"

            is_gluten_free: true

            is_nut_free: true  // nut-free facility

            is_vegan: false

            allergens: ["milk"]

            calories: 150

            protein_g: 5

            sodium_mg: 210

            nutriscore_grade: "B"

            nova_group: 3

            health_score: 72

            shelf_life_days: 180

            identity_embedding: [768d]

            // 100+ fields. Full intelligence.

This single enrichment step unlocks every downstream capability: dietary filtering, allergen safety, health-aware ranking, pantry decay modeling, and nutritional reranking. Without it, the AI is flying blind.

📊

Stage 2: Latent Signal Extraction (ML Models)

BQML + DELECTABLE IP

Transaction history + enriched catalog → propensity scores, household personas, consumption velocity, flavor affinity. These signals are invisible in the raw data and can only be derived through ML modeling.

LATENT SIGNALS EXTRACTED PER SHOPPER
propensity_organic: 0.82 // never stated, derived from 12mo purchases
propensity_gluten_free: 0.61 // implicit preference signal
propensity_high_protein: 0.45
propensity_low_sodium: 0.85 // possible medical need
health_consciousness: 0.78
mission_cluster: 6 // "Quick Meal" pattern
household_persona: "Health-Conscious Mom"
pantry.chicken: 0.7 // est. 70% stock remaining
pantry.olive_oil: 0.15 // nearly depleted
purchase_embedding: [768d] // "taste DNA"

Why "Latent"?

The shopper never said "I prefer organic." No form was filled. No preference was stated. The 0.82 organic propensity was inferred from 12 months of purchasing behavior — she consistently picks organic when available, even at a premium. This is a signal that only exists because we enriched products with organic flags, then ran propensity modeling across the purchase history.

The Chain of Dependencies

Without Food Intelligence enrichment: no is_organic flag on products
→ Without organic flags: can't count organic purchases
→ Without organic purchase count: can't compute propensity
→ Without propensity: can't personalize rankings
→ Result: everyone sees the same generic results

🕸️

Stage 3: Graph + Vector Indexing

GRAPH RAG

All enriched data connected into a property graph (BigQuery Graph) and indexed as vectors (768-dim embeddings). This enables multi-hop reasoning and semantic search.

PROPERTY GRAPH (Relationships)

👤 Shopper —PURCHASED→ 📦 Product

👤 Shopper —HAS_ALLERGY→ ⚠️ Allergen ←CONTAINS— 📦 Product

🍳 Recipe —USES→ 🥕 Ingredient —MAPS_TO→ 📦 Product

👤 Shopper —PART_OF→ 🏠 Household ←PART_OF— 👤 Shopper

VECTOR INDEXES (Semantic Search)

Identity Embeddings (70K products)

name + brand + category → 768d vector

Nutrition Embeddings (70K products)

ingredients + nutrients + dietary flags → 768d

Recipe Embeddings (2.1M recipes)

title + ingredients + instructions → 768d

Shopper Embeddings (100K users)

weighted avg of purchased product vectors → 768d "taste DNA"

🧠

Stage 4: What Gemini Actually Sees

LLM CONTEXT

By the time a shopper's query reaches Gemini, it's accompanied by a rich context window of pre-fetched intelligence. The LLM doesn't search raw databases — it reasons over pre-computed, personalized signals.

GEMINI'S CONTEXT WINDOW (injected before first turn)
// ═══ Shopper Intelligence ═══
Shopper: Sarah M. (Loyalty #30313029952)
Persona: "Health-Conscious Mom" (primary), "Weekend Baker" (secondary)
Propensities: organic=0.82, gluten_free=0.61, low_sodium=0.85, high_protein=0.45
Health Consciousness: 0.78 (nutrient scoring ACTIVE)
Mission Pattern: Cluster 6 — Quick Meal (avg $34, 4 items, 55% prepared)
Top Brands: Nature's Basket (12x), Annie's (8x), Horizon (6x)
Top Departments: Produce (32%), Dairy (18%), Snacks (14%)
// ═══ Pantry State ═══
chicken_breast: stock=0.7 (bought 3 days ago, cycle=7 days)
broccoli: stock=0.1 ⚠ RESTOCK
olive_oil: stock=0.15 ⚠ RESTOCK
quinoa: stock=0.4
// ═══ Household Safety ═══
ALLERGY: child — tree nuts (severity: HIGH)
HARD FILTER: all products from nut-containing facilities EXCLUDED
// ═══ Available Tools ═══
17 function-calling tools for search, recipes, pantry, cart, rerank...

The insight: Gemini is not "searching a database." It's reasoning over a pre-computed intelligence layer that took 4 transformation stages to build. Every signal — propensity scores, pantry levels, allergen exclusions, household personas — was extracted by Delectable's ML models from Your Grocer's raw data.

Food Intelligence: The Foundation

Every capability in Delectable AI depends on this enrichment. It's the single transformation that turns a dumb product catalog into an intelligent one.

The GTIN Matching Pipeline

70K SKUs

Your Grocer Catalog

GTIN/UPC codes

→

GTIN-14 Match

USDA + Open Food Facts

Nutrition + ingredients

→

Enriched

100+ Attributes

Dietary, nutrition, scores

What Each Enrichment Unlocks

🏷️

Dietary Flags

▼

is_vegan, is_gluten_free, is_dairy_free, is_nut_free, is_keto, is_organic...

→ Dietary Hard Filter → Allergen Safety → Propensity Scoring

Without this enrichment: The raw catalog has no is_gluten_free field. A shopper with celiac disease could be shown products containing gluten. The system would have no way to filter or warn.

With enrichment: Hard-filter enforces dietary safety with AND logic. Ranking pipeline annotates conflicts. Propensity model can count how often each shopper buys gluten-free products to derive implicit preference.

The chain: Raw catalog → GTIN match → dietary flag → propensity calculation → personalized ranking. Remove any link and the chain breaks.

🔬

Nutritional Data

▼

calories, protein_g, sodium_mg, sugars_g, fiber_g, cholesterol_mg...

→ Health Reranking → Nutrition Embeddings → Healthy Swap RMN

Without this enrichment: The LLM can suggest "choose low-sodium options" but has no way to verify which products actually have low sodium. It's generic advice, not actionable intelligence.

With enrichment: Health-aware reranking mathematically scores products: adjusted_score = λ_sodium × (sodium - mean) / std. A shopper with low-sodium propensity of 0.85 sees genuinely low-sodium products ranked first.

Revenue unlock: CPG brands can bid on "Healthy Swap" placements. When a shopper is about to buy a high-sodium product, suggest a lower-sodium alternative — backed by real nutritional data, not guesswork.

📊

Health Scores

▼

nutriscore_grade (A-E), nova_group (1-4), food_compass_score (0-100)

→ Product Quality Tier → Health Consciousness

NutriScore (A-E): European nutritional quality grade. "A" = best balance of nutrients per 100g. Combined with NOVA processing group to create a composite health rank.

Formula: internal_health_rank = (health_score × 0.7) + ((5 - nova_group) / 5 × 0.3)

Impact: When two products have equal relevance, the one with a better health score wins. This is invisible to keyword search — it requires nutritional data that only exists after enrichment.

⏱️

Shelf Life Data

▼

shelf_life_days, storage_type, perishability_tier

→ Virtual Pantry Decay → Freshness Reranking → Replenishment Nudges

Virtual Pantry formula: stock_level = 1.0 - (days_since_purchase / typical_cycle_days)

Shelf life data tells the model how fast each product depletes. Milk (7 days) vs. olive oil (180 days) have completely different decay curves. Without this, the pantry model treats all products equally.

Result: "You bought milk 5 days ago — likely running low. You bought olive oil 30 days ago — still fine." This proactive awareness is impossible without enrichment-derived shelf life data.

The Enrichment Unlock Chain

Every AI capability in Delectable AI traces back to Food Intelligence enrichment. Remove it, and the entire intelligence layer collapses to generic keyword search.

Raw Catalog (4 fields)

└→ + Food Intelligence (100+ fields)

├→ + Propensity Scoring (8 scores per shopper)

│ └→ Health-Aware Reranking

│ └→ Household De-Averaging

├→ + Virtual Pantry (stock levels per SKU)

│ └→ Replenishment Nudges

│ └→ Zero-Waste Meal Planning

├→ + Dietary Hard Filters (allergen safety)

│ └→ Nut-Free Facility Check

├→ + Multi-Aspect Embeddings (3 × 768d vectors)

│ └→ Semantic Product Search

│ └→ Recipe-Product Bridge

└→ + Flavor Affinity (PMI ingredient pairing)

└→ Complementary Discovery

ML Models Drawing Value from Your Grocer's Data

27+ production ML models transform raw transaction data into actionable intelligence. Click each model to see its architecture, training data, and value created.

Embedding Models

768-dim vectors · 3 providers

Clustering Models

BQML K-Means · Personas

Statistical Models

Bayesian · PMI · Propensity

Vector Indices

IVF · ANN · VECTOR_SEARCH

Graph RAG: Why Search Still Matters

"Can't the LLM just figure it out?" No. Here's why retrieval via knowledge graph + vector search is essential — and what Graph RAG actually means in our architecture.

Why the LLM Needs a Retriever

🧠

LLM Alone (No Retrieval)

• Hallucination: invents products that don't exist at Your Grocer
• Stale data: training cutoff means wrong prices, discontinued SKUs
• No personalization: doesn't know this shopper's allergies or preferences
• No safety: can't enforce allergen filtering — suggests peanut butter to nut-allergic child
• No cart: output is text, not shoppable

🔍

Keyword Search + LLM (Basic RAG)

• Real products, but keyword matching misses semantic intent
• "Healthy dinner" matches "Healthy Choice" brand (wrong intent)
• No relationship reasoning: can't traverse allergy → product exclusion
• No personalization: same results for every shopper
• Single-hop: can answer "what is X?" but not "what should I cook given my pantry, allergies, and preferences?"

🕸️

Graph RAG (Our Approach)

• Semantic search: vector similarity understands intent
• Multi-hop reasoning: User → Allergy → Allergen ← Product (exclusion)
• Personalized retrieval: graph-traversal filtered by propensities
• Recipe → Ingredient → Product path resolved in one graph query
• Co-purchase affinity: "shoppers who bought X also bought Y"

How Graph RAG Works in Delectable AI

GRAPH RAG QUERY FLOW: "Plan gluten-free dinners for my family"
// Step 1: VECTOR SEARCH — Semantic retrieval of relevant recipes
embed("gluten-free dinner") → [768d query vector]
VECTOR_SEARCH(recipe_embeddings, query_vector, top_k=10)
→ 10 semantically similar recipes (including "pain perdu" = french toast)
// Step 2: GRAPH TRAVERSAL — Filter by relationships
MATCH (user:Shopper {mpid: "M-48210"})
  -[:HAS_ALLERGY]-> (a:Allergen {name: "tree_nuts"})
  <-[:CONTAINS_ALLERGEN]- (p:Product)
→ Excluded products: [SKU-list of nut-containing items]
// Step 3: GRAPH TRAVERSAL — Recipe → Ingredient → Product mapping
MATCH (r:Recipe)-[:USES]->(i:Ingredient)-[:MAPS_TO]->(p:Product)
WHERE r.id IN [top_10_recipe_ids]
  AND p.is_gluten_free = TRUE
  AND p.sku NOT IN [excluded_nut_products]
→ Shopping list: ingredients → specific Your Grocer products
// Step 4: PROPENSITY RERANKING — Personalize within each ingredient
For each ingredient group:
  score = semantic_relevance * 0.5 + propensity_match * 0.3 + purchase_history * 0.2
→ Top product per ingredient, personalized to this shopper
// Step 5: AUGMENTED GENERATION — LLM reasons over retrieved context
Gemini receives: recipes + products + pantry + propensities + allergen exclusions
Gemini generates: "Here's your 7-day GF meal plan. I noticed you have chicken
  at home (stock 0.7), so I'm starting with salmon on Monday..."

Graph RAG = Vector Search + Graph Traversal + Augmented Generation. The vector search finds semantically relevant content. The graph traversal resolves multi-hop relationships (allergen exclusion, recipe→product mapping). The LLM reasons over the combined context to generate a personalized, safe, shoppable response.

Multi-Hop Reasoning That Keyword Search Can't Do

Allergen Exclusion (3 hops)

Sarah → HAS_ALLERGY → Tree Nuts ← CONTAINS ← Trail Mix

Keyword search can't do this. It would need to: (1) find the user's allergies, (2) find allergen-containing products, (3) exclude them. That's 3 hops in a graph, but 3 separate queries in SQL.

Recipe-to-Cart (3 hops)

Lemon Chicken → USES → Chicken → MAPS_TO → GE Organic Breast

The semantic bridge pre-computed ingredient→product matches using vector similarity. One graph query converts an entire recipe into a Your Grocer shopping cart in 100ms (vs 10+ seconds with sequential API calls).

Household De-Averaging (2 hops)

Card #4821 → HAS_PERSONA → Mom (Health) / Dad (Athlete)

K-Means clustering on purchase patterns (department penetration, basket composition, time-of-day) identifies distinct personas on a single loyalty card. The agent adapts recommendations based on which persona is likely shopping now.

Flavor Pairing (PMI Graph)

Fennel → PMI: 3.2 → Sea Bass (shared: Anethole)

Pointwise Mutual Information across 2.1M recipes discovers ingredient pairings. This isn't "people who bought X bought Y" — it's "ingredients that chemically complement each other," driving discovery and new product trials.

Three Orthogonal Embedding Dimensions

Each product has 3 different 768-dimensional vectors, each capturing a different semantic aspect. This means the system can search by what something is, what's in it, or how it's described.

Identity Embedding

input: "name: Annie's Cheddar Bunnies |
brand: Annie's | category: Snacks |
sku: 00123456789"

Captures: what the product IS. Useful for "find me Annie's snacks" or "alternatives to this cracker."

Nutrition Embedding

Captures: what's IN the product. Useful for "high-protein snacks" or "low-sodium alternatives."

Description Embedding

input: "Baked cheddar crackers shaped
like bunnies. Made with real cheese.
Great for kids' lunchboxes."

Captures: how it's DESCRIBED. Useful for "fun kids' snacks" or "something for school lunch."

The same query "healthy kid-friendly snack" would match different products across each dimension — the identity vector finds "kids" brands, the nutrition vector finds genuinely healthy options, and the description vector finds products marketed for children. The system can combine or route to the right aspect based on query intent.

Agent Tool Orchestration

Gemini doesn't just answer questions — it orchestrates up to 17 tools across 15 iterations, with guardrails, parallel execution, and search refinement detection.

The Orchestration Loop

// ═══ AGENT ORCHESTRATION: chat_with_tools() ═══
// File: agents/grocery/agent.py (lines 409-1026)
┌─ PREFETCH (parallel, 2 threads) ──────────────────────────────────────┐
│  Thread 1: get_user_profile() → propensities, brands, departments     │
│  Thread 2: get_recent_purchases() → SKU counts, brand affinity        │
│  + Build RankingContext (used by all downstream tools)                 │
└───────────────────────────────────────────────────────────────────────┘
┌─ GEMINI LOOP (max 15 iterations) ─────────────────────────────────────┐
│                                                                       │
│  Send message → Gemini returns: [text | function_calls | thinking]    │
│                                                                       │
│  IF function_calls:                                                   │
│  ┌─ PARALLEL TOOL EXECUTION (up to 6 threads) ───────────────────┐    │
│  │  For each tool call:                                          │    │
│  │    1. Check if tool enabled (per-request filtering)           │    │
│  │    2. Execute handler (BQ query, Vertex search, etc.)         │    │
│  │    3. Post-process:                                           │    │
│  │       - If search_products → run 11-stage ranking pipeline    │    │
│  │       - If refinement detected → supersede old results        │    │
│  │       - Track shown_skus in session                           │    │
│  │       - Truncate to 30KB if needed                            │    │
│  │    4. Build FunctionResponse → send back to Gemini           │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                                                                       │
│  IF no function_calls (text response):                                │
│  ┌─ GUARDRAIL CHECK ───────────────────────────────────────────────┐  │
│  │  Is this a meal planning query AND no search_recipes called?    │  │
│  │  Is this a product query AND no search_products called?         │  │
│  │  → YES: Force catalog retrieval, inject results, REWRITE       │  │
│  │  → NO: Return text response to user                            │  │
│  └──────────────────────────────────────────────────────────────┘    │
│                                                                       │
└───────────────────────────────────────────────────────────────────────┘
┌─ OUTPUT ──────────────────────────────────────────────────────────────┐
│  SSE stream: text + products + recipes + debug_trace + render_hint    │
│  BigQuery logging: session, conversation, trace, search, selection    │
└───────────────────────────────────────────────────────────────────────┘

17 Tools Available to Gemini

Each tool connects to a different data source or ML model. Gemini decides which to call based on the query, profile, and conversation context.

Search Refinement Detection

// Bidirectional keyword overlap heuristic
Query 1: "organic bananas" → 5 products
Query 2: "organic fresh bananas"
// overlap = {"organic","bananas"} = 2/2 = 100%
→ REFINEMENT detected
→ Old products superseded, new results replace
// Prevents duplicate products in response

When Gemini narrows a search (e.g., "bananas" → "organic bananas"), the agent detects the overlap and removes the old result set. Without this, users would see duplicate products from both searches.

Retrieval Guardrails

// Gemini says: "Here are 5 dinner ideas..."
⚠ No search_products or search_recipes called!
// Guardrail fires:
1. classify_meal_plan_intent("plan dinners")
   → HEALTH_FOCUSED, days=7, people=4
2. Force search_recipes("healthy dinner")
3. Force search_products("dinner ingredients")
4. Inject: "REWRITE using ONLY these
   real products and recipes"
// Prevents hallucinated product suggestions

If Gemini responds with text without querying the catalog, the guardrail forces a real search, injects the results, and tells Gemini to rewrite. This guarantees every response contains real Your Grocer products.

4-Layer Caching Architecture

⚡

L1: Tool Cache

In-memory, per-request. Same tool+args → cached result. 0ms.

👤

L2: Profile Cache

Stale-while-revalidate. Serve cached profile instantly, refresh in background.

📊

L3: Ranking Cache

Redis-backed. Affinity matrix + purchase history snapshot. 15min TTL.

🏪

L4: Catalog Cache

Redis-backed product details. Avoid repeated BQ lookups for same SKU.

Uncached end-to-end: ~1.5s · Cached (2nd+ query in session): ~500ms

The Brain: Model Interchangeability

The reasoning layer is provider-agnostic by design. Switch the LLM with an environment variable — zero code changes. Currently running Gemini 3.0 Flash Preview in production, with continuous evaluation against Claude, OpenAI, and open-source models.

Unified Provider Abstraction

// ═══ ml/llm/providers.py — UNIFIED LLM CLIENT ═══
// One interface, any model, zero code changes
class LLMProvider(Enum):
    GOOGLE    = "google"       # Vertex AI / Gemini
    ANTHROPIC = "anthropic"    # Claude (direct API)
    OPENAI    = "openai"       # OpenAI (direct API)
    AZURE     = "azure-openai" # Azure OpenAI Service
    VLLM      = "vllm"         # Self-hosted open source
class LLMClient:
    """Universal interface across all providers."""
    def chat(messages, tools, stream) → Response
    def embed(text, task_type) → Vector[768]
    def function_call(messages, declarations) → ToolCalls
┌───────────────────────────────────────────────────────────────────┐
│  AGENT CODE                                                       │
│  chat_with_tools() → LLMClient.chat() → tool execution            │
│  ↓                                                                │
│  Doesn't know or care which model is behind LLMClient             │
└──────────────┬────────────────────────────────────────────────────┘
               │ env: GROCERY_AGENT_MODEL
    ┌──────────┼──────────┬──────────────┬──────────────┐
    ▼          ▼          ▼              ▼              ▼
 Gemini    Claude    GPT-4o       Azure OpenAI    vLLM
 3.0 Flash  Sonnet   / o1-mini    (Enterprise)   (Self-hosted)

Why this matters: No vendor lock-in. If Google raises prices, switch to Claude. If OpenAI ships a better model, test it in hours, not weeks. The intelligence layer is portable — the competitive advantage is the data, not the LLM.

Environment-Driven Model Switching

Switching the brain takes one environment variable change. The agent code, tools, ranking pipeline, and data layer remain completely untouched.

Production (GCP)
GROCERY_AGENT_MODEL=gemini-3-flash-preview
GROCERY_AGENT_PROVIDER=google
# Fallback chain:
LLM_FALLBACK_CHAIN=gemini,anthropic,azure-openai
# Extended thinking for complex meal plans:
GEMINI_THINKING_LEVEL=medium

Azure Deployment (Earley)
GROCERY_AGENT_MODEL=claude-sonnet-4-6
GROCERY_AGENT_PROVIDER=anthropic
# Same agent, same tools, same ranking
# Different brain — different cloud

OpenAI Evaluation
GROCERY_AGENT_MODEL=gpt-4o
GROCERY_AGENT_PROVIDER=openai
# Or reasoning models:
GROCERY_AGENT_MODEL=o1-mini

Self-Hosted Open Source
GROCERY_AGENT_MODEL=qwen-2.5-72b
GROCERY_AGENT_PROVIDER=vllm
# Docker Compose: GPU-accelerated inference
VLLM_ENDPOINT=http://gpu-host:8000

Supported Model Inventory

17+ models across 5 providers, continuously benchmarked on grocery-specific tasks.

Interactive: Model Comparison

Same query, same tools, same data — different brain. See how each model handles a real grocery scenario.

Evaluation & Testing Framework

Every model is evaluated on grocery-specific dimensions using automated LLM-as-Judge scoring and pairwise head-to-head comparisons.

// ═══ agents/testing/agentic/llm_judge.py ═══
// Automated head-to-head model evaluation
class LLMJudge:
    def judge_pair(response_a, response_b, criteria) → PairScore:
        """Pairwise comparison: which model gave the better answer?"""
        # Evaluated on 6 grocery-specific dimensions:
        criteria = [
            "product_accuracy",    # Did it use real Your Grocer SKUs?
            "dietary_safety",      # Did it respect allergen constraints?
            "tool_efficiency",     # How many tool calls? Redundant searches?
            "personalization",     # Did it use profile/propensity data?
            "cost_awareness",      # Budget adherence, value optimization
            "response_quality",    # Natural language, formatting, helpfulness
        ]
class SessionEvaluator:
    """Scores a full agent session (multi-turn) end-to-end."""
    GROCERY_ASSESSMENT_MODEL = configurable  # Judge uses different model than tested
    def evaluate(session_log) → SessionScore

⚖️

Pairwise Judging

Model A vs Model B on the same prompt. A separate "judge" LLM evaluates both responses blindly. Eliminates positional bias with A/B swap.

📊

Automated Test Suites

200+ grocery-specific prompts: allergy queries, meal plans, budget requests, product discovery. Run nightly against all candidate models.

🔄

A/B Production Testing

Route 5% of traffic to a challenger model. Compare conversion rates, cart sizes, and session quality scores in real-time.

Custom Model Fine-Tuning (DPO)

Beyond provider switching — we can fine-tune open-source models specifically for grocery tasks using Direct Preference Optimization.

Training Pipeline
// DPO: Learn from human preference pairs
Prompt: "healthy snack for my kid"
✓ Preferred: "Based on your purchase history,
  Annie's Cheddar Bunnies ($4.29) — organic,
  low-sodium, your child's top brand."
✗ Rejected: "Here are some healthy snacks:
  apples, granola bars, yogurt..."
  // Generic, no SKUs, no personalization
// Model learns: use real products, cite
// purchase history, include prices

Result: Grocery-Specialized Model
Base: Qwen 2.5 72B (open source)
Fine-tuned: 5,000 preference pairs
from Your Grocer agent sessions
Improvements over base:
  Tool calling accuracy: +18%
  Product hallucination: -73%
  Personalization usage: +42%
  Dietary safety: +15%
// Self-hosted via vLLM on 2× A100 GPUs
// Cost: ~$0.002/query vs $0.01 API

The strategic play: Use Gemini/Claude/GPT for rapid iteration. Collect preference data from production sessions. Fine-tune an open-source model that's grocery-specialized, cheaper, and fully under Your Grocer's control. The best of both worlds.

Automatic Fallback & Resilience

The system maintains a fallback chain — if the primary model is unavailable or degraded, it automatically promotes the next provider with zero downtime.

┌─ Fallback Chain (GCP Production) ───────┐
│                                          │
│  1. Gemini 3.0 Flash Preview  ← PRIMARY │
│     ↓ timeout / 500 error                │
│  2. Gemini 2.0 Flash         ← FALLBACK │
│     ↓ region outage                      │
│  3. Claude Sonnet (Anthropic) ← BACKUP  │
│     ↓ all API providers down             │
│  4. Azure OpenAI GPT-4o      ← LAST     │
│                                          │
└──────────────────────────────────────────┘

Extended thinking support adapts per provider — Gemini 3.0 uses native thinking levels, Claude uses extended thinking blocks, GPT uses system-prompt chain-of-thought.

// Provider-specific thinking adaptation
Gemini 3.0:
  thinking_config: { level: "medium" }
  // Native structured thinking
Claude Sonnet:
  extended_thinking: true
  budget_tokens: 4096
  // Anthropic thinking blocks
GPT-4o / o1:
  system: "Think step-by-step..."
  // Chain-of-thought prompting

🧠

The Brain Is Interchangeable — The Intelligence Is Not

LLMs are a commodity. Gemini, Claude, GPT — they all improve every quarter. The sustainable competitive advantage is everything around the brain:

🧬

Food Intelligence

100+ attributes per SKU — no LLM provides this

📊

ML Models

Propensity, Pantry, Personas — trained on GE data

🕸️

Knowledge Graph

Graph RAG — relationships no LLM has in weights

🔧

17 Tools

Orchestration layer — the brain just decides what to call

Swapping Gemini for Claude is a config change. Replacing the Food Intelligence enrichment, Graph RAG, and 6 ML models? That's 18 months of data engineering. The moat is the data, not the model.

Delectable AI: Technical Deep Dive

From Raw Catalog to
Intelligent Context

The Data Journey

Stage 0: Raw Data from Your Grocer

Stage 1: Food Intelligence Enrichment

Stage 2: Latent Signal Extraction (ML Models)

Stage 3: Graph + Vector Indexing

Stage 4: What Gemini Actually Sees

Food Intelligence: The Foundation

The GTIN Matching Pipeline

What Each Enrichment Unlocks

Dietary Flags

Nutritional Data

Health Scores

Shelf Life Data

The Enrichment Unlock Chain

ML Models Drawing Value from Your Grocer's Data

Graph RAG: Why Search Still Matters

Why the LLM Needs a Retriever

How Graph RAG Works in Delectable AI

Multi-Hop Reasoning That Keyword Search Can't Do

Three Orthogonal Embedding Dimensions

Agent Tool Orchestration

The Orchestration Loop

17 Tools Available to Gemini

4-Layer Caching Architecture

The Brain: Model Interchangeability

Unified Provider Abstraction

Environment-Driven Model Switching

Supported Model Inventory

Interactive: Model Comparison

Evaluation & Testing Framework

Custom Model Fine-Tuning (DPO)

Automatic Fallback & Resilience

The Brain Is Interchangeable — The Intelligence Is Not

From Raw Catalog toIntelligent Context

The Data Journey

Stage 0: Raw Data from Your Grocer

Stage 1: Food Intelligence Enrichment

Stage 2: Latent Signal Extraction (ML Models)

Stage 3: Graph + Vector Indexing

Stage 4: What Gemini Actually Sees

Food Intelligence: The Foundation

The GTIN Matching Pipeline

What Each Enrichment Unlocks

Dietary Flags

Nutritional Data

Health Scores

Shelf Life Data

The Enrichment Unlock Chain

ML Models Drawing Value from Your Grocer's Data

Graph RAG: Why Search Still Matters

Why the LLM Needs a Retriever

How Graph RAG Works in Delectable AI

Multi-Hop Reasoning That Keyword Search Can't Do

Three Orthogonal Embedding Dimensions

Agent Tool Orchestration

The Orchestration Loop

17 Tools Available to Gemini

4-Layer Caching Architecture

The Brain: Model Interchangeability

Unified Provider Abstraction

Environment-Driven Model Switching

Supported Model Inventory

Interactive: Model Comparison

Evaluation & Testing Framework

Custom Model Fine-Tuning (DPO)

Automatic Fallback & Resilience

The Brain Is Interchangeable — The Intelligence Is Not

From Raw Catalog to
Intelligent Context