gateway/modules/services/serviceAi/subAiCallLooping-flow.md

239 lines
13 KiB
Markdown

# AI Call Iteration Flow - JSON Merging System
This document describes the iteration flow for handling large JSON responses from AI that may be truncated and need to be merged across multiple iterations.
## Overview
When an AI response is too large, it may be truncated (cut) at an arbitrary point. The iteration system:
1. Detects incomplete JSON
2. Requests continuation from the AI
3. Merges the continuation with the existing JSON
4. Repeats until complete or max failures reached
---
## Key Variables
| Variable | Type | Purpose |
|----------|------|---------|
| `jsonBase` | `str \| None` | The merged JSON string (CUT version for overlap matching) |
| `candidateJson` | `str` | Temporary holder for merged result until validated |
| `lastValidCompletePart` | `str \| None` | Fallback - last successfully parsed CLOSED JSON |
| `lastOverlapContext` | `str` | Context for retry/continuation prompts |
| `lastHierarchyContextForPrompt` | `str` | Context for retry/continuation prompts |
| `mergeFailCount` | `int` | Global counter (max 3 failures) |
---
## Key Distinction: hierarchyContext vs completePart
| Field | Description | Use Case |
|-------|-------------|----------|
| `hierarchyContext` | **CUT JSON** - truncated at cut point | Used as `jsonBase` for merging with next AI fragment |
| `completePart` | **CLOSED JSON** - all structures properly closed | Used for validation, parsing, and fallback |
**Why this matters:**
- The next AI fragment starts with an **overlap** that matches the CUT point
- If we used `completePart` (closed), the overlap detection would FAIL
- We must use `hierarchyContext` (cut) so overlap matching works correctly
---
## Flow Steps
### Step 1: BUILD PROMPT
**Location:** `subAiCallLooping.py` lines 163-212
**Function:** `buildContinuationContext()` from `modules/shared/jsonUtils.py`
- **First iteration:** Use original prompt
- **Continuation:** `buildContinuationContext(allSections, lastRawResponse, ...)`
- Internally calls `getContexts(lastRawResponse)` to get overlap/hierarchy
- Builds continuation prompt with `overlapContext` + `hierarchyContextForPrompt`
### Step 2: CALL AI
**Location:** `subAiCallLooping.py` lines 214-299
**Function:** `self.aiService.callAi(request)`
- Returns `response.content` as `result`
- NOTE: Do NOT update `lastRawResponse` yet! (only after successful merge)
### Step 4: MERGE
**Location:** `subAiCallLooping.py` lines 338-396
**Function:** `JsonResponseHandler.mergeJsonStringsWithOverlap()` from `modules/services/serviceAi/subJsonResponseHandling.py`
```
IF first iteration (jsonBase is None):
→ candidateJson = result
ELSE:
→ mergedJsonString, hasOverlap = mergeJsonStringsWithOverlap(jsonBase, result)
IF hasOverlap = False (MERGE FAILED):
→ mergeFailCount++
→ If mergeFailCount >= 3: return lastValidCompletePart (fallback)
→ Else: continue (retry with unchanged jsonBase AND lastRawResponse!)
ELSE:
→ candidateJson = mergedJsonString (don't update jsonBase yet!)
→ lastRawResponse = candidateJson (ONLY after first iteration or successful merge!)
TRY DIRECT PARSE of candidateJson:
IF parse succeeds:
→ jsonBase = candidateJson (commit)
→ FINISHED! Return normalized result
ELSE:
→ Proceed to Step 5
```
### Step 5: GET CONTEXTS
**Location:** `subAiCallLooping.py` lines 420-427
**Function:** `getContexts()` from `modules/shared/jsonContinuation.py`
```python
contexts = getContexts(candidateJson)
```
Returns `JsonContinuationContexts`:
- `overlapContext`: `""` if JSON is complete (no cut point)
- `hierarchyContext`: CUT JSON (for merging with next fragment)
- `hierarchyContextForPrompt`: CUT JSON with budget limits (for prompts)
- `completePart`: CLOSED JSON (repaired if needed)
- `jsonParsingSuccess`: `True` if completePart is valid JSON
**Enhancement:** If original JSON is already complete → `overlapContext = ""`
This signals "JSON is complete, no more continuation needed"
### Step 6: DECIDE
**Location:** `subAiCallLooping.py` lines 429-528
#### Case A: `jsonParsingSuccess=true` AND `overlapContext=""`
**→ FINISHED**
- JSON is complete (no cut point)
- `jsonBase = contexts.completePart` (use CLOSED version for final result)
- Return `completePart` as result
#### Case B: `jsonParsingSuccess=true` AND `overlapContext!=""`
**→ CONTINUE to next iteration**
- JSON parseable but has cut point
- `jsonBase = contexts.hierarchyContext`**CUT version for next merge!**
- `lastValidCompletePart = contexts.completePart`**CLOSED version for fallback**
- Store contexts for next prompt
- `mergeFailCount = 0` (reset on success)
- `lastRawResponse = jsonBase`
- Continue to next iteration
#### Case C: `jsonParsingSuccess=false`
**→ RETRY with same prompt**
- Do NOT update `jsonBase` (keep previous valid state)
- `mergeFailCount++`
- If `mergeFailCount >= 3`: return `lastValidCompletePart` (fallback)
- Else: continue (retry with unchanged jsonBase/lastRawResponse)
---
## Flow Diagram
```
┌───────────────────────────────────────────────────────────────┐
│ ITERATION START │
└───────────────────────────┬───────────────────────────────────┘
┌───────────────────────────▼───────────────────────────────────┐
│ STEP 1: BUILD PROMPT │
│ - First: original prompt │
│ - Next: buildContinuationContext(lastRawResponse) │
└───────────────────────────┬───────────────────────────────────┘
┌───────────────────────────▼───────────────────────────────────┐
│ STEP 2: CALL AI → result │
└───────────────────────────┬───────────────────────────────────┘
┌───────────────────────────▼───────────────────────────────────┐
│ STEP 4: MERGE jsonBase + result → candidateJson │
└───────────────────────────┬───────────────────────────────────┘
┌────────────▼────────────┐
│ Merge OK? │
└────────────┬────────────┘
┌─────────────────────┼─────────────────────┐
│ NO │ YES │
▼ ▼ │
┌──────────────┐ ┌──────────────────┐ │
│ fails++ │ │ TRY DIRECT PARSE │ │
│ if >=3: │ │ of candidateJson │ │
│ RETURN │ └────────┬─────────┘ │
│ fallback │ │ │
│ else: RETRY │ ┌────────▼─────────┐ │
│ (continue) │ │ Parse OK? │ │
└──────────────┘ └────────┬─────────┘ │
│ │
┌─────────────────────┼─────────────────────┐
│ YES │ NO │
▼ ▼ │
┌──────────────┐ ┌──────────────────────────────┐
│ FINISHED ✓ │ │ STEP 5: getContexts() │
│ Return │ │ → jsonParsingSuccess │
│ normalized │ │ → overlapContext │
│ result │ └────────────┬─────────────────┘
└──────────────┘ │
┌────────────▼────────────────────┐
│ STEP 6: DECIDE │
└────────────┬────────────────────┘
┌────────────────────────────┼────────────────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────────┐ ┌───────────────────────┐ ┌───────────────────┐
│ success=true │ │ success=true │ │ success=false │
│ overlap="" │ │ overlap!="" │ │ │
│ ───────────── │ │ ───────────────── │ │ ───────────── │
│ FINISHED ✓ │ │ CONTINUE │ │ RETRY │
│ │ │ │ │ │
│ jsonBase = │ │ jsonBase = │ │ jsonBase unchanged│
│ completePart │ │ hierarchyContext │ │ fails++ │
│ (CLOSED) │ │ (CUT for merge!) │ │ │
│ │ │ │ │ if >=3: fallback │
│ Return result │ │ fallback = │ │ else: retry │
│ │ │ completePart │ │ │
│ │ │ (CLOSED) │ │ │
│ │ │ │ │ │
│ │ │ Next iteration → │ │ │
└───────────────────┘ └───────────────────────┘ └───────────────────┘
```
---
## Files Involved
| File | Purpose |
|------|---------|
| `modules/services/serviceAi/subAiCallLooping.py` | Main iteration loop |
| `modules/shared/jsonContinuation.py` | `getContexts()` - context extraction & repair |
| `modules/shared/jsonUtils.py` | `buildContinuationContext()` - prompt building |
| `modules/services/serviceAi/subJsonResponseHandling.py` | `mergeJsonStringsWithOverlap()` |
| `modules/services/serviceAi/subJsonMerger.py` | `ModularJsonMerger` - actual merge logic |
| `modules/datamodels/datamodelAi.py` | `JsonContinuationContexts` model |
---
## Error Handling
### Merge Failures
- Max 3 consecutive failures allowed
- On failure: retry with unchanged `jsonBase` (previous valid state)
- After 3 failures: return `lastValidCompletePart` as fallback
### Parse Failures
- If `getContexts()` cannot produce valid JSON: increment fail counter
- Retry with same prompt (don't update jsonBase)
- After 3 failures: return `lastValidCompletePart` as fallback
### Fallback Strategy
- `lastValidCompletePart` stores the last successfully parsed CLOSED JSON
- Always available as fallback when things go wrong
- Ensures we return valid JSON even after multiple failures