wiki/z-archive/appdoc/json_string_accumulation_implementation_plan.md

128 lines
5.4 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# JSON String Accumulation Implementation Plan
## Modules to Modify
### 1. ✅ `datamodelAi.py` - COMPLETED
- Added `JsonAccumulationState` class with `lastKpi` field
### 2. `subJsonResponseHandling.py` - NEW FUNCTIONS NEEDED
**Location:** `poweron/gateway/modules/services/serviceAi/subJsonResponseHandling.py`
**Functions to add:**
1. `cleanEncodingIssues(jsonString: str) -> str`
- Clean encoding issues from JSON string
- Generic, works for any JSON structure
2. `mergeJsonStringsWithOverlap(accumulated: str, newFragment: str) -> str`
- Merge two JSON strings with overlap detection
- Find longest common suffix/prefix
- Remove duplicates
- Generic string-based comparison
3. `isJsonComplete(parsedJson: Dict[str, Any]) -> bool`
- Check if parsed JSON is complete
- Recursive validation of all structures
- Generic, no content-type-specific logic
4. `finalizeJson(parsedJson: Dict[str, Any]) -> Dict[str, Any]`
- Add missing closing elements
- Repair corruption
- Generic recursive approach
5. `extractKpiFromResponse(aiResponse: str) -> Optional[int]`
- Extract percentage (0-100) from AI response
- Look for patterns: "45%", "45 percent", "45"
- Validate range
6. `validateKpiProgression(accumulationState: JsonAccumulationState, currentKpi: int) -> bool`
- Validate KPI progression
- increment < 0 False (went down)
- increment < 1 False (no progress)
- increment >= 1 → True (progress)
7. `accumulateAndParseJsonFragments(accumulatedJsonString: str, newFragmentString: str, allSections: List[Dict], iteration: int) -> Tuple[str, List[Dict], bool, Optional[Dict]]`
- Main accumulation function
- Clean encoding, merge strings, parse, extract sections
- Return: (accumulatedString, sections, isComplete, parsedResult)
### 3. `mainServiceAi.py` - MODIFY EXISTING FUNCTIONS
**Location:** `poweron/gateway/modules/services/serviceAi/mainServiceAi.py`
**Changes needed:**
1. **Import JsonAccumulationState:**
```python
from modules.datamodels.datamodelAi import JsonAccumulationState
```
2. **Modify `_extractSectionsFromResponse()`:**
- Add parameter: `accumulationState: Optional[JsonAccumulationState] = None`
- Change return type to include `Optional[JsonAccumulationState]`
- Add first iteration check:
- Try to parse
- If complete → return sections, True, parsed, None
- If incomplete → create JsonAccumulationState, return [], False, None, state
- For subsequent iterations:
- If accumulationState exists → call `accumulateAndParseJsonFragments()`
- Update accumulationState object
- Return updated state
3. **Modify iteration loop (around line 200-350):**
- Add: `accumulationState = None` before loop
- Modify `_extractSectionsFromResponse()` call to pass and receive accumulationState
- After AI call, extract KPI from response:
```python
if accumulationState and accumulationState.isAccumulationMode:
currentKpi = JsonResponseHandler.extractKpiFromResponse(result)
if currentKpi is not None:
if not JsonResponseHandler.validateKpiProgression(accumulationState, currentKpi):
logger.warning(f"Iteration {iteration}: KPI validation failed, stopping")
break
accumulationState.lastKpi = currentKpi
```
- Update continuation context building to include KPI question
### 4. `jsonUtils.py` - MODIFY EXISTING FUNCTION
**Location:** `poweron/gateway/modules/shared/jsonUtils.py`
**Changes needed:**
1. **Modify `buildContinuationContext()`:**
- Change truncation from 100+100 to 10+10 items (line 722-727)
- Add `kpiQuestion` to context dict:
```python
context["kpiQuestion"] = "Based on the delivered data so far, approximately what percentage (%) of the total required content has been delivered? Respond with an integer between 0-100."
```
### 5. `subPromptBuilderGeneration.py` - MODIFY EXISTING FUNCTION
**Location:** `poweron/gateway/modules/services/serviceGeneration/subPromptBuilderGeneration.py`
**Changes needed:**
1. **Modify `buildGenerationPrompt()`:**
- Check for `kpiQuestion` in continuationContext
- Add KPI question to continuation prompt if present:
```python
if continuationContext and continuationContext.get("kpiQuestion"):
continuationText += f"\n\n=== PROGRESS INDICATOR ===\n{continuationContext['kpiQuestion']}\n\n⚠ IMPORTANT:\n- If percentage goes DOWN in next iteration → Generation will stop (error detected)\n- If percentage doesn't increase by at least 1% → Generation will stop (no progress)\n- Only continue if percentage increases by 1% or more\n"
```
## Implementation Order
1. ✅ Add JsonAccumulationState to datamodelAi.py (DONE)
2. Add helper functions to subJsonResponseHandling.py
3. Add main function accumulateAndParseJsonFragments to subJsonResponseHandling.py
4. Modify _extractSectionsFromResponse in mainServiceAi.py
5. Modify iteration loop in mainServiceAi.py
6. Update buildContinuationContext in jsonUtils.py
7. Update buildGenerationPrompt in subPromptBuilderGeneration.py
## Testing Considerations
- Test complete JSON on first iteration (should NOT accumulate)
- Test incomplete JSON on first iteration (should start accumulation)
- Test string accumulation with overlaps
- Test encoding cleanup
- Test KPI extraction and validation
- Test repair failure handling (should continue with previous data)