gateway/modules/services/serviceGeneration/paths/ARCHITECTURE_CHANGES.md

77 lines
3.4 KiB
Markdown

# Architecture Changes Summary
## Problem Identified
The architecture had AI extraction happening in TWO places:
1. **`extractAndPrepareContent`**: Vision AI for images, AI processing for text with extractionPrompt
2. **Section generation**: AI aggregation of contentParts
This was:
- Redundant (double AI processing)
- Inconsistent (pre-extracted JSON had no AI, regular documents had AI)
- Against the desired architecture (documents should become contentParts like pre-extracted JSON)
## Solution Implemented
### 1. Removed AI Extraction from `extractAndPrepareContent`
**File**: `gateway/modules/services/serviceAi/subContentExtraction.py`
**Changes**:
- **Removed**: Vision AI extraction for images (lines 186-246)
- **Removed**: AI text processing with extractionPrompt (lines 260-334)
- **Updated**: Images with extract intent are now marked with `needsVisionExtraction=True` flag
- **Updated**: Regular documents mark images with `needsVisionExtraction=True` when extract intent is present
**Result**: Documents → contentParts (raw extraction only, no AI)
### 2. Added Vision AI Extraction in Section Generation
**File**: `gateway/modules/services/serviceAi/subStructureFilling.py`
**Changes**:
- **Added**: Vision AI extraction logic before aggregation (lines 553-610)
- **Added**: Vision AI extraction logic for single-part processing (lines 1074-1115)
- **Logic**:
- Checks if `part.typeGroup == "image"` AND `needsVisionExtraction == True` AND `intent == "extract"`
- Extracts text using Vision AI (`IMAGE_ANALYSE` operation)
- Replaces image part with text part for further processing
- Images with `contentFormat == "object"` (render intent) are rendered directly (no extraction)
**Result**: AI extraction happens ONLY during section generation
## Architecture Flow (After Changes)
### Document Input → ContentParts
1. **Regular documents**: `extractContent()` (NON-AI) → Raw contentParts
- Images with extract intent: `contentFormat="extracted"`, `needsVisionExtraction=True`
- Images with render intent: `contentFormat="object"` (rendered directly)
- Text: `contentFormat="extracted"` (raw text, no AI processing)
2. **Pre-extracted JSON**: Direct contentParts (no changes)
### Section Generation → AI Processing
1. **Images with extract intent**: Vision AI extraction → Text part → AI aggregation
2. **Images with render intent**: Rendered directly (no extraction)
3. **Text contentParts**: AI aggregation with extractionPrompt (if provided)
## Key Benefits
1. **Consistent Architecture**: Documents = raw contentParts (like pre-extracted JSON)
2. **Single Point of AI Processing**: Only in section generation
3. **Clear Separation**: Extraction vs Generation
4. **Intent-Based Logic**:
- `intent == "extract"` → Vision AI extraction during section generation
- `intent == "render"` → Direct rendering (no extraction)
- `contentFormat == "object"` → Embedded/referenced images (no extraction)
## Testing Checklist
- [ ] Regular documents create contentParts without AI extraction
- [ ] Images with extract intent are marked with `needsVisionExtraction=True`
- [ ] Images with render intent are marked with `contentFormat="object"`
- [ ] Section generation extracts images with Vision AI when needed
- [ ] Section generation renders images with object format directly
- [ ] Text contentParts are processed with AI during section generation
- [ ] Pre-extracted JSON flow still works correctly