77 lines
3.4 KiB
Markdown
77 lines
3.4 KiB
Markdown
# Architecture Changes Summary
|
|
|
|
## Problem Identified
|
|
|
|
The architecture had AI extraction happening in TWO places:
|
|
1. **`extractAndPrepareContent`**: Vision AI for images, AI processing for text with extractionPrompt
|
|
2. **Section generation**: AI aggregation of contentParts
|
|
|
|
This was:
|
|
- Redundant (double AI processing)
|
|
- Inconsistent (pre-extracted JSON had no AI, regular documents had AI)
|
|
- Against the desired architecture (documents should become contentParts like pre-extracted JSON)
|
|
|
|
## Solution Implemented
|
|
|
|
### 1. Removed AI Extraction from `extractAndPrepareContent`
|
|
|
|
**File**: `gateway/modules/services/serviceAi/subContentExtraction.py`
|
|
|
|
**Changes**:
|
|
- **Removed**: Vision AI extraction for images (lines 186-246)
|
|
- **Removed**: AI text processing with extractionPrompt (lines 260-334)
|
|
- **Updated**: Images with extract intent are now marked with `needsVisionExtraction=True` flag
|
|
- **Updated**: Regular documents mark images with `needsVisionExtraction=True` when extract intent is present
|
|
|
|
**Result**: Documents → contentParts (raw extraction only, no AI)
|
|
|
|
### 2. Added Vision AI Extraction in Section Generation
|
|
|
|
**File**: `gateway/modules/services/serviceAi/subStructureFilling.py`
|
|
|
|
**Changes**:
|
|
- **Added**: Vision AI extraction logic before aggregation (lines 553-610)
|
|
- **Added**: Vision AI extraction logic for single-part processing (lines 1074-1115)
|
|
- **Logic**:
|
|
- Checks if `part.typeGroup == "image"` AND `needsVisionExtraction == True` AND `intent == "extract"`
|
|
- Extracts text using Vision AI (`IMAGE_ANALYSE` operation)
|
|
- Replaces image part with text part for further processing
|
|
- Images with `contentFormat == "object"` (render intent) are rendered directly (no extraction)
|
|
|
|
**Result**: AI extraction happens ONLY during section generation
|
|
|
|
## Architecture Flow (After Changes)
|
|
|
|
### Document Input → ContentParts
|
|
1. **Regular documents**: `extractContent()` (NON-AI) → Raw contentParts
|
|
- Images with extract intent: `contentFormat="extracted"`, `needsVisionExtraction=True`
|
|
- Images with render intent: `contentFormat="object"` (rendered directly)
|
|
- Text: `contentFormat="extracted"` (raw text, no AI processing)
|
|
|
|
2. **Pre-extracted JSON**: Direct contentParts (no changes)
|
|
|
|
### Section Generation → AI Processing
|
|
1. **Images with extract intent**: Vision AI extraction → Text part → AI aggregation
|
|
2. **Images with render intent**: Rendered directly (no extraction)
|
|
3. **Text contentParts**: AI aggregation with extractionPrompt (if provided)
|
|
|
|
## Key Benefits
|
|
|
|
1. **Consistent Architecture**: Documents = raw contentParts (like pre-extracted JSON)
|
|
2. **Single Point of AI Processing**: Only in section generation
|
|
3. **Clear Separation**: Extraction vs Generation
|
|
4. **Intent-Based Logic**:
|
|
- `intent == "extract"` → Vision AI extraction during section generation
|
|
- `intent == "render"` → Direct rendering (no extraction)
|
|
- `contentFormat == "object"` → Embedded/referenced images (no extraction)
|
|
|
|
## Testing Checklist
|
|
|
|
- [ ] Regular documents create contentParts without AI extraction
|
|
- [ ] Images with extract intent are marked with `needsVisionExtraction=True`
|
|
- [ ] Images with render intent are marked with `contentFormat="object"`
|
|
- [ ] Section generation extracts images with Vision AI when needed
|
|
- [ ] Section generation renders images with object format directly
|
|
- [ ] Text contentParts are processed with AI during section generation
|
|
- [ ] Pre-extracted JSON flow still works correctly
|
|
|