# Architecture Changes Summary

## Problem Identified

The architecture had AI extraction happening in TWO places:
1. **`extractAndPrepareContent`**: Vision AI for images, AI processing for text with extractionPrompt
2. **Section generation**: AI aggregation of contentParts

This was:
- Redundant (double AI processing)
- Inconsistent (pre-extracted JSON had no AI, regular documents had AI)
- Against the desired architecture (documents should become contentParts like pre-extracted JSON)

## Solution Implemented

### 1. Removed AI Extraction from `extractAndPrepareContent`

**File**: `gateway/modules/services/serviceAi/subContentExtraction.py`

**Changes**:
- **Removed**: Vision AI extraction for images (lines 186-246)
- **Removed**: AI text processing with extractionPrompt (lines 260-334)
- **Updated**: Images with extract intent are now marked with `needsVisionExtraction=True` flag
- **Updated**: Regular documents mark images with `needsVisionExtraction=True` when extract intent is present

**Result**: Documents → contentParts (raw extraction only, no AI)

### 2. Added Vision AI Extraction in Section Generation

**File**: `gateway/modules/services/serviceAi/subStructureFilling.py`

**Changes**:
- **Added**: Vision AI extraction logic before aggregation (lines 553-610)
- **Added**: Vision AI extraction logic for single-part processing (lines 1074-1115)
- **Logic**: 
  - Checks if `part.typeGroup == "image"` AND `needsVisionExtraction == True` AND `intent == "extract"`
  - Extracts text using Vision AI (`IMAGE_ANALYSE` operation)
  - Replaces image part with text part for further processing
  - Images with `contentFormat == "object"` (render intent) are rendered directly (no extraction)

**Result**: AI extraction happens ONLY during section generation

## Architecture Flow (After Changes)

### Document Input → ContentParts
1. **Regular documents**: `extractContent()` (NON-AI) → Raw contentParts
   - Images with extract intent: `contentFormat="extracted"`, `needsVisionExtraction=True`
   - Images with render intent: `contentFormat="object"` (rendered directly)
   - Text: `contentFormat="extracted"` (raw text, no AI processing)

2. **Pre-extracted JSON**: Direct contentParts (no changes)

### Section Generation → AI Processing
1. **Images with extract intent**: Vision AI extraction → Text part → AI aggregation
2. **Images with render intent**: Rendered directly (no extraction)
3. **Text contentParts**: AI aggregation with extractionPrompt (if provided)

## Key Benefits

1. **Consistent Architecture**: Documents = raw contentParts (like pre-extracted JSON)
2. **Single Point of AI Processing**: Only in section generation
3. **Clear Separation**: Extraction vs Generation
4. **Intent-Based Logic**: 
   - `intent == "extract"` → Vision AI extraction during section generation
   - `intent == "render"` → Direct rendering (no extraction)
   - `contentFormat == "object"` → Embedded/referenced images (no extraction)

## Testing Checklist

- [ ] Regular documents create contentParts without AI extraction
- [ ] Images with extract intent are marked with `needsVisionExtraction=True`
- [ ] Images with render intent are marked with `contentFormat="object"`
- [ ] Section generation extracts images with Vision AI when needed
- [ ] Section generation renders images with object format directly
- [ ] Text contentParts are processed with AI during section generation
- [ ] Pre-extracted JSON flow still works correctly