# Document Generation Architecture Analysis

## Current Flow

### 1. Document Input → ContentParts (`extractAndPrepareContent`)

**Location**: `gateway/modules/services/serviceAi/subContentExtraction.py`

**Flow**:
- Regular documents → Calls `extractContent()` (NON-AI extraction) → Creates contentParts with raw extracted text
- **BUT THEN**:
  - Images with "extract" intent → Calls Vision AI (line 190) → AI extraction
  - Text with "extract" intent + extractionPrompt → Calls AI processing (line 265) → AI extraction
- Pre-extracted JSON → Uses contentParts directly (no AI)

**Result**: ContentParts may already be AI-processed before structure generation

### 2. Structure Generation

**Location**: `gateway/modules/services/serviceAi/subStructureGeneration.py`

**Flow**:
- Uses contentParts (may already be AI-processed)
- Generates document structure (chapters, sections)

### 3. Section Generation (`_processSingleSection`)

**Location**: `gateway/modules/services/serviceAi/subStructureFilling.py`

**Flow**:
- Uses contentParts (which may already be AI-processed)
- Aggregates "extracted" contentParts with AI (line 554-682)
- Generates section content using `callAiWithLooping` with `useCaseId="section_content"`

## Issues Identified

### Issue 1: Duplicate AI Processing
- AI extraction happens in `extractAndPrepareContent` (for images/text)
- AI generation happens again in section generation
- This is redundant and inefficient

### Issue 2: Architecture Inconsistency
- Pre-extracted JSON files → contentParts directly (no AI)
- Regular documents → contentParts + AI extraction (inconsistent)
- User wants: Documents → contentParts (like pre-extracted JSON) → AI only in section generation

### Issue 3: Image Processing
- Images need Vision AI to extract text
- Currently happens in `extractAndPrepareContent`
- Question: Should this happen during section generation instead?

## Proposed Architecture

### Option A: Remove All AI from `extractAndPrepareContent`
- Documents → `extractContent()` → Raw contentParts (text, tables, etc.)
- Images → Keep as image contentParts (no Vision AI extraction)
- Section generation → Handle images with Vision AI when needed

**Pros**:
- Consistent with pre-extracted JSON flow
- Single point of AI processing (section generation)
- Clear separation of concerns

**Cons**:
- Images won't have extracted text until section generation
- May need to handle images differently in section generation

### Option B: Keep Vision AI for Images Only
- Documents → `extractContent()` → Raw contentParts
- Images → Vision AI extraction → Text contentParts
- Section generation → Uses text contentParts (no additional AI extraction)

**Pros**:
- Images get text extracted early
- Section generation can use text directly

**Cons**:
- Still has AI extraction before structure generation
- Inconsistent with user's request

## Recommendation

**Follow Option A** - Remove all AI extraction from `extractAndPrepareContent`:

1. **Documents → ContentParts** (like pre-extracted JSON):
   - Call `extractContent()` (NON-AI)
   - Create contentParts with raw extracted content
   - Images remain as image contentParts (no Vision AI)

2. **Section Generation**:
   - Handle images with Vision AI when needed
   - Aggregate all contentParts with AI
   - Single point of AI processing

**Benefits**:
- Clear architecture: Documents = raw contentParts
- Consistent with pre-extracted JSON flow
- AI processing only where needed (section generation)
- Easier to understand and maintain

## Questions to Resolve

1. **Image handling**: How should images be processed during section generation?
   - Option 1: Vision AI extraction happens automatically when image contentParts are used
   - Option 2: Images are passed to AI with Vision models during section generation
   - Option 3: Images remain as binary and are rendered directly (no text extraction)

2. **Text with extractionPrompt**: Should text contentParts with extractionPrompt be processed differently?
   - Currently: AI processing in `extractAndPrepareContent`
   - Proposed: Raw text → AI processing during section generation

3. **Performance**: Will deferring image extraction to section generation cause performance issues?
   - Need to test with multiple images