refactored ai service container (3000 lines) with submodules, and enhanced generation part with dynamic chapters

2025-12-26 00:16:14 +01:00 · 2025-12-26 00:16:14 +01:00 · 0eaaeb3550
commit 0eaaeb3550
parent b2b5761917
1 changed files with 881 additions and 0 deletions
--- a/appdoc/implementation_concept_generation_structure_rev4.md
+++ b/appdoc/implementation_concept_generation_structure_rev4.md
@ -0,0 +1,881 @@
+# Implementierungskonzept: Chapter-basierte Generierungs-Struktur
+
+## Übersicht
+
+Wechsel von section-basierter zu **chapter-basierter Struktur** zur Lösung folgender Probleme:
+
+1. Section-Generierungs-Prompts kennen den Standard-JSON-Schema nicht
+2. Gemischte Element-Typen können nicht korrekt verarbeitet werden
+3. Sections sind zu starr - können nicht mehrere Element-Typen enthalten
+
+## Kritische Analyse: Aggregation mehrerer ContentParts
+
+**Problem identifiziert:**
+- Bestimmte `content_type` (z.B. `table`, `bullet_list`) benötigen Aggregation mehrerer ContentParts
+- Beispiel: 20 Spesenbelege → eine Excel-Tabelle
+- Aktuell: Jeder ContentPart wird einzeln verarbeitet → keine Aggregation möglich
+
+**Lösung implementiert:**
+- Generische `_needsAggregation()` Funktion erkennt Aggregations-Bedarf
+- Wenn Aggregation nötig: Alle ContentParts zusammen an `callAi` übergeben
+- Verwendet `callAi` statt `callAiPlanning` für ContentParts-Unterstützung
+- Automatisches Chunking funktioniert auch bei aggregierten Parts
+
+**Unterstützte Aggregations-Typen:**
+- `table`: Mehrere Parts → eine Tabelle (z.B. Excel-Liste)
+- `bullet_list`: Mehrere Parts → eine Liste
+- Weitere Typen können einfach hinzugefügt werden
+
+## Kernprinzipien
+
+1. **Chapter-basierte Struktur**: Struktur-Generierung definiert Chapters, nicht Sections
+2. **Chapters enthalten Sections**: Jedes Chapter kann mehrere Sections unterschiedlicher Typen enthalten
+3. **Standard JSON Schema in Prompts**: Chapter-Generierungs-Prompts enthalten vollständiges Standard-JSON-Schema
+4. **Flexible Content-Verarbeitung**: Chapters können gemischte ContentParts enthalten
+5. **Hierarchische Überschriften**: Chapters sind hierarchische Überschriften (Level 1, 2, 3, etc.)
+
+---
+
+## Architektur
+
+### Chapters als Helper-Struktur
+
+Chapters sind eine intermediate Helper-Struktur für die Generierung. Die finale Output-Struktur bleibt unverändert:
+
+**Finale Output-Struktur:**
+```
+Document
+  └── Sections[]
+      └── content_type
+      └── elements[]
+```
+
+**Chapter-Struktur (Helper):**
+```
+ChapterStructure
+  └── Chapters[]
+      └── level, title
+      └── contentPartIds[]
+      └── contentPartInstructions{}
+      └── generationHint
+      └── Sections[] (generiert)
+          └── content_type
+          └── elements[]
+```
+
+**Wichtig:**
+- Chapter = Container zur Generierung eines Dokument-Teils mit Sections
+- Jedes Chapter hat eine vordefinierte Heading-Section (Chapter-Title + Level)
+- Finale Output-Struktur hat keine Chapters - nur Sections
+- Chapters werden zu Sections geflatten für das finale Output
+
+---
+
+## Workflow-Phasen
+
+**Wichtig - Debug-File-Logging:**
+- Alle AI-Calls und Responses werden in Debug-Files geloggt
+- Prompts: `{operationType}_{identifier}_prompt.txt`
+- Responses: `{operationType}_{identifier}_response.txt`
+- Beispiele:
+  - Phase 5C: `chapter_structure_generation_prompt.txt` / `chapter_structure_generation_response.txt`
+  - Phase 5D.1: `chapter_structure_{chapterId}_prompt.txt` / `chapter_structure_{chapterId}_response.txt`
+  - Phase 5D.2: `section_content_{sectionId}_prompt.txt` / `section_content_{sectionId}_response.txt`
+
+---
+
+### Phase 5B: Content Extraction
+
+**Was passiert:**
+- Extrahiert Content basierend auf Intents
+- Bereitet ContentParts mit Metadaten vor
+- Alle Extraktionen passieren VOR Struktur-Generierung
+
+**Output:**
+- Liste von ContentParts mit vollständigen Metadaten
+
+---
+
+### Phase 5C: Chapter-Struktur-Generierung
+
+**Was passiert:**
+- Generiert Chapter-Struktur (Table of Contents)
+- Definiert für jedes Chapter:
+  - Level, Title
+  - contentPartIds
+  - contentPartInstructions
+  - generationHint
+
+**Input:**
+- `userPrompt`: User-Anfrage
+- `contentParts`: Alle vorbereiteten ContentParts (bereits extrahiert)
+- `outputFormat`: Ziel-Format
+
+**Process:**
+```python
+async def _generateChapterStructure(
+    self,
+    userPrompt: str,
+    contentParts: List[ContentPart],
+    outputFormat: str,
+    parentOperationId: str
+) -> Dict[str, Any]:
+    structurePrompt = self._buildChapterStructurePrompt(
+        userPrompt=userPrompt,
+        contentParts=contentParts,
+        outputFormat=outputFormat
+    )
+    
+    # Debug: Log Prompt
+    self.services.utils.writeDebugFile(
+        structurePrompt,
+        "chapter_structure_generation_prompt"
+    )
+    
+    aiResponse = await self.services.ai.callAiPlanning(
+        prompt=structurePrompt
+    )
+    
+    # Debug: Log Response
+    self.services.utils.writeDebugFile(
+        aiResponse,
+        "chapter_structure_generation_response"
+    )
+    
+    structure = json.loads(
+        self.services.utils.jsonExtractString(aiResponse)
+    )
+    
+    return structure
+```
+
+**Prompt-Format:**
+```
+USER REQUEST: {userPrompt}
+
+AVAILABLE CONTENT PARTS:
+{contentPartsIndex}
+
+TASK: Generiere Chapter-Struktur für die zu generierenden Dokumente.
+
+Für jedes Chapter:
+- chapter id
+- level (1, 2, 3, etc.)
+- title
+- contentPartIds: [Liste von ContentPart-IDs]
+- contentPartInstructions: {
+    "partId": {
+        "instruction": "Wie Content strukturiert werden soll"
+    }
+}
+- generationHint: Beschreibung des Inhalts
+
+RETURN JSON:
+{
+  "metadata": {...},
+  "documents": [{
+    "chapters": [
+      {
+        "id": "chapter_1",
+        "level": 1,
+        "title": "Introduction",
+        "contentPartIds": ["part_ext_1"],
+        "contentPartInstructions": {...},
+        "generationHint": "...",
+        "sections": []
+      }
+    ]
+  }]
+}
+```
+
+**Output-Struktur:**
+```json
+{
+  "metadata": {"title": "...", "language": "de"},
+  "documents": [{
+    "chapters": [
+      {
+        "id": "chapter_summary",
+        "level": 1,
+        "title": "Summary",
+        "contentPartIds": ["extracted_doc1_part1"],
+        "contentPartInstructions": {
+          "extracted_doc1_part1": {
+            "instruction": "Erstelle Zusammenfassungsparagraph"
+          }
+        },
+        "generationHint": "Create summary",
+        "sections": []
+      }
+    ]
+  }]
+}
+```
+
+---
+
+### Phase 5D: Chapter-Content-Generierung
+
+**Zwei-Phasen-Ansatz:**
+
+#### Phase 5D.1: Sections-Struktur generieren
+
+**Was passiert:**
+- Generiert Sections-Struktur für jedes Chapter (ohne Content)
+- Sections enthalten: content_type, contentPartIds, generationHint, useAiCall
+- AI setzt `useAiCall` Flag direkt im JSON
+
+**useAiCall Flag:**
+- `useAiCall = true` wenn:
+  - `content_type != "paragraph"` (Transformation nötig)
+  - Oder spezifische Anweisungen in contentPartInstructions (nur Teile verwenden)
+- `useAiCall = false` sonst (Content direkt einfügen)
+
+**Process:**
+```python
+async def _generateChapterStructure(
+    self,
+    chapterStructure: Dict[str, Any],
+    contentParts: List[ContentPart],
+    userPrompt: str,
+    parentOperationId: str
+) -> Dict[str, Any]:
+    for doc in chapterStructure.get("documents", []):
+        for chapter in doc.get("chapters", []):
+            chapterId = chapter.get("id", "unknown")
+            chapterPrompt = self._buildChapterStructurePrompt(
+                chapter=chapter,
+                contentPartIds=chapter.get("contentPartIds"),
+                contentPartInstructions=chapter.get("contentPartInstructions"),
+                userPrompt=userPrompt
+            )
+            
+            # Debug: Log Prompt
+            self.services.utils.writeDebugFile(
+                chapterPrompt,
+                f"chapter_structure_{chapterId}_prompt"
+            )
+            
+            aiResponse = await self.services.ai.callAiPlanning(
+                prompt=chapterPrompt
+            )
+            
+            # Debug: Log Response
+            self.services.utils.writeDebugFile(
+                aiResponse,
+                f"chapter_structure_{chapterId}_response"
+            )
+            
+            sectionsStructure = json.loads(
+                self.services.utils.jsonExtractString(aiResponse)
+            )
+            
+            chapter["sections"] = sectionsStructure.get("sections", [])
+            
+            # Setze useAiCall Flag (falls nicht von AI gesetzt)
+            for section in chapter["sections"]:
+                if "useAiCall" not in section:
+                    contentType = section.get("content_type", "paragraph")
+                    useAiCall = contentType != "paragraph"
+                    
+                    # Prüfe contentPartInstructions
+                    if not useAiCall:
+                        for partId in section.get("contentPartIds", []):
+                            instruction = contentPartInstructions.get(partId, {}).get("instruction", "")
+                            if instruction and instruction.lower() not in ["include full text", "include all content"]:
+                                useAiCall = True
+                                break
+                    
+                    section["useAiCall"] = useAiCall
+    
+    return chapterStructure
+```
+
+**Prompt-Format:**
+```
+TASK: Generate Chapter Sections Structure
+
+CHAPTER METADATA:
+- Chapter ID: {chapterId}
+- Chapter Level: {chapterLevel}
+- Chapter Title: {chapterTitle}
+- Generation Hint: {generationHint}
+
+WICHTIG: Chapter hat bereits vordefinierte Heading-Section.
+Generiere NICHT eine Heading-Section für Chapter-Title!
+
+AVAILABLE CONTENT PARTS:
+{contentPartIds}  # Nur IDs, KEINE Previews!
+
+Für jeden ContentPart:
+- ContentPart ID: {partId}
+- Format: {contentFormat}
+- Instruction: {contentPartInstructions[partId].instruction}
+
+STANDARD JSON SCHEMA FOR SECTIONS:
+[... Standard JSON Schema ...]
+
+Return JSON:
+{
+  "sections": [
+    {
+      "id": "section_1",
+      "content_type": "paragraph",
+      "contentPartIds": ["part_ext_1"],
+      "generationHint": "...",
+      "useAiCall": false,  # AI setzt Flag direkt
+      "elements": []
+    }
+  ]
+}
+```
+
+#### Phase 5D.2: Sections mit ContentParts füllen
+
+**Was passiert:**
+- Füllt Sections separat mit ContentParts
+- Basierend auf `useAiCall` Flag:
+  - `useAiCall = true`: Separater AI-Call mit ContentPart(s) (Chunking bei großen Parts)
+  - `useAiCall = false`: Content direkt einfügen
+- Rendering/Reference content: Immer direkt ohne AI-Call
+
+**Aggregation mehrerer ContentParts:**
+- Bestimmte `content_type` benötigen Aggregation mehrerer Parts:
+  - `table`: Mehrere Parts → eine Tabelle (z.B. 20 Belege → Excel-Liste)
+  - `bullet_list`: Mehrere Parts → eine Liste
+  - `paragraph`: Kann auch aggregiert werden (z.B. Vergleich mehrerer Dokumente)
+- Wenn Aggregation nötig: Alle Parts zusammen an AI übergeben (nicht einzeln)
+- Verwendet `callAi` statt `callAiPlanning` für ContentParts-Unterstützung
+
+**Process:**
+```python
+async def _fillChapterSections(
+    self,
+    chapterStructure: Dict[str, Any],
+    contentParts: List[ContentPart],
+    userPrompt: str,
+    parentOperationId: str
+) -> Dict[str, Any]:
+    for doc in chapterStructure.get("documents", []):
+        for chapter in doc.get("chapters", []):
+            for section in chapter.get("sections", []):
+                elements = []
+                useAiCall = section.get("useAiCall", False)
+                contentType = section.get("content_type", "paragraph")
+                contentPartIds = section.get("contentPartIds", [])
+                
+                # Prüfe ob Aggregation nötig ist
+                needsAggregation = self._needsAggregation(
+                    contentType=contentType,
+                    contentPartCount=len(contentPartIds)
+                )
+                
+                if needsAggregation and useAiCall:
+                    # Aggregation: Alle Parts zusammen verarbeiten
+                    sectionParts = [
+                        self._findContentPartById(pid, contentParts)
+                        for pid in contentPartIds
+                    ]
+                    sectionParts = [p for p in sectionParts if p is not None]
+                    
+                    if sectionParts:
+                        sectionId = section.get("id", "unknown")
+                        sectionPrompt = self._buildSectionContentPrompt(
+                            section=section,
+                            contentParts=sectionParts,  # ALLE PARTS!
+                            generationHint=section.get("generationHint"),
+                            userPrompt=userPrompt
+                        )
+                        
+                        # Debug: Log Prompt
+                        self.services.utils.writeDebugFile(
+                            sectionPrompt,
+                            f"section_content_{sectionId}_prompt"
+                        )
+                        
+                        # Verwende callAi für ContentParts-Unterstützung
+                        request = AiCallRequest(
+                            prompt=sectionPrompt,
+                            contentParts=sectionParts,  # ALLE PARTS!
+                            options=AiCallOptions(
+                                operationType=OperationTypeEnum.DATA_ANALYSE,
+                                priority=PriorityEnum.BALANCED,
+                                processingMode=ProcessingModeEnum.DETAILED
+                            )
+                        )
+                        aiResponse = await self.services.ai.callAi(request)
+                        
+                        # Debug: Log Response
+                        self.services.utils.writeDebugFile(
+                            aiResponse.content,
+                            f"section_content_{sectionId}_response"
+                        )
+                        
+                        elements.extend(parseElements(aiResponse.content))
+                
+                else:
+                    # Einzelverarbeitung: Jeder Part einzeln
+                    for partId in contentPartIds:
+                        part = self._findContentPartById(partId, contentParts)
+                        if not part:
+                            continue
+                        
+                        contentFormat = part.metadata.get("contentFormat")
+                        
+                        if contentFormat == "extracted":
+                            if useAiCall:
+                                # AI-Call mit einzelnen ContentPart
+                                sectionId = section.get("id", "unknown")
+                                sectionPrompt = self._buildSectionContentPrompt(
+                                    section=section,
+                                    contentParts=[part],  # EIN PART
+                                    generationHint=section.get("generationHint"),
+                                    userPrompt=userPrompt
+                                )
+                                
+                                # Debug: Log Prompt
+                                self.services.utils.writeDebugFile(
+                                    sectionPrompt,
+                                    f"section_content_{sectionId}_prompt"
+                                )
+                                
+                                request = AiCallRequest(
+                                    prompt=sectionPrompt,
+                                    contentParts=[part],
+                                    options=AiCallOptions(...)
+                                )
+                                aiResponse = await self.services.ai.callAi(request)
+                                
+                                # Debug: Log Response
+                                self.services.utils.writeDebugFile(
+                                    aiResponse.content,
+                                    f"section_content_{sectionId}_response"
+                                )
+                                
+                                elements.extend(parseElements(aiResponse.content))
+                            else:
+                                # Content direkt einfügen
+                                elements.append({
+                                    "type": "paragraph",
+                                    "content": part.data or ""
+                                })
+                        
+                        elif contentFormat == "reference":
+                            elements.append({
+                                "type": "reference",
+                                "documentReference": part.metadata.get("documentReference")
+                            })
+                        
+                        elif contentFormat == "object":
+                            elements.append({
+                                "type": "image",
+                                "base64Data": part.data
+                            })
+                
+                section["elements"] = elements
+    
+    return chapterStructure
+
+def _needsAggregation(
+    self,
+    contentType: str,
+    contentPartCount: int
+) -> bool:
+    """
+    Bestimmt ob mehrere ContentParts aggregiert werden müssen.
+    
+    Aggregation nötig wenn:
+    - content_type erfordert Aggregation (table, bullet_list)
+    - UND mehrere ContentParts vorhanden sind (> 1)
+    
+    Args:
+        contentType: Section content_type
+        contentPartCount: Anzahl der ContentParts in dieser Section
+        
+    Returns:
+        True wenn Aggregation nötig, False sonst
+    """
+    aggregationTypes = ["table", "bullet_list"]
+    
+    if contentType in aggregationTypes and contentPartCount > 1:
+        return True
+    
+    # Optional: Auch für paragraph wenn mehrere Parts vorhanden
+    # (z.B. Vergleich mehrerer Dokumente)
+    if contentType == "paragraph" and contentPartCount > 1:
+        # Prüfe generationHint für Hinweise auf Aggregation
+        # (z.B. "Vergleiche", "Zusammenfassung", "Liste")
+        return False  # Standard: Keine Aggregation für paragraph
+    
+    return False
+```
+
+**Prompt-Format (für AI-Call):**
+
+**Einzelverarbeitung (ein ContentPart):**
+```
+TASK: Generate Section Content
+
+SECTION METADATA:
+- Section ID: {sectionId}
+- Content Type: {contentType}
+- Generation Hint: {generationHint}
+
+CONTEXT:
+- User Request: {userPrompt}
+- What to generate: {generationHint}
+
+CONTENT PART:
+- ContentPart ID: {partId}
+- Format: extracted
+- ContentPart wird als Parameter übergeben (nicht im Prompt!)
+- Kann sehr groß sein (z.B. 200MB) → Chunking automatisch
+
+STANDARD JSON SCHEMA FOR ELEMENTS:
+[... Standard JSON Schema ...]
+
+Return JSON:
+{
+  "elements": [
+    {"type": "paragraph", "content": "..."},
+    {"type": "table", "headers": [...], "rows": [...]}
+  ]
+}
+```
+
+**Aggregation (mehrere ContentParts):**
+```
+TASK: Generate Section Content
+
+SECTION METADATA:
+- Section ID: {sectionId}
+- Content Type: {contentType}  # z.B. "table"
+- Generation Hint: {generationHint}  # z.B. "Erstelle Excel-Liste aller Spesenbelege"
+
+CONTEXT:
+- User Request: {userPrompt}
+- What to generate: {generationHint}
+
+CONTENT PARTS (Aggregation):
+- Anzahl: {contentPartCount} ContentParts
+- Alle ContentParts werden als Parameter übergeben (nicht im Prompt!)
+- Jeder Part kann sehr groß sein → Chunking automatisch
+- WICHTIG: Aggregiere ALLE Parts zu einem Element (z.B. eine Tabelle)
+
+ContentPart IDs:
+{contentPartIds}  # Liste aller IDs
+
+STANDARD JSON SCHEMA FOR ELEMENTS:
+[... Standard JSON Schema ...]
+
+Return JSON:
+{
+  "elements": [
+    {
+      "type": "table",
+      "headers": ["Spalte1", "Spalte2", ...],
+      "rows": [
+        ["Daten aus Part 1", ...],
+        ["Daten aus Part 2", ...],
+        ...
+      ]
+    }
+  ]
+}
+```
+
+**Hauptfunktion:**
+```python
+async def _generateChapterContent(
+    self,
+    chapterStructure: Dict[str, Any],
+    contentParts: List[ContentPart],
+    userPrompt: str,
+    parentOperationId: str
+) -> Dict[str, Any]:
+    # Phase 5D.1: Sections-Struktur generieren
+    structureWithSections = await self._generateChapterStructure(
+        chapterStructure, contentParts, userPrompt, parentOperationId
+    )
+    
+    # Phase 5D.2: Sections mit ContentParts füllen
+    filledStructure = await self._fillChapterSections(
+        structureWithSections, contentParts, userPrompt, parentOperationId
+    )
+    
+    return filledStructure
+```
+
+---
+
+## Standard JSON Schema
+
+### Supported Section Types
+
+```python
+supportedSectionTypes = [
+    "table",
+    "bullet_list",
+    "heading",
+    "paragraph",
+    "code_block",
+    "image"
+]
+```
+
+### Section Element Types
+
+1. **Standard Elements:**
+   - `heading`, `paragraph`, `table`, `bullet_list`, `code_block`, `image`
+
+2. **Special Elements:**
+   - `extracted_text`: Extrahierter Text mit Source
+   - `reference`: Dokument-Referenz
+
+---
+
+## Flattening: Chapters zu Sections
+
+**Wichtig:** Finale Output-Struktur hat keine Chapters - nur Sections.
+
+```python
+def flattenChapterStructureToSections(
+    chapterStructure: Dict[str, Any]
+) -> Dict[str, Any]:
+    result = {
+        "metadata": chapterStructure.get("metadata", {}),
+        "documents": []
+    }
+    
+    for doc in chapterStructure.get("documents", []):
+        flattened_doc = {
+            "id": doc.get("id"),
+            "title": doc.get("title"),
+            "filename": doc.get("filename"),
+            "sections": []
+        }
+        
+        for chapter in doc.get("chapters", []):
+            # 1. Vordefinierte Heading-Section
+            heading_section = {
+                "id": f"{chapter['id']}_heading",
+                "content_type": "heading",
+                "elements": [{
+                    "type": "heading",
+                    "content": chapter.get("title"),
+                    "level": chapter.get("level", 1)
+                }]
+            }
+            flattened_doc["sections"].append(heading_section)
+            
+            # 2. Generierte Sections
+            flattened_doc["sections"].extend(chapter.get("sections", []))
+        
+        result["documents"].append(flattened_doc)
+    
+    return result
+```
+
+---
+
+## Pydantic Models
+
+```python
+class ContentPartInstruction(BaseModel):
+    instruction: str = Field(
+        description="Anweisung, wie der bereits extrahierte Content strukturiert werden soll"
+    )
+
+class Chapter(BaseModel):
+    id: str
+    level: int = Field(ge=1, le=6)
+    title: str
+    contentPartIds: List[str] = Field(default_factory=list)
+    contentPartInstructions: Dict[str, ContentPartInstruction] = Field(default_factory=dict)
+    generationHint: str
+    sections: List[Dict[str, Any]] = Field(default_factory=list)
+
+class ChapterStructure(BaseModel):
+    metadata: Dict[str, Any]
+    documents: List[Dict[str, Any]]
+    
+    def flattenToSections(self) -> Dict[str, Any]:
+        # Flattening-Logik
+        ...
+```
+
+---
+
+## Chunking für große ContentParts
+
+**Wichtig:** ContentParts können sehr groß sein (z.B. 200MB). Chunking passiert automatisch.
+
+**Flow:**
+1. `callAi` mit ContentParts → routet zu `processContentPartsWithAi`
+2. `processContentPartsWithAi` → `processContentPartWithFallback` für jeden Part
+3. Wenn Part zu groß → `chunkContentPartForAi` → Chunking passiert EINMAL
+4. Gechunkte Parts werden sequenziell verarbeitet
+5. `_callWithModel` macht kein weiteres Chunking
+
+**Keine Rekursion:**
+- Chunking passiert einmal pro ContentPart
+- Gechunkte Parts werden sequenziell verarbeitet (nicht rekursiv)
+- `_callWithModel` ruft nur Model auf (kein Chunking)
+
+---
+
+## Implementierungsanforderungen
+
+1. **Phase 5C**: Generiert Chapters statt Sections
+2. **Phase 5D.1**: Generiert Sections-Struktur mit useAiCall Flag
+3. **Phase 5D.2**: Füllt Sections basierend auf useAiCall Flag
+4. **Flattening**: Konvertiert Chapters zu finaler Section-Struktur
+5. **Pydantic Models**: ChapterStructure Model definieren
+6. **Standard-JSON-Schema**: In Chapter-Prompts enthalten
+7. **Renderer**: Bleiben unverändert (verwenden finale Section-Struktur)
+
+---
+
+## Wichtige Punkte
+
+1. **ContentParts Integration:**
+   - ContentParts kommen aus Phase 5B (bereits extrahiert)
+   - Phase 5D.1: Nur IDs im Prompt (keine Previews)
+   - Phase 5D.2: ContentParts als Parameter übergeben (nicht im Prompt)
+
+2. **useAiCall Flag:**
+   - AI setzt Flag direkt im JSON
+   - Fallback: Automatisch gesetzt basierend auf content_type und instructions
+   - Generisch, sprachunabhängig (keine Stichwort-Abfragen)
+
+3. **Aggregation mehrerer ContentParts:**
+   - Bestimmte content_types benötigen Aggregation (table, bullet_list)
+   - Wenn mehrere Parts vorhanden: Alle zusammen an AI übergeben
+   - Verwendet `callAi` statt `callAiPlanning` für ContentParts-Unterstützung
+   - Automatisches Chunking bei großen aggregierten Parts
+
+4. **Chunking:**
+   - Automatisch bei großen ContentParts
+   - Funktioniert auch bei Aggregation mehrerer Parts
+   - Keine Rekursion möglich
+   - Chunks werden sequenziell verarbeitet
+
+5. **Mehrere Dokumente:**
+   - Struktur unterstützt mehrere Dokumente mit eigenen Chapters
+
+6. **ContentPart Instructions:**
+   - ContentParts sind bereits extrahiert
+   - Instructions geben Kontext für Strukturierung
+   - Kein "usage" Feld (Format durch contentFormat klar)
+
+7. **Debug-File-Logging:**
+   - Alle AI-Calls und Responses werden in Debug-Files geloggt
+   - Prompts: `{operationType}_{identifier}_prompt.txt`
+   - Responses: `{operationType}_{identifier}_response.txt`
+   - Beispiele:
+     - Phase 5C: `chapter_structure_generation_prompt.txt` / `chapter_structure_generation_response.txt`
+     - Phase 5D.1: `chapter_structure_{chapterId}_prompt.txt` / `chapter_structure_{chapterId}_response.txt`
+     - Phase 5D.2: `section_content_{sectionId}_prompt.txt` / `section_content_{sectionId}_response.txt`
+
+---
+
+## Beispiel-Szenarien
+
+### Beispiel 1: Excel-Liste der Spesenbelege
+**User Prompt:** "Erstelle eine Excel-Liste der Spesenbelege"
+**Input:** 20 PDF-Dokumente, jedes mit einem Foto eines Beleges
+
+**Phase 5B:**
+- 20 PDFs werden extrahiert → 20 ContentParts (contentFormat: "extracted")
+
+**Phase 5C:**
+- Generiert Chapter mit allen 20 contentPartIds
+
+**Phase 5D.1:**
+- Generiert Section mit:
+  - `content_type: "table"`
+  - `contentPartIds: [part_1, ..., part_20]`
+  - `useAiCall: true`
+
+**Phase 5D.2:**
+- `_needsAggregation("table", 20)` → `True`
+- Alle 20 ContentParts werden zusammen an `callAi` übergeben
+- AI generiert eine Tabelle mit allen Belegdaten
+
+**Ergebnis:** ✅ Funktioniert mit Aggregationslogik
+
+---
+
+### Beispiel 2: Vergleich mehrerer Dokumente
+**User Prompt:** "Vergleiche die drei Verträge"
+**Input:** 3 PDF-Dokumente (Verträge)
+
+**Phase 5B:**
+- 3 PDFs werden extrahiert → 3 ContentParts
+
+**Phase 5C:**
+- Generiert Chapter mit allen 3 contentPartIds
+
+**Phase 5D.1:**
+- Generiert Section mit:
+  - `content_type: "table"` (für Vergleichstabelle)
+  - `contentPartIds: [part_1, part_2, part_3]`
+  - `useAiCall: true`
+
+**Phase 5D.2:**
+- `_needsAggregation("table", 3)` → `True`
+- Alle 3 ContentParts werden zusammen an `callAi` übergeben
+- AI generiert Vergleichstabelle
+
+**Ergebnis:** ✅ Funktioniert mit Aggregationslogik
+
+---
+
+### Beispiel 3: Liste von Produkten
+**User Prompt:** "Erstelle eine Liste aller Produkte aus den Katalogen"
+**Input:** 5 PDF-Dokumente (Produktkataloge)
+
+**Phase 5B:**
+- 5 PDFs werden extrahiert → 5 ContentParts
+
+**Phase 5C:**
+- Generiert Chapter mit allen 5 contentPartIds
+
+**Phase 5D.1:**
+- Generiert Section mit:
+  - `content_type: "bullet_list"`
+  - `contentPartIds: [part_1, ..., part_5]`
+  - `useAiCall: true`
+
+**Phase 5D.2:**
+- `_needsAggregation("bullet_list", 5)` → `True`
+- Alle 5 ContentParts werden zusammen an `callAi` übergeben
+- AI generiert eine Liste mit allen Produkten
+
+**Ergebnis:** ✅ Funktioniert mit Aggregationslogik
+
+---
+
+### Beispiel 4: Einzelnes Dokument
+**User Prompt:** "Zusammenfassung des Dokuments"
+**Input:** 1 PDF-Dokument
+
+**Phase 5B:**
+- 1 PDF wird extrahiert → 1 ContentPart
+
+**Phase 5C:**
+- Generiert Chapter mit 1 contentPartId
+
+**Phase 5D.1:**
+- Generiert Section mit:
+  - `content_type: "paragraph"`
+  - `contentPartIds: [part_1]`
+  - `useAiCall: true`
+
+**Phase 5D.2:**
+- `_needsAggregation("paragraph", 1)` → `False`
+- Einzelverarbeitung: 1 ContentPart wird an `callAi` übergeben
+- AI generiert Zusammenfassung
+
+**Ergebnis:** ✅ Funktioniert (keine Aggregation nötig)