wiki/z-archive/implementation/implementation_taskintentions_done.md

1571 lines
57 KiB
Markdown

# Task Intentions & Generic Looping System - Refactoring Architecture
## Executive Summary
This document outlines a comprehensive refactoring to enhance the generation system with:
1. **AI Service-Level Intent Detection**: Detect intent (document vs code) when `DATA_GENERATE` operation is called - workflow level remains unchanged
2. **Generic Looping System**: Parametrized looping infrastructure supporting different JSON formats and use cases
3. **Multiple Generation Paths**: Document, code, and image generation paths within the generation service, all unified as action result documents
4. **Smart Code Generation**: Multi-file projects with dependency handling, requirements.txt/package.json generation, and proper cross-file references
---
## Part 1: AI Service-Level Intent Detection
### 1.1 Current State
**Problem**:
- `DATA_GENERATE` operation type is used for both document and code generation
- No distinction at AI service level - always routes to document generation pipeline
- Code generation requests treated as document generation
- `IMAGE_GENERATE` already works correctly (no changes needed)
**Current Flow**:
```
User Request
Task Planning (unchanged)
Action Planning (selects ai.process)
ai.process → callAiContent(operationType=DATA_GENERATE)
Document Generation Pipeline (always) ❌ Wrong for code!
```
**Key Insight**:
- **Workflow level (task/action planning)**: Remains unchanged ✅
- **AI Service level**: Need to detect intent when `DATA_GENERATE` is called
- **Operation Types**:
- `IMAGE_GENERATE` → Already handles images correctly ✅
- `DATA_GENERATE` → Needs to split: document vs code
**Current Issue with `ai.process`**:
- `ai.process` creates `AiCallOptions(resultFormat=output_format)` - **no operationType set**
- `callAiContent()` defaults to `DATA_GENERATE` if operationType not set (line 623)
- If `resultType="png"` or `"jpg"` → still uses `DATA_GENERATE`, NOT `IMAGE_GENERATE`
- Image generation requests go through document pipeline instead of image pipeline
**Solution**: Detect image generation intent and set `operationType=IMAGE_GENERATE` when appropriate
### 1.2 Proposed Architecture
#### Intent Detection at AI Service Level
**Location**: `gateway/modules/services/serviceAi/mainServiceAi.py` and `callAiContent()`
**Principle**: When `DATA_GENERATE` operation is called, detect from prompt/content whether it's:
- **Document generation**: Reports, articles, formatted documents (existing behavior)
- **Code generation**: Executable code files (new behavior)
**No changes needed**:
- Task planning (remains unchanged)
- Action planning (remains unchanged)
- `IMAGE_GENERATE` operation (already works)
#### Intent Detection Logic
**NO AUTO-DETECTION**: Intent detection is NOT used in the new architecture.
**Architecture Principle**:
- **NO auto-detection**: Actions must explicitly provide `generationIntent`
- **Clear use cases**: Each action defines its intent explicitly
- **No fallback**: No fallback to old processing or detection logic
- **Fail fast**: If `generationIntent` is missing, raise error immediately
- **Explicit over implicit**: All intent must be explicitly specified - no guessing or inference
- **Format detection vs Intent detection**:
-**Format detection is acceptable**: Detecting image formats from explicit `resultType` parameter (e.g., "png", "jpg") is acceptable because it's based on an explicit parameter, not prompt analysis
-**Intent detection is NOT acceptable**: Detecting intent from prompt content or other inferred sources is not allowed - intent must be explicit
**Implementation**:
- All actions must pass explicit `generationIntent` parameter
- `callAiContent()` requires `generationIntent` for `DATA_GENERATE` operations
- No IntentDetector class needed - intent comes from action definition
- Image generation detection: `ai.process` detects image formats from `resultType` and sets `operationType=IMAGE_GENERATE` automatically (this is format detection based on explicit parameter, not intent detection from prompt)
#### AI Service Integration
**Modify**: `mainServiceAi.py` - `callAiContent()` method
```python
async def callAiContent(
self,
prompt: str,
options: Optional[AiCallOptions] = None,
documentList: Optional[DocumentReferenceList] = None,
contentParts: Optional[List[ContentPart]] = None,
outputFormat: str = None,
title: str = None,
parentOperationId: Optional[str] = None,
generationIntent: Optional[str] = None # NEW: Explicit intent from action (skips detection)
) -> AiResponse:
"""
Unified AI content generation with explicit intent requirement.
Args:
generationIntent: REQUIRED explicit intent ("document" | "code" | "image") from action.
NO auto-detection - actions must explicitly specify intent.
"""
options = options or AiCallOptions()
operationType = options.operationType or OperationTypeEnum.DATA_GENERATE
# Route based on operation type
if operationType == OperationTypeEnum.IMAGE_GENERATE:
# Image generation - already works correctly, no changes needed
return await self._handleImageGeneration(prompt, options, outputFormat)
elif operationType == OperationTypeEnum.DATA_GENERATE:
# Data generation - REQUIRES explicit generationIntent
if not generationIntent:
raise ValueError(
"generationIntent is required for DATA_GENERATE operation. "
"Actions must explicitly specify 'document' or 'code' intent. "
"No auto-detection - use qualified actions (ai.generateDocument, ai.generateCode)."
)
# Route based on explicit intent (no auto-detection, no fallback)
if generationIntent == "code":
# Route to code generation path
return await self._handleCodeGeneration(
prompt=prompt,
options=options,
contentParts=contentParts,
outputFormat=outputFormat,
title=title,
parentOperationId=parentOperationId
)
else:
# Route to document generation path (existing behavior)
return await self._handleDocumentGeneration(
prompt=prompt,
options=options,
documentList=documentList,
contentParts=contentParts,
outputFormat=outputFormat,
title=title,
parentOperationId=parentOperationId
)
# Other operation types (DATA_ANALYSE, DATA_EXTRACT, etc.) - existing logic
# ...
```
#### Generation Path Handlers
**New Methods in `mainServiceAi.py`**:
```python
async def _handleCodeGeneration(
self,
prompt: str,
options: AiCallOptions,
contentParts: Optional[List[ContentPart]],
outputFormat: str,
title: str,
parentOperationId: Optional[str]
) -> AiResponse:
"""Handle code generation using code generation path."""
from modules.services.serviceGeneration.paths.codePath import CodeGenerationPath
codePath = CodeGenerationPath(self.services)
return await codePath.generateCode(
userPrompt=prompt,
outputFormat=outputFormat,
contentParts=contentParts
)
async def _handleDocumentGeneration(
self,
prompt: str,
options: AiCallOptions,
documentList: Optional[DocumentReferenceList],
contentParts: Optional[List[ContentPart]],
outputFormat: str,
title: str,
parentOperationId: Optional[str]
) -> AiResponse:
"""Handle document generation using existing document path."""
# Existing document generation logic (unchanged)
# ...
```
#### Action Integration
**Enhancement**: Actions can pass explicit `generationIntent` to skip detection
**1. Enhance `ai.generateDocument` Action**
**Modify**: `generateDocument.py`
```python
async def generateDocument(self, parameters: Dict[str, Any]) -> ActionResult:
"""Generate documents - explicitly sets intent to 'document'."""
# ... existing code ...
aiResponse: AiResponse = await self.services.ai.callAiContent(
prompt=prompt,
options=options,
documentList=docRefList,
outputFormat=resultType,
title=title,
parentOperationId=parentOperationId,
generationIntent="document" # NEW: Explicit intent, skips detection
)
# ... rest of method ...
```
**2. Create New `ai.generateCode` Action**
**New File**: `generateCode.py`
```python
@action
async def generateCode(self, parameters: Dict[str, Any]) -> ActionResult:
"""
Generate code files - explicitly sets intent to 'code'.
Parameters:
- prompt (str, required): Description of code to generate
- documentList (list, optional): Reference documents
- resultType (str, optional): Output format (html, js, py, etc.). Default: based on prompt
"""
prompt = parameters.get("prompt")
if not prompt:
return ActionResult.isFailure(error="prompt is required")
documentList = parameters.get("documentList", [])
resultType = parameters.get("resultType")
# Auto-detect format from prompt if not provided
if not resultType:
promptLower = prompt.lower()
if ".html" in promptLower or "html file" in promptLower:
resultType = "html"
elif ".js" in promptLower or "javascript" in promptLower:
resultType = "js"
elif ".py" in promptLower or "python" in promptLower:
resultType = "py"
else:
resultType = "txt" # Default
# Prepare title
title = "Generated Code"
# Call AI service with explicit code intent
options = AiCallOptions(
operationType=OperationTypeEnum.DATA_GENERATE,
priority=PriorityEnum.BALANCED,
processingMode=ProcessingModeEnum.DETAILED
)
aiResponse: AiResponse = await self.services.ai.callAiContent(
prompt=prompt,
options=options,
documentList=docRefList,
outputFormat=resultType,
title=title,
parentOperationId=parentOperationId,
generationIntent="code" # Explicit intent, skips detection
)
# Convert to ActionResult (same as generateDocument)
# ...
```
**3. Enhance `ai.process` Action**
**Modify**: `process.py` - Detect image generation from resultType, require generationIntent for DATA_GENERATE
**Important**: Image format detection (png, jpg, etc.) is **format detection**, not intent detection. This is acceptable because it's based on explicit `resultType` parameter, not prompt analysis.
```python
async def process(self, parameters: Dict[str, Any]) -> ActionResult:
"""Universal AI document processing action."""
# ... existing code ...
# Detect image generation from resultType (format detection, not intent detection)
# This is acceptable because resultType is an explicit parameter, not inferred from prompt
resultType = parameters.get("resultType", "txt")
normalized_result_type = (str(resultType).strip().lstrip('.').lower() or "txt")
imageFormats = ["png", "jpg", "jpeg", "gif", "webp"]
isImageGeneration = normalized_result_type in imageFormats
# Build options with correct operationType
output_format = normalized_result_type.replace('.', '') or 'txt'
options = AiCallOptions(
resultFormat=output_format,
operationType=OperationTypeEnum.IMAGE_GENERATE if isImageGeneration else OperationTypeEnum.DATA_GENERATE
)
# Get generationIntent from parameters (REQUIRED for DATA_GENERATE)
generationIntent = parameters.get("generationIntent")
# For DATA_GENERATE, generationIntent is REQUIRED (no auto-detection, no fallback)
if options.operationType == OperationTypeEnum.DATA_GENERATE and not generationIntent:
raise ValueError(
"ai.process called with DATA_GENERATE but no generationIntent. "
"Use qualified actions (ai.generateDocument, ai.generateCode) instead, "
"or explicitly pass generationIntent parameter."
)
# ... existing code ...
# Pass generationIntent to callAiContent (REQUIRED for DATA_GENERATE)
if contentParts:
aiResponse = await self.services.ai.callAiContent(
prompt=aiPrompt,
options=options,
contentParts=contentParts,
outputFormat=output_format,
parentOperationId=operationId,
generationIntent=generationIntent # REQUIRED for DATA_GENERATE
)
else:
aiResponse = await self.services.ai.callAiContent(
prompt=aiPrompt,
options=options,
documentList=documentList,
outputFormat=output_format,
parentOperationId=operationId,
generationIntent=generationIntent # REQUIRED for DATA_GENERATE
)
# ... rest of method ...
```
**Behavior**:
- If `resultType` is image format (png, jpg, etc.) → Sets `operationType=IMAGE_GENERATE`
- For `DATA_GENERATE`: `generationIntent` is REQUIRED (no auto-detection, no fallback)
- If `generationIntent` not provided → Raises ValueError (fail fast)
- **Best Practice**: Use qualified actions (`ai.generateDocument`, `ai.generateCode`) instead of `ai.process`
**Rationale**:
- `ai.process` detects image generation from `resultType` and sets correct operationType
- For DATA_GENERATE, explicit intent is required - no auto-detection, no fallback
- Wrapper actions (`translateDocument`, `summarizeDocument`) must pass explicit `generationIntent`
- Clear use cases - no ambiguity
**4. `ai.translateDocument` and `ai.summarizeDocument` Actions**
**Current**: Both wrap `ai.process()` with specific prompts
**Enhancement**: Pass `generationIntent="document"` when calling `process()` internally
**Modify**: `translateDocument.py` and `summarizeDocument.py`
```python
# In translateDocument.py
processParams = {
"aiPrompt": aiPrompt,
"documentList": documentList,
"generationIntent": "document" # NEW: Explicit intent
}
if resultType:
processParams["resultType"] = resultType
return await self.process(processParams)
# In summarizeDocument.py
return await self.process({
"aiPrompt": aiPrompt,
"documentList": documentList,
"resultType": resultType,
"generationIntent": "document" # NEW: Explicit intent
})
```
**Summary**:
| Action | generationIntent | Behavior |
|--------|------------------|----------|
| `ai.generateDocument` | `"document"` | Explicit intent, skips detection ✅ |
| `ai.generateCode` | `"code"` | Explicit intent, skips detection ✅ |
| `ai.translateDocument` | `"document"` | Explicit intent (via process) ✅ |
| `ai.summarizeDocument` | `"document"` | Explicit intent (via process) ✅ |
| `ai.process` | REQUIRED | Must provide `generationIntent` for DATA_GENERATE, raises error if missing ❌ |
**Benefits**:
- **Efficiency**: Qualified actions skip detection (saves AI call)
- **Clarity**: Intent is explicit in action name
- **No Ambiguity**: Always clear use case - no auto-detection, no fallback
- **Consistency**: All actions must explicitly define intent
**Critical Requirements**:
- **NO auto-detection**: `callAiContent()` requires explicit `generationIntent` for DATA_GENERATE
- **NO fallback**: No fallback to old processing logic - raises error if intent missing
- **Clear use cases**: Always explicit - no ambiguity
- **Use qualified actions**: Prefer `ai.generateDocument`, `ai.generateCode` over generic `ai.process`
- **Fail fast**: Missing `generationIntent` raises ValueError immediately
---
## Part 2: Generic Looping System
### 2.1 Current State
**Current System**: `subAiCallLooping.py`
- Handles different JSON formats through early detection
- Format-specific routing (elements, chapters, sections)
- Continuation context built for sections (not generic)
- No parametrized configuration
**Issues**:
- Hard-coded format detection
- Continuation context mismatch for different formats
- No accumulation support for all formats
- Not easily extensible for new formats
### 2.2 Proposed Generic Looping System
#### Looping Use Case Configuration
**New Class**: `LoopingUseCase`
```python
@dataclass
class LoopingUseCase:
"""Configuration for a specific looping use case."""
# Identification
useCaseId: str # "section_content", "chapter_structure", "code_structure", "code_content"
# JSON Format Detection
jsonTemplate: Dict[str, Any] # Expected JSON structure template
detectionKeys: List[str] # Keys to check for format detection (e.g., ["elements"], ["chapters"], ["files"])
detectionPath: str # JSONPath to check (e.g., "documents[0].chapters", "files[0].content")
# Prompt Building
initialPromptBuilder: Callable # Function to build initial prompt
continuationPromptBuilder: Callable # Function to build continuation prompt
# Accumulation & Merging
accumulator: Optional[Callable] = None # Function to accumulate fragments
merger: Optional[Callable] = None # Function to merge accumulated data
# Continuation Context
continuationContextBuilder: Optional[Callable] = None # Build continuation context for this format
# Result Building
resultBuilder: Optional[Callable] = None # Build final result from accumulated data
# Metadata
supportsAccumulation: bool = True # Whether this use case supports accumulation
requiresExtraction: bool = False # Whether this requires extraction (like sections)
```
#### Use Case Registry
**New Module**: `gateway/modules/services/serviceAi/subLoopingUseCases.py`
```python
class LoopingUseCaseRegistry:
"""Registry of all looping use cases."""
def __init__(self):
self.useCases: Dict[str, LoopingUseCase] = {}
self._registerDefaultUseCases()
def register(self, useCase: LoopingUseCase):
"""Register a new use case."""
self.useCases[useCase.useCaseId] = useCase
def get(self, useCaseId: str) -> Optional[LoopingUseCase]:
"""Get use case by ID."""
return self.useCases.get(useCaseId)
def detectUseCase(self, parsedJson: Dict[str, Any]) -> Optional[str]:
"""Detect which use case matches the JSON structure."""
for useCaseId, useCase in self.useCases.items():
if self._matchesFormat(parsedJson, useCase):
return useCaseId
return None
def _matchesFormat(self, json: Dict[str, Any], useCase: LoopingUseCase) -> bool:
"""Check if JSON matches use case format."""
for key in useCase.detectionKeys:
if key in json:
return True
# Check nested path
if useCase.detectionPath:
try:
from jsonpath_ng import parse
jsonpath_expr = parse(useCase.detectionPath)
matches = [match.value for match in jsonpath_expr.find(json)]
if matches:
return True
except:
pass
return False
def _registerDefaultUseCases(self):
"""Register default use cases."""
# Use Case 1: Section Content Generation
self.register(LoopingUseCase(
useCaseId="section_content",
jsonTemplate={"elements": []},
detectionKeys=["elements"],
detectionPath="",
initialPromptBuilder=buildSectionContentPrompt,
continuationPromptBuilder=buildSectionContentContinuationPrompt,
accumulator=None, # Direct return, no accumulation
merger=None,
continuationContextBuilder=buildSectionContinuationContext,
resultBuilder=None, # Return JSON directly
supportsAccumulation=False,
requiresExtraction=False
))
# Use Case 2: Chapter Structure Generation
self.register(LoopingUseCase(
useCaseId="chapter_structure",
jsonTemplate={"documents": [{"chapters": []}]},
detectionKeys=["chapters"],
detectionPath="documents[0].chapters",
initialPromptBuilder=buildChapterStructurePrompt,
continuationPromptBuilder=buildChapterStructureContinuationPrompt,
accumulator=None, # Direct return, no accumulation
merger=None,
continuationContextBuilder=buildChapterContinuationContext,
resultBuilder=None, # Return JSON directly
supportsAccumulation=False,
requiresExtraction=False
))
merger=mergeDocumentSections,
continuationContextBuilder=buildDocumentContinuationContext,
resultBuilder=buildDocumentResultFromSections,
supportsAccumulation=True,
requiresExtraction=True
))
# Use Case 4: Code Structure Generation (NEW)
self.register(LoopingUseCase(
useCaseId="code_structure",
jsonTemplate={
"metadata": {
"language": "",
"projectType": "single_file|multi_file",
"projectName": ""
},
"files": [
{
"id": "",
"filename": "",
"fileType": "",
"dependencies": [], # List of file IDs this file depends on
"imports": [], # List of import statements (for dependency extraction)
"functions": [], # Function signatures for cross-file references
"classes": [] # Class definitions for cross-file references
}
]
},
detectionKeys=["files"],
detectionPath="files",
initialPromptBuilder=buildCodeStructurePrompt,
continuationPromptBuilder=buildCodeStructureContinuationPrompt,
accumulator=None, # Direct return
merger=None,
continuationContextBuilder=buildCodeContinuationContext,
resultBuilder=None,
supportsAccumulation=False,
requiresExtraction=False
))
# Use Case 5: Code Content Generation (NEW)
self.register(LoopingUseCase(
useCaseId="code_content",
jsonTemplate={"files": [{"content": "", "functions": []}]},
detectionKeys=["content", "functions"],
detectionPath="files[0].content",
initialPromptBuilder=buildCodeContentPrompt,
continuationPromptBuilder=buildCodeContentContinuationPrompt,
accumulator=accumulateCodeContent,
merger=mergeCodeContent,
continuationContextBuilder=buildCodeContentContinuationContext,
resultBuilder=buildCodeResultFromContent,
supportsAccumulation=True,
requiresExtraction=False
))
resultBuilder=None,
supportsAccumulation=False,
requiresExtraction=False
))
```
#### Refactored Looping System
**Refactor**: `subAiCallLooping.py`
```python
class AiCallLooper:
"""Generic looping system with parametrized use cases."""
def __init__(self, services, aiService, responseParser):
self.services = services
self.aiService = aiService
self.responseParser = responseParser
self.useCaseRegistry = LoopingUseCaseRegistry()
async def callAiWithLooping(
self,
prompt: str,
options: AiCallOptions,
useCaseId: str, # REQUIRED: Explicit use case ID
debugPrefix: str = "ai_call",
promptArgs: Optional[Dict[str, Any]] = None,
operationId: Optional[str] = None,
userPrompt: Optional[str] = None,
contentParts: Optional[List[ContentPart]] = None
) -> str:
"""
Generic looping system with parametrized use case.
Args:
useCaseId: REQUIRED explicit use case ID (e.g., "code_structure", "section_content", "chapter_structure")
promptArgs: Optional arguments for prompt builders
... (other args)
"""
maxIterations = 50
iteration = 0
accumulatedData = {} # Generic accumulation (replaces allSections)
lastRawResponse = None
# Get use case (REQUIRED - no auto-detection)
useCase = self.useCaseRegistry.get(useCaseId)
if not useCase:
raise ValueError(f"Use case '{useCaseId}' not found in registry. Available use cases: {list(self.useCaseRegistry.useCases.keys())}")
while iteration < maxIterations:
iteration += 1
# Build prompt using use case
if iteration == 1:
# Initial prompt
currentPrompt = useCase.initialPromptBuilder(
prompt=prompt,
**promptArgs or {}
)
else:
# Continuation prompt
continuationContext = None
if useCase.continuationContextBuilder:
continuationContext = useCase.continuationContextBuilder(
accumulatedData,
lastRawResponse
)
currentPrompt = useCase.continuationPromptBuilder(
prompt=prompt,
continuationContext=continuationContext,
**promptArgs or {}
)
# Make AI call
result = await self._makeAiCall(currentPrompt, options, iteration, operationId, debugPrefix)
lastRawResponse = result
# Process response based on use case
processedResult, isComplete, shouldContinue = await self._processUseCaseResponse(
result,
useCase,
accumulatedData,
iteration,
debugPrefix
)
if not shouldContinue:
return processedResult
# Max iterations reached
logger.warning(f"Max iterations ({maxIterations}) reached")
return accumulatedData.get("finalResult", lastRawResponse)
async def _processUseCaseResponse(
self,
result: str,
useCase: LoopingUseCase,
accumulatedData: Dict[str, Any],
iteration: int,
debugPrefix: str
) -> Tuple[str, bool, bool]:
"""Process response according to use case configuration."""
# Parse JSON
extractedJson = extractJsonString(result)
parsedJson, parseError, _ = tryParseJson(extractedJson)
if parseError:
# JSON parsing failed - continue
return result, False, True
# Check if use case requires extraction
if useCase.requiresExtraction:
# Extract data (e.g., sections from document structure)
extracted = self._extractData(parsedJson, useCase)
accumulatedData.setdefault("extracted", []).extend(extracted)
# Check completeness
isComplete = self._isJsonComplete(parsedJson, useCase)
# Accumulate if supported
if useCase.supportsAccumulation and useCase.accumulator:
accumulatedData = useCase.accumulator(accumulatedData, parsedJson, iteration)
# Merge if supported
if useCase.merger and accumulatedData.get("extracted"):
accumulatedData["merged"] = useCase.merger(accumulatedData["extracted"], iteration)
# Build result if complete
if isComplete:
if useCase.resultBuilder:
finalResult = useCase.resultBuilder(accumulatedData, useCase)
else:
# Direct return
finalResult = json.dumps(parsedJson, indent=2, ensure_ascii=False)
accumulatedData["finalResult"] = finalResult
return finalResult, True, False
# Not complete - continue
return result, False, True
```
---
## Part 3: Multiple Generation Paths
### 3.1 Current State
**Current**: Single document generation path in `serviceGeneration`
**Structure**:
```
serviceGeneration/
├── mainServiceGeneration.py
├── subStructureGeneration.py (chapter structure)
├── subStructureFilling.py (section structure + content)
└── renderers/ (document rendering)
```
### 3.2 Proposed Multi-Path Architecture
#### Enhanced Generation Service Structure
```
serviceGeneration/
├── mainServiceGeneration.py # Main entry point, routes by intent
├── paths/
│ ├── documentPath.py # Document generation path
│ ├── codePath.py # Code generation path (NEW)
│ ├── imagePath.py # Image generation path (NEW)
│ ├── videoPath.py # Video generation path (FUTURE)
│ └── audioPath.py # Audio generation path (FUTURE)
├── shared/
│ ├── subStructureGeneration.py # Shared structure generation (if applicable)
│ ├── subContentGeneration.py # Shared content generation (if applicable)
│ └── subPromptBuilder.py # Shared prompt builders
└── renderers/ # Format-specific renderers
├── document/ # Document renderers (existing)
├── code/ # Code renderers (NEW)
└── image/ # Image renderers (NEW)
```
#### Main Service Entry Point
**Refactor**: `mainServiceGeneration.py`
```python
class GenerationService:
"""Main generation service with multiple paths."""
def __init__(self, services):
self.services = services
self.documentPath = DocumentGenerationPath(services)
self.codePath = CodeGenerationPath(services)
self.imagePath = ImageGenerationPath(services)
# Future: videoPath, audioPath
async def generate(
self,
userPrompt: str,
generationIntent: str, # "document" | "code" | "image" (detected at AI service level)
documentList: Optional[DocumentReferenceList] = None,
contentParts: Optional[List[ContentPart]] = None,
outputFormat: str = None,
**kwargs
) -> AiResponse:
"""
Main entry point - routes to appropriate generation path.
Args:
generationIntent: Intent detected at AI service level ("document" | "code" | "image")
Returns: AiResponse with documents list (unified format)
"""
# Route to appropriate path based on generationIntent
if generationIntent == "code":
return await self.codePath.generateCode(
userPrompt=userPrompt,
contentParts=contentParts,
outputFormat=outputFormat,
**kwargs
)
elif generationIntent == "image":
return await self.imagePath.generateImages(
userPrompt=userPrompt,
outputFormat=outputFormat,
**kwargs
)
elif generationIntent == "document":
return await self.documentPath.generateDocument(
userPrompt=userPrompt,
documentList=documentList,
contentParts=contentParts,
outputFormat=outputFormat,
**kwargs
)
# Future paths...
else:
raise ValueError(f"Unsupported generationIntent: {generationIntent}")
```
#### Document Generation Path (Existing, Refactored)
**File**: `paths/documentPath.py`
```python
class DocumentGenerationPath:
"""Document generation path (existing functionality, refactored)."""
async def generateDocument(
self,
userPrompt: str,
documentList: Optional[DocumentReferenceList] = None,
outputFormat: str = "txt",
**kwargs
) -> AiResponse:
"""
Generate document using existing chapter/section model.
Returns: AiResponse with documents list
"""
# Phase 1: Chapter structure generation (with looping)
chapterStructure = await self._generateChapterStructure(
userPrompt=userPrompt,
contentParts=contentParts,
outputFormat=outputFormat
)
# Phase 2: Section structure generation (parallel)
sectionStructure = await self._generateSectionStructures(chapterStructure)
# Phase 3: Content generation (with looping, parallel)
filledStructure = await self._generateContent(sectionStructure)
# Phase 4: Rendering
renderedDocuments = await self._renderDocuments(filledStructure, outputFormat)
# Return unified format
return AiResponse(
documents=renderedDocuments,
content=None,
metadata=AiResponseMetadata(title=title, filename=filename)
)
```
#### Code Generation Path (NEW)
**File**: `paths/codePath.py`
```python
class CodeGenerationPath:
"""Code generation path."""
async def generateCode(
self,
userPrompt: str,
language: str = None,
fileTypes: List[str] = None,
projectType: str = "single_file",
outputFormat: str = None,
**kwargs
) -> AiResponse:
"""
Generate code files.
Returns: AiResponse with code files as documents
"""
# Phase 1: Code structure generation (with looping)
codeStructure = await self._generateCodeStructure(
userPrompt=userPrompt,
language=language,
fileTypes=fileTypes,
projectType=projectType
)
# Phase 2: Code content generation (with looping, parallel per file)
codeFiles = await self._generateCodeContent(codeStructure)
# Phase 3: Code formatting & validation
formattedFiles = await self._formatAndValidateCode(codeFiles)
# Convert to unified document format
documents = []
for file in formattedFiles:
documents.append(DocumentData(
documentName=file["filename"],
documentData=file["content"].encode('utf-8'),
mimeType=self._getMimeType(file["fileType"]),
sourceJson=file
))
return AiResponse(
documents=documents,
content=None,
metadata=AiResponseMetadata(title="Generated Code", filename=None)
)
async def _generateCodeStructure(
self,
userPrompt: str,
language: str,
fileTypes: List[str],
projectType: str
) -> Dict[str, Any]:
"""Generate code structure using looping system."""
prompt = buildCodeStructurePrompt(
userPrompt=userPrompt,
language=language,
fileTypes=fileTypes,
projectType=projectType
)
# Use generic looping system with code_structure use case
structureJson = await self.services.ai._callAiWithLooping(
prompt=prompt,
options=AiCallOptions(operationType=OperationTypeEnum.DATA_GENERATE),
useCaseId="code_structure", # Use parametrized use case
debugPrefix="code_structure_generation",
promptArgs={
"userPrompt": userPrompt,
"language": language,
"fileTypes": fileTypes
}
)
return json.loads(structureJson)
async def _generateCodeContent(
self,
codeStructure: Dict[str, Any]
) -> List[Dict[str, Any]]:
"""Generate code content for each file with dependency handling."""
files = codeStructure.get("files", [])
metadata = codeStructure.get("metadata", {})
# Step 1: Resolve dependency order
orderedFiles = self._resolveDependencyOrder(files)
# Step 2: Generate dependency files first (requirements.txt, package.json, etc.)
dependencyFiles = await self._generateDependencyFiles(metadata, orderedFiles)
# Step 3: Generate code files in dependency order (not fully parallel)
codeFiles = []
generatedFileContext = {} # Track what's been generated for cross-file references
for fileStructure in orderedFiles:
# Provide context about already-generated files for proper imports
fileContext = self._buildFileContext(generatedFileContext, fileStructure)
# Generate this file with context
fileContent = await self._generateSingleFileContent(
fileStructure,
fileContext=fileContext,
allFilesStructure=orderedFiles
)
codeFiles.append(fileContent)
# Update context with generated file info (for next files)
generatedFileContext[fileStructure["id"]] = {
"filename": fileContent.get("filename"),
"functions": fileContent.get("functions", []),
"classes": fileContent.get("classes", []),
"exports": fileContent.get("exports", [])
}
# Combine dependency files and code files
return dependencyFiles + codeFiles
def _resolveDependencyOrder(self, files: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Resolve file generation order based on dependencies."""
# Build dependency graph
fileMap = {f["id"]: f for f in files}
dependencies = {}
for file in files:
fileId = file["id"]
deps = file.get("dependencies", []) # List of file IDs this file depends on
dependencies[fileId] = deps
# Topological sort
ordered = []
visited = set()
tempMark = set()
def visit(fileId: str):
if fileId in tempMark:
# Circular dependency detected - break it
logger.warning(f"Circular dependency detected involving {fileId}")
return
if fileId in visited:
return
tempMark.add(fileId)
for depId in dependencies.get(fileId, []):
if depId in fileMap:
visit(depId)
tempMark.remove(fileId)
visited.add(fileId)
ordered.append(fileMap[fileId])
for file in files:
if file["id"] not in visited:
visit(file["id"])
return ordered
async def _generateDependencyFiles(
self,
metadata: Dict[str, Any],
files: List[Dict[str, Any]]
) -> List[Dict[str, Any]]:
"""Generate dependency files (requirements.txt, package.json, etc.)."""
language = metadata.get("language", "").lower()
dependencyFiles = []
# Extract all dependencies from files
allDependencies = set()
for file in files:
fileDeps = file.get("dependencies", [])
if isinstance(fileDeps, list):
allDependencies.update(fileDeps)
# Generate requirements.txt for Python
if language in ["python", "py"]:
requirementsContent = await self._generateRequirementsTxt(files, allDependencies)
if requirementsContent:
dependencyFiles.append({
"filename": "requirements.txt",
"content": requirementsContent,
"fileType": "txt",
"id": "requirements_txt"
})
# Generate package.json for JavaScript/TypeScript
elif language in ["javascript", "typescript", "js", "ts"]:
packageJson = await self._generatePackageJson(files, allDependencies, metadata)
if packageJson:
dependencyFiles.append({
"filename": "package.json",
"content": json.dumps(packageJson, indent=2),
"fileType": "json",
"id": "package_json"
})
return dependencyFiles
async def _generateRequirementsTxt(
self,
files: List[Dict[str, Any]],
dependencies: set
) -> str:
"""Generate requirements.txt content."""
# Extract Python imports from file structures
pythonPackages = set()
for file in files:
imports = file.get("imports", [])
if isinstance(imports, list):
for imp in imports:
# Extract package name from import (e.g., "from flask import" -> "flask")
if isinstance(imp, str):
# Simple extraction - can be enhanced
if "import" in imp:
parts = imp.split("import")
if len(parts) > 0:
package = parts[0].strip().split("from")[-1].strip()
if package and not package.startswith("."):
pythonPackages.add(package)
# Generate requirements.txt
if pythonPackages:
return "\n".join(sorted(pythonPackages))
return None
async def _generatePackageJson(
self,
files: List[Dict[str, Any]],
dependencies: set,
metadata: Dict[str, Any]
) -> Optional[Dict[str, Any]]:
"""Generate package.json content."""
# Extract npm packages from file structures
npmPackages = {}
for file in files:
imports = file.get("imports", [])
if isinstance(imports, list):
for imp in imports:
# Extract npm package (e.g., "import express from 'express'" -> "express")
if isinstance(imp, str) and ("from" in imp or "require" in imp):
# Simple extraction - can be enhanced
if "from" in imp:
parts = imp.split("from")
if len(parts) > 1:
package = parts[1].strip().strip("'\"")
if package and not package.startswith("."):
npmPackages[package] = "*" # Default version
if npmPackages:
return {
"name": metadata.get("projectName", "generated-project"),
"version": "1.0.0",
"dependencies": npmPackages
}
return None
def _buildFileContext(
self,
generatedFileContext: Dict[str, Dict[str, Any]],
currentFile: Dict[str, Any]
) -> Dict[str, Any]:
"""Build context about other files for proper imports/references."""
context = {
"availableFiles": [],
"availableFunctions": {},
"availableClasses": {}
}
# Add info about already-generated files
for fileId, fileInfo in generatedFileContext.items():
context["availableFiles"].append({
"id": fileId,
"filename": fileInfo["filename"],
"functions": fileInfo.get("functions", []),
"classes": fileInfo.get("classes", []),
"exports": fileInfo.get("exports", [])
})
# Build function/class maps for easy lookup
for func in fileInfo.get("functions", []):
funcName = func.get("name", "")
if funcName:
context["availableFunctions"][funcName] = {
"file": fileInfo["filename"],
"signature": func.get("signature", "")
}
for cls in fileInfo.get("classes", []):
className = cls.get("name", "")
if className:
context["availableClasses"][className] = {
"file": fileInfo["filename"]
}
return context
async def _generateSingleFileContent(
self,
fileStructure: Dict[str, Any],
fileContext: Dict[str, Any] = None,
allFilesStructure: List[Dict[str, Any]] = None
) -> Dict[str, Any]:
"""Generate code content for a single file with context about other files."""
# Build prompt with context about other files for proper imports
prompt = buildCodeContentPrompt(
fileStructure,
fileContext=fileContext,
allFilesStructure=allFilesStructure
)
# Use generic looping system with code_content use case
contentJson = await self.services.ai._callAiWithLooping(
prompt=prompt,
options=AiCallOptions(operationType=OperationTypeEnum.DATA_GENERATE),
useCaseId="code_content", # Use parametrized use case
debugPrefix=f"code_content_{fileStructure['id']}",
promptArgs={
"fileStructure": fileStructure,
"fileContext": fileContext,
"allFilesStructure": allFilesStructure
}
)
parsed = json.loads(contentJson)
# Extract function/class info for context building
parsed["functions"] = parsed.get("files", [{}])[0].get("functions", [])
parsed["classes"] = parsed.get("files", [{}])[0].get("classes", [])
return parsed
```
#### Image Generation Path (NEW)
**File**: `paths/imagePath.py`
```python
class ImageGenerationPath:
"""Image generation path."""
async def generateImages(
self,
userPrompt: str,
count: int = 1,
style: str = None,
format: str = "png",
**kwargs
) -> AiResponse:
"""
Generate image files.
Returns: AiResponse with image files as documents
"""
# Phase 1: Image prompt generation (if multiple images)
if count > 1:
imagePrompts = await self._generateImagePrompts(userPrompt, count, style)
else:
imagePrompts = [userPrompt]
# Phase 2: Generate images (parallel)
images = await self._generateImagesParallel(imagePrompts, format)
# Convert to unified document format
documents = []
for i, imageData in enumerate(images):
documents.append(DocumentData(
documentName=f"image_{i+1}.{format}",
documentData=imageData, # Already bytes
mimeType=f"image/{format}",
sourceJson={"prompt": imagePrompts[i], "index": i}
))
return AiResponse(
documents=documents,
content=None,
metadata=AiResponseMetadata(title="Generated Images", filename=None)
)
async def _generateImagesParallel(
self,
imagePrompts: List[str],
format: str
) -> List[bytes]:
"""Generate multiple images in parallel."""
tasks = []
for prompt in imagePrompts:
task = self._generateSingleImage(prompt, format)
tasks.append(task)
images = await asyncio.gather(*tasks)
return images
async def _generateSingleImage(
self,
prompt: str,
format: str
) -> bytes:
"""Generate a single image."""
# Use IMAGE_GENERATE operation
request = AiCallRequest(
prompt=prompt,
options=AiCallOptions(
operationType=OperationTypeEnum.IMAGE_GENERATE,
resultFormat="base64"
)
)
response = await self.services.ai.callAi(request)
# Decode base64 to bytes
import base64
imageBytes = base64.b64decode(response.content)
return imageBytes
```
---
## Part 4: Unified Document Output
### 4.1 Current State
**Current State**: ✅ All actions already return unified `ActionResult` format with `ActionDocument` objects
**Note**: The unification needed is at the **AI Service level** (`AiResponse`), not at the action level. Actions already convert `AiResponse` to `ActionResult` consistently.
### 4.2 AI Service Level Format
**Current**: ✅ All AI service paths already return unified `AiResponse` format
**Format** (already exists):
```python
@dataclass
class DocumentData:
"""Unified document data structure (already exists)."""
documentName: str # Filename
documentData: bytes # File content (bytes)
mimeType: str # MIME type (e.g., "text/html", "image/png", "application/pdf")
sourceJson: Optional[Dict[str, Any]] = None # Source JSON structure (if applicable)
@dataclass
class AiResponse:
"""Unified AI response format (already exists)."""
documents: List[DocumentData] # List of generated documents
content: Optional[str] = None # Optional text content
metadata: Optional[AiResponseMetadata] = None
```
**Requirement**: Ensure all new generation paths (code, image) return `AiResponse` in this format (same as document path)
### 4.3 Action Result Integration
**Current**: ✅ All actions already convert `AiResponse` to `ActionResult` consistently
**Pattern** (already implemented in all actions):
```python
# All actions follow this pattern (existing code):
async def execute(self, parameters: Dict[str, Any]) -> ActionResult:
# Call AI service - returns AiResponse
aiResponse = await self.services.ai.callAiContent(...)
# Convert AiResponse to ActionDocument (unified format)
documents = []
for docData in aiResponse.documents:
documents.append(ActionDocument(
documentName=docData.documentName,
documentData=docData.documentData,
mimeType=docData.mimeType,
sourceJson=docData.sourceJson
))
return ActionResult.isSuccess(documents=documents) # ✅ Already unified
```
**Note**:
- ✅ Actions already return unified `ActionResult` format
- ✅ No changes needed at action level
- ✅ Focus: Ensure new AI service paths (code, image) return `AiResponse` consistently
---
## Part 5: Implementation Plan
### Phase 1: Foundation (Weeks 1-2)
1. **Explicit Intent Requirement at AI Service Level**
- **Note**: No `IntentDetector` class needed - intent comes explicitly from actions
- Integrate `generationIntent` parameter into `callAiContent()` method
- Add `_handleCodeGeneration()` and `_handleDocumentGeneration()` methods
- Update `ai.process` to detect image formats from `resultType` (format detection, not intent detection)
- Require explicit `generationIntent` for all `DATA_GENERATE` operations
- Test with various actions (generateDocument, generateCode, process)
- Verify `IMAGE_GENERATE` still works correctly (no changes)
2. **Generic Looping System**
- Create `LoopingUseCase` dataclass
- Create `LoopingUseCaseRegistry`
- Register existing use cases (section_content, chapter_structure, code_structure)
- Refactor `subAiCallLooping.py` to use registry
### Phase 2: Code Generation (Weeks 3-4)
1. **Code Generation Path**
- Create `paths/codePath.py`
- Implement code structure generation
- Implement code content generation
- Register code use cases in looping registry
- Create `ai.generateCode` action
2. **Integration**
- Integrate code path into `mainServiceGeneration.py`
- Test code generation end-to-end
- Validate code output quality
### Phase 3: Image Generation (Weeks 5-6)
1. **Image Generation Path**
- Create `paths/imagePath.py`
- Implement standalone image generation
- Support batch image generation
- Register image use cases in looping registry
- Create `ai.generateImages` action
2. **Integration**
- Integrate image path into `mainServiceGeneration.py`
- Test image generation end-to-end
- Validate image output quality
### Phase 4: Refinement (Weeks 7-8)
1. **Unified Output**
- Ensure all paths return unified `AiResponse` format
- Standardize action result handling
- Test cross-path compatibility
2. **Documentation & Testing**
- Document all use cases
- Add unit tests for looping system
- Add integration tests for each path
- Performance testing
---
## Part 6: Migration Strategy
### Clean Implementation
1. **No Legacy Code**: Remove old prompt builder parameters completely
2. **Clear Use Cases**: All calls must specify explicit `useCaseId`
3. **No Fallback**: Fail fast if use case not found or intent missing
### Testing Strategy
1. **Unit Tests**: Test each use case independently
2. **Integration Tests**: Test full generation flows
3. **Use Case Tests**: Test all registered use cases
4. **Performance Tests**: Compare performance before/after
---
## Part 7: Future Extensions
### Video Generation Path (Future)
- Similar structure to image path
- Video structure planning (scenes, transitions)
- Frame-by-frame generation
- Video encoding
### Audio Generation Path (Future)
- Similar structure to image path
- Text-to-speech generation
- Music generation
- Audio file output
### Additional Use Cases
- Easy to add new use cases to registry
- Just register new `LoopingUseCase` configuration
- No changes to core looping system needed
---
## Part 8: Critical Cross-Check
### 8.1 Codebase Verification
**✅ Multiple Files Support**:
- Current system already supports multiple documents via `renderReport()` → returns `List[RenderedDocument]`
- HTML renderer creates multiple files (HTML + images) as separate documents
- Code generation path enhanced to generate multiple code files + dependency files
**✅ Code Generation Intelligence**:
1. **Dependency Handling**:
- ✅ Code structure includes `dependencies` field (list of file IDs)
-`_resolveDependencyOrder()` implements topological sort for proper generation order
- ✅ Handles circular dependencies gracefully
- ✅ Files generated sequentially based on dependencies (not fully parallel)
2. **Requirements/Dependencies Files**:
-`_generateDependencyFiles()` generates:
- `requirements.txt` for Python projects (extracts packages from imports)
- `package.json` for JavaScript/TypeScript projects (extracts npm packages)
- ✅ Dependency files generated BEFORE code files
- ✅ Extracts dependencies from file structures' `imports` field
3. **Cross-File References**:
-`_buildFileContext()` provides context about already-generated files
- ✅ Tracks functions, classes, and exports from each file
- ✅ Context passed to each file generation for proper imports
-`fileContext` includes:
- Available files and their exports
- Function signatures for proper imports
- Class definitions for proper imports
4. **File Structure Template**:
```json
{
"metadata": {
"language": "python|javascript|typescript",
"projectType": "single_file|multi_file",
"projectName": "..."
},
"files": [
{
"id": "file_1",
"filename": "main.py",
"fileType": "py",
"dependencies": ["file_2"], // File IDs this depends on
"imports": ["from utils import helper"], // For dependency extraction
"functions": [{"name": "main", "signature": "..."}],
"classes": [{"name": "MyClass", "signature": "..."}]
}
]
}
```
### 8.2 Architecture Validation
**✅ Smart Enough for Multi-File Projects**:
- ✅ Dependency resolution ensures proper order
- ✅ Requirements.txt/package.json automatically generated
- ✅ Cross-file context enables proper imports/references
- ✅ Function/class tracking enables accurate references
- ✅ Sequential generation with context accumulation
**✅ Current Codebase Compatibility**:
- ✅ Uses existing `List[RenderedDocument]` pattern
- ✅ Follows existing `AiResponse``ActionResult` conversion
- ✅ Compatible with existing document processing pipeline
- ✅ No breaking changes to existing document generation
**✅ Potential Enhancements** (Future):
- More sophisticated import parsing (AST-based)
- Support for more dependency file types (Cargo.toml, go.mod, etc.)
- Parallel generation of independent files (files without dependencies)
- Validation of imports against generated files
---
## Conclusion
This refactoring provides:
1.**AI Service-Level Intent Detection**: Detect document vs code when `DATA_GENERATE` is called - workflow unchanged
2.**Generic Looping System**: Parametrized, extensible, supports all JSON formats
3.**Multiple Generation Paths**: Document, code, image paths (extensible to video/audio)
4.**Unified Output**: All paths return same format, unified as action result documents
5.**Smart Code Generation**: Multi-file projects with dependencies, requirements.txt, and proper references
**Benefits**:
- **Minimal Changes**: Workflow level (task/action planning) remains unchanged
- **Correct Level**: Intent detection at AI service level where generation happens
- **Clean Architecture**: Separation of concerns - workflow handles planning, AI service handles generation
- **Easy to Extend**: New intents can be added by registering new use cases
- **Clear Code**: No legacy code, no deprecated parameters, no fallback logic
- **Well-tested Foundation**: Changes isolated to AI service layer
- **Smart Code Generation**: Handles complex multi-file projects with dependencies
**Next Steps**:
1. Review and approve architecture
2. Start Phase 1 implementation
3. Iterate based on feedback