57 KiB
Task Intentions & Generic Looping System - Refactoring Architecture
Executive Summary
This document outlines a comprehensive refactoring to enhance the generation system with:
- AI Service-Level Intent Detection: Detect intent (document vs code) when
DATA_GENERATEoperation is called - workflow level remains unchanged - Generic Looping System: Parametrized looping infrastructure supporting different JSON formats and use cases
- Multiple Generation Paths: Document, code, and image generation paths within the generation service, all unified as action result documents
- Smart Code Generation: Multi-file projects with dependency handling, requirements.txt/package.json generation, and proper cross-file references
Part 1: AI Service-Level Intent Detection
1.1 Current State
Problem:
DATA_GENERATEoperation type is used for both document and code generation- No distinction at AI service level - always routes to document generation pipeline
- Code generation requests treated as document generation
IMAGE_GENERATEalready works correctly (no changes needed)
Current Flow:
User Request
↓
Task Planning (unchanged)
↓
Action Planning (selects ai.process)
↓
ai.process → callAiContent(operationType=DATA_GENERATE)
↓
Document Generation Pipeline (always) ❌ Wrong for code!
Key Insight:
- Workflow level (task/action planning): Remains unchanged ✅
- AI Service level: Need to detect intent when
DATA_GENERATEis called - Operation Types:
IMAGE_GENERATE→ Already handles images correctly ✅DATA_GENERATE→ Needs to split: document vs code
Current Issue with ai.process:
ai.processcreatesAiCallOptions(resultFormat=output_format)- no operationType setcallAiContent()defaults toDATA_GENERATEif operationType not set (line 623)- If
resultType="png"or"jpg"→ still usesDATA_GENERATE, NOTIMAGE_GENERATE❌ - Image generation requests go through document pipeline instead of image pipeline
Solution: Detect image generation intent and set operationType=IMAGE_GENERATE when appropriate
1.2 Proposed Architecture
Intent Detection at AI Service Level
Location: gateway/modules/services/serviceAi/mainServiceAi.py and callAiContent()
Principle: When DATA_GENERATE operation is called, detect from prompt/content whether it's:
- Document generation: Reports, articles, formatted documents (existing behavior)
- Code generation: Executable code files (new behavior)
No changes needed:
- Task planning (remains unchanged)
- Action planning (remains unchanged)
IMAGE_GENERATEoperation (already works)
Intent Detection Logic
NO AUTO-DETECTION: Intent detection is NOT used in the new architecture.
Architecture Principle:
- NO auto-detection: Actions must explicitly provide
generationIntent - Clear use cases: Each action defines its intent explicitly
- No fallback: No fallback to old processing or detection logic
- Fail fast: If
generationIntentis missing, raise error immediately - Explicit over implicit: All intent must be explicitly specified - no guessing or inference
- Format detection vs Intent detection:
- ✅ Format detection is acceptable: Detecting image formats from explicit
resultTypeparameter (e.g., "png", "jpg") is acceptable because it's based on an explicit parameter, not prompt analysis - ❌ Intent detection is NOT acceptable: Detecting intent from prompt content or other inferred sources is not allowed - intent must be explicit
- ✅ Format detection is acceptable: Detecting image formats from explicit
Implementation:
- All actions must pass explicit
generationIntentparameter callAiContent()requiresgenerationIntentforDATA_GENERATEoperations- No IntentDetector class needed - intent comes from action definition
- Image generation detection:
ai.processdetects image formats fromresultTypeand setsoperationType=IMAGE_GENERATEautomatically (this is format detection based on explicit parameter, not intent detection from prompt)
AI Service Integration
Modify: mainServiceAi.py - callAiContent() method
async def callAiContent(
self,
prompt: str,
options: Optional[AiCallOptions] = None,
documentList: Optional[DocumentReferenceList] = None,
contentParts: Optional[List[ContentPart]] = None,
outputFormat: str = None,
title: str = None,
parentOperationId: Optional[str] = None,
generationIntent: Optional[str] = None # NEW: Explicit intent from action (skips detection)
) -> AiResponse:
"""
Unified AI content generation with explicit intent requirement.
Args:
generationIntent: REQUIRED explicit intent ("document" | "code" | "image") from action.
NO auto-detection - actions must explicitly specify intent.
"""
options = options or AiCallOptions()
operationType = options.operationType or OperationTypeEnum.DATA_GENERATE
# Route based on operation type
if operationType == OperationTypeEnum.IMAGE_GENERATE:
# Image generation - already works correctly, no changes needed
return await self._handleImageGeneration(prompt, options, outputFormat)
elif operationType == OperationTypeEnum.DATA_GENERATE:
# Data generation - REQUIRES explicit generationIntent
if not generationIntent:
raise ValueError(
"generationIntent is required for DATA_GENERATE operation. "
"Actions must explicitly specify 'document' or 'code' intent. "
"No auto-detection - use qualified actions (ai.generateDocument, ai.generateCode)."
)
# Route based on explicit intent (no auto-detection, no fallback)
if generationIntent == "code":
# Route to code generation path
return await self._handleCodeGeneration(
prompt=prompt,
options=options,
contentParts=contentParts,
outputFormat=outputFormat,
title=title,
parentOperationId=parentOperationId
)
else:
# Route to document generation path (existing behavior)
return await self._handleDocumentGeneration(
prompt=prompt,
options=options,
documentList=documentList,
contentParts=contentParts,
outputFormat=outputFormat,
title=title,
parentOperationId=parentOperationId
)
# Other operation types (DATA_ANALYSE, DATA_EXTRACT, etc.) - existing logic
# ...
Generation Path Handlers
New Methods in mainServiceAi.py:
async def _handleCodeGeneration(
self,
prompt: str,
options: AiCallOptions,
contentParts: Optional[List[ContentPart]],
outputFormat: str,
title: str,
parentOperationId: Optional[str]
) -> AiResponse:
"""Handle code generation using code generation path."""
from modules.services.serviceGeneration.paths.codePath import CodeGenerationPath
codePath = CodeGenerationPath(self.services)
return await codePath.generateCode(
userPrompt=prompt,
outputFormat=outputFormat,
contentParts=contentParts
)
async def _handleDocumentGeneration(
self,
prompt: str,
options: AiCallOptions,
documentList: Optional[DocumentReferenceList],
contentParts: Optional[List[ContentPart]],
outputFormat: str,
title: str,
parentOperationId: Optional[str]
) -> AiResponse:
"""Handle document generation using existing document path."""
# Existing document generation logic (unchanged)
# ...
Action Integration
Enhancement: Actions can pass explicit generationIntent to skip detection
1. Enhance ai.generateDocument Action
Modify: generateDocument.py
async def generateDocument(self, parameters: Dict[str, Any]) -> ActionResult:
"""Generate documents - explicitly sets intent to 'document'."""
# ... existing code ...
aiResponse: AiResponse = await self.services.ai.callAiContent(
prompt=prompt,
options=options,
documentList=docRefList,
outputFormat=resultType,
title=title,
parentOperationId=parentOperationId,
generationIntent="document" # NEW: Explicit intent, skips detection
)
# ... rest of method ...
2. Create New ai.generateCode Action
New File: generateCode.py
@action
async def generateCode(self, parameters: Dict[str, Any]) -> ActionResult:
"""
Generate code files - explicitly sets intent to 'code'.
Parameters:
- prompt (str, required): Description of code to generate
- documentList (list, optional): Reference documents
- resultType (str, optional): Output format (html, js, py, etc.). Default: based on prompt
"""
prompt = parameters.get("prompt")
if not prompt:
return ActionResult.isFailure(error="prompt is required")
documentList = parameters.get("documentList", [])
resultType = parameters.get("resultType")
# Auto-detect format from prompt if not provided
if not resultType:
promptLower = prompt.lower()
if ".html" in promptLower or "html file" in promptLower:
resultType = "html"
elif ".js" in promptLower or "javascript" in promptLower:
resultType = "js"
elif ".py" in promptLower or "python" in promptLower:
resultType = "py"
else:
resultType = "txt" # Default
# Prepare title
title = "Generated Code"
# Call AI service with explicit code intent
options = AiCallOptions(
operationType=OperationTypeEnum.DATA_GENERATE,
priority=PriorityEnum.BALANCED,
processingMode=ProcessingModeEnum.DETAILED
)
aiResponse: AiResponse = await self.services.ai.callAiContent(
prompt=prompt,
options=options,
documentList=docRefList,
outputFormat=resultType,
title=title,
parentOperationId=parentOperationId,
generationIntent="code" # Explicit intent, skips detection
)
# Convert to ActionResult (same as generateDocument)
# ...
3. Enhance ai.process Action
Modify: process.py - Detect image generation from resultType, require generationIntent for DATA_GENERATE
Important: Image format detection (png, jpg, etc.) is format detection, not intent detection. This is acceptable because it's based on explicit resultType parameter, not prompt analysis.
async def process(self, parameters: Dict[str, Any]) -> ActionResult:
"""Universal AI document processing action."""
# ... existing code ...
# Detect image generation from resultType (format detection, not intent detection)
# This is acceptable because resultType is an explicit parameter, not inferred from prompt
resultType = parameters.get("resultType", "txt")
normalized_result_type = (str(resultType).strip().lstrip('.').lower() or "txt")
imageFormats = ["png", "jpg", "jpeg", "gif", "webp"]
isImageGeneration = normalized_result_type in imageFormats
# Build options with correct operationType
output_format = normalized_result_type.replace('.', '') or 'txt'
options = AiCallOptions(
resultFormat=output_format,
operationType=OperationTypeEnum.IMAGE_GENERATE if isImageGeneration else OperationTypeEnum.DATA_GENERATE
)
# Get generationIntent from parameters (REQUIRED for DATA_GENERATE)
generationIntent = parameters.get("generationIntent")
# For DATA_GENERATE, generationIntent is REQUIRED (no auto-detection, no fallback)
if options.operationType == OperationTypeEnum.DATA_GENERATE and not generationIntent:
raise ValueError(
"ai.process called with DATA_GENERATE but no generationIntent. "
"Use qualified actions (ai.generateDocument, ai.generateCode) instead, "
"or explicitly pass generationIntent parameter."
)
# ... existing code ...
# Pass generationIntent to callAiContent (REQUIRED for DATA_GENERATE)
if contentParts:
aiResponse = await self.services.ai.callAiContent(
prompt=aiPrompt,
options=options,
contentParts=contentParts,
outputFormat=output_format,
parentOperationId=operationId,
generationIntent=generationIntent # REQUIRED for DATA_GENERATE
)
else:
aiResponse = await self.services.ai.callAiContent(
prompt=aiPrompt,
options=options,
documentList=documentList,
outputFormat=output_format,
parentOperationId=operationId,
generationIntent=generationIntent # REQUIRED for DATA_GENERATE
)
# ... rest of method ...
Behavior:
- If
resultTypeis image format (png, jpg, etc.) → SetsoperationType=IMAGE_GENERATE✅ - For
DATA_GENERATE:generationIntentis REQUIRED (no auto-detection, no fallback) - If
generationIntentnot provided → Raises ValueError (fail fast) - Best Practice: Use qualified actions (
ai.generateDocument,ai.generateCode) instead ofai.process
Rationale:
ai.processdetects image generation fromresultTypeand sets correct operationType- For DATA_GENERATE, explicit intent is required - no auto-detection, no fallback
- Wrapper actions (
translateDocument,summarizeDocument) must pass explicitgenerationIntent - Clear use cases - no ambiguity
4. ai.translateDocument and ai.summarizeDocument Actions
Current: Both wrap ai.process() with specific prompts
Enhancement: Pass generationIntent="document" when calling process() internally
Modify: translateDocument.py and summarizeDocument.py
# In translateDocument.py
processParams = {
"aiPrompt": aiPrompt,
"documentList": documentList,
"generationIntent": "document" # NEW: Explicit intent
}
if resultType:
processParams["resultType"] = resultType
return await self.process(processParams)
# In summarizeDocument.py
return await self.process({
"aiPrompt": aiPrompt,
"documentList": documentList,
"resultType": resultType,
"generationIntent": "document" # NEW: Explicit intent
})
Summary:
| Action | generationIntent | Behavior |
|---|---|---|
ai.generateDocument |
"document" |
Explicit intent, skips detection ✅ |
ai.generateCode |
"code" |
Explicit intent, skips detection ✅ |
ai.translateDocument |
"document" |
Explicit intent (via process) ✅ |
ai.summarizeDocument |
"document" |
Explicit intent (via process) ✅ |
ai.process |
REQUIRED | Must provide generationIntent for DATA_GENERATE, raises error if missing ❌ |
Benefits:
- Efficiency: Qualified actions skip detection (saves AI call)
- Clarity: Intent is explicit in action name
- No Ambiguity: Always clear use case - no auto-detection, no fallback
- Consistency: All actions must explicitly define intent
Critical Requirements:
- NO auto-detection:
callAiContent()requires explicitgenerationIntentfor DATA_GENERATE - NO fallback: No fallback to old processing logic - raises error if intent missing
- Clear use cases: Always explicit - no ambiguity
- Use qualified actions: Prefer
ai.generateDocument,ai.generateCodeover genericai.process - Fail fast: Missing
generationIntentraises ValueError immediately
Part 2: Generic Looping System
2.1 Current State
Current System: subAiCallLooping.py
- Handles different JSON formats through early detection
- Format-specific routing (elements, chapters, sections)
- Continuation context built for sections (not generic)
- No parametrized configuration
Issues:
- Hard-coded format detection
- Continuation context mismatch for different formats
- No accumulation support for all formats
- Not easily extensible for new formats
2.2 Proposed Generic Looping System
Looping Use Case Configuration
New Class: LoopingUseCase
@dataclass
class LoopingUseCase:
"""Configuration for a specific looping use case."""
# Identification
useCaseId: str # "section_content", "chapter_structure", "code_structure", "code_content"
# JSON Format Detection
jsonTemplate: Dict[str, Any] # Expected JSON structure template
detectionKeys: List[str] # Keys to check for format detection (e.g., ["elements"], ["chapters"], ["files"])
detectionPath: str # JSONPath to check (e.g., "documents[0].chapters", "files[0].content")
# Prompt Building
initialPromptBuilder: Callable # Function to build initial prompt
continuationPromptBuilder: Callable # Function to build continuation prompt
# Accumulation & Merging
accumulator: Optional[Callable] = None # Function to accumulate fragments
merger: Optional[Callable] = None # Function to merge accumulated data
# Continuation Context
continuationContextBuilder: Optional[Callable] = None # Build continuation context for this format
# Result Building
resultBuilder: Optional[Callable] = None # Build final result from accumulated data
# Metadata
supportsAccumulation: bool = True # Whether this use case supports accumulation
requiresExtraction: bool = False # Whether this requires extraction (like sections)
Use Case Registry
New Module: gateway/modules/services/serviceAi/subLoopingUseCases.py
class LoopingUseCaseRegistry:
"""Registry of all looping use cases."""
def __init__(self):
self.useCases: Dict[str, LoopingUseCase] = {}
self._registerDefaultUseCases()
def register(self, useCase: LoopingUseCase):
"""Register a new use case."""
self.useCases[useCase.useCaseId] = useCase
def get(self, useCaseId: str) -> Optional[LoopingUseCase]:
"""Get use case by ID."""
return self.useCases.get(useCaseId)
def detectUseCase(self, parsedJson: Dict[str, Any]) -> Optional[str]:
"""Detect which use case matches the JSON structure."""
for useCaseId, useCase in self.useCases.items():
if self._matchesFormat(parsedJson, useCase):
return useCaseId
return None
def _matchesFormat(self, json: Dict[str, Any], useCase: LoopingUseCase) -> bool:
"""Check if JSON matches use case format."""
for key in useCase.detectionKeys:
if key in json:
return True
# Check nested path
if useCase.detectionPath:
try:
from jsonpath_ng import parse
jsonpath_expr = parse(useCase.detectionPath)
matches = [match.value for match in jsonpath_expr.find(json)]
if matches:
return True
except:
pass
return False
def _registerDefaultUseCases(self):
"""Register default use cases."""
# Use Case 1: Section Content Generation
self.register(LoopingUseCase(
useCaseId="section_content",
jsonTemplate={"elements": []},
detectionKeys=["elements"],
detectionPath="",
initialPromptBuilder=buildSectionContentPrompt,
continuationPromptBuilder=buildSectionContentContinuationPrompt,
accumulator=None, # Direct return, no accumulation
merger=None,
continuationContextBuilder=buildSectionContinuationContext,
resultBuilder=None, # Return JSON directly
supportsAccumulation=False,
requiresExtraction=False
))
# Use Case 2: Chapter Structure Generation
self.register(LoopingUseCase(
useCaseId="chapter_structure",
jsonTemplate={"documents": [{"chapters": []}]},
detectionKeys=["chapters"],
detectionPath="documents[0].chapters",
initialPromptBuilder=buildChapterStructurePrompt,
continuationPromptBuilder=buildChapterStructureContinuationPrompt,
accumulator=None, # Direct return, no accumulation
merger=None,
continuationContextBuilder=buildChapterContinuationContext,
resultBuilder=None, # Return JSON directly
supportsAccumulation=False,
requiresExtraction=False
))
merger=mergeDocumentSections,
continuationContextBuilder=buildDocumentContinuationContext,
resultBuilder=buildDocumentResultFromSections,
supportsAccumulation=True,
requiresExtraction=True
))
# Use Case 4: Code Structure Generation (NEW)
self.register(LoopingUseCase(
useCaseId="code_structure",
jsonTemplate={
"metadata": {
"language": "",
"projectType": "single_file|multi_file",
"projectName": ""
},
"files": [
{
"id": "",
"filename": "",
"fileType": "",
"dependencies": [], # List of file IDs this file depends on
"imports": [], # List of import statements (for dependency extraction)
"functions": [], # Function signatures for cross-file references
"classes": [] # Class definitions for cross-file references
}
]
},
detectionKeys=["files"],
detectionPath="files",
initialPromptBuilder=buildCodeStructurePrompt,
continuationPromptBuilder=buildCodeStructureContinuationPrompt,
accumulator=None, # Direct return
merger=None,
continuationContextBuilder=buildCodeContinuationContext,
resultBuilder=None,
supportsAccumulation=False,
requiresExtraction=False
))
# Use Case 5: Code Content Generation (NEW)
self.register(LoopingUseCase(
useCaseId="code_content",
jsonTemplate={"files": [{"content": "", "functions": []}]},
detectionKeys=["content", "functions"],
detectionPath="files[0].content",
initialPromptBuilder=buildCodeContentPrompt,
continuationPromptBuilder=buildCodeContentContinuationPrompt,
accumulator=accumulateCodeContent,
merger=mergeCodeContent,
continuationContextBuilder=buildCodeContentContinuationContext,
resultBuilder=buildCodeResultFromContent,
supportsAccumulation=True,
requiresExtraction=False
))
resultBuilder=None,
supportsAccumulation=False,
requiresExtraction=False
))
Refactored Looping System
Refactor: subAiCallLooping.py
class AiCallLooper:
"""Generic looping system with parametrized use cases."""
def __init__(self, services, aiService, responseParser):
self.services = services
self.aiService = aiService
self.responseParser = responseParser
self.useCaseRegistry = LoopingUseCaseRegistry()
async def callAiWithLooping(
self,
prompt: str,
options: AiCallOptions,
useCaseId: str, # REQUIRED: Explicit use case ID
debugPrefix: str = "ai_call",
promptArgs: Optional[Dict[str, Any]] = None,
operationId: Optional[str] = None,
userPrompt: Optional[str] = None,
contentParts: Optional[List[ContentPart]] = None
) -> str:
"""
Generic looping system with parametrized use case.
Args:
useCaseId: REQUIRED explicit use case ID (e.g., "code_structure", "section_content", "chapter_structure")
promptArgs: Optional arguments for prompt builders
... (other args)
"""
maxIterations = 50
iteration = 0
accumulatedData = {} # Generic accumulation (replaces allSections)
lastRawResponse = None
# Get use case (REQUIRED - no auto-detection)
useCase = self.useCaseRegistry.get(useCaseId)
if not useCase:
raise ValueError(f"Use case '{useCaseId}' not found in registry. Available use cases: {list(self.useCaseRegistry.useCases.keys())}")
while iteration < maxIterations:
iteration += 1
# Build prompt using use case
if iteration == 1:
# Initial prompt
currentPrompt = useCase.initialPromptBuilder(
prompt=prompt,
**promptArgs or {}
)
else:
# Continuation prompt
continuationContext = None
if useCase.continuationContextBuilder:
continuationContext = useCase.continuationContextBuilder(
accumulatedData,
lastRawResponse
)
currentPrompt = useCase.continuationPromptBuilder(
prompt=prompt,
continuationContext=continuationContext,
**promptArgs or {}
)
# Make AI call
result = await self._makeAiCall(currentPrompt, options, iteration, operationId, debugPrefix)
lastRawResponse = result
# Process response based on use case
processedResult, isComplete, shouldContinue = await self._processUseCaseResponse(
result,
useCase,
accumulatedData,
iteration,
debugPrefix
)
if not shouldContinue:
return processedResult
# Max iterations reached
logger.warning(f"Max iterations ({maxIterations}) reached")
return accumulatedData.get("finalResult", lastRawResponse)
async def _processUseCaseResponse(
self,
result: str,
useCase: LoopingUseCase,
accumulatedData: Dict[str, Any],
iteration: int,
debugPrefix: str
) -> Tuple[str, bool, bool]:
"""Process response according to use case configuration."""
# Parse JSON
extractedJson = extractJsonString(result)
parsedJson, parseError, _ = tryParseJson(extractedJson)
if parseError:
# JSON parsing failed - continue
return result, False, True
# Check if use case requires extraction
if useCase.requiresExtraction:
# Extract data (e.g., sections from document structure)
extracted = self._extractData(parsedJson, useCase)
accumulatedData.setdefault("extracted", []).extend(extracted)
# Check completeness
isComplete = self._isJsonComplete(parsedJson, useCase)
# Accumulate if supported
if useCase.supportsAccumulation and useCase.accumulator:
accumulatedData = useCase.accumulator(accumulatedData, parsedJson, iteration)
# Merge if supported
if useCase.merger and accumulatedData.get("extracted"):
accumulatedData["merged"] = useCase.merger(accumulatedData["extracted"], iteration)
# Build result if complete
if isComplete:
if useCase.resultBuilder:
finalResult = useCase.resultBuilder(accumulatedData, useCase)
else:
# Direct return
finalResult = json.dumps(parsedJson, indent=2, ensure_ascii=False)
accumulatedData["finalResult"] = finalResult
return finalResult, True, False
# Not complete - continue
return result, False, True
Part 3: Multiple Generation Paths
3.1 Current State
Current: Single document generation path in serviceGeneration
Structure:
serviceGeneration/
├── mainServiceGeneration.py
├── subStructureGeneration.py (chapter structure)
├── subStructureFilling.py (section structure + content)
└── renderers/ (document rendering)
3.2 Proposed Multi-Path Architecture
Enhanced Generation Service Structure
serviceGeneration/
├── mainServiceGeneration.py # Main entry point, routes by intent
├── paths/
│ ├── documentPath.py # Document generation path
│ ├── codePath.py # Code generation path (NEW)
│ ├── imagePath.py # Image generation path (NEW)
│ ├── videoPath.py # Video generation path (FUTURE)
│ └── audioPath.py # Audio generation path (FUTURE)
├── shared/
│ ├── subStructureGeneration.py # Shared structure generation (if applicable)
│ ├── subContentGeneration.py # Shared content generation (if applicable)
│ └── subPromptBuilder.py # Shared prompt builders
└── renderers/ # Format-specific renderers
├── document/ # Document renderers (existing)
├── code/ # Code renderers (NEW)
└── image/ # Image renderers (NEW)
Main Service Entry Point
Refactor: mainServiceGeneration.py
class GenerationService:
"""Main generation service with multiple paths."""
def __init__(self, services):
self.services = services
self.documentPath = DocumentGenerationPath(services)
self.codePath = CodeGenerationPath(services)
self.imagePath = ImageGenerationPath(services)
# Future: videoPath, audioPath
async def generate(
self,
userPrompt: str,
generationIntent: str, # "document" | "code" | "image" (detected at AI service level)
documentList: Optional[DocumentReferenceList] = None,
contentParts: Optional[List[ContentPart]] = None,
outputFormat: str = None,
**kwargs
) -> AiResponse:
"""
Main entry point - routes to appropriate generation path.
Args:
generationIntent: Intent detected at AI service level ("document" | "code" | "image")
Returns: AiResponse with documents list (unified format)
"""
# Route to appropriate path based on generationIntent
if generationIntent == "code":
return await self.codePath.generateCode(
userPrompt=userPrompt,
contentParts=contentParts,
outputFormat=outputFormat,
**kwargs
)
elif generationIntent == "image":
return await self.imagePath.generateImages(
userPrompt=userPrompt,
outputFormat=outputFormat,
**kwargs
)
elif generationIntent == "document":
return await self.documentPath.generateDocument(
userPrompt=userPrompt,
documentList=documentList,
contentParts=contentParts,
outputFormat=outputFormat,
**kwargs
)
# Future paths...
else:
raise ValueError(f"Unsupported generationIntent: {generationIntent}")
Document Generation Path (Existing, Refactored)
File: paths/documentPath.py
class DocumentGenerationPath:
"""Document generation path (existing functionality, refactored)."""
async def generateDocument(
self,
userPrompt: str,
documentList: Optional[DocumentReferenceList] = None,
outputFormat: str = "txt",
**kwargs
) -> AiResponse:
"""
Generate document using existing chapter/section model.
Returns: AiResponse with documents list
"""
# Phase 1: Chapter structure generation (with looping)
chapterStructure = await self._generateChapterStructure(
userPrompt=userPrompt,
contentParts=contentParts,
outputFormat=outputFormat
)
# Phase 2: Section structure generation (parallel)
sectionStructure = await self._generateSectionStructures(chapterStructure)
# Phase 3: Content generation (with looping, parallel)
filledStructure = await self._generateContent(sectionStructure)
# Phase 4: Rendering
renderedDocuments = await self._renderDocuments(filledStructure, outputFormat)
# Return unified format
return AiResponse(
documents=renderedDocuments,
content=None,
metadata=AiResponseMetadata(title=title, filename=filename)
)
Code Generation Path (NEW)
File: paths/codePath.py
class CodeGenerationPath:
"""Code generation path."""
async def generateCode(
self,
userPrompt: str,
language: str = None,
fileTypes: List[str] = None,
projectType: str = "single_file",
outputFormat: str = None,
**kwargs
) -> AiResponse:
"""
Generate code files.
Returns: AiResponse with code files as documents
"""
# Phase 1: Code structure generation (with looping)
codeStructure = await self._generateCodeStructure(
userPrompt=userPrompt,
language=language,
fileTypes=fileTypes,
projectType=projectType
)
# Phase 2: Code content generation (with looping, parallel per file)
codeFiles = await self._generateCodeContent(codeStructure)
# Phase 3: Code formatting & validation
formattedFiles = await self._formatAndValidateCode(codeFiles)
# Convert to unified document format
documents = []
for file in formattedFiles:
documents.append(DocumentData(
documentName=file["filename"],
documentData=file["content"].encode('utf-8'),
mimeType=self._getMimeType(file["fileType"]),
sourceJson=file
))
return AiResponse(
documents=documents,
content=None,
metadata=AiResponseMetadata(title="Generated Code", filename=None)
)
async def _generateCodeStructure(
self,
userPrompt: str,
language: str,
fileTypes: List[str],
projectType: str
) -> Dict[str, Any]:
"""Generate code structure using looping system."""
prompt = buildCodeStructurePrompt(
userPrompt=userPrompt,
language=language,
fileTypes=fileTypes,
projectType=projectType
)
# Use generic looping system with code_structure use case
structureJson = await self.services.ai._callAiWithLooping(
prompt=prompt,
options=AiCallOptions(operationType=OperationTypeEnum.DATA_GENERATE),
useCaseId="code_structure", # Use parametrized use case
debugPrefix="code_structure_generation",
promptArgs={
"userPrompt": userPrompt,
"language": language,
"fileTypes": fileTypes
}
)
return json.loads(structureJson)
async def _generateCodeContent(
self,
codeStructure: Dict[str, Any]
) -> List[Dict[str, Any]]:
"""Generate code content for each file with dependency handling."""
files = codeStructure.get("files", [])
metadata = codeStructure.get("metadata", {})
# Step 1: Resolve dependency order
orderedFiles = self._resolveDependencyOrder(files)
# Step 2: Generate dependency files first (requirements.txt, package.json, etc.)
dependencyFiles = await self._generateDependencyFiles(metadata, orderedFiles)
# Step 3: Generate code files in dependency order (not fully parallel)
codeFiles = []
generatedFileContext = {} # Track what's been generated for cross-file references
for fileStructure in orderedFiles:
# Provide context about already-generated files for proper imports
fileContext = self._buildFileContext(generatedFileContext, fileStructure)
# Generate this file with context
fileContent = await self._generateSingleFileContent(
fileStructure,
fileContext=fileContext,
allFilesStructure=orderedFiles
)
codeFiles.append(fileContent)
# Update context with generated file info (for next files)
generatedFileContext[fileStructure["id"]] = {
"filename": fileContent.get("filename"),
"functions": fileContent.get("functions", []),
"classes": fileContent.get("classes", []),
"exports": fileContent.get("exports", [])
}
# Combine dependency files and code files
return dependencyFiles + codeFiles
def _resolveDependencyOrder(self, files: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Resolve file generation order based on dependencies."""
# Build dependency graph
fileMap = {f["id"]: f for f in files}
dependencies = {}
for file in files:
fileId = file["id"]
deps = file.get("dependencies", []) # List of file IDs this file depends on
dependencies[fileId] = deps
# Topological sort
ordered = []
visited = set()
tempMark = set()
def visit(fileId: str):
if fileId in tempMark:
# Circular dependency detected - break it
logger.warning(f"Circular dependency detected involving {fileId}")
return
if fileId in visited:
return
tempMark.add(fileId)
for depId in dependencies.get(fileId, []):
if depId in fileMap:
visit(depId)
tempMark.remove(fileId)
visited.add(fileId)
ordered.append(fileMap[fileId])
for file in files:
if file["id"] not in visited:
visit(file["id"])
return ordered
async def _generateDependencyFiles(
self,
metadata: Dict[str, Any],
files: List[Dict[str, Any]]
) -> List[Dict[str, Any]]:
"""Generate dependency files (requirements.txt, package.json, etc.)."""
language = metadata.get("language", "").lower()
dependencyFiles = []
# Extract all dependencies from files
allDependencies = set()
for file in files:
fileDeps = file.get("dependencies", [])
if isinstance(fileDeps, list):
allDependencies.update(fileDeps)
# Generate requirements.txt for Python
if language in ["python", "py"]:
requirementsContent = await self._generateRequirementsTxt(files, allDependencies)
if requirementsContent:
dependencyFiles.append({
"filename": "requirements.txt",
"content": requirementsContent,
"fileType": "txt",
"id": "requirements_txt"
})
# Generate package.json for JavaScript/TypeScript
elif language in ["javascript", "typescript", "js", "ts"]:
packageJson = await self._generatePackageJson(files, allDependencies, metadata)
if packageJson:
dependencyFiles.append({
"filename": "package.json",
"content": json.dumps(packageJson, indent=2),
"fileType": "json",
"id": "package_json"
})
return dependencyFiles
async def _generateRequirementsTxt(
self,
files: List[Dict[str, Any]],
dependencies: set
) -> str:
"""Generate requirements.txt content."""
# Extract Python imports from file structures
pythonPackages = set()
for file in files:
imports = file.get("imports", [])
if isinstance(imports, list):
for imp in imports:
# Extract package name from import (e.g., "from flask import" -> "flask")
if isinstance(imp, str):
# Simple extraction - can be enhanced
if "import" in imp:
parts = imp.split("import")
if len(parts) > 0:
package = parts[0].strip().split("from")[-1].strip()
if package and not package.startswith("."):
pythonPackages.add(package)
# Generate requirements.txt
if pythonPackages:
return "\n".join(sorted(pythonPackages))
return None
async def _generatePackageJson(
self,
files: List[Dict[str, Any]],
dependencies: set,
metadata: Dict[str, Any]
) -> Optional[Dict[str, Any]]:
"""Generate package.json content."""
# Extract npm packages from file structures
npmPackages = {}
for file in files:
imports = file.get("imports", [])
if isinstance(imports, list):
for imp in imports:
# Extract npm package (e.g., "import express from 'express'" -> "express")
if isinstance(imp, str) and ("from" in imp or "require" in imp):
# Simple extraction - can be enhanced
if "from" in imp:
parts = imp.split("from")
if len(parts) > 1:
package = parts[1].strip().strip("'\"")
if package and not package.startswith("."):
npmPackages[package] = "*" # Default version
if npmPackages:
return {
"name": metadata.get("projectName", "generated-project"),
"version": "1.0.0",
"dependencies": npmPackages
}
return None
def _buildFileContext(
self,
generatedFileContext: Dict[str, Dict[str, Any]],
currentFile: Dict[str, Any]
) -> Dict[str, Any]:
"""Build context about other files for proper imports/references."""
context = {
"availableFiles": [],
"availableFunctions": {},
"availableClasses": {}
}
# Add info about already-generated files
for fileId, fileInfo in generatedFileContext.items():
context["availableFiles"].append({
"id": fileId,
"filename": fileInfo["filename"],
"functions": fileInfo.get("functions", []),
"classes": fileInfo.get("classes", []),
"exports": fileInfo.get("exports", [])
})
# Build function/class maps for easy lookup
for func in fileInfo.get("functions", []):
funcName = func.get("name", "")
if funcName:
context["availableFunctions"][funcName] = {
"file": fileInfo["filename"],
"signature": func.get("signature", "")
}
for cls in fileInfo.get("classes", []):
className = cls.get("name", "")
if className:
context["availableClasses"][className] = {
"file": fileInfo["filename"]
}
return context
async def _generateSingleFileContent(
self,
fileStructure: Dict[str, Any],
fileContext: Dict[str, Any] = None,
allFilesStructure: List[Dict[str, Any]] = None
) -> Dict[str, Any]:
"""Generate code content for a single file with context about other files."""
# Build prompt with context about other files for proper imports
prompt = buildCodeContentPrompt(
fileStructure,
fileContext=fileContext,
allFilesStructure=allFilesStructure
)
# Use generic looping system with code_content use case
contentJson = await self.services.ai._callAiWithLooping(
prompt=prompt,
options=AiCallOptions(operationType=OperationTypeEnum.DATA_GENERATE),
useCaseId="code_content", # Use parametrized use case
debugPrefix=f"code_content_{fileStructure['id']}",
promptArgs={
"fileStructure": fileStructure,
"fileContext": fileContext,
"allFilesStructure": allFilesStructure
}
)
parsed = json.loads(contentJson)
# Extract function/class info for context building
parsed["functions"] = parsed.get("files", [{}])[0].get("functions", [])
parsed["classes"] = parsed.get("files", [{}])[0].get("classes", [])
return parsed
Image Generation Path (NEW)
File: paths/imagePath.py
class ImageGenerationPath:
"""Image generation path."""
async def generateImages(
self,
userPrompt: str,
count: int = 1,
style: str = None,
format: str = "png",
**kwargs
) -> AiResponse:
"""
Generate image files.
Returns: AiResponse with image files as documents
"""
# Phase 1: Image prompt generation (if multiple images)
if count > 1:
imagePrompts = await self._generateImagePrompts(userPrompt, count, style)
else:
imagePrompts = [userPrompt]
# Phase 2: Generate images (parallel)
images = await self._generateImagesParallel(imagePrompts, format)
# Convert to unified document format
documents = []
for i, imageData in enumerate(images):
documents.append(DocumentData(
documentName=f"image_{i+1}.{format}",
documentData=imageData, # Already bytes
mimeType=f"image/{format}",
sourceJson={"prompt": imagePrompts[i], "index": i}
))
return AiResponse(
documents=documents,
content=None,
metadata=AiResponseMetadata(title="Generated Images", filename=None)
)
async def _generateImagesParallel(
self,
imagePrompts: List[str],
format: str
) -> List[bytes]:
"""Generate multiple images in parallel."""
tasks = []
for prompt in imagePrompts:
task = self._generateSingleImage(prompt, format)
tasks.append(task)
images = await asyncio.gather(*tasks)
return images
async def _generateSingleImage(
self,
prompt: str,
format: str
) -> bytes:
"""Generate a single image."""
# Use IMAGE_GENERATE operation
request = AiCallRequest(
prompt=prompt,
options=AiCallOptions(
operationType=OperationTypeEnum.IMAGE_GENERATE,
resultFormat="base64"
)
)
response = await self.services.ai.callAi(request)
# Decode base64 to bytes
import base64
imageBytes = base64.b64decode(response.content)
return imageBytes
Part 4: Unified Document Output
4.1 Current State
Current State: ✅ All actions already return unified ActionResult format with ActionDocument objects
Note: The unification needed is at the AI Service level (AiResponse), not at the action level. Actions already convert AiResponse to ActionResult consistently.
4.2 AI Service Level Format
Current: ✅ All AI service paths already return unified AiResponse format
Format (already exists):
@dataclass
class DocumentData:
"""Unified document data structure (already exists)."""
documentName: str # Filename
documentData: bytes # File content (bytes)
mimeType: str # MIME type (e.g., "text/html", "image/png", "application/pdf")
sourceJson: Optional[Dict[str, Any]] = None # Source JSON structure (if applicable)
@dataclass
class AiResponse:
"""Unified AI response format (already exists)."""
documents: List[DocumentData] # List of generated documents
content: Optional[str] = None # Optional text content
metadata: Optional[AiResponseMetadata] = None
Requirement: Ensure all new generation paths (code, image) return AiResponse in this format (same as document path)
4.3 Action Result Integration
Current: ✅ All actions already convert AiResponse to ActionResult consistently
Pattern (already implemented in all actions):
# All actions follow this pattern (existing code):
async def execute(self, parameters: Dict[str, Any]) -> ActionResult:
# Call AI service - returns AiResponse
aiResponse = await self.services.ai.callAiContent(...)
# Convert AiResponse to ActionDocument (unified format)
documents = []
for docData in aiResponse.documents:
documents.append(ActionDocument(
documentName=docData.documentName,
documentData=docData.documentData,
mimeType=docData.mimeType,
sourceJson=docData.sourceJson
))
return ActionResult.isSuccess(documents=documents) # ✅ Already unified
Note:
- ✅ Actions already return unified
ActionResultformat - ✅ No changes needed at action level
- ✅ Focus: Ensure new AI service paths (code, image) return
AiResponseconsistently
Part 5: Implementation Plan
Phase 1: Foundation (Weeks 1-2)
-
Explicit Intent Requirement at AI Service Level
- Note: No
IntentDetectorclass needed - intent comes explicitly from actions - Integrate
generationIntentparameter intocallAiContent()method - Add
_handleCodeGeneration()and_handleDocumentGeneration()methods - Update
ai.processto detect image formats fromresultType(format detection, not intent detection) - Require explicit
generationIntentfor allDATA_GENERATEoperations - Test with various actions (generateDocument, generateCode, process)
- Verify
IMAGE_GENERATEstill works correctly (no changes)
- Note: No
-
Generic Looping System
- Create
LoopingUseCasedataclass - Create
LoopingUseCaseRegistry - Register existing use cases (section_content, chapter_structure, code_structure)
- Refactor
subAiCallLooping.pyto use registry
- Create
Phase 2: Code Generation (Weeks 3-4)
-
Code Generation Path
- Create
paths/codePath.py - Implement code structure generation
- Implement code content generation
- Register code use cases in looping registry
- Create
ai.generateCodeaction
- Create
-
Integration
- Integrate code path into
mainServiceGeneration.py - Test code generation end-to-end
- Validate code output quality
- Integrate code path into
Phase 3: Image Generation (Weeks 5-6)
-
Image Generation Path
- Create
paths/imagePath.py - Implement standalone image generation
- Support batch image generation
- Register image use cases in looping registry
- Create
ai.generateImagesaction
- Create
-
Integration
- Integrate image path into
mainServiceGeneration.py - Test image generation end-to-end
- Validate image output quality
- Integrate image path into
Phase 4: Refinement (Weeks 7-8)
-
Unified Output
- Ensure all paths return unified
AiResponseformat - Standardize action result handling
- Test cross-path compatibility
- Ensure all paths return unified
-
Documentation & Testing
- Document all use cases
- Add unit tests for looping system
- Add integration tests for each path
- Performance testing
Part 6: Migration Strategy
Clean Implementation
- No Legacy Code: Remove old prompt builder parameters completely
- Clear Use Cases: All calls must specify explicit
useCaseId - No Fallback: Fail fast if use case not found or intent missing
Testing Strategy
- Unit Tests: Test each use case independently
- Integration Tests: Test full generation flows
- Use Case Tests: Test all registered use cases
- Performance Tests: Compare performance before/after
Part 7: Future Extensions
Video Generation Path (Future)
- Similar structure to image path
- Video structure planning (scenes, transitions)
- Frame-by-frame generation
- Video encoding
Audio Generation Path (Future)
- Similar structure to image path
- Text-to-speech generation
- Music generation
- Audio file output
Additional Use Cases
- Easy to add new use cases to registry
- Just register new
LoopingUseCaseconfiguration - No changes to core looping system needed
Part 8: Critical Cross-Check
8.1 Codebase Verification
✅ Multiple Files Support:
- Current system already supports multiple documents via
renderReport()→ returnsList[RenderedDocument] - HTML renderer creates multiple files (HTML + images) as separate documents
- Code generation path enhanced to generate multiple code files + dependency files
✅ Code Generation Intelligence:
-
Dependency Handling:
- ✅ Code structure includes
dependenciesfield (list of file IDs) - ✅
_resolveDependencyOrder()implements topological sort for proper generation order - ✅ Handles circular dependencies gracefully
- ✅ Files generated sequentially based on dependencies (not fully parallel)
- ✅ Code structure includes
-
Requirements/Dependencies Files:
- ✅
_generateDependencyFiles()generates:requirements.txtfor Python projects (extracts packages from imports)package.jsonfor JavaScript/TypeScript projects (extracts npm packages)
- ✅ Dependency files generated BEFORE code files
- ✅ Extracts dependencies from file structures'
importsfield
- ✅
-
Cross-File References:
- ✅
_buildFileContext()provides context about already-generated files - ✅ Tracks functions, classes, and exports from each file
- ✅ Context passed to each file generation for proper imports
- ✅
fileContextincludes:- Available files and their exports
- Function signatures for proper imports
- Class definitions for proper imports
- ✅
-
File Structure Template:
{ "metadata": { "language": "python|javascript|typescript", "projectType": "single_file|multi_file", "projectName": "..." }, "files": [ { "id": "file_1", "filename": "main.py", "fileType": "py", "dependencies": ["file_2"], // File IDs this depends on "imports": ["from utils import helper"], // For dependency extraction "functions": [{"name": "main", "signature": "..."}], "classes": [{"name": "MyClass", "signature": "..."}] } ] }
8.2 Architecture Validation
✅ Smart Enough for Multi-File Projects:
- ✅ Dependency resolution ensures proper order
- ✅ Requirements.txt/package.json automatically generated
- ✅ Cross-file context enables proper imports/references
- ✅ Function/class tracking enables accurate references
- ✅ Sequential generation with context accumulation
✅ Current Codebase Compatibility:
- ✅ Uses existing
List[RenderedDocument]pattern - ✅ Follows existing
AiResponse→ActionResultconversion - ✅ Compatible with existing document processing pipeline
- ✅ No breaking changes to existing document generation
✅ Potential Enhancements (Future):
- More sophisticated import parsing (AST-based)
- Support for more dependency file types (Cargo.toml, go.mod, etc.)
- Parallel generation of independent files (files without dependencies)
- Validation of imports against generated files
Conclusion
This refactoring provides:
- ✅ AI Service-Level Intent Detection: Detect document vs code when
DATA_GENERATEis called - workflow unchanged - ✅ Generic Looping System: Parametrized, extensible, supports all JSON formats
- ✅ Multiple Generation Paths: Document, code, image paths (extensible to video/audio)
- ✅ Unified Output: All paths return same format, unified as action result documents
- ✅ Smart Code Generation: Multi-file projects with dependencies, requirements.txt, and proper references
Benefits:
- Minimal Changes: Workflow level (task/action planning) remains unchanged
- Correct Level: Intent detection at AI service level where generation happens
- Clean Architecture: Separation of concerns - workflow handles planning, AI service handles generation
- Easy to Extend: New intents can be added by registering new use cases
- Clear Code: No legacy code, no deprecated parameters, no fallback logic
- Well-tested Foundation: Changes isolated to AI service layer
- Smart Code Generation: Handles complex multi-file projects with dependencies
Next Steps:
- Review and approve architecture
- Start Phase 1 implementation
- Iterate based on feedback