wiki/implementation/implementation_multifile_output_refactoring.md

26 KiB

Multi-File Output Refactoring Concept

Overview

This document outlines a comprehensive refactoring approach to enable multi-file output generation while preserving the existing modular architecture. The solution extends the current system to support prompts like "Deliver one file for each customer data" or "split the data into meaningful pieces" without breaking existing single-file functionality.

Current System Analysis

Strengths to Preserve

  • Modular Service Architecture: Clean separation between ExtractionService, AiService, GenerationService, and WorkflowService
  • Renderer Engine: Robust format-specific renderers (DOCX, PDF, HTML, etc.) with registry pattern
  • Extraction Pipeline: Sophisticated content extraction and chunking system
  • JSON Schema Structure: Well-defined document structure for AI processing
  • Backward Compatibility: All existing single-file functionality must remain intact

Current Limitations

  • Single File Output: All renderers return (content, mime_type) - one file per call
  • Monolithic JSON Schema: Designed for single document structure
  • No Split Logic: No mechanism to detect or handle multi-file requests
  • Fixed Document Array: AI service always returns single document in documents array

Refactoring Architecture

Core Design Principles

  1. Backward Compatibility First: All existing single-file functionality remains unchanged
  2. Minimal Core Changes: Extend existing patterns rather than replace them
  3. Renderer Preservation: Keep all existing renderers unchanged
  4. AI-Powered Detection: Use AI to analyze prompts in any language for multi-file requests
  5. Generic Functions: All new functions are generic and language-agnostic
  6. Graceful Fallback: Fall back to single-file if multi-file processing fails
  7. Performance Conscious: Efficient processing without duplicating work

Enhanced Processing Flow

flowchart TD
    start[User Prompt Analysis]
    detect{Multi-File Request?}
    single[Single-File Path]
    multi[Multi-File Path]
    
    single --> extract[Extraction Service]
    single --> ai[AI Service - Single]
    single --> render[Generation Service - Single]
    single --> result[Single Document]
    
    multi --> extractMulti[Extraction Service]
    multi --> aiMulti[AI Service - Multi]
    multi --> split[Document Splitter]
    multi --> renderMulti[Multi-File Generator]
    multi --> resultMulti[Multiple Documents]
    
    extract --> ai
    extractMulti --> aiMulti
    ai --> render
    aiMulti --> split
    split --> renderMulti
    render --> result
    renderMulti --> resultMulti

Implementation Strategy

Phase 1: Enhanced JSON Schema

1.1 Multi-Document Schema Extension

File: gateway/modules/services/serviceGeneration/subJsonSchema.py

def get_multi_document_subJsonSchema() -> Dict[str, Any]:
    """Get the JSON schema for multi-document generation."""
    return {
        "type": "object",
        "required": ["metadata", "documents"],
        "properties": {
            "metadata": {
                "type": "object",
                "required": ["title", "splitStrategy"],
                "properties": {
                    "title": {"type": "string"},
                    "splitStrategy": {
                        "type": "string",
                        "enum": ["per_customer", "by_section", "by_criteria", "by_data_type", "custom"],
                        "description": "Strategy for splitting content into multiple files"
                    },
                    "splitCriteria": {
                        "type": "object",
                        "description": "Custom criteria for splitting (e.g., customer_id, category, etc.)"
                    },
                    "fileNamingPattern": {
                        "type": "string",
                        "description": "Pattern for generating filenames (e.g., '{customer_name}_data.docx')"
                    }
                }
            },
            "documents": {
                "type": "array",
                "description": "Array of individual documents to generate",
                "items": {
                    "type": "object",
                    "required": ["id", "title", "sections", "filename"],
                    "properties": {
                        "id": {"type": "string"},
                        "title": {"type": "string"},
                        "filename": {"type": "string"},
                        "sections": {
                            "type": "array",
                            "items": {"$ref": "#/definitions/section"}
                        },
                        "metadata": {
                            "type": "object",
                            "description": "Document-specific metadata"
                        }
                    }
                }
            }
        }
    }

1.2 Backward Compatibility

def get_document_subJsonSchema() -> Dict[str, Any]:
    """Get the JSON schema for single document generation (existing)."""
    # Keep existing schema unchanged for backward compatibility
    return existing_schema

def get_adaptive_json_schema(user_prompt: str) -> Dict[str, Any]:
    """Automatically select appropriate schema based on user prompt analysis."""
    if _is_multi_file_request(user_prompt):
        return get_multi_document_subJsonSchema()
    else:
        return get_document_subJsonSchema()

Phase 2: Enhanced AI Service

2.1 AI-Powered Multi-File Detection

File: gateway/modules/services/serviceAi/mainServiceAi.py

class AiService:
    async def _analyzePromptIntent(self, prompt: str, ai_service=None) -> Dict[str, Any]:
        """Use AI to analyze user prompt and determine processing requirements."""
        if not ai_service:
            return {"is_multi_file": False, "strategy": "single", "criteria": None}
        
        try:
            analysis_prompt = f"""
Analyze this user request and determine if it requires multiple file output or single file output.

User request: "{prompt}"

Respond with JSON only in this exact format:
{{
    "is_multi_file": true/false,
    "strategy": "single|per_entity|by_section|by_criteria|custom",
    "criteria": "description of how to split content",
    "file_naming_pattern": "suggested pattern for filenames",
    "reasoning": "brief explanation of the analysis"
}}

Consider:
- Does the user want separate files for different entities (customers, products, etc.)?
- Does the user want to split content into multiple documents?
- What would be the most logical way to organize the content?
- What language is the request in? (analyze in the original language)

Return only the JSON response.
"""
            
            from modules.datamodels.datamodelAi import AiCallRequest, AiCallOptions, OperationType
            request_options = AiCallOptions()
            request_options.operationType = OperationType.GENERAL
            
            request = AiCallRequest(prompt=analysis_prompt, context="", options=request_options)
            response = await ai_service.aiObjects.call(request)
            
            if response and response.content:
                import json
                import re
                
                # Extract JSON from response
                result = response.content.strip()
                json_match = re.search(r'\{.*\}', result, re.DOTALL)
                if json_match:
                    result = json_match.group(0)
                
                analysis = json.loads(result)
                return analysis
            else:
                return {"is_multi_file": False, "strategy": "single", "criteria": None}
                
        except Exception as e:
            self.logger.warning(f"AI prompt analysis failed: {str(e)}, defaulting to single file")
            return {"is_multi_file": False, "strategy": "single", "criteria": None}

2.2 Enhanced Document Generation

async def _callAiWithDocumentGeneration(
    self,
    prompt: str,
    documents: Optional[List[ChatDocument]],
    options: AiCallOptions,
    outputFormat: str,
    title: Optional[str]
) -> Dict[str, Any]:
    """Enhanced document generation with AI-powered multi-file detection."""
    
    # Use AI to analyze prompt intent
    prompt_analysis = await self._analyzePromptIntent(prompt, self)
    
    if prompt_analysis.get("is_multi_file", False):
        return await self._callAiWithMultiFileGeneration(
            prompt, documents, options, outputFormat, title, prompt_analysis
        )
    else:
        # Use existing single-file logic
        return await self._callAiWithSingleFileGeneration(
            prompt, documents, options, outputFormat, title
        )

async def _callAiWithMultiFileGeneration(
    self,
    prompt: str,
    documents: Optional[List[ChatDocument]],
    options: AiCallOptions,
    outputFormat: str,
    title: Optional[str],
    prompt_analysis: Dict[str, Any]
) -> Dict[str, Any]:
    """Handle multi-file document generation using AI analysis."""
    
    # Get multi-file extraction prompt based on AI analysis
    generation_service = GenerationService(self.services)
    extraction_prompt = await generation_service.getAdaptiveExtractionPrompt(
        outputFormat=outputFormat,
        userPrompt=prompt,
        title=title,
        promptAnalysis=prompt_analysis,
        aiService=self
    )
    
    # Process with adaptive JSON schema
    ai_response = await self._callAiJson(extraction_prompt, documents, options)
    
    # Validate response structure
    if not self._validateResponseStructure(ai_response, prompt_analysis):
        # Fallback to single-file if multi-file fails
        self.logger.warning("Multi-file processing failed, falling back to single-file")
        return await self._callAiWithSingleFileGeneration(
            prompt, documents, options, outputFormat, title
        )
    
    # Process multiple documents
    generated_documents = []
    for doc_data in ai_response.get("documents", []):
        rendered_content, mime_type = await generation_service.renderReport(
            extractedContent={"sections": doc_data["sections"]},
            outputFormat=outputFormat,
            title=doc_data["title"],
            userPrompt=prompt,
            aiService=self
        )
        
        generated_documents.append({
            "documentName": doc_data["filename"],
            "documentData": rendered_content,
            "mimeType": mime_type
        })
    
    return {
        "success": True,
        "content": ai_response,
        "rendered_content": None,  # Not applicable for multi-file
        "mime_type": None,  # Not applicable for multi-file
        "filename": None,  # Not applicable for multi-file
        "format": outputFormat,
        "title": title,
        "documents": generated_documents,
        "is_multi_file": True,
        "split_strategy": prompt_analysis.get("strategy", "custom")
    }

def _validateResponseStructure(self, response: Dict[str, Any], prompt_analysis: Dict[str, Any]) -> bool:
    """Validate that AI response matches the expected structure."""
    try:
        if not isinstance(response, dict):
            return False
        
        # Check for multi-file structure
        if prompt_analysis.get("is_multi_file", False):
            return "documents" in response and isinstance(response["documents"], list)
        else:
            return "sections" in response and isinstance(response["sections"], list)
    except Exception:
        return False

Phase 3: Enhanced Generation Service

3.1 Adaptive Prompt Builder

File: gateway/modules/services/serviceGeneration/subPromptBuilder.py

async def buildAdaptiveExtractionPrompt(
    outputFormat: str,
    userPrompt: str,
    title: str,
    promptAnalysis: Dict[str, Any],
    aiService=None,
    services=None
) -> str:
    """Build adaptive extraction prompt based on AI analysis."""
    
    # Get appropriate JSON schema based on analysis
    if promptAnalysis.get("is_multi_file", False):
        from .subJsonSchema import get_multi_document_subJsonSchema
        json_schema = get_multi_document_subJsonSchema()
        schema_type = "multi-document"
    else:
        from .subJsonSchema import get_document_subJsonSchema
        json_schema = get_document_subJsonSchema()
        schema_type = "single-document"
    
    # Build adaptive prompt using AI analysis
    adaptive_prompt = f"""
{userPrompt}

You are extracting structured content from documents and must respond with valid JSON only.

IMPORTANT: You must respond with valid JSON only. No additional text, explanations, or formatting outside the JSON structure.

Processing Requirements:
- Output Type: {schema_type}
- Split Strategy: {promptAnalysis.get('strategy', 'single')}
- Split Criteria: {promptAnalysis.get('criteria', 'N/A')}
- File Naming Pattern: {promptAnalysis.get('file_naming_pattern', 'auto-generated')}

Extract the actual data from the source documents and structure it as JSON with this format:
{json.dumps(json_schema, indent=2)}

Requirements:
- Preserve all original data - do not summarize or interpret
- Follow the split strategy: {promptAnalysis.get('criteria', 'organize logically')}
- Use meaningful filenames that reflect the content
- Ensure each document is complete and self-contained
- Maintain data integrity and structure

Return only the JSON structure with actual data from the documents. Do not include any text before or after the JSON.
"""
    
    return adaptive_prompt

async def buildGenericExtractionPrompt(
    outputFormat: str,
    userPrompt: str,
    title: str,
    aiService=None,
    services=None
) -> str:
    """Build generic extraction prompt that works for both single and multi-file."""
    
    # Use AI to determine the best approach
    if aiService:
        try:
            analysis_prompt = f"""
Analyze this user request and determine the best JSON structure for document extraction.

User request: "{userPrompt}"

Respond with JSON only:
{{
    "requires_multi_file": true/false,
    "recommended_schema": "single_document|multi_document",
    "split_approach": "description of how to organize content",
    "file_naming": "suggested naming pattern"
}}

Consider the user's intent and the most logical way to organize the extracted content.
"""
            
            from modules.datamodels.datamodelAi import AiCallRequest, AiCallOptions, OperationType
            request_options = AiCallOptions()
            request_options.operationType = OperationType.GENERAL
            
            request = AiCallRequest(prompt=analysis_prompt, context="", options=request_options)
            response = await aiService.aiObjects.call(request)
            
            if response and response.content:
                import json
                import re
                
                result = response.content.strip()
                json_match = re.search(r'\{.*\}', result, re.DOTALL)
                if json_match:
                    result = json_match.group(0)
                
                analysis = json.loads(result)
                
                # Use analysis to build appropriate prompt
                return await buildAdaptiveExtractionPrompt(
                    outputFormat, userPrompt, title, analysis, aiService, services
                )
        except Exception as e:
            services.utils.debugLogToFile(f"Generic prompt analysis failed: {str(e)}", "PROMPT_BUILDER")
    
    # Fallback to single-file prompt
    from .subJsonSchema import get_document_subJsonSchema
    json_schema = get_document_subJsonSchema()
    
    return f"""
{userPrompt}

You are extracting structured content from documents and must respond with valid JSON only.

IMPORTANT: You must respond with valid JSON only. No additional text, explanations, or formatting outside the JSON structure.

Extract the actual data from the source documents and structure it as JSON with this format:
{json.dumps(json_schema, indent=2)}

Requirements:
- Preserve all original data - do not summarize or interpret
- Use the exact JSON schema provided
- Maintain data integrity and structure

Return only the JSON structure with actual data from the documents. Do not include any text before or after the JSON.
"""

3.2 Adaptive Generation Service

File: gateway/modules/services/serviceGeneration/mainServiceGeneration.py

class GenerationService:
    async def getAdaptiveExtractionPrompt(
        self,
        outputFormat: str,
        userPrompt: str,
        title: str,
        promptAnalysis: Dict[str, Any],
        aiService=None
    ) -> str:
        """Get adaptive extraction prompt based on AI analysis."""
        from .subPromptBuilder import buildAdaptiveExtractionPrompt
        return await buildAdaptiveExtractionPrompt(
            outputFormat=outputFormat,
            userPrompt=userPrompt,
            title=title,
            promptAnalysis=promptAnalysis,
            aiService=aiService,
            services=self.services
        )
    
    async def getGenericExtractionPrompt(
        self,
        outputFormat: str,
        userPrompt: str,
        title: str,
        aiService=None
    ) -> str:
        """Get generic extraction prompt that works for both single and multi-file."""
        from .subPromptBuilder import buildGenericExtractionPrompt
        return await buildGenericExtractionPrompt(
            outputFormat=outputFormat,
            userPrompt=userPrompt,
            title=title,
            aiService=aiService,
            services=self.services
        )
    
    async def renderAdaptiveReport(
        self,
        extractedContent: Dict[str, Any],
        outputFormat: str,
        title: str,
        userPrompt: str = None,
        aiService=None,
        isMultiFile: bool = False
    ) -> Union[Tuple[str, str], List[Dict[str, Any]]]:
        """Render report adaptively based on content structure."""
        
        if isMultiFile and "documents" in extractedContent:
            return await self._renderMultiFileReport(
                extractedContent, outputFormat, title, userPrompt, aiService
            )
        else:
            return await self._renderSingleFileReport(
                extractedContent, outputFormat, title, userPrompt, aiService
            )
    
    async def _renderMultiFileReport(
        self,
        extractedContent: Dict[str, Any],
        outputFormat: str,
        title: str,
        userPrompt: str = None,
        aiService=None
    ) -> List[Dict[str, Any]]:
        """Render multiple documents from extracted content."""
        
        generated_documents = []
        
        for doc_data in extractedContent.get("documents", []):
            # Use existing single-file renderer for each document
            renderer = self._getFormatRenderer(outputFormat)
            if not renderer:
                continue
            
            # Render individual document
            rendered_content, mime_type = await renderer.render(
                extractedContent={"sections": doc_data["sections"]},
                title=doc_data["title"],
                userPrompt=userPrompt,
                aiService=aiService
            )
            
            generated_documents.append({
                "filename": doc_data["filename"],
                "content": rendered_content,
                "mime_type": mime_type,
                "title": doc_data["title"]
            })
        
        return generated_documents
    
    async def _renderSingleFileReport(
        self,
        extractedContent: Dict[str, Any],
        outputFormat: str,
        title: str,
        userPrompt: str = None,
        aiService=None
    ) -> Tuple[str, str]:
        """Render single file report (existing functionality)."""
        # Use existing renderReport method
        return await self.renderReport(
            extractedContent, outputFormat, title, userPrompt, aiService
        )

Phase 4: Workflow Integration

4.1 Enhanced Workflow Service

File: gateway/modules/services/serviceWorkflow/mainServiceWorkflow.py

class WorkflowService:
    async def processAiActionWithMultiFileSupport(
        self,
        action,
        workflow,
        message_id: str = None
    ) -> Dict[str, Any]:
        """Process AI action with multi-file support."""
        
        # Call AI service
        ai_result = await self.services.ai.callAi(
            prompt=action.prompt,
            documents=action.documents,
            outputFormat=action.outputFormat,
            title=action.title
        )
        
        # Check if multi-file result
        if ai_result.get("is_multi_file", False):
            # Process multiple documents
            created_documents = []
            for doc_info in ai_result.get("documents", []):
                document = self.services.generation.createDocument(
                    fileName=doc_info["documentName"],
                    mimeType=doc_info["mimeType"],
                    content=doc_info["documentData"],
                    messageId=message_id
                )
                if document:
                    created_documents.append(document)
            
            return {
                "success": True,
                "documents": created_documents,
                "is_multi_file": True,
                "file_count": len(created_documents)
            }
        else:
            # Use existing single-file processing
            return await self._processSingleFileAction(action, workflow, message_id)

Key Improvements Made

AI-Powered Multi-Language Support

  • No Hardcoded Patterns: Removed all language-specific pattern matching
  • Generic AI Analysis: Uses AI to analyze prompts in any language
  • Language Agnostic: Works with English, German, French, Spanish, etc.
  • Context Understanding: AI understands intent regardless of language

Generic Function Design

  • _analyzePromptIntent(): Generic AI-powered prompt analysis
  • buildAdaptiveExtractionPrompt(): Adapts to any language and intent
  • buildGenericExtractionPrompt(): Fallback for any prompt type
  • renderAdaptiveReport(): Handles both single and multi-file generically

Enhanced Error Handling

  • Graceful Fallback: Always falls back to single-file if multi-file fails
  • AI Error Recovery: Uses AI to recover from parsing errors
  • Generic Validation: Validates responses regardless of language
  • Backward Compatibility: 100% compatibility with existing functionality

Implementation Phases

Phase 1: Foundation (Week 1-2)

  • Extend JSON schema for multi-document support
  • Add multi-file detection logic to AI service
  • Create multi-file prompt builder
  • Add backward compatibility tests

Phase 2: Core Processing (Week 3-4)

  • Implement multi-file AI processing
  • Add multi-file generation service methods
  • Create document splitter logic
  • Add error handling and fallback mechanisms

Phase 3: Integration (Week 5-6)

  • Integrate with workflow service
  • Add multi-file support to action processing
  • Create comprehensive tests
  • Performance optimization

Phase 4: Enhancement (Week 7-8)

  • Add advanced split strategies
  • Implement custom file naming patterns
  • Add multi-file validation
  • Documentation and examples

Testing Strategy

Unit Tests

  • Multi-file detection accuracy
  • JSON schema validation
  • Document splitting logic
  • Error handling and fallbacks

Integration Tests

  • End-to-end multi-file generation
  • Backward compatibility with single-file
  • Performance with large documents
  • Various split strategies

User Acceptance Tests

  • "One file for each customer" scenarios
  • "Split data into meaningful pieces" scenarios
  • Complex multi-file requests
  • Error recovery and fallbacks

Performance Considerations

Optimization Strategies

  1. Parallel Processing: Process multiple documents simultaneously
  2. Caching: Cache renderer instances and common operations
  3. Memory Management: Efficient handling of multiple large documents
  4. Error Recovery: Graceful fallback to single-file if multi-file fails

Monitoring

  • Track multi-file vs single-file usage patterns
  • Monitor performance impact
  • Measure success rates for different split strategies
  • User satisfaction metrics

Migration Strategy

Backward Compatibility

  • All existing single-file functionality remains unchanged
  • No breaking changes to existing APIs
  • Gradual rollout with feature flags
  • Comprehensive testing before production

Rollout Plan

  1. Internal Testing: Test with development team
  2. Beta Testing: Limited user group testing
  3. Gradual Rollout: Feature flag controlled release
  4. Full Deployment: Complete rollout after validation

Success Metrics

Technical Metrics

  • Multi-file detection accuracy > 95%
  • Processing time increase < 20% for multi-file
  • Error rate < 1% for multi-file processing
  • Backward compatibility maintained 100%

User Metrics

  • User satisfaction with multi-file output
  • Reduction in manual file splitting tasks
  • Increased usage of complex document processing
  • Support ticket reduction for file splitting requests

Conclusion

This refactoring approach provides a robust, scalable solution for multi-file output generation while preserving the existing architecture's strengths. The phased implementation ensures minimal risk and maximum compatibility, while the comprehensive testing strategy guarantees reliability and performance.

The solution addresses the core requirements:

  • Generic multi-file requests (any entity type: customers, products, sections, etc.)
  • Data splitting requests ("split into meaningful pieces", "organize by category", etc.)
  • Preserves existing single-file functionality
  • Maintains current module structure
  • Extends rather than replaces existing systems