doc updates
This commit is contained in:
parent
804a00a196
commit
4e38e2abf5
10 changed files with 5214 additions and 8919 deletions
|
|
@ -0,0 +1,716 @@
|
|||
# Multi-File Output Refactoring Concept
|
||||
|
||||
## Overview
|
||||
|
||||
This document outlines a comprehensive refactoring approach to enable multi-file output generation while preserving the existing modular architecture. The solution extends the current system to support prompts like "Deliver one file for each customer data" or "split the data into meaningful pieces" without breaking existing single-file functionality.
|
||||
|
||||
## Current System Analysis
|
||||
|
||||
### Strengths to Preserve
|
||||
- **Modular Service Architecture**: Clean separation between `ExtractionService`, `AiService`, `GenerationService`, and `WorkflowService`
|
||||
- **Renderer Engine**: Robust format-specific renderers (DOCX, PDF, HTML, etc.) with registry pattern
|
||||
- **Extraction Pipeline**: Sophisticated content extraction and chunking system
|
||||
- **JSON Schema Structure**: Well-defined document structure for AI processing
|
||||
- **Backward Compatibility**: All existing single-file functionality must remain intact
|
||||
|
||||
### Current Limitations
|
||||
- **Single File Output**: All renderers return `(content, mime_type)` - one file per call
|
||||
- **Monolithic JSON Schema**: Designed for single document structure
|
||||
- **No Split Logic**: No mechanism to detect or handle multi-file requests
|
||||
- **Fixed Document Array**: AI service always returns single document in `documents` array
|
||||
|
||||
## Refactoring Architecture
|
||||
|
||||
### Core Design Principles
|
||||
|
||||
1. **Backward Compatibility First**: All existing single-file functionality remains unchanged
|
||||
2. **Minimal Core Changes**: Extend existing patterns rather than replace them
|
||||
3. **Renderer Preservation**: Keep all existing renderers unchanged
|
||||
4. **AI-Powered Detection**: Use AI to analyze prompts in any language for multi-file requests
|
||||
5. **Generic Functions**: All new functions are generic and language-agnostic
|
||||
6. **Graceful Fallback**: Fall back to single-file if multi-file processing fails
|
||||
7. **Performance Conscious**: Efficient processing without duplicating work
|
||||
|
||||
### Enhanced Processing Flow
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
start[User Prompt Analysis]
|
||||
detect{Multi-File Request?}
|
||||
single[Single-File Path]
|
||||
multi[Multi-File Path]
|
||||
|
||||
single --> extract[Extraction Service]
|
||||
single --> ai[AI Service - Single]
|
||||
single --> render[Generation Service - Single]
|
||||
single --> result[Single Document]
|
||||
|
||||
multi --> extractMulti[Extraction Service]
|
||||
multi --> aiMulti[AI Service - Multi]
|
||||
multi --> split[Document Splitter]
|
||||
multi --> renderMulti[Multi-File Generator]
|
||||
multi --> resultMulti[Multiple Documents]
|
||||
|
||||
extract --> ai
|
||||
extractMulti --> aiMulti
|
||||
ai --> render
|
||||
aiMulti --> split
|
||||
split --> renderMulti
|
||||
render --> result
|
||||
renderMulti --> resultMulti
|
||||
```
|
||||
|
||||
## Implementation Strategy
|
||||
|
||||
### Phase 1: Enhanced JSON Schema
|
||||
|
||||
#### 1.1 Multi-Document Schema Extension
|
||||
|
||||
**File**: `gateway/modules/services/serviceGeneration/subJsonSchema.py`
|
||||
|
||||
```python
|
||||
def get_multi_document_subJsonSchema() -> Dict[str, Any]:
|
||||
"""Get the JSON schema for multi-document generation."""
|
||||
return {
|
||||
"type": "object",
|
||||
"required": ["metadata", "documents"],
|
||||
"properties": {
|
||||
"metadata": {
|
||||
"type": "object",
|
||||
"required": ["title", "splitStrategy"],
|
||||
"properties": {
|
||||
"title": {"type": "string"},
|
||||
"splitStrategy": {
|
||||
"type": "string",
|
||||
"enum": ["per_customer", "by_section", "by_criteria", "by_data_type", "custom"],
|
||||
"description": "Strategy for splitting content into multiple files"
|
||||
},
|
||||
"splitCriteria": {
|
||||
"type": "object",
|
||||
"description": "Custom criteria for splitting (e.g., customer_id, category, etc.)"
|
||||
},
|
||||
"fileNamingPattern": {
|
||||
"type": "string",
|
||||
"description": "Pattern for generating filenames (e.g., '{customer_name}_data.docx')"
|
||||
}
|
||||
}
|
||||
},
|
||||
"documents": {
|
||||
"type": "array",
|
||||
"description": "Array of individual documents to generate",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"required": ["id", "title", "sections", "filename"],
|
||||
"properties": {
|
||||
"id": {"type": "string"},
|
||||
"title": {"type": "string"},
|
||||
"filename": {"type": "string"},
|
||||
"sections": {
|
||||
"type": "array",
|
||||
"items": {"$ref": "#/definitions/section"}
|
||||
},
|
||||
"metadata": {
|
||||
"type": "object",
|
||||
"description": "Document-specific metadata"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 1.2 Backward Compatibility
|
||||
|
||||
```python
|
||||
def get_document_subJsonSchema() -> Dict[str, Any]:
|
||||
"""Get the JSON schema for single document generation (existing)."""
|
||||
# Keep existing schema unchanged for backward compatibility
|
||||
return existing_schema
|
||||
|
||||
def get_adaptive_json_schema(user_prompt: str) -> Dict[str, Any]:
|
||||
"""Automatically select appropriate schema based on user prompt analysis."""
|
||||
if _is_multi_file_request(user_prompt):
|
||||
return get_multi_document_subJsonSchema()
|
||||
else:
|
||||
return get_document_subJsonSchema()
|
||||
```
|
||||
|
||||
### Phase 2: Enhanced AI Service
|
||||
|
||||
#### 2.1 AI-Powered Multi-File Detection
|
||||
|
||||
**File**: `gateway/modules/services/serviceAi/mainServiceAi.py`
|
||||
|
||||
```python
|
||||
class AiService:
|
||||
async def _analyzePromptIntent(self, prompt: str, ai_service=None) -> Dict[str, Any]:
|
||||
"""Use AI to analyze user prompt and determine processing requirements."""
|
||||
if not ai_service:
|
||||
return {"is_multi_file": False, "strategy": "single", "criteria": None}
|
||||
|
||||
try:
|
||||
analysis_prompt = f"""
|
||||
Analyze this user request and determine if it requires multiple file output or single file output.
|
||||
|
||||
User request: "{prompt}"
|
||||
|
||||
Respond with JSON only in this exact format:
|
||||
{{
|
||||
"is_multi_file": true/false,
|
||||
"strategy": "single|per_entity|by_section|by_criteria|custom",
|
||||
"criteria": "description of how to split content",
|
||||
"file_naming_pattern": "suggested pattern for filenames",
|
||||
"reasoning": "brief explanation of the analysis"
|
||||
}}
|
||||
|
||||
Consider:
|
||||
- Does the user want separate files for different entities (customers, products, etc.)?
|
||||
- Does the user want to split content into multiple documents?
|
||||
- What would be the most logical way to organize the content?
|
||||
- What language is the request in? (analyze in the original language)
|
||||
|
||||
Return only the JSON response.
|
||||
"""
|
||||
|
||||
from modules.datamodels.datamodelAi import AiCallRequest, AiCallOptions, OperationType
|
||||
request_options = AiCallOptions()
|
||||
request_options.operationType = OperationType.GENERAL
|
||||
|
||||
request = AiCallRequest(prompt=analysis_prompt, context="", options=request_options)
|
||||
response = await ai_service.aiObjects.call(request)
|
||||
|
||||
if response and response.content:
|
||||
import json
|
||||
import re
|
||||
|
||||
# Extract JSON from response
|
||||
result = response.content.strip()
|
||||
json_match = re.search(r'\{.*\}', result, re.DOTALL)
|
||||
if json_match:
|
||||
result = json_match.group(0)
|
||||
|
||||
analysis = json.loads(result)
|
||||
return analysis
|
||||
else:
|
||||
return {"is_multi_file": False, "strategy": "single", "criteria": None}
|
||||
|
||||
except Exception as e:
|
||||
self.logger.warning(f"AI prompt analysis failed: {str(e)}, defaulting to single file")
|
||||
return {"is_multi_file": False, "strategy": "single", "criteria": None}
|
||||
```
|
||||
|
||||
#### 2.2 Enhanced Document Generation
|
||||
|
||||
```python
|
||||
async def _callAiWithDocumentGeneration(
|
||||
self,
|
||||
prompt: str,
|
||||
documents: Optional[List[ChatDocument]],
|
||||
options: AiCallOptions,
|
||||
outputFormat: str,
|
||||
title: Optional[str]
|
||||
) -> Dict[str, Any]:
|
||||
"""Enhanced document generation with AI-powered multi-file detection."""
|
||||
|
||||
# Use AI to analyze prompt intent
|
||||
prompt_analysis = await self._analyzePromptIntent(prompt, self)
|
||||
|
||||
if prompt_analysis.get("is_multi_file", False):
|
||||
return await self._callAiWithMultiFileGeneration(
|
||||
prompt, documents, options, outputFormat, title, prompt_analysis
|
||||
)
|
||||
else:
|
||||
# Use existing single-file logic
|
||||
return await self._callAiWithSingleFileGeneration(
|
||||
prompt, documents, options, outputFormat, title
|
||||
)
|
||||
|
||||
async def _callAiWithMultiFileGeneration(
|
||||
self,
|
||||
prompt: str,
|
||||
documents: Optional[List[ChatDocument]],
|
||||
options: AiCallOptions,
|
||||
outputFormat: str,
|
||||
title: Optional[str],
|
||||
prompt_analysis: Dict[str, Any]
|
||||
) -> Dict[str, Any]:
|
||||
"""Handle multi-file document generation using AI analysis."""
|
||||
|
||||
# Get multi-file extraction prompt based on AI analysis
|
||||
generation_service = GenerationService(self.services)
|
||||
extraction_prompt = await generation_service.getAdaptiveExtractionPrompt(
|
||||
outputFormat=outputFormat,
|
||||
userPrompt=prompt,
|
||||
title=title,
|
||||
promptAnalysis=prompt_analysis,
|
||||
aiService=self
|
||||
)
|
||||
|
||||
# Process with adaptive JSON schema
|
||||
ai_response = await self._callAiJson(extraction_prompt, documents, options)
|
||||
|
||||
# Validate response structure
|
||||
if not self._validateResponseStructure(ai_response, prompt_analysis):
|
||||
# Fallback to single-file if multi-file fails
|
||||
self.logger.warning("Multi-file processing failed, falling back to single-file")
|
||||
return await self._callAiWithSingleFileGeneration(
|
||||
prompt, documents, options, outputFormat, title
|
||||
)
|
||||
|
||||
# Process multiple documents
|
||||
generated_documents = []
|
||||
for doc_data in ai_response.get("documents", []):
|
||||
rendered_content, mime_type = await generation_service.renderReport(
|
||||
extractedContent={"sections": doc_data["sections"]},
|
||||
outputFormat=outputFormat,
|
||||
title=doc_data["title"],
|
||||
userPrompt=prompt,
|
||||
aiService=self
|
||||
)
|
||||
|
||||
generated_documents.append({
|
||||
"documentName": doc_data["filename"],
|
||||
"documentData": rendered_content,
|
||||
"mimeType": mime_type
|
||||
})
|
||||
|
||||
return {
|
||||
"success": True,
|
||||
"content": ai_response,
|
||||
"rendered_content": None, # Not applicable for multi-file
|
||||
"mime_type": None, # Not applicable for multi-file
|
||||
"filename": None, # Not applicable for multi-file
|
||||
"format": outputFormat,
|
||||
"title": title,
|
||||
"documents": generated_documents,
|
||||
"is_multi_file": True,
|
||||
"split_strategy": prompt_analysis.get("strategy", "custom")
|
||||
}
|
||||
|
||||
def _validateResponseStructure(self, response: Dict[str, Any], prompt_analysis: Dict[str, Any]) -> bool:
|
||||
"""Validate that AI response matches the expected structure."""
|
||||
try:
|
||||
if not isinstance(response, dict):
|
||||
return False
|
||||
|
||||
# Check for multi-file structure
|
||||
if prompt_analysis.get("is_multi_file", False):
|
||||
return "documents" in response and isinstance(response["documents"], list)
|
||||
else:
|
||||
return "sections" in response and isinstance(response["sections"], list)
|
||||
except Exception:
|
||||
return False
|
||||
```
|
||||
|
||||
### Phase 3: Enhanced Generation Service
|
||||
|
||||
#### 3.1 Adaptive Prompt Builder
|
||||
|
||||
**File**: `gateway/modules/services/serviceGeneration/subPromptBuilder.py`
|
||||
|
||||
```python
|
||||
async def buildAdaptiveExtractionPrompt(
|
||||
outputFormat: str,
|
||||
userPrompt: str,
|
||||
title: str,
|
||||
promptAnalysis: Dict[str, Any],
|
||||
aiService=None,
|
||||
services=None
|
||||
) -> str:
|
||||
"""Build adaptive extraction prompt based on AI analysis."""
|
||||
|
||||
# Get appropriate JSON schema based on analysis
|
||||
if promptAnalysis.get("is_multi_file", False):
|
||||
from .subJsonSchema import get_multi_document_subJsonSchema
|
||||
json_schema = get_multi_document_subJsonSchema()
|
||||
schema_type = "multi-document"
|
||||
else:
|
||||
from .subJsonSchema import get_document_subJsonSchema
|
||||
json_schema = get_document_subJsonSchema()
|
||||
schema_type = "single-document"
|
||||
|
||||
# Build adaptive prompt using AI analysis
|
||||
adaptive_prompt = f"""
|
||||
{userPrompt}
|
||||
|
||||
You are extracting structured content from documents and must respond with valid JSON only.
|
||||
|
||||
IMPORTANT: You must respond with valid JSON only. No additional text, explanations, or formatting outside the JSON structure.
|
||||
|
||||
Processing Requirements:
|
||||
- Output Type: {schema_type}
|
||||
- Split Strategy: {promptAnalysis.get('strategy', 'single')}
|
||||
- Split Criteria: {promptAnalysis.get('criteria', 'N/A')}
|
||||
- File Naming Pattern: {promptAnalysis.get('file_naming_pattern', 'auto-generated')}
|
||||
|
||||
Extract the actual data from the source documents and structure it as JSON with this format:
|
||||
{json.dumps(json_schema, indent=2)}
|
||||
|
||||
Requirements:
|
||||
- Preserve all original data - do not summarize or interpret
|
||||
- Follow the split strategy: {promptAnalysis.get('criteria', 'organize logically')}
|
||||
- Use meaningful filenames that reflect the content
|
||||
- Ensure each document is complete and self-contained
|
||||
- Maintain data integrity and structure
|
||||
|
||||
Return only the JSON structure with actual data from the documents. Do not include any text before or after the JSON.
|
||||
"""
|
||||
|
||||
return adaptive_prompt
|
||||
|
||||
async def buildGenericExtractionPrompt(
|
||||
outputFormat: str,
|
||||
userPrompt: str,
|
||||
title: str,
|
||||
aiService=None,
|
||||
services=None
|
||||
) -> str:
|
||||
"""Build generic extraction prompt that works for both single and multi-file."""
|
||||
|
||||
# Use AI to determine the best approach
|
||||
if aiService:
|
||||
try:
|
||||
analysis_prompt = f"""
|
||||
Analyze this user request and determine the best JSON structure for document extraction.
|
||||
|
||||
User request: "{userPrompt}"
|
||||
|
||||
Respond with JSON only:
|
||||
{{
|
||||
"requires_multi_file": true/false,
|
||||
"recommended_schema": "single_document|multi_document",
|
||||
"split_approach": "description of how to organize content",
|
||||
"file_naming": "suggested naming pattern"
|
||||
}}
|
||||
|
||||
Consider the user's intent and the most logical way to organize the extracted content.
|
||||
"""
|
||||
|
||||
from modules.datamodels.datamodelAi import AiCallRequest, AiCallOptions, OperationType
|
||||
request_options = AiCallOptions()
|
||||
request_options.operationType = OperationType.GENERAL
|
||||
|
||||
request = AiCallRequest(prompt=analysis_prompt, context="", options=request_options)
|
||||
response = await aiService.aiObjects.call(request)
|
||||
|
||||
if response and response.content:
|
||||
import json
|
||||
import re
|
||||
|
||||
result = response.content.strip()
|
||||
json_match = re.search(r'\{.*\}', result, re.DOTALL)
|
||||
if json_match:
|
||||
result = json_match.group(0)
|
||||
|
||||
analysis = json.loads(result)
|
||||
|
||||
# Use analysis to build appropriate prompt
|
||||
return await buildAdaptiveExtractionPrompt(
|
||||
outputFormat, userPrompt, title, analysis, aiService, services
|
||||
)
|
||||
except Exception as e:
|
||||
services.utils.debugLogToFile(f"Generic prompt analysis failed: {str(e)}", "PROMPT_BUILDER")
|
||||
|
||||
# Fallback to single-file prompt
|
||||
from .subJsonSchema import get_document_subJsonSchema
|
||||
json_schema = get_document_subJsonSchema()
|
||||
|
||||
return f"""
|
||||
{userPrompt}
|
||||
|
||||
You are extracting structured content from documents and must respond with valid JSON only.
|
||||
|
||||
IMPORTANT: You must respond with valid JSON only. No additional text, explanations, or formatting outside the JSON structure.
|
||||
|
||||
Extract the actual data from the source documents and structure it as JSON with this format:
|
||||
{json.dumps(json_schema, indent=2)}
|
||||
|
||||
Requirements:
|
||||
- Preserve all original data - do not summarize or interpret
|
||||
- Use the exact JSON schema provided
|
||||
- Maintain data integrity and structure
|
||||
|
||||
Return only the JSON structure with actual data from the documents. Do not include any text before or after the JSON.
|
||||
"""
|
||||
```
|
||||
|
||||
#### 3.2 Adaptive Generation Service
|
||||
|
||||
**File**: `gateway/modules/services/serviceGeneration/mainServiceGeneration.py`
|
||||
|
||||
```python
|
||||
class GenerationService:
|
||||
async def getAdaptiveExtractionPrompt(
|
||||
self,
|
||||
outputFormat: str,
|
||||
userPrompt: str,
|
||||
title: str,
|
||||
promptAnalysis: Dict[str, Any],
|
||||
aiService=None
|
||||
) -> str:
|
||||
"""Get adaptive extraction prompt based on AI analysis."""
|
||||
from .subPromptBuilder import buildAdaptiveExtractionPrompt
|
||||
return await buildAdaptiveExtractionPrompt(
|
||||
outputFormat=outputFormat,
|
||||
userPrompt=userPrompt,
|
||||
title=title,
|
||||
promptAnalysis=promptAnalysis,
|
||||
aiService=aiService,
|
||||
services=self.services
|
||||
)
|
||||
|
||||
async def getGenericExtractionPrompt(
|
||||
self,
|
||||
outputFormat: str,
|
||||
userPrompt: str,
|
||||
title: str,
|
||||
aiService=None
|
||||
) -> str:
|
||||
"""Get generic extraction prompt that works for both single and multi-file."""
|
||||
from .subPromptBuilder import buildGenericExtractionPrompt
|
||||
return await buildGenericExtractionPrompt(
|
||||
outputFormat=outputFormat,
|
||||
userPrompt=userPrompt,
|
||||
title=title,
|
||||
aiService=aiService,
|
||||
services=self.services
|
||||
)
|
||||
|
||||
async def renderAdaptiveReport(
|
||||
self,
|
||||
extractedContent: Dict[str, Any],
|
||||
outputFormat: str,
|
||||
title: str,
|
||||
userPrompt: str = None,
|
||||
aiService=None,
|
||||
isMultiFile: bool = False
|
||||
) -> Union[Tuple[str, str], List[Dict[str, Any]]]:
|
||||
"""Render report adaptively based on content structure."""
|
||||
|
||||
if isMultiFile and "documents" in extractedContent:
|
||||
return await self._renderMultiFileReport(
|
||||
extractedContent, outputFormat, title, userPrompt, aiService
|
||||
)
|
||||
else:
|
||||
return await self._renderSingleFileReport(
|
||||
extractedContent, outputFormat, title, userPrompt, aiService
|
||||
)
|
||||
|
||||
async def _renderMultiFileReport(
|
||||
self,
|
||||
extractedContent: Dict[str, Any],
|
||||
outputFormat: str,
|
||||
title: str,
|
||||
userPrompt: str = None,
|
||||
aiService=None
|
||||
) -> List[Dict[str, Any]]:
|
||||
"""Render multiple documents from extracted content."""
|
||||
|
||||
generated_documents = []
|
||||
|
||||
for doc_data in extractedContent.get("documents", []):
|
||||
# Use existing single-file renderer for each document
|
||||
renderer = self._getFormatRenderer(outputFormat)
|
||||
if not renderer:
|
||||
continue
|
||||
|
||||
# Render individual document
|
||||
rendered_content, mime_type = await renderer.render(
|
||||
extractedContent={"sections": doc_data["sections"]},
|
||||
title=doc_data["title"],
|
||||
userPrompt=userPrompt,
|
||||
aiService=aiService
|
||||
)
|
||||
|
||||
generated_documents.append({
|
||||
"filename": doc_data["filename"],
|
||||
"content": rendered_content,
|
||||
"mime_type": mime_type,
|
||||
"title": doc_data["title"]
|
||||
})
|
||||
|
||||
return generated_documents
|
||||
|
||||
async def _renderSingleFileReport(
|
||||
self,
|
||||
extractedContent: Dict[str, Any],
|
||||
outputFormat: str,
|
||||
title: str,
|
||||
userPrompt: str = None,
|
||||
aiService=None
|
||||
) -> Tuple[str, str]:
|
||||
"""Render single file report (existing functionality)."""
|
||||
# Use existing renderReport method
|
||||
return await self.renderReport(
|
||||
extractedContent, outputFormat, title, userPrompt, aiService
|
||||
)
|
||||
```
|
||||
|
||||
### Phase 4: Workflow Integration
|
||||
|
||||
#### 4.1 Enhanced Workflow Service
|
||||
|
||||
**File**: `gateway/modules/services/serviceWorkflow/mainServiceWorkflow.py`
|
||||
|
||||
```python
|
||||
class WorkflowService:
|
||||
async def processAiActionWithMultiFileSupport(
|
||||
self,
|
||||
action,
|
||||
workflow,
|
||||
message_id: str = None
|
||||
) -> Dict[str, Any]:
|
||||
"""Process AI action with multi-file support."""
|
||||
|
||||
# Call AI service
|
||||
ai_result = await self.services.ai.callAi(
|
||||
prompt=action.prompt,
|
||||
documents=action.documents,
|
||||
outputFormat=action.outputFormat,
|
||||
title=action.title
|
||||
)
|
||||
|
||||
# Check if multi-file result
|
||||
if ai_result.get("is_multi_file", False):
|
||||
# Process multiple documents
|
||||
created_documents = []
|
||||
for doc_info in ai_result.get("documents", []):
|
||||
document = self.services.generation.createDocument(
|
||||
fileName=doc_info["documentName"],
|
||||
mimeType=doc_info["mimeType"],
|
||||
content=doc_info["documentData"],
|
||||
messageId=message_id
|
||||
)
|
||||
if document:
|
||||
created_documents.append(document)
|
||||
|
||||
return {
|
||||
"success": True,
|
||||
"documents": created_documents,
|
||||
"is_multi_file": True,
|
||||
"file_count": len(created_documents)
|
||||
}
|
||||
else:
|
||||
# Use existing single-file processing
|
||||
return await self._processSingleFileAction(action, workflow, message_id)
|
||||
```
|
||||
|
||||
## Key Improvements Made
|
||||
|
||||
### AI-Powered Multi-Language Support
|
||||
- **No Hardcoded Patterns**: Removed all language-specific pattern matching
|
||||
- **Generic AI Analysis**: Uses AI to analyze prompts in any language
|
||||
- **Language Agnostic**: Works with English, German, French, Spanish, etc.
|
||||
- **Context Understanding**: AI understands intent regardless of language
|
||||
|
||||
### Generic Function Design
|
||||
- **`_analyzePromptIntent()`**: Generic AI-powered prompt analysis
|
||||
- **`buildAdaptiveExtractionPrompt()`**: Adapts to any language and intent
|
||||
- **`buildGenericExtractionPrompt()`**: Fallback for any prompt type
|
||||
- **`renderAdaptiveReport()`**: Handles both single and multi-file generically
|
||||
|
||||
### Enhanced Error Handling
|
||||
- **Graceful Fallback**: Always falls back to single-file if multi-file fails
|
||||
- **AI Error Recovery**: Uses AI to recover from parsing errors
|
||||
- **Generic Validation**: Validates responses regardless of language
|
||||
- **Backward Compatibility**: 100% compatibility with existing functionality
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
### Phase 1: Foundation (Week 1-2)
|
||||
- [ ] Extend JSON schema for multi-document support
|
||||
- [ ] Add multi-file detection logic to AI service
|
||||
- [ ] Create multi-file prompt builder
|
||||
- [ ] Add backward compatibility tests
|
||||
|
||||
### Phase 2: Core Processing (Week 3-4)
|
||||
- [ ] Implement multi-file AI processing
|
||||
- [ ] Add multi-file generation service methods
|
||||
- [ ] Create document splitter logic
|
||||
- [ ] Add error handling and fallback mechanisms
|
||||
|
||||
### Phase 3: Integration (Week 5-6)
|
||||
- [ ] Integrate with workflow service
|
||||
- [ ] Add multi-file support to action processing
|
||||
- [ ] Create comprehensive tests
|
||||
- [ ] Performance optimization
|
||||
|
||||
### Phase 4: Enhancement (Week 7-8)
|
||||
- [ ] Add advanced split strategies
|
||||
- [ ] Implement custom file naming patterns
|
||||
- [ ] Add multi-file validation
|
||||
- [ ] Documentation and examples
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Unit Tests
|
||||
- Multi-file detection accuracy
|
||||
- JSON schema validation
|
||||
- Document splitting logic
|
||||
- Error handling and fallbacks
|
||||
|
||||
### Integration Tests
|
||||
- End-to-end multi-file generation
|
||||
- Backward compatibility with single-file
|
||||
- Performance with large documents
|
||||
- Various split strategies
|
||||
|
||||
### User Acceptance Tests
|
||||
- "One file for each customer" scenarios
|
||||
- "Split data into meaningful pieces" scenarios
|
||||
- Complex multi-file requests
|
||||
- Error recovery and fallbacks
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Optimization Strategies
|
||||
1. **Parallel Processing**: Process multiple documents simultaneously
|
||||
2. **Caching**: Cache renderer instances and common operations
|
||||
3. **Memory Management**: Efficient handling of multiple large documents
|
||||
4. **Error Recovery**: Graceful fallback to single-file if multi-file fails
|
||||
|
||||
### Monitoring
|
||||
- Track multi-file vs single-file usage patterns
|
||||
- Monitor performance impact
|
||||
- Measure success rates for different split strategies
|
||||
- User satisfaction metrics
|
||||
|
||||
## Migration Strategy
|
||||
|
||||
### Backward Compatibility
|
||||
- All existing single-file functionality remains unchanged
|
||||
- No breaking changes to existing APIs
|
||||
- Gradual rollout with feature flags
|
||||
- Comprehensive testing before production
|
||||
|
||||
### Rollout Plan
|
||||
1. **Internal Testing**: Test with development team
|
||||
2. **Beta Testing**: Limited user group testing
|
||||
3. **Gradual Rollout**: Feature flag controlled release
|
||||
4. **Full Deployment**: Complete rollout after validation
|
||||
|
||||
## Success Metrics
|
||||
|
||||
### Technical Metrics
|
||||
- Multi-file detection accuracy > 95%
|
||||
- Processing time increase < 20% for multi-file
|
||||
- Error rate < 1% for multi-file processing
|
||||
- Backward compatibility maintained 100%
|
||||
|
||||
### User Metrics
|
||||
- User satisfaction with multi-file output
|
||||
- Reduction in manual file splitting tasks
|
||||
- Increased usage of complex document processing
|
||||
- Support ticket reduction for file splitting requests
|
||||
|
||||
## Conclusion
|
||||
|
||||
This refactoring approach provides a robust, scalable solution for multi-file output generation while preserving the existing architecture's strengths. The phased implementation ensures minimal risk and maximum compatibility, while the comprehensive testing strategy guarantees reliability and performance.
|
||||
|
||||
The solution addresses the core requirements:
|
||||
- ✅ Generic multi-file requests (any entity type: customers, products, sections, etc.)
|
||||
- ✅ Data splitting requests ("split into meaningful pieces", "organize by category", etc.)
|
||||
- ✅ Preserves existing single-file functionality
|
||||
- ✅ Maintains current module structure
|
||||
- ✅ Extends rather than replaces existing systems
|
||||
2249
poweron/testdata-wait/mail_full_uc.json
Normal file
2249
poweron/testdata-wait/mail_full_uc.json
Normal file
File diff suppressed because one or more lines are too long
BIN
poweron/testdata/PowerOn NDA 2025.docx
vendored
BIN
poweron/testdata/PowerOn NDA 2025.docx
vendored
Binary file not shown.
BIN
poweron/testdata/SILF_AI_VO_PM.pptx
vendored
BIN
poweron/testdata/SILF_AI_VO_PM.pptx
vendored
Binary file not shown.
BIN
poweron/testdata/auszug_liste_positionen.pdf
vendored
BIN
poweron/testdata/auszug_liste_positionen.pdf
vendored
Binary file not shown.
8808
poweron/testdata/data_full.csv
vendored
8808
poweron/testdata/data_full.csv
vendored
File diff suppressed because it is too large
Load diff
BIN
poweron/testdata/diagramm_komponenten.pdf
vendored
BIN
poweron/testdata/diagramm_komponenten.pdf
vendored
Binary file not shown.
Binary file not shown.
|
Before Width: | Height: | Size: 51 KiB |
111
poweron/testdata/future path.html
vendored
111
poweron/testdata/future path.html
vendored
|
|
@ -1,111 +0,0 @@
|
|||
<!DOCTYPE html>
|
||||
<html lang="de">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<title>Opportunities > Future-Fit – Blue Ocean</title>
|
||||
<style>
|
||||
body {
|
||||
font-family: 'Segoe UI', Arial, sans-serif;
|
||||
background: #fff;
|
||||
color: #1a2747;
|
||||
margin: 0;
|
||||
padding: 2cm 2.5cm 2cm 2.5cm;
|
||||
box-sizing: border-box;
|
||||
}
|
||||
h1 {
|
||||
color: #1a2747;
|
||||
font-size: 2.1em;
|
||||
margin-bottom: 0.2em;
|
||||
}
|
||||
h2 {
|
||||
color: #2e4a7d;
|
||||
font-size: 1.2em;
|
||||
margin-top: 1.5em;
|
||||
margin-bottom: 0.5em;
|
||||
letter-spacing: 0.01em;
|
||||
}
|
||||
.subtitle {
|
||||
color: #4b5c7d;
|
||||
font-size: 1.1em;
|
||||
margin-bottom: 1.2em;
|
||||
}
|
||||
.grid {
|
||||
display: grid;
|
||||
grid-template-columns: 1fr 1fr;
|
||||
gap: 1.5em 2.5em;
|
||||
margin-top: 1.5em;
|
||||
}
|
||||
.section {
|
||||
margin-bottom: 0.5em;
|
||||
}
|
||||
ul {
|
||||
margin: 0.2em 0 0.8em 1.2em;
|
||||
padding: 0;
|
||||
color: #1a2747;
|
||||
font-size: 1em;
|
||||
}
|
||||
li {
|
||||
margin-bottom: 0.3em;
|
||||
line-height: 1.5;
|
||||
}
|
||||
@media print {
|
||||
body { padding: 0.5cm 1cm; }
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<h1>Opportunities > Future-Fit</h1>
|
||||
<div class="subtitle">"Blue Ocean"</div>
|
||||
<div class="grid">
|
||||
<div class="section">
|
||||
<h2>Fokus</h2>
|
||||
<ul>
|
||||
<li>Demokratisierung von KI: KI-gestuetzte Workflows und Beratung fuer alle Mitarbeitenden, nicht nur fuer IT oder Management</li>
|
||||
<li>Integration von E-Mail, Daten und Dokumenten in einer Plattform</li>
|
||||
<li>Self-Service & Automatisierung: Kunden befaehigen, selbst Innovationen und Optimierungen umzusetzen</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div class="section">
|
||||
<h2>Shape / Re-Think</h2>
|
||||
<ul>
|
||||
<li>Beratung als Plattform: Von klassischer Beratung zu "Consulting-as-a-Platform"</li>
|
||||
<li>Kollaborative KI-Workflows: Teams arbeiten gemeinsam mit KI an Projekten</li>
|
||||
<li>Proaktive Innovation: Die Plattform erkennt und schlaegt Innovationspotenziale vor</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div class="section">
|
||||
<h2>Create</h2>
|
||||
<ul>
|
||||
<li>Digitaler Co-Pilot fuer jede Rolle im Unternehmen</li>
|
||||
<li>Plug & Play Consulting-Module und KI-Agenten</li>
|
||||
<li>Unternehmensweites Wissensnetzwerk statt Silos</li>
|
||||
<li>Beratungsergebnisse als digitale, wiederverwendbare Assets (Workflows, Agenten)</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div class="section">
|
||||
<h2>Eliminate</h2>
|
||||
<ul>
|
||||
<li>IT-Huerden und langwierige Integrationsprojekte</li>
|
||||
<li>Abhaengigkeit von externen Beratern fuer jede Optimierung</li>
|
||||
<li>Wissenssilos und Intransparenz</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div class="section">
|
||||
<h2>Opportunities</h2>
|
||||
<ul>
|
||||
<li>Erschliessung neuer Kundensegmente (KMU, Non-Profits, etc.), die bisher keinen Zugang zu KI-Beratung hatten</li>
|
||||
<li>Aufbau eines Oekosystems fuer digitale Beratungs- und KI-Module</li>
|
||||
<li>Positionierung als Innovationsfuehrer im Beratungsmarkt</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div class="section">
|
||||
<h2>Threats</h2>
|
||||
<ul>
|
||||
<li>Schnelle Nachahmung durch Wettbewerber</li>
|
||||
<li>Datenschutz- und Compliance-Anforderungen</li>
|
||||
<li>Technologische Ueberforderung bei Kunden ohne Change Management</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
2249
poweron/testdata/mail_full_uc.json
vendored
Normal file
2249
poweron/testdata/mail_full_uc.json
vendored
Normal file
File diff suppressed because one or more lines are too long
Loading…
Reference in a new issue