gateway/test_ai_calls.md
2025-09-02 18:58:30 +02:00

7.7 KiB

AI Call Functions Test and Content Size Analysis

Overview

This file documents the ServiceCenter AI functions that have risk of delivering too big content, along with their usage patterns and potential size issues.

High-Risk AI Functions

1. summarizeChat() -> callAiTextBasic()

Location: gateway/modules/chat/handling/promptFactory.py:122 Risk Level: MEDIUM Content: Entire workflow message history Usage:

messageSummary = await service.summarizeChat(context.workflow.messages) if context.workflow else ""

Potential Issues:

  • Long conversations can generate very large summaries
  • Includes all previous messages in workflow
  • No size limits or truncation

2. callAiTextAdvanced() -> interfaceAiCalls.callAiTextAdvanced()

Risk Level: HIGH Multiple Usage Points:

A. Task Planning (handlingTasks.py:116)

prompt = await self.service.callAiTextAdvanced(task_planning_prompt)

Content: User input + document context + connection context + previous results Risk: VERY HIGH - includes all available documents and context

B. Action Definition (handlingTasks.py:388)

prompt = await self.service.callAiTextAdvanced(action_prompt)

Content: Task context + available documents + connections + previous results Risk: HIGH - comprehensive context for action planning

C. Result Review (handlingTasks.py:894)

response = await self.service.callAiTextAdvanced(prompt)

Content: Action results + success criteria + context Risk: MEDIUM-HIGH - depends on result size

D. Email Composition (methodOutlook.py:1609)

composed_email = await self.service.interfaceAiCalls.callAiTextAdvanced(ai_prompt)

Content: Document content + email requirements Risk: MEDIUM - depends on document size

E. AI Processing (methodAi.py:175)

result = await self.service.callAiTextAdvanced(enhanced_prompt, context)

Content: User prompt + extracted document content Risk: HIGH - includes full document content

3. callAiTextBasic() -> interfaceAiCalls.callAiTextBasic()

Risk Level: MEDIUM Multiple Usage Points:

A. Document Format Conversion (methodDocument.py:429)

formatted_content = await self.service.callAiTextBasic(ai_prompt, content)

Content: Document content + format requirements Risk: MEDIUM - depends on document size

B. HTML Report Generation (methodDocument.py:642)

aiReport = await self.service.callAiTextBasic(aiPrompt, combinedContent)

Content: Combined content from multiple documents Risk: HIGH - combines multiple documents

C. AI Processing Fallback (methodAi.py:177)

result = await self.service.callAiTextBasic(enhanced_prompt, context)

Content: User prompt + document context Risk: MEDIUM - includes document content

D. Document Content Processing (documentExtraction.py:1459)

processedContent = await self._serviceCenter.callAiTextBasic(aiPrompt, contentToProcess)

Content: Document chunks + AI prompt Risk: MEDIUM - processes document chunks

4. extractContentFromDocument() -> documentProcessor.processFileData()

Risk Level: HIGH Multiple Usage Points:

A. Document Content Extraction (methodDocument.py:74)

extracted_content = await self.service.extractContentFromDocument(
    prompt=aiPrompt,
    document=chatDocument
)

Content: Full document + extraction prompt Risk: HIGH - processes entire documents

B. HTML Report Generation (methodDocument.py:581)

extracted_content = await self.service.extractContentFromDocument(
    prompt="Extract readable text content for HTML report generation", 
    document=doc
)

Content: Full document content Risk: HIGH - processes documents for reports

C. Email Composition (methodOutlook.py:1510)

extracted_content = await self.service.extractContentFromDocument(
    prompt="Extract readable text content for email composition", 
    document=doc
)

Content: Full document content Risk: HIGH - processes documents for emails

D. AI Processing (methodAi.py:94)

extracted_content = await self.service.extractContentFromDocument(
    prompt=extraction_prompt.strip(), 
    document=doc
)

Content: Full document content Risk: HIGH - processes documents for AI analysis

Risk Assessment Summary

CRITICAL RISK (Immediate Attention Required)

  1. Task Planning (handlingTasks.py:116) - Entire workflow context
  2. Action Definition (handlingTasks.py:388) - Comprehensive context
  3. Document Processing (all extractContentFromDocument calls) - Full documents
  4. AI Method Processing (methodAi.py:175) - Document content + context
  5. Report Generation (methodDocument.py:642) - Multiple documents combined

HIGH RISK (Monitor Closely)

  1. Chat Summarization (promptFactory.py:122) - Message history
  2. Document Format Conversion (methodDocument.py:429) - Single documents
  3. Email Composition (methodOutlook.py:1609) - Document content

Potential Issues

Content Size Problems

  • Large documents (PDFs, Word docs, Excel files) can exceed AI model limits
  • Combined document content in reports can be massive
  • Long conversation histories in chat summarization
  • Full workflow context in task planning

Performance Issues

  • Timeout errors for large content
  • Memory issues with large document processing
  • API rate limiting with large requests
  • Cost implications for large AI calls

Error Scenarios

  • OpenAI API 400 errors (content too large)
  • Timeout errors (processing too slow)
  • Memory exhaustion (large document processing)
  • Incomplete processing (truncated content)

1. Content Size Limits

  • Implement maximum content size checks before AI calls
  • Truncate large content with appropriate warnings
  • Split large documents into chunks

2. Content Filtering

  • Remove unnecessary context from prompts
  • Filter out large binary content
  • Use document summaries instead of full content

3. Chunking Strategy

  • Process large documents in smaller chunks
  • Implement progressive processing
  • Use streaming for large responses

4. Caching and Optimization

  • Cache processed document content
  • Reuse extracted content across operations
  • Implement smart content selection

5. Error Handling

  • Graceful degradation for oversized content
  • Fallback strategies for failed AI calls
  • User notifications for content size issues

Test Scenarios

Test Case 1: Large Document Processing

  • Upload a 10MB PDF document
  • Try to extract content for AI processing
  • Monitor for size limit errors

Test Case 2: Multiple Document Reports

  • Upload 5+ large documents
  • Generate HTML report
  • Check for combined content size issues

Test Case 3: Long Conversation History

  • Create workflow with 50+ messages
  • Test chat summarization
  • Monitor for context size limits

Test Case 4: Task Planning with Large Context

  • Create workflow with many documents
  • Test task planning functionality
  • Check for prompt size limits

Monitoring Recommendations

  1. Log Content Sizes: Track the size of content sent to AI functions
  2. Monitor API Errors: Watch for 400 errors indicating content too large
  3. Performance Metrics: Track processing times for large content
  4. User Feedback: Monitor for incomplete or failed operations
  5. Cost Tracking: Monitor AI API costs for large requests

Implementation Priority

  1. Immediate: Add content size checks to extractContentFromDocument
  2. High: Implement chunking for large document processing
  3. Medium: Add content filtering to task planning prompts
  4. Low: Implement caching for processed content

This analysis should help identify and mitigate the risks of delivering too big content to AI functions.