gateway/modules/workflows/processing/shared/RENDERING_ISSUE_ANALYSIS.md
2025-12-23 00:34:15 +01:00

7.1 KiB

Rendering Issue Analysis

Why HTML Documents Are Being Rendered as Text

Date: 2025-12-22
Issue: Documents requested as HTML are being output as text/plain


Root Cause Analysis

Issue 1: resultType Not Extracted from Task Objective CRITICAL

Problem:

  • Task objective clearly states: "Generate a complete, well-structured HTML document"
  • Validation shows: EXPECTED FORMATS: ['html']
  • But action was called with: ai.generateDocument {} (empty parameters)
  • So resultType defaults to "docx" instead of "html"

Location:

  • generateDocument.py line 44: resultType = parameters.get("resultType", "docx")
  • No parameter extraction from task objective/prompt

Impact: CRITICAL - Wrong format is used even though task clearly requests HTML

Fix Needed:

  • Extract resultType from task objective/prompt before calling action
  • Or enhance generateDocument to detect format from prompt if not provided

Issue 2: HTML Not in Action Definition Options CRITICAL

Problem:

  • Action definition in methodAi.py line 357 only lists: ["docx", "pdf", "txt", "md"]
  • "html" is NOT in the allowed options
  • But docstring says HTML is supported: "resultType (str, optional): Output format (docx, pdf, txt, md, html, etc.)"

Location:

  • methodAi.py line 357: frontendOptions=["docx", "pdf", "txt", "md"]

Impact: CRITICAL - Even if HTML is requested, it might be rejected or not recognized

Fix Needed:

  • Add "html" to frontendOptions list

Issue 3: Renderer Fallback to Text CRITICAL

Problem:

  • When resultType="docx" is used (default)
  • If docx renderer fails or is not found
  • System falls back to text renderer (line 403-404 of mainServiceGeneration.py)
  • This explains why output is text/plain instead of HTML

Location:

  • mainServiceGeneration.py lines 393-409: _getFormatRenderer() method
  • Line 403: logger.warning(f"No renderer found for format {output_format}, falling back to text")

Impact: CRITICAL - Wrong format is rendered

Fix Needed:

  • Fix docx renderer if it's failing
  • Or better: Extract correct format from prompt

Issue 4: Missing Parameter Extraction HIGH PRIORITY

Problem:

  • Task objective contains format information ("HTML document")
  • But no parameter extraction step extracts resultType from prompt
  • Action is called with empty parameters {}

Location:

  • Workflow execution - parameter extraction phase
  • Should extract resultType: "html" from task objective

Impact: HIGH - System can't infer format from user intent

Fix Needed:

  • Add parameter extraction that detects format from prompt
  • Or enhance generateDocument to auto-detect format from prompt

Flow Analysis

Expected Flow:

1. Task Objective: "Generate HTML document..."
2. Parameter Extraction: Extract resultType="html" from objective
3. Action Call: ai.generateDocument({resultType: "html", prompt: "..."})
4. Content Generation: Generate sections with content
5. Integration: Merge sections into complete structure
6. Rendering: Call renderReport(outputFormat="html")
7. HTML Renderer: Render to HTML
8. Output: document.html (text/html)

Actual Flow (Broken):

1. Task Objective: "Generate HTML document..."
2. Parameter Extraction: ❌ MISSING - no extraction
3. Action Call: ai.generateDocument({}) ❌ Empty parameters
4. Content Generation: ✅ Generate sections with content
5. Integration: ✅ Merge sections into complete structure
6. Rendering: Call renderReport(outputFormat="docx") ❌ Wrong format
7. Docx Renderer: ❌ Fails or not found
8. Fallback: Text renderer ❌ Wrong renderer
9. Output: document.text (text/plain) ❌ Wrong format

Fixes Required

Fix 1: Add HTML to Action Definition Options EASY

File: gateway/modules/workflows/methods/methodAi/methodAi.py
Line: 357

Change:

frontendOptions=["docx", "pdf", "txt", "md", "html"],  # Added "html"

Fix 2: Extract resultType from Prompt MEDIUM

Option A: Enhance generateDocument to detect format from prompt

File: gateway/modules/workflows/methods/methodAi/actions/generateDocument.py
After line 44:

resultType = parameters.get("resultType", "docx")

# Auto-detect format from prompt if not provided
if resultType == "docx" and prompt:
    promptLower = prompt.lower()
    if "html" in promptLower or "html5" in promptLower:
        resultType = "html"
    elif "pdf" in promptLower:
        resultType = "pdf"
    elif "markdown" in promptLower or "md" in promptLower:
        resultType = "md"
    elif "text" in promptLower or "txt" in promptLower:
        resultType = "txt"

Option B: Extract in parameter planning phase (better, but requires workflow changes)


Fix 3: Improve Renderer Error Handling MEDIUM

File: gateway/modules/services/serviceGeneration/mainServiceGeneration.py
Lines: 393-409

Enhance: Better error messages and logging when renderer not found

def _getFormatRenderer(self, output_format: str):
    """Get the appropriate renderer for the specified format using auto-discovery."""
    try:
        from .renderers.registry import getRenderer
        renderer = getRenderer(output_format, services=self.services)
        
        if renderer:
            return renderer
        
        # Log available formats for debugging
        from .renderers.registry import getSupportedFormats
        availableFormats = getSupportedFormats()
        logger.error(
            f"No renderer found for format '{output_format}'. "
            f"Available formats: {availableFormats}"
        )
        
        # Fallback to text renderer if no specific renderer found
        logger.warning(f"Falling back to text renderer for format {output_format}")
        fallbackRenderer = getRenderer('text', services=self.services)
        if fallbackRenderer:
            return fallbackRenderer
        
        logger.error("Even text renderer fallback failed")
        return None
        
    except Exception as e:
        logger.error(f"Error getting renderer for {output_format}: {str(e)}")
        return None

Verification Steps

After fixes:

  1. Test HTML Generation:

    • Task: "Generate HTML document about AI"
    • Expected: resultType="html" extracted or detected
    • Expected: HTML renderer used
    • Expected: Output is document.html with text/html MIME type
  2. Test Format Detection:

    • Task: "Generate PDF report"
    • Expected: resultType="pdf" detected
    • Expected: PDF renderer used
  3. Test Explicit Parameter:

    • Action: ai.generateDocument({resultType: "html", prompt: "..."})
    • Expected: HTML renderer used (no fallback)

Summary

Root Causes:

  1. resultType not extracted from task objective
  2. HTML not in action definition options
  3. Renderer fallback to text when docx fails
  4. No format auto-detection from prompt

Priority: CRITICAL - System cannot produce HTML documents as requested

Estimated Fix Time:

  • Fix 1: 5 minutes
  • Fix 2: 30 minutes
  • Fix 3: 15 minutes
  • Total: ~1 hour

Analysis Complete: 2025-12-22