wiki/appdoc/loop_plan.md
2025-12-03 23:02:58 +01:00

26 KiB

Refactoring Plan: Integrate Intent Analysis into Existing Prompts and Simplify AI Loop Mode

Overview

Integrate IntentAnalyzer logic into existing prompts (userintention, taskplan, dynamic) instead of making separate AI calls (saves 3 AI calls). Simplify AI loop logic to: Complete JSON = Stop, Cut-off JSON = Continue. Remove Definition of Done (DoD) logic entirely.

Key Change: Keep intent analysis logic (dataType, expectedFormats, qualityRequirements, etc.) but merge it into existing prompts instead of separate calls.

Overlap Analysis:

  • UserIntention prompt already does: language detection, normalization, intent extraction
  • Intent Analysis prompt does: primaryGoal, dataType, expectedFormats, qualityRequirements, successCriteria, language detection
  • Overlap: Language detection (both do it)
  • Solution: Merge intent analysis fields into userintention prompt (one call instead of two)

Changes Required

Phase 1: Integrate Intent Analysis into Existing Prompts

1. Update UserIntention Prompt (Integrate Intent Analysis)

  • File: gateway/modules/workflows/workflowManager.py
  • Lines: 353-378
  • Changes:
    • Merge intent analysis fields (from intentAnalyzer.py lines 2-31) into userintention prompt
    • Add fields: primaryGoal, dataType, expectedFormats, qualityRequirements, successCriteria
    • Keep existing fields: detectedLanguage, normalizedRequest, intent, contextItems
    • Integration: Combine both prompts into one - userintention already does language detection and normalization, now also does full intent analysis
    • Note: Adding 5 items is no problem for AI - prompt complexity is acceptable
    • CRITICAL: Intent check should be different on workflow, task, and action levels (keep separate)
    • New Prompt Structure:
      analyzerPrompt = (
          "You are an input analyzer. From the user's message, perform ALL of the following in one pass:\n"
          "1) detectedLanguage: detect ISO 639-1 language code (e.g., de, en).\n"
          "2) normalizedRequest: full, explicit restatement of the user's request in the detected language; do NOT summarize; preserve ALL constraints and details.\n"
          "3) intent: concise single-paragraph core request in the detected language for high-level routing.\n"
          "4) contextItems: supportive data blocks to attach as separate documents if significantly larger than the intent.\n"
          "5) primaryGoal: The main objective the user wants to achieve.\n"
          "6) dataType: What type of data/content they want (numbers|text|documents|analysis|code|unknown).\n"
          "7) expectedFormats: What file format(s) they expect - provide matching file format extensions list (e.g., [\"xlsx\", \"pdf\"]). If format is unclear or not specified, use empty list [].\n"
          "8) qualityRequirements: Quality requirements they have (accuracy, completeness) as {accuracyThreshold: 0.0-1.0, completenessThreshold: 0.0-1.0}.\n"
          "9) successCriteria: Specific success criteria that define completion (array of strings).\n\n"
          "Rules:\n"
          "- If total content (intent + data) is < 10% of model max tokens, do not extract; return empty contextItems and keep intent compact and self-contained.\n"
          "- If content exceeds that threshold, move bulky parts into contextItems; keep intent short and clear.\n"
          "- Preserve critical references (URLs, filenames) in intent.\n"
          "- Normalize to the primary detected language if mixed-language.\n\n"
          "Return ONLY JSON (no markdown) with this shape:\n"
          "{\n"
          "  \"detectedLanguage\": \"de|en|fr|it|...\",\n"
          "  \"normalizedRequest\": \"Full explicit instruction in detected language\",\n"
          "  \"intent\": \"Concise normalized request...\",\n"
          "  \"contextItems\": [...],\n"
          "  \"primaryGoal\": \"The main objective the user wants to achieve\",\n"
          "  \"dataType\": \"numbers|text|documents|analysis|code|unknown\",\n"
          "  \"expectedFormats\": [\"pdf\", \"docx\", \"xlsx\", ...],\n"
          "  \"qualityRequirements\": {\n"
          "    \"accuracyThreshold\": 0.0-1.0,\n"
          "    \"completenessThreshold\": 0.0-1.0\n"
          "  },\n"
          "  \"successCriteria\": [\"specific criterion 1\", \"specific criterion 2\"]\n"
          "}\n\n"
          f"User message:\n{self.services.utils.sanitizePromptContent(userInput.prompt, 'userinput')}"
      )
      
    • Update parsing (lines 397-402): Extract new fields and store in workflow object
    • Store as workflow._workflowIntent for reuse

2. Remove IntentAnalyzer Class and Calls

  • File: gateway/modules/workflows/processing/adaptive/intentAnalyzer.py

  • Action: Delete entire file (logic now integrated into prompts)

  • File: gateway/modules/workflows/processing/adaptive/__init__.py

  • Action: Remove IntentAnalyzer from exports

  • File: gateway/modules/workflows/processing/core/taskPlanner.py

    • Line 56: Remove workflowIntent = await intentAnalyzer.analyzeUserIntent(actualUserPrompt, None)
    • Line 60: Use workflowIntent from workflow object (set in workflowManager)
    • Lines 167-173: Use workflowIntent from workflow object for dataType/expectedFormats/qualityRequirements
  • File: gateway/modules/workflows/processing/modes/modeDynamic.py

    • Line 36: Remove self.intentAnalyzer = IntentAnalyzer(services)
    • Line 66: Use workflow._workflowIntent from workflow object (already set)
    • Line 72: Remove self.taskIntent = await self.intentAnalyzer.analyzeUserIntent(taskStep.objective, context)
    • Line 362: Remove actionIntent = await self.intentAnalyzer.analyzeUserIntent(actionObjective, context)
    • Lines 60-67: Use existing workflowIntent from workflow object
    • Lines 359-373: Remove actionIntent creation logic (will be integrated into dynamic prompts)

3. Remove IntentAnalyzer Imports

  • File: taskPlanner.py (line 12): Remove from modules.workflows.processing.adaptive import IntentAnalyzer
  • File: modeDynamic.py (line 25): Remove IntentAnalyzer from imports

Phase 2: Integrate Information Gathering into Existing Prompts

4. Update Taskplan Prompt (Use Workflow Intent from UserIntention, Allow Override)

  • File: gateway/modules/workflows/processing/shared/promptGenerationTaskplan.py
  • Changes:
    • Use workflowIntent from workflow object (already set in workflowManager from userintention analysis)
    • Pass workflowIntent fields as context to taskplan prompt via placeholders
    • CRITICAL: Allow taskplan to override workflow intent if task-specific needs differ
    • Example: Workflow wants PDF, but task needs CSV for intermediate step
    • Taskplan prompt can reference workflow intent but can override with task-specific values
    • If taskplan needs task-specific intent analysis, add fields to task JSON:
      {
          "overview": "...",
          "userMessage": "...",
          "tasks": [
              {
                  "id": "task_1",
                  "objective": "...",
                  "dataType": "numbers|text|documents|analysis|code|unknown",  // Inherit from workflow or task-specific
                  "expectedFormats": ["pdf", "docx", ...],  // Inherit from workflow or task-specific
                  "qualityRequirements": {...},  // Inherit from workflow or task-specific
                  ...
              }
          ]
      }
      
    • Update taskPlanner.py to use workflowIntent from workflow object (line 60)
    • Extract task-specific fields from task plan response if provided

5. Update Dynamic Plan Selection Prompt (Integrate Action Intent Analysis)

  • File: gateway/modules/workflows/processing/shared/promptGenerationActionsDynamic.py
  • Function: generateDynamicPlanSelectionPrompt
  • Changes:
    • Integrate intent analysis into action selection prompt
    • Add intent analysis instructions to prompt template
    • Add to JSON response structure:
      {
          "action": "...",
          "actionObjective": "...",
          "dataType": "numbers|text|documents|analysis|code|unknown",  // Analyze from actionObjective
          "expectedFormats": ["pdf", "docx", "xlsx", ...],  // Analyze from actionObjective
          "qualityRequirements": {
              "accuracyThreshold": 0.0-1.0,
              "completenessThreshold": 0.0-1.0
          },  // Analyze from actionObjective
          "successCriteria": ["specific criterion 1", ...],  // Analyze from actionObjective
          "userMessage": "...",
          "learnings": [...],
          "requiredInputDocuments": [...],
          "requiredConnection": "...",
          "parametersContext": "..."
      }
      
    • Update prompt instructions to analyze actionObjective for these fields
    • Extract these fields in modeDynamic.py when processing selection
    • Store as workflow._actionIntent for use in AI loop (but without DoD)

6. Update Dynamic Parameters Prompt

  • File: gateway/modules/workflows/processing/shared/promptGenerationActionsDynamic.py
  • Function: generateDynamicParametersPrompt
  • Changes:
    • Add completion criteria description (natural language, not DoD metrics)
    • Ask AI to describe what "complete" means for this action
    • Example: "This action is complete when: [description]"
    • Use natural language completion criteria instead of DoD metrics

Phase 3: Simplify AI Loop Logic

7. Simplify _shouldContinueGeneration

  • File: gateway/modules/services/serviceAi/mainServiceAi.py
  • Lines: 904-943
  • Changes:
    • Remove DoD/KPI checking logic
    • Remove workflowIntent parameter usage for DoD
    • Remove _analyzeTaskCompletion call (verified: not called anywhere)
    • CRITICAL: JSON completeness is determined by parsing, NOT by checking last character!
    • Last character check (line 860) is WRONG - } or ] could be by chance, JSON still incomplete
    • New Logic:
      def _shouldContinueGeneration(
          self,
          allSections: List[Dict[str, Any]],
          iteration: int,
          wasJsonComplete: bool,
          rawResponse: str = None
      ) -> bool:
          """
          Determine if AI generation loop should continue.
      
          Simple logic:
          - If JSON parsing failed or incomplete → continue (needs more content)
          - If JSON parses successfully and is complete → stop (all content delivered)
          - Loop detection prevents infinite loops
      
          CRITICAL: JSON completeness is determined by parsing, NOT by last character check!
          Returns True if we should continue, False if AI Loop is done.
          """
          if len(allSections) == 0:
              return True  # No sections yet, continue
      
          # CRITERION 1: If JSON was incomplete/broken (parsing failed or incomplete) - continue to repair/complete
          if not wasJsonComplete:
              logger.info(f"Iteration {iteration}: JSON incomplete/broken - continuing to complete")
              return True
      
          # CRITERION 2: JSON is complete (parsed successfully) - check for loop detection
          if self._isStuckInLoop(allSections, iteration):
              logger.warning(f"Iteration {iteration}: Detected potential infinite loop - stopping AI loop")
              return False
      
          # JSON is complete and not stuck in loop - done
          logger.info(f"Iteration {iteration}: JSON complete - AI loop done")
          return False
      
    • Remove userPrompt and workflowIntent parameters (no longer needed)
    • Update _extractSectionsFromResponse: Remove last character check (line 860), rely only on JSON parsing

8. Remove _analyzeTaskCompletion and Check DoD Usage

  • File: gateway/modules/services/serviceAi/mainServiceAi.py
  • Lines: 945-1090
  • Action: Delete entire method (verified: not called anywhere in codebase)
  • Action: Remove all references to this method
  • CRITICAL: Check deeply in code how completeness checks are handled:
    • _refineDecide (modeDynamic.py line 693) uses content validation and analysis
    • ContentValidator checks content quality and requirements
    • ProgressTracker tracks progress state
    • Verify: DoD checking happens in refinement/validation phase, NOT in AI loop
    • Action: Ensure validation/refinement phase still checks requirements after DoD removal

9. Revise buildContinuationContext

  • File: gateway/modules/shared/jsonUtils.py

  • Lines: 448-1016

  • Changes:

    A. New Summary Format (per-section counts):

    Following data has already been delivered:
    
    - heading "id" <list of elements.level . elements.text>
    - paragraph with <count> texts
    - bullet_list with <count> items
    - table "id" with <count> rows
    - code_block "id" with <count> code lines
    
    Check if with already delivered data and the last delivered data part the full response is delivered. 
    If not, deliver the remaining part.
    

    Rules:

    • If section has no ID, omit it from summary (don't show "unknown")
    • If summary is too long (exceeds token limit), truncate: show first 100 items and last 100 items (remove middle)

    B. New Extraction Algorithm:

    1. Loop over all sections of the JSON (as it is, cut off), until a section is not complete
    2. CRITICAL: There is always only one section incomplete (JSON cut-off point)
    3. In the cut off section, loop through all elements, until an element is cut off
    4. Edge case: If cut-off is in first element, just show cut-off element (no element before exists)
    5. Normal case: Return cut-off element AND the element before it to give to the next iteration prompt
    6. CRITICAL: In 99% of cases, JSON is cut off mid-string or mid-number - deliver the cut-off part as-is (don't try to "complete" it)
    7. Performance: No problem - we only parse one AI response, not all accumulated sections

    C. Implementation:

    def buildContinuationContext(allSections: List[Dict[str, Any]], lastRawResponse: Optional[str] = None) -> Dict[str, Any]:
        """
        Build context information from accumulated sections for continuation prompt.
    
        Returns summary of delivered data and cut-off point for continuation.
        """
        context = {
            "section_count": len(allSections),
        }
    
        # Build summary of delivered data (per-section counts)
        summary_lines = []
        summary_lines.append("Following data has already been delivered:\n")
    
        for section in allSections:
            section_id = section.get("id")
            # CRITICAL: If section has no ID, omit it from summary
            if not section_id:
                continue
    
            content_type = section.get("content_type", "")
            elements = section.get("elements", [])
    
            if isinstance(elements, list) and elements:
                elem = elements[-1] if elements else {}
            else:
                elem = elements if isinstance(elements, dict) else {}
    
            if isinstance(elem, dict):
                if content_type == "heading":
                    level = elem.get("level", "")
                    text = elem.get("text", "")
                    summary_lines.append(f'- heading "{section_id}" level {level}: {text}')
    
                elif content_type == "paragraph":
                    # Count text elements
                    text_count = sum(1 for e in (elements if isinstance(elements, list) else [elem]) 
                                    if isinstance(e, dict) and e.get("text"))
                    summary_lines.append(f'- paragraph with {text_count} text(s)')
    
                elif content_type in ["bullet_list", "numbered_list"]:
                    items = elem.get("items", [])
                    item_count = len(items) if isinstance(items, list) else 0
                    summary_lines.append(f'- bullet_list with {item_count} items')
    
                elif content_type == "table":
                    rows = elem.get("rows", [])
                    row_count = len(rows) if isinstance(rows, list) else 0
                    summary_lines.append(f'- table "{section_id}" with {row_count} rows')
    
                elif content_type == "code_block":
                    code = elem.get("code", "")
                    if code:
                        lines = [l for l in code.split('\n') if l.strip()]
                        line_count = len(lines)
                        summary_lines.append(f'- code_block "{section_id}" with {line_count} code lines")
    
        # CRITICAL: If summary is too long, truncate: show first 100 and last 100 items
        summary_text = "\n".join(summary_lines)
        if len(summary_lines) > 200:  # More than 200 lines
            first_100 = "\n".join(summary_lines[:100])
            last_100 = "\n".join(summary_lines[-100:])
            summary_text = f"{first_100}\n... (truncated {len(summary_lines) - 200} items) ...\n{last_100}"
    
        context["delivered_summary"] = summary_text
    
        # Extract cut-off point using new algorithm
        # 1. Loop over all sections until finding incomplete section
        incomplete_section = None
        incomplete_section_index = -1
        for i, section in enumerate(allSections):
            if self._isSectionIncomplete(section):
                incomplete_section = section
                incomplete_section_index = i
                break
    
        # 2. In incomplete section, loop through elements until finding cut-off element
        # CRITICAL: There is always only ONE section incomplete (JSON cut-off point)
        cut_off_element = None
        element_before_cutoff = None
    
        if incomplete_section:
            elements = incomplete_section.get("elements", [])
            if isinstance(elements, list):
                for i, elem in enumerate(elements):
                    if self._isElementIncomplete(elem):
                        cut_off_element = elem
                        # Edge case: If cut-off is in first element, no element before exists
                        if i > 0:
                            element_before_cutoff = elements[i-1]
                        break
    
        # 3. Extract from lastRawResponse if available
        # CRITICAL: In 99% of cases, JSON is cut off mid-string or mid-number
        # Deliver the cut-off part AS-IS (don't try to "complete" it)
        if lastRawResponse and not cut_off_element:
            # Try to extract cut-off element from raw response
            # Extract exactly as-is, even if mid-string/mid-number
            cut_off_element = self._extractCutOffElementFromRaw(lastRawResponse, incomplete_section)
    
        context["element_before_cutoff"] = element_before_cutoff
        context["cut_off_element"] = cut_off_element
    
        return context
    

    D. Keep Existing Merge Logic:

    • Keep _mergeSectionsIntelligently (works well)
    • Keep _mergeSectionContent (works well)
    • Keep _mergeCodeBlocks (works well)

10. Update buildGenerationPrompt

  • File: gateway/modules/services/serviceGeneration/subPromptBuilderGeneration.py
  • Lines: 48-171
  • Changes:
    • Remove DoD references (line 78)
    • Update continuation prompt to use new summary format:
      if hasContinuation:
          delivered_summary = continuationContext.get("delivered_summary", "")
          element_before_cutoff = continuationContext.get("element_before_cutoff")
          cut_off_element = continuationContext.get("cut_off_element")
      
          continuationText = f"""{delivered_summary}
      
      

Check if with already delivered data and the last delivered data part the full response is delivered. If not, deliver the remaining part.

Last complete element before cut-off: {element_before_cutoff} Cut-off element (incomplete): {cut_off_element}

Continue from the incomplete element above - complete it first, then add NEW items.""" ```

  • Remove progress stats based on DoD
  • Use simple section counts instead

11. Clean Up _callAiWithLooping

  • File: gateway/modules/services/serviceAi/mainServiceAi.py
  • Lines: 162-365
  • Changes:
    • Remove workflowIntent parameter usage for DoD (line 223-224)
    • Remove _analyzeTaskCompletion call (if any)
    • Simplify _shouldContinueGeneration call (remove workflowIntent parameter)
    • Update continuation context building:
      continuationContext = buildContinuationContext(allSections, lastRawResponse)
      # Remove taskIntent from continuationContext - no longer needed
      

Phase 4: Clean Up References

12. Remove DoD References

  • File: gateway/modules/workflows/processing/modes/modeDynamic.py

    • Remove DoD extraction from intents (line 365)
    • Remove workflowIntent/taskIntent/actionIntent DoD usage
  • File: gateway/modules/services/serviceGeneration/subPromptBuilderGeneration.py

    • Remove DoD from continuation context (line 78)
  • File: gateway/modules/services/serviceAi/mainServiceAi.py

    • Remove all DoD-related code

13. Update Workflow Intent Storage

  • File: modeDynamic.py
  • Changes:
    • Remove workflow._workflowIntent storage (or keep empty structure if needed elsewhere)
    • Remove workflow._taskIntent storage
    • Remove workflow._actionIntent storage
    • Or keep them but remove DoD fields

Implementation Order

  1. Remove IntentAnalyzer class and all calls (Phase 1)
  2. Update prompts to gather info directly (Phase 2)
    • Update UserIntention prompt (integrate intent analysis)
    • Update Taskplan prompt (allow workflow intent override)
    • Update Dynamic prompts (integrate action-level intent)
  3. Fix JSON completeness check (Phase 3, Step 7)
    • Remove last character check (line 860)
    • Use JSON parsing only (json.loads())
  4. Simplify _shouldContinueGeneration (Phase 3, Step 7)
  5. Remove _analyzeTaskCompletion (Phase 3, Step 8)
    • Verify DoD checking in refinement phase (_refineDecide, ContentValidator)
  6. Revise buildContinuationContext (Phase 3, Step 9)
    • Implement new summary format
    • Implement new extraction algorithm
    • Handle edge cases (first element, mid-string/number cuts)
    • Omit sections without ID
    • Truncate summary if too long (first 100 + last 100)
  7. Update buildGenerationPrompt (Phase 3, Step 10)
  8. Clean up _callAiWithLooping (Phase 3, Step 11)
  9. Remove all DoD references (Phase 4)
  10. Testing and validation (Post-implementation)
    • Test UserIntention prompt quality (verify all fields extracted correctly)
    • Test extraction edge cases
    • Test task-level intent override
    • Add comprehensive unit tests
  11. Documentation (Post-implementation)
    • Update docs to explain new loop behavior
    • Document JSON completeness check (parsing-based)
    • Document continuation summary format

Expected Benefits

  • Saves 3 AI calls per workflow (workflow intent, task intent, action intent - now integrated into existing calls)
  • Simpler loop logic (JSON completeness only, no DoD checking)
  • Clearer continuation prompts (section counts instead of DoD metrics)
  • More precise cut-off detection (element-level detection)
  • Better performance (fewer AI calls = faster execution)
  • Same information gathered (intent analysis logic preserved, just integrated)

Testing Checklist

Functional Tests

  • Task planning works without intent analysis
  • Dynamic mode action selection works without intent analysis
  • Dynamic mode parameter generation works without intent analysis
  • AI loop stops when JSON is complete (parsing-based check)
  • AI loop continues when JSON is cut off (parsing fails)
  • Continuation prompts show correct section counts
  • Cut-off element extraction works correctly
  • Merge logic still works correctly
  • No DoD references remain in codebase

Edge Case Tests (from Critical Analysis)

  • JSON completeness: Nested structures that don't parse correctly
  • JSON completeness: Mid-string cuts (e.g., "text": "incomplete)
  • JSON completeness: Mid-number cuts (e.g., "value": 1234)
  • Extraction: Cut-off in first element (no element before)
  • Extraction: Only one incomplete section (not multiple)
  • Summary: Sections without ID are omitted
  • Summary: Truncation works (first 100 + last 100 if > 200 items)
  • Task-level intent override works (task can override workflow intent)

Quality Tests

  • UserIntention prompt: All 9 fields extracted correctly
  • UserIntention prompt: Quality maintained after adding 5 fields
  • Taskplan prompt: Can override workflow intent when needed
  • Dynamic prompts: Action-level intent analysis works correctly

Notes

Implementation Notes

  • Keep existing merge logic - it works well and doesn't need changes
  • New extraction algorithm focuses on finding the cut-off point more precisely
  • Summary format provides clear progress without DoD thresholds
  • Intent analysis logic preserved - just integrated into existing prompts instead of separate calls
  • UserIntention prompt already does language detection and normalization - now also does full intent analysis
  • Taskplan prompt can use workflowIntent from userintention (no re-analysis needed), but can override if task-specific needs differ
  • Dynamic prompts integrate action-level intent analysis (no separate call needed)
  • DoD removed - but dataType/expectedFormats/qualityRequirements still gathered for other purposes

Critical Requirements (from Critical Analysis)

  • JSON completeness: Use parsing (json.loads()), NOT last character check
  • Intent levels: Keep separate checks for workflow, task, and action levels
  • Task-level override: Critical requirement - allow taskplan to override workflow intent
  • Sections without ID: Omit from summary (don't show "unknown")
  • Summary truncation: Show first 100 and last 100 items if > 200 items
  • Extraction edge cases: Handle first element, only one incomplete section, deliver mid-string/number as-is
  • DoD checking: Verified in refinement phase (_refineDecide, ContentValidator), not in AI loop