wiki/z-archive/implementation/spec-workflow-implementation.md

5 KiB
Raw Permalink Blame History

Title: PlanActObserveRefine Implementation Specification (PowerOn)

Scope

  • This document specifies concrete, stepwise changes to adopt the iterative loop across models, workflow engine, state machine, prompts, and handlers.
  1. Data Models (modules/interfaces/interfaceChatModel.py)
  • Add minimal schemas for the two-step protocol:
    • ActionSelection: { method: str, name: str }
    • ActionParameters: { parameters: dict }
  • Add Observation model returned to the model after each action: Observation: { success: bool, resultLabel: str, documentsCount: int, previews: [{name,mime,snippet}], notes: [str] }
  • Extend TaskWorkflow context with: maxSteps (int), reactMode (bool).
  • No breaking changes to ActionResult/ActionDocument; keep stable.
  1. Workflow Engine (gateway/modules/chat/managerChat.py)
  • Entry point remains executeUnifiedWorkflow.
  • Per task, switch to planact loop instead of generating a full action list:
    • For each iteration (<= maxSteps):
      1. Call createActionSelectionPrompt → get {action:{method,name}}.
      2. Validate allowed method; if invalid, request another selection (one retry).
      3. Call createActionParameterPrompt with the selected action → get {parameters:{...}}.
      4. Validate/fill defaults in code; executeSingleAction.
      5. Build Observation (compact) and feed to createRefinementPrompt; if “stop”, break.
  • Preserve current messaging (start/step/complete) but at iteration granularity; keep document routing the same.
  1. State Machine (gateway/modules/chat/handling/executionState.py)
  • Track iteration count (current_step) and max_steps.
  • Maintain criteria progress across steps as today, but per-iteration.
  • Provide helper: should_continue(observation, review) -> bool.
  1. Prompts (gateway/modules/chat/handling/promptFactory.py)
  • New compact prompts: a) createActionSelectionPrompt(context): returns only one action selection. b) createActionParameterPrompt(context, selected_action): returns only parameters for that action. c) createRefinementPrompt(context, observation): returns decision: {"decision":"continue|stop","reason":"..."} and, if continue, optionally a hint for next subgoal.
  • Keep existing task planning and review prompts; they remain for high-level plan and QA.
  • Replace the long “available methods” JSON with a tiny catalog (names + parameter names only).
  • Embed 5-line business rules; drop provider specifics.
  1. HandlingTasks (gateway/modules/chat/handling/handlingTasks.py)
  • New flow inside executeTask:
    • Initialize iteration=0; while iteration<max_steps:
      • selection_prompt = createActionSelectionPrompt(...)
      • param_prompt = createActionParameterPrompt(...)
      • Execute action with validated parameters; create message and documents as today.
      • Build Observation (compact) from ActionResult.
      • refinement_prompt = createRefinementPrompt(...)
      • If stop -> run existing reviewTaskCompletion to finalize; else continue.
  • Keep createTaskAction for backward compatibility (batch plans) but mark as legacy.
  1. Defaults & Validation (code, not prompts)
  • Derive userLanguage from workflow/user; set defaults for web.* depth/time if implied by criteria.
  • ai.process defaults to markdown narrative unless expectedDocumentFormats requires JSON/CSV.
  • Strict validation: reject missing required params; soft-fill optional ones.
  1. Web Accuracy Enhancements (optional but recommended)
  • In methodWeb.scrape, keep enrichment + reranking (already added) behind a config flag.
  • Observation should surface relevance_score, key_points, entities for top-k results.
  1. Telemetry and Limits
  • Track per-iteration tokens/time; stop early if budget exceeded.
  • Expose step count and decision reasons in logs for auditability.

Migration Plan Step A (Non-breaking):

  • Add Observation model/types; add compact prompt creators (unused initially).
  • Gate enrichment in methodWeb.scrape via config. Step B (Switch path):
  • Add reactMode flag; if true, executeTask uses iterative loop; else keep legacy full action plan. Step C (Tighten prompts):
  • Replace verbose catalog with tiny catalog when reactMode.
  • Add 5-line rules and schema caps (one action per step; minimal params). Step D (Clean up):
  • Deprecate bulk action plan path after validation; keep fallback.

Testing Strategy

  • Unit: selection -> parameters validators; defaults application; observation builder.
  • Integration: run a research task in reactMode and verify iterations, labels, and final output.
  • Regression: legacy flow must still pass existing tests when reactMode=false.

Security Considerations

  • Whitelist allowed methods per task type; sanitize parameters (URLs, file refs).
  • Limit step count and token budget to avoid runaways.

Appendix: Minimal JSON Schemas

  • Selection: {"action":{"method":"web","name":"scrape"}}
  • Parameters: {"parameters":{"query":"...","maxResults":10,"language":"de"}}
  • Observation: {"success":true,"resultLabel":"round1_task1_action1_results","documentsCount":3,"previews":[{"name":"...","mime":"application/json","snippet":"..."}]}
  • Refinement decision: {"decision":"continue","reason":"Need more sources on Q3 2024"}