wiki/z-archive/implementation/spec-workflow-implementation.md

Title: Plan–Act–Observe–Refine Implementation Specification (PowerOn)

Scope
- This document specifies concrete, stepwise changes to adopt the iterative loop across models, workflow engine, state machine, prompts, and handlers.

1) Data Models (modules/interfaces/interfaceChatModel.py)
- Add minimal schemas for the two-step protocol:
  - ActionSelection: { method: str, name: str }
  - ActionParameters: { parameters: dict }
- Add Observation model returned to the model after each action:
  Observation: { success: bool, resultLabel: str, documentsCount: int, previews: [{name,mime,snippet}], notes: [str] }
- Extend TaskWorkflow context with: maxSteps (int), reactMode (bool).
- No breaking changes to ActionResult/ActionDocument; keep stable.

2) Workflow Engine (gateway/modules/chat/managerChat.py)
- Entry point remains executeUnifiedWorkflow.
- Per task, switch to plan–act loop instead of generating a full action list:
  - For each iteration (<= maxSteps):
    1) Call createActionSelectionPrompt → get {action:{method,name}}.
    2) Validate allowed method; if invalid, request another selection (one retry).
    3) Call createActionParameterPrompt with the selected action → get {parameters:{...}}.
    4) Validate/fill defaults in code; executeSingleAction.
    5) Build Observation (compact) and feed to createRefinementPrompt; if “stop”, break.
- Preserve current messaging (start/step/complete) but at iteration granularity; keep document routing the same.

3) State Machine (gateway/modules/chat/handling/executionState.py)
- Track iteration count (current_step) and max_steps.
- Maintain criteria progress across steps as today, but per-iteration.
- Provide helper: should_continue(observation, review) -> bool.

4) Prompts (gateway/modules/chat/handling/promptFactory.py)
- New compact prompts:
  a) createActionSelectionPrompt(context): returns only one action selection.
  b) createActionParameterPrompt(context, selected_action): returns only parameters for that action.
  c) createRefinementPrompt(context, observation): returns decision: {"decision":"continue|stop","reason":"..."} and, if continue, optionally a hint for next subgoal.
- Keep existing task planning and review prompts; they remain for high-level plan and QA.
- Replace the long “available methods” JSON with a tiny catalog (names + parameter names only).
- Embed 5-line business rules; drop provider specifics.

5) HandlingTasks (gateway/modules/chat/handling/handlingTasks.py)
- New flow inside executeTask:
  - Initialize iteration=0; while iteration<max_steps:
    - selection_prompt = createActionSelectionPrompt(...)
    - param_prompt = createActionParameterPrompt(...)
    - Execute action with validated parameters; create message and documents as today.
    - Build Observation (compact) from ActionResult.
    - refinement_prompt = createRefinementPrompt(...)
    - If stop -> run existing reviewTaskCompletion to finalize; else continue.
- Keep createTaskAction for backward compatibility (batch plans) but mark as legacy.

6) Defaults & Validation (code, not prompts)
- Derive userLanguage from workflow/user; set defaults for web.* depth/time if implied by criteria.
- ai.process defaults to markdown narrative unless expectedDocumentFormats requires JSON/CSV.
- Strict validation: reject missing required params; soft-fill optional ones.

7) Web Accuracy Enhancements (optional but recommended)
- In methodWeb.scrape, keep enrichment + reranking (already added) behind a config flag.
- Observation should surface relevance_score, key_points, entities for top-k results.

8) Telemetry and Limits
- Track per-iteration tokens/time; stop early if budget exceeded.
- Expose step count and decision reasons in logs for auditability.

Migration Plan
Step A (Non-breaking):
  - Add Observation model/types; add compact prompt creators (unused initially).
  - Gate enrichment in methodWeb.scrape via config.
Step B (Switch path):
  - Add reactMode flag; if true, executeTask uses iterative loop; else keep legacy full action plan.
Step C (Tighten prompts):
  - Replace verbose catalog with tiny catalog when reactMode.
  - Add 5-line rules and schema caps (one action per step; minimal params).
Step D (Clean up):
  - Deprecate bulk action plan path after validation; keep fallback.

Testing Strategy
- Unit: selection -> parameters validators; defaults application; observation builder.
- Integration: run a research task in reactMode and verify iterations, labels, and final output.
- Regression: legacy flow must still pass existing tests when reactMode=false.

Security Considerations
- Whitelist allowed methods per task type; sanitize parameters (URLs, file refs).
- Limit step count and token budget to avoid runaways.

Appendix: Minimal JSON Schemas
- Selection: {"action":{"method":"web","name":"scrape"}}
- Parameters: {"parameters":{"query":"...","maxResults":10,"language":"de"}}
- Observation: {"success":true,"resultLabel":"round1_task1_action1_results","documentsCount":3,"previews":[{"name":"...","mime":"application/json","snippet":"..."}]}
- Refinement decision: {"decision":"continue","reason":"Need more sources on Q3 2024"}