wiki/z-archive/implementation/spec-workflow-implementation.md

91 lines
5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Title: PlanActObserveRefine Implementation Specification (PowerOn)
Scope
- This document specifies concrete, stepwise changes to adopt the iterative loop across models, workflow engine, state machine, prompts, and handlers.
1) Data Models (modules/interfaces/interfaceChatModel.py)
- Add minimal schemas for the two-step protocol:
- ActionSelection: { method: str, name: str }
- ActionParameters: { parameters: dict }
- Add Observation model returned to the model after each action:
Observation: { success: bool, resultLabel: str, documentsCount: int, previews: [{name,mime,snippet}], notes: [str] }
- Extend TaskWorkflow context with: maxSteps (int), reactMode (bool).
- No breaking changes to ActionResult/ActionDocument; keep stable.
2) Workflow Engine (gateway/modules/chat/managerChat.py)
- Entry point remains executeUnifiedWorkflow.
- Per task, switch to planact loop instead of generating a full action list:
- For each iteration (<= maxSteps):
1) Call createActionSelectionPrompt → get {action:{method,name}}.
2) Validate allowed method; if invalid, request another selection (one retry).
3) Call createActionParameterPrompt with the selected action → get {parameters:{...}}.
4) Validate/fill defaults in code; executeSingleAction.
5) Build Observation (compact) and feed to createRefinementPrompt; if “stop”, break.
- Preserve current messaging (start/step/complete) but at iteration granularity; keep document routing the same.
3) State Machine (gateway/modules/chat/handling/executionState.py)
- Track iteration count (current_step) and max_steps.
- Maintain criteria progress across steps as today, but per-iteration.
- Provide helper: should_continue(observation, review) -> bool.
4) Prompts (gateway/modules/chat/handling/promptFactory.py)
- New compact prompts:
a) createActionSelectionPrompt(context): returns only one action selection.
b) createActionParameterPrompt(context, selected_action): returns only parameters for that action.
c) createRefinementPrompt(context, observation): returns decision: {"decision":"continue|stop","reason":"..."} and, if continue, optionally a hint for next subgoal.
- Keep existing task planning and review prompts; they remain for high-level plan and QA.
- Replace the long “available methods” JSON with a tiny catalog (names + parameter names only).
- Embed 5-line business rules; drop provider specifics.
5) HandlingTasks (gateway/modules/chat/handling/handlingTasks.py)
- New flow inside executeTask:
- Initialize iteration=0; while iteration<max_steps:
- selection_prompt = createActionSelectionPrompt(...)
- param_prompt = createActionParameterPrompt(...)
- Execute action with validated parameters; create message and documents as today.
- Build Observation (compact) from ActionResult.
- refinement_prompt = createRefinementPrompt(...)
- If stop -> run existing reviewTaskCompletion to finalize; else continue.
- Keep createTaskAction for backward compatibility (batch plans) but mark as legacy.
6) Defaults & Validation (code, not prompts)
- Derive userLanguage from workflow/user; set defaults for web.* depth/time if implied by criteria.
- ai.process defaults to markdown narrative unless expectedDocumentFormats requires JSON/CSV.
- Strict validation: reject missing required params; soft-fill optional ones.
7) Web Accuracy Enhancements (optional but recommended)
- In methodWeb.scrape, keep enrichment + reranking (already added) behind a config flag.
- Observation should surface relevance_score, key_points, entities for top-k results.
8) Telemetry and Limits
- Track per-iteration tokens/time; stop early if budget exceeded.
- Expose step count and decision reasons in logs for auditability.
Migration Plan
Step A (Non-breaking):
- Add Observation model/types; add compact prompt creators (unused initially).
- Gate enrichment in methodWeb.scrape via config.
Step B (Switch path):
- Add reactMode flag; if true, executeTask uses iterative loop; else keep legacy full action plan.
Step C (Tighten prompts):
- Replace verbose catalog with tiny catalog when reactMode.
- Add 5-line rules and schema caps (one action per step; minimal params).
Step D (Clean up):
- Deprecate bulk action plan path after validation; keep fallback.
Testing Strategy
- Unit: selection -> parameters validators; defaults application; observation builder.
- Integration: run a research task in reactMode and verify iterations, labels, and final output.
- Regression: legacy flow must still pass existing tests when reactMode=false.
Security Considerations
- Whitelist allowed methods per task type; sanitize parameters (URLs, file refs).
- Limit step count and token budget to avoid runaways.
Appendix: Minimal JSON Schemas
- Selection: {"action":{"method":"web","name":"scrape"}}
- Parameters: {"parameters":{"query":"...","maxResults":10,"language":"de"}}
- Observation: {"success":true,"resultLabel":"round1_task1_action1_results","documentsCount":3,"previews":[{"name":"...","mime":"application/json","snippet":"..."}]}
- Refinement decision: {"decision":"continue","reason":"Need more sources on Q3 2024"}