Refactor full workflow engine 3.0

This commit is contained in:
ValueOn AG 2025-09-23 22:48:00 +02:00
parent 71933e6e9e
commit cb8d337cf9
5 changed files with 241 additions and 1 deletions

View file

@ -0,0 +1,79 @@
## React Mode (PlanActObserveRefine)
This introduces a compact iterative workflow that replaces bulk action plans with a tight loop:
- Plan (select): Model selects exactly one action
- Act (execute): Host requests only parameters for that action and executes
- Observe (summarize): Host returns a compact observation object
- Refine (decide): Model decides whether to continue or stop
### How to enable
Set the mode on `ChatWorkflow` (persisted in DB / passed through the API):
- `workflowMode: string` "Actionplan" (legacy) or "React" (iterative)
- `maxSteps: number` maximum iterations per task in React mode (default 5)
When `workflowMode="Actionplan"`, the legacy batch action planning path is used.
### Data models
Defined in `gateway/modules/interfaces/interfaceChatModel.py`:
- `ActionSelection`: `{ method: str, name: str }`
- `ActionParameters`: `{ parameters: dict }`
- `Observation`: `{ success: bool, resultLabel: str, documentsCount: int, previews: [{name,mime,snippet}], notes: [str] }`
- `TaskContext` additions: `reactMode: bool`, `maxSteps: int`
- `ChatWorkflow` additions: `workflowMode: str`, `maxSteps: int`
### Prompts
Defined in `gateway/modules/chat/handling/promptFactory.py`:
- `createActionSelectionPrompt(context)` → returns one action
- `createActionParameterPrompt(context, selected_action)` → returns parameters only
- `createRefinementPrompt(context, observation)` → returns `{ decision: continue|stop, reason }`
### Execution flow
Implemented in `gateway/modules/chat/handling/handlingTasks.py`:
- `plan_select(context)``{ action: { method, name } }`
- `act_execute(context, selection, task_step, workflow, step)` → executes one action and returns `ActionResult`
- `observe_build(action_result)` → builds `Observation`
- `refine_decide(context, observation)``{ decision, reason }`
- Integrated loop lives inside `executeTask(...)` when `context.reactMode` is true
Iteration control helper in `gateway/modules/chat/handling/executionState.py`:
- `should_continue(observation, review_dict, current_step, max_steps)`
### Observation format
Compact, machine-friendly, with small previews only:
```
{
success: boolean,
resultLabel: string,
documentsCount: number,
previews: [ { name, mime, snippet } ],
notes: [ string ]
}
```
### Telemetry
Each iteration logs duration (seconds) to workflow logs. No specific token metrics required.
### Backward compatibility
- Legacy planning/execution remains the default when `workflowMode="Actionplan"`.
- No breaking changes to action or document structures.
### Notes
- Document routing uses deterministic `resultLabel`: `round{r}_task{t}_action{a}_...`
- Previews are capped to ≤5 items and keep `name`, `mime`, and a short `snippet`.

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1,69 @@
Title: PlanActObserveRefine Workflow Architecture (PowerOn)
Overview
- Objective: Replace bulk “task plan → full action plan → execute later” with a compact iterative loop: plan (select one action) → act (execute with minimal params) → observe (summarize results) → refine (decide next step or stop).
- Benefits: Lower token usage, higher accuracy via tight feedback, less overplanning, clearer document routing, and better failure recovery.
Core Loop
1) Plan (Select)
- Input: objective, success criteria, tiny tool catalog (names + parameter names only), minimal available documents/connections, and short rules.
- Output (JSON): {"action": {"method": "<method>", "name": "<action>"}}
- Constraints: exactly one action per iteration.
2) Act (Specify + Execute)
- Input: selected action name from Plan.
- Model returns only required parameters for that action: {"parameters": {...}}.
- Host validates and applies in-code defaults (user language, depth, recency) then executes.
3) Observe (Summarize Results)
- Host returns a compact observation object, not raw payloads:
{
"success": true|false,
"resultLabel": "roundX_taskY_actionZ_label",
"documentsCount": N,
"previews": [{"name":"..","mime":"..","snippet":".."}],
"notes": ["short fact"],
}
- For web results, include per-URL relevance, key points, entities if available.
4) Refine (Decide Next or Stop)
- Model decides: stop with final answer or propose next single action (return to Plan).
- Stop criteria: all success_criteria met; or no further actions can improve score; or max steps reached.
Minimal Tool Catalog (names + param names only)
- web.search(query,maxResults,searchDepth,timeRange,topic,includeDomains,excludeDomains,language,includeAnswer,includeRawContent)
- web.scrape(query,maxResults,searchDepth,timeRange,topic,includeDomains,excludeDomains,language,includeAnswer,includeRawContent,extractDepth,format)
- web.crawl(documentList,extractDepth,format)
- ai.process(documentList,aiPrompt,processingMode,includeMetadata,customInstructions,expectedDocumentFormats)
- document.extract(documentList,aiPrompt)
- document.generateReport(documentList,title)
Business Rules (Prompt-Level, ≤7 lines)
- Pick exactly one action per step; then specify only its parameters.
- Derive parameters from objective + criteria; use user language; add recency only if freshness is implied.
- Only request machine-readable formats when explicitly required; otherwise narrative text/markdown.
- Keep parameters minimal; avoid connector-specific knobs; chain outputs so next step is consumable.
- Stop when criteria are met; otherwise iterate with a new single action.
Defaults (Code-Level, not Prompt)
- Language default from user profile; depth advanced for analysis, basic for lookups.
- Time window only when criteria imply freshness.
- ai.process defaults to markdown narrative unless expectedDocumentFormats explicitly requests structured data.
State and Routing
- Workflow context tracks (currentRound, currentTask, currentAction).
- Each action execution attaches documents with a deterministic resultLabel: round{r}_task{t}_action{a}_{label}.
- Observation objects reference only labels and previews to limit prompt size.
Failure Handling
- After observe, a lightweight review step classifies: success | retry | failed, with improvements.
- Retry increments a small counter; criteria progress tracked across attempts.
Security & Limits
- Allowed methods per task type; deny-list for risky methods.
- Max steps per task; token budget guard; truncate observations to safe size.
Final Output
- When stopping, model produces a concise final message and, if applicable, invokes a last formatting action (e.g., ai.process → report md) before ending.

View file

@ -0,0 +1,91 @@
Title: PlanActObserveRefine Implementation Specification (PowerOn)
Scope
- This document specifies concrete, stepwise changes to adopt the iterative loop across models, workflow engine, state machine, prompts, and handlers.
1) Data Models (modules/interfaces/interfaceChatModel.py)
- Add minimal schemas for the two-step protocol:
- ActionSelection: { method: str, name: str }
- ActionParameters: { parameters: dict }
- Add Observation model returned to the model after each action:
Observation: { success: bool, resultLabel: str, documentsCount: int, previews: [{name,mime,snippet}], notes: [str] }
- Extend TaskWorkflow context with: maxSteps (int), reactMode (bool).
- No breaking changes to ActionResult/ActionDocument; keep stable.
2) Workflow Engine (gateway/modules/chat/managerChat.py)
- Entry point remains executeUnifiedWorkflow.
- Per task, switch to planact loop instead of generating a full action list:
- For each iteration (<= maxSteps):
1) Call createActionSelectionPrompt → get {action:{method,name}}.
2) Validate allowed method; if invalid, request another selection (one retry).
3) Call createActionParameterPrompt with the selected action → get {parameters:{...}}.
4) Validate/fill defaults in code; executeSingleAction.
5) Build Observation (compact) and feed to createRefinementPrompt; if “stop”, break.
- Preserve current messaging (start/step/complete) but at iteration granularity; keep document routing the same.
3) State Machine (gateway/modules/chat/handling/executionState.py)
- Track iteration count (current_step) and max_steps.
- Maintain criteria progress across steps as today, but per-iteration.
- Provide helper: should_continue(observation, review) -> bool.
4) Prompts (gateway/modules/chat/handling/promptFactory.py)
- New compact prompts:
a) createActionSelectionPrompt(context): returns only one action selection.
b) createActionParameterPrompt(context, selected_action): returns only parameters for that action.
c) createRefinementPrompt(context, observation): returns decision: {"decision":"continue|stop","reason":"..."} and, if continue, optionally a hint for next subgoal.
- Keep existing task planning and review prompts; they remain for high-level plan and QA.
- Replace the long “available methods” JSON with a tiny catalog (names + parameter names only).
- Embed 5-line business rules; drop provider specifics.
5) HandlingTasks (gateway/modules/chat/handling/handlingTasks.py)
- New flow inside executeTask:
- Initialize iteration=0; while iteration<max_steps:
- selection_prompt = createActionSelectionPrompt(...)
- param_prompt = createActionParameterPrompt(...)
- Execute action with validated parameters; create message and documents as today.
- Build Observation (compact) from ActionResult.
- refinement_prompt = createRefinementPrompt(...)
- If stop -> run existing reviewTaskCompletion to finalize; else continue.
- Keep createTaskAction for backward compatibility (batch plans) but mark as legacy.
6) Defaults & Validation (code, not prompts)
- Derive userLanguage from workflow/user; set defaults for web.* depth/time if implied by criteria.
- ai.process defaults to markdown narrative unless expectedDocumentFormats requires JSON/CSV.
- Strict validation: reject missing required params; soft-fill optional ones.
7) Web Accuracy Enhancements (optional but recommended)
- In methodWeb.scrape, keep enrichment + reranking (already added) behind a config flag.
- Observation should surface relevance_score, key_points, entities for top-k results.
8) Telemetry and Limits
- Track per-iteration tokens/time; stop early if budget exceeded.
- Expose step count and decision reasons in logs for auditability.
Migration Plan
Step A (Non-breaking):
- Add Observation model/types; add compact prompt creators (unused initially).
- Gate enrichment in methodWeb.scrape via config.
Step B (Switch path):
- Add reactMode flag; if true, executeTask uses iterative loop; else keep legacy full action plan.
Step C (Tighten prompts):
- Replace verbose catalog with tiny catalog when reactMode.
- Add 5-line rules and schema caps (one action per step; minimal params).
Step D (Clean up):
- Deprecate bulk action plan path after validation; keep fallback.
Testing Strategy
- Unit: selection -> parameters validators; defaults application; observation builder.
- Integration: run a research task in reactMode and verify iterations, labels, and final output.
- Regression: legacy flow must still pass existing tests when reactMode=false.
Security Considerations
- Whitelist allowed methods per task type; sanitize parameters (URLs, file refs).
- Limit step count and token budget to avoid runaways.
Appendix: Minimal JSON Schemas
- Selection: {"action":{"method":"web","name":"scrape"}}
- Parameters: {"parameters":{"query":"...","maxResults":10,"language":"de"}}
- Observation: {"success":true,"resultLabel":"round1_task1_action1_results","documentsCount":3,"previews":[{"name":"...","mime":"application/json","snippet":"..."}]}
- Refinement decision: {"decision":"continue","reason":"Need more sources on Q3 2024"}