wiki/implementation/implementation_workflow_processing_refactoring.md

339 lines
12 KiB
Markdown

# Refactoring Recommendations for Workflow Processing
## Current Issues
### 1. File Size Problem
- **handlingTasks.py**: 2,192 lines - too large for maintainability
- **Mixed responsibilities**: Task planning, execution, messaging, validation all in one class
- **Complex conditional logic**: Mode switching creates nested conditions
### 2. Code Duplication
- **Message creation**: Similar logic in `createActionMessage()` and `createReactActionMessage()`
- **AI calls**: Repeated patterns for different prompt types
- **Validation**: Similar validation logic across modes
### 3. Tight Coupling
- **Mode-specific logic**: Mixed with common functionality
- **Hard to test**: Large methods with multiple responsibilities
- **Hard to extend**: Adding new modes requires modifying existing code
## Proposed Refactoring Structure
```
gateway/modules/workflows/processing/
├── core/ # Common functionality
│ ├── __init__.py
│ ├── taskPlanner.py # Task planning (common)
│ ├── actionExecutor.py # Action execution (common)
│ ├── messageCreator.py # Generic message creation (all workflow phases)
│ ├── validator.py # Validation logic (common)
│ └── workflowCoordinator.py # Main workflow coordination
├── modes/ # Mode-specific implementations
│ ├── __init__.py
│ ├── baseMode.py # Abstract base class
│ ├── actionplanMode.py # Actionplan mode implementation
│ └── reactMode.py # React mode implementation
├── shared/ # Shared utilities
│ ├── __init__.py
│ ├── executionState.py # State management
│ ├── promptFactory.py # Prompt generation (reusable functions)
│ └── promptFactoryPlaceholders.py
└── workflowProcessor.py # Main processor (renamed from handlingTasks.py)
```
## Detailed Refactoring Plan
### Phase 1: Extract Common Functionality
#### 1.1 Create Core Modules
**taskPlanner.py**
```python
class TaskPlanner:
async def generateTaskPlan(self, userInput: str, workflow) -> TaskPlan:
# Move from handlingTasks.py:87-267
# Common task planning logic
```
**actionExecutor.py**
```python
class ActionExecutor:
async def executeSingleAction(self, action, workflow, task_step, ...):
# Move from handlingTasks.py:1562-1710
# Common action execution logic
```
**messageCreator.py**
```python
class MessageCreator:
async def createTaskPlanMessage(self, task_plan, workflow):
# Move from handlingTasks.py:269-308
# Generic task plan messaging
async def createTaskStartMessage(self, task_step, workflow, task_index, total_tasks):
# Move from handlingTasks.py:882-910
# Generic task start messaging
async def createActionMessage(self, action, result, workflow, ...):
# Move from handlingTasks.py:1712-1789
# Generic action messaging (success/failure)
async def createTaskCompletionMessage(self, task_step, workflow, task_index, total_tasks, review_result):
# Move from handlingTasks.py:1069-1103
# Generic task completion messaging
async def createRetryMessage(self, task_step, workflow, task_index, review_result):
# Move from handlingTasks.py:1176-1195
# Generic retry messaging
async def createErrorMessage(self, task_step, workflow, task_index, error_details):
# Move from handlingTasks.py:1201-1252
# Generic error messaging
```
**validator.py**
```python
class WorkflowValidator:
def validateTask(self, task_plan: Dict[str, Any]) -> bool:
# Move from handlingTasks.py:1793-1850
# Renamed from _validateTaskPlan, removed underscore
def validateAction(self, actions: List[Dict[str, Any]], context) -> bool:
# Move from handlingTasks.py:1906-1938
# Renamed from _validateActions, removed underscore, singular form
```
### Phase 2: Create Mode-Specific Implementations
#### 2.1 Base Mode Class
**baseMode.py**
```python
from abc import ABC, abstractmethod
class BaseMode(ABC):
def __init__(self, services, workflow):
self.services = services
self.workflow = workflow
self.taskPlanner = TaskPlanner(services)
self.actionExecutor = ActionExecutor(services)
self.messageCreator = MessageCreator(services)
self.validator = WorkflowValidator(services)
@abstractmethod
async def executeTask(self, task_step, workflow, context, ...) -> TaskResult:
pass
@abstractmethod
async def generateTaskActions(self, task_step, workflow, ...) -> List[TaskAction]:
pass
```
#### 2.2 Actionplan Mode
**actionplanMode.py**
```python
class ActionplanMode(BaseMode):
async def executeTask(self, task_step, workflow, context, ...) -> TaskResult:
# Move from handlingTasks.py:972-1261
# Actionplan-specific execution logic
async def generateTaskActions(self, task_step, workflow, ...) -> List[TaskAction]:
# Move from handlingTasks.py:445-643
# Batch action generation
```
#### 2.3 React Mode
**reactMode.py**
```python
class ReactMode(BaseMode):
async def executeTask(self, task_step, workflow, context, ...) -> TaskResult:
# Move from handlingTasks.py:916-970
# React-specific execution logic
async def plan_select(self, context: TaskContext) -> Dict[str, Any]:
# Move from handlingTasks.py:647-692
async def act_execute(self, context: TaskContext, selection: Dict[str, Any], ...):
# Move from handlingTasks.py:694-773
def observe_build(self, action_result: ActionResult) -> Dict[str, Any]:
# Move from handlingTasks.py:775-822
async def refine_decide(self, context: TaskContext, observation: Dict[str, Any]) -> Dict[str, Any]:
# Move from handlingTasks.py:824-863
```
### Phase 3: Simplify Main Coordinator
**workflowProcessor.py** (Simplified)
```python
class WorkflowProcessor:
def __init__(self, services, workflow=None):
self.services = services
self.workflow = workflow
self.mode = self._createMode(workflow.workflowMode)
def _createMode(self, workflowMode: str) -> BaseMode:
if workflowMode == "React":
return ReactMode(self.services, self.workflow)
else:
return ActionplanMode(self.services, self.workflow)
async def generateTaskPlan(self, userInput: str, workflow) -> TaskPlan:
return await self.mode.taskPlanner.generateTaskPlan(userInput, workflow)
async def executeTask(self, task_step, workflow, context, ...) -> TaskResult:
return await self.mode.executeTask(task_step, workflow, context, ...)
# Delegate other methods to appropriate components
```
### Phase 4: Create Shared Utilities
#### 4.1 Enhanced Prompt Factory (Reusable Functions)
**promptFactory.py** (Enhanced)
```python
class PromptFactory:
def __init__(self, services):
self.services = services
def createPrompt(self, template_type: str, context: Any) -> str:
# Centralized prompt creation
# Move from promptFactory.py and promptFactoryPlaceholders.py
def extractPlaceholders(self, context: Any) -> Dict[str, str]:
# Centralized placeholder extraction
# Reusable prompt element functions (called by both modes)
def getAvailableDocuments(self, context: Any) -> str:
# Move from promptFactory.py:_getAvailableDocuments()
def getWorkflowHistory(self, services, context: Any) -> str:
# Move from promptFactory.py:_getPreviousRoundContext()
def getAvailableMethods(self, services) -> str:
# Move from promptFactory.py:getMethodsList()
def getEnhancedDocumentContext(self, services) -> str:
# Move from promptFactory.py:getEnhancedDocumentContext()
def getConnectionReferenceList(self, services) -> str:
# Move from promptFactory.py:_getConnectionReferenceList()
```
## Benefits of Refactoring
### 1. Maintainability
- **Smaller files**: Each file has single responsibility
- **Clear separation**: Mode-specific vs common logic
- **Easier debugging**: Isolated functionality
### 2. Testability
- **Unit testing**: Each component can be tested independently
- **Mocking**: Easier to mock dependencies
- **Integration testing**: Clear interfaces between components
### 3. Extensibility
- **New modes**: Add new mode by implementing BaseMode
- **New features**: Add to appropriate component
- **Configuration**: Mode-specific configuration
### 4. Code Reuse
- **Common logic**: Shared across modes
- **Reduced duplication**: Single implementation of common functionality
- **Consistent behavior**: Same logic for same operations
## Migration Strategy
### Step 1: Create New Structure
1. Create new directory structure
2. Create base classes and interfaces
3. Move common functionality to core modules
### Step 2: Extract Mode-Specific Logic
1. Create ActionplanMode class
2. Create ReactMode class
3. Move mode-specific methods
### Step 3: Update Main Coordinator
1. Simplify HandlingTasks class
2. Use composition instead of inheritance
3. Delegate to appropriate components
### Step 4: Testing and Validation
1. Create unit tests for each component
2. Integration tests for each mode
3. End-to-end workflow tests
### Step 5: Cleanup
1. Remove old code
2. Update imports
3. Update documentation
## File Size Reduction
### Current State
- **handlingTasks.py**: 2,192 lines
### After Refactoring
- **workflowProcessor.py**: ~200 lines (main processor)
- **core/taskPlanner.py**: ~300 lines
- **core/actionExecutor.py**: ~400 lines
- **core/messageCreator.py**: ~400 lines (generic messages for all workflow phases)
- **modes/actionplanMode.py**: ~400 lines
- **modes/reactMode.py**: ~350 lines
- **shared/promptFactory.py**: ~300 lines (reusable prompt element functions)
- **shared/executionState.py**: ~100 lines
**Total**: ~2,450 lines (slightly more due to better organization and generic message handling)
## Implementation Priority
1. **High Priority**: Extract common functionality (Phase 1)
2. **Medium Priority**: Create mode-specific implementations (Phase 2)
3. **Low Priority**: Create shared utilities (Phase 4)
4. **Final**: Simplify main coordinator (Phase 3)
This refactoring will make the codebase more maintainable, testable, and extensible while preserving all existing functionality.
## Key Changes Based on Review
### ✅ Prompt Elements as Reusable Functions
- **Before**: Duplicated prompt content in each mode
- **After**: Reusable functions like `getAvailableDocuments()`, `getWorkflowHistory()`, etc.
- **Benefit**: Single source of truth, easier maintenance
### ✅ Group Imports with `__init__.py`
- **Pattern**: `import core` then `core.taskPlanner.xxx()`
- **Benefit**: Clear calling references, better maintainability
- **Files**: All modules have `__init__.py` for group imports
### ✅ Generic MessageCreator
- **Before**: Mode-specific messages (`createReactActionMessage`)
- **After**: Generic messages for all workflow phases
- **Messages**: Task plan, task start/end, action start/end, retry, error
- **Benefit**: Consistent messaging across modes
### ✅ Clean Function Names
- **Before**: `_validateTaskPlan()`, `_validateActions()`
- **After**: `validateTask()`, `validateAction()`
- **Benefit**: No underscores, singular forms, clearer naming
### ✅ No Unnecessary AICallManager
- **Removed**: AICallManager wrapper
- **Reason**: AI calls already well-parameterized with centralized service
- **Benefit**: Avoid over-engineering, keep existing good patterns
### ✅ Enhanced PromptFactory
- **Focus**: Reusable prompt element functions
- **Functions**: `getAvailableDocuments()`, `getWorkflowHistory()`, `getAvailableMethods()`, etc.
- **Benefit**: Both modes use same prompt building logic
### ✅ Better Naming: workflowProcessor.py
- **Before**: `handlingTasks.py` (unclear role)
- **After**: `workflowProcessor.py` (clear processing role)
- **Class**: `HandlingTasks``WorkflowProcessor`
- **Benefit**: Clear relationship with `workflowManager.py`