frontend_nyla/docs/DASHBOARD_LOG_POLLING_DOCUMENTATION.md

575 lines
19 KiB
Markdown

# Dashboard Log Polling and Rendering Documentation
## Overview
This documentation explains the complete flow of how dashboard messages (logs with `operationId`) are polled, processed, sorted, and rendered in the workflow dashboard. The system uses a hierarchical tree structure to display operations and their progress, with real-time updates through polling.
## Architecture Flow
The system follows this flow:
1. **Polling Controller** (`workflowPollingController.js`) - Manages polling intervals and scheduling
2. **Data Layer** (`workflowData.js`) - Fetches data from API and routes logs to appropriate handlers
3. **Dashboard Processor** (`workflowUiRendererDashboard.js`) - Processes logs with `operationId` and builds hierarchical tree
4. **Dashboard Renderer** (`workflowUiRendererDashboard.js`) - Renders the hierarchical tree structure
## Key Files
- `workflowPollingController.js` - Centralized polling controller
- `workflowData.js` - API communication and data routing
- `workflowUiRendererDashboard.js` - Dashboard log processing and rendering
- `workflowCoordination.js` - State management coordination
## Implementation Details
### 1. Polling Mechanism
**File**: `frontend_agents/public/js/modules/workflowPollingController.js`
The polling controller uses a recursive `setTimeout` approach to create an infinite polling chain. This ensures continuous updates while preventing race conditions and rate limiting issues.
#### Configuration
- **Base interval**: 5 seconds (`baseInterval = 5000`)
- **Maximum interval**: 10 seconds (`maxInterval = 10000`)
- **Exponential backoff multiplier**: 1.5
- **Concurrency prevention**: Uses `isPollInProgress` flag to prevent multiple simultaneous polls
#### Key Methods
**`startPolling(workflowId)`**
- Starts polling for a specific workflow
- Stops any existing polling before starting new one
- Sets `activeWorkflowId` and `isPolling` flag
- Executes immediate first poll (no delay)
- Validates workflow ID before starting
**`doPolling()`**
- Executes one poll cycle asynchronously
- Prevents concurrent execution using `isPollInProgress` flag
- Calls `pollWorkflowData()` from `workflowData.js`
- Handles errors and implements exponential backoff on failures
- Self-schedules next poll using recursive `setTimeout`
- Validates workflow is still valid before scheduling next poll
**`stopPolling()`**
- Stops all polling operations immediately
- Clears all scheduled timeouts
- Resets all state flags (`isPolling`, `isPollInProgress`, `activeWorkflowId`)
- Resets failure count
**`pausePolling()` / `resumePolling()`**
- Temporarily pauses polling (e.g., during user interactions)
- Resumes polling after pause
#### Polling Flow
```javascript
startPolling(workflowId)
doPolling() [immediate first poll]
pollWorkflowData(workflowId) [async API call]
setTimeout(() => doPolling(), interval) [schedule next poll]
[recursive loop continues until stopped]
```
#### Error Handling
- **Rate limiting (429 errors)**: Increases backoff more aggressively, stops polling after 5 consecutive rate limit errors
- **Network errors**: Logged but don't immediately stop polling (allows retry)
- **Workflow validation**: Checks if workflow is still valid before each poll cycle
- **Poll failures**: Exponential backoff increases interval up to `maxInterval`
### 2. Data Fetching
**File**: `frontend_agents/public/js/modules/workflowData.js`
The `pollWorkflowData()` function orchestrates the data fetching process.
#### API Calls
The function makes two parallel API calls:
1. **`api.getWorkflow(workflowId)`** - Fetches workflow status and metadata
2. **`api.getWorkflowChatData(workflowId, afterTimestamp)`** - Fetches unified chat data (messages, logs, stats)
#### Incremental Polling
- **First poll**: `afterTimestamp = null` → Fetches ALL historical data
- **Subsequent polls**: `afterTimestamp = workflowState.lastRenderedTimestamp` → Fetches only new items since last render
- **Timestamp tracking**: Uses `createdAt` timestamp from each item to track what's been rendered
#### Data Processing
The `processUnifiedChatData()` function processes items in chronological order:
1. Routes each item based on `type` field:
- `'message'``processUnifiedMessage()`
- `'log'``processUnifiedLog()`
- `'stat'``processUnifiedStat()`
2. Updates `lastRenderedTimestamp` after processing each item (ensures accurate incremental polling)
3. Processes items sequentially to maintain chronological order
#### Workflow Status Updates
- Monitors workflow status changes
- Updates UI buttons and controls when status changes
- Handles special case: Ignores 'completed' status if workflow is in Round 2+ (prevents premature stopping)
#### Polling Continuation Logic
Polling continues based on workflow status:
- **'running'**: Continues polling
- **'completed'**: Continues polling temporarily to get final messages, then stops
- **'failed' / 'stopped'**: Stops polling immediately
- **Other statuses**: Stops polling
### 3. Log Routing
**File**: `frontend_agents/public/js/modules/workflowData.js` - `processUnifiedLog()`
Logs are routed to different rendering areas based on the presence of `operationId`:
#### Routing Logic
```javascript
if (log.operationId) {
// Logs WITH operationId → Dashboard
processDashboardLogs([frontendLog]);
} else {
// Logs WITHOUT operationId → Unified Content Area
WorkflowCoordination.addLogEntry(frontendLog.message, frontendLog.type, frontendLog);
}
```
#### Log Format Conversion
Backend `ChatLog` format is converted to frontend format:
```javascript
{
id: log.id,
message: log.message,
type: log.type || 'info',
timestamp: log.timestamp,
status: log.status || 'running',
progress: log.progress !== undefined && log.progress !== null ? log.progress : undefined,
performance: log.performance,
operationId: log.operationId || null,
parentId: log.parentId || null
}
```
#### Key Points
- **All logs are processed**: No duplicates are skipped (logs may contain progress updates)
- **Progress tracking**: Logs with `operationId` typically contain progress information
- **Hierarchical structure**: `parentId` field enables parent-child relationships between operations
### 4. Dashboard Log Processing
**File**: `frontend_agents/public/js/modules/workflowUiRendererDashboard.js` - `processDashboardLogs()`
This function processes logs with `operationId` and builds the hierarchical tree structure.
#### Processing Steps
1. **Group by operationId**
- Creates or updates operation groups in `dashboardLogTree.operations` Map
- Each operation stores logs in a Map keyed by `logId` (ensures uniqueness)
2. **Update operation metadata**
- Updates `parentId` if not set yet (from first log entry)
- Updates `latestProgress` when log contains progress value
- Updates `latestStatus` when log contains status value
3. **Generate unique log IDs**
- Uses provided `log.id` if available
- Otherwise generates: `log_${Date.now()}_${Math.random().toString(36).substring(2, 9)}`
- Ensures all progress updates are stored, even with same progress value
4. **Build root operations list**
- Filters operations without `parentId`
- Stores in `dashboardLogTree.rootOperations` array
5. **Trigger rendering**
- Calls `renderDashboard()` after processing all logs
#### Data Structure
```javascript
dashboardLogTree = {
operations: Map<operationId, {
logs: Map<logId, log>, // All logs for this operation
parentId: string | null, // Parent operation ID (if nested)
expanded: boolean, // UI expanded/collapsed state
latestProgress: number | null, // Most recent progress value
latestStatus: string | null // Most recent status value
}>,
rootOperations: string[], // Operation IDs without parent
logExpandedStates: Map<logId, boolean>, // Individual log expanded states
currentRound: number | null // Current workflow round
}
```
#### Important Behaviors
- **All logs stored**: Every log with same `operationId` is stored (represents progress updates)
- **Latest values tracked**: `latestProgress` and `latestStatus` always reflect most recent state
- **Parent-child relationships**: Operations can nest via `parentId` field
### 5. Sorting
**File**: `frontend_agents/public/js/modules/workflowUiRendererDashboard.js`
Multiple sorting mechanisms ensure consistent display order:
#### Operation-Level Log Sorting
**Location**: `renderOperationNode()` function, lines 169-173
Logs within an operation are sorted by timestamp in ascending order:
```javascript
const logsArray = Array.from(operation.logs.values()).sort((a, b) => {
const tsA = a.timestamp || 0;
const tsB = b.timestamp || 0;
return tsA - tsB; // Ascending order (oldest first)
});
```
**Purpose**: Ensures logs are displayed in chronological order within each operation.
#### Child Operations Sorting
**Location**: `getChildOperations()` function, line 453
Child operations are sorted alphabetically by `operationId`:
```javascript
return Array.from(dashboardLogTree.operations.entries())
.filter(([opId, op]) => op.parentId === parentId)
.map(([opId]) => opId)
.sort(); // Alphabetical sort for consistent ordering
```
**Purpose**: Provides consistent, predictable ordering of sibling operations.
#### Timeline Sorting (Unified Content)
**Location**: `workflowUiRenderer.js` - `renderUnifiedContent()` function
Logs without `operationId` are combined with messages and sorted by timestamp:
```javascript
timeline.sort((a, b) => a.timestamp - b.timestamp);
```
**Purpose**: Creates a unified chronological timeline of all non-dashboard content.
#### Sorting Summary
| Context | Sort Key | Order | Purpose |
|---------|----------|-------|---------|
| Logs within operation | `timestamp` | Ascending | Chronological display |
| Child operations | `operationId` | Alphabetical | Consistent ordering |
| Unified timeline | `timestamp` | Ascending | Chronological timeline |
### 6. Rendering
**File**: `frontend_agents/public/js/modules/workflowUiRendererDashboard.js` - `renderDashboard()`
The rendering system creates a hierarchical tree structure with collapsible nodes and progress indicators.
#### Hierarchical Structure
- **Root operations**: Operations without `parentId` are rendered first
- **Child operations**: Operations with `parentId` matching a parent's `operationId` are nested
- **Single line per operation**: Each operation shows ONE line that updates with latest status/progress
- **All logs represented**: All logs with same `operationId` are represented by this single updating line
#### Rendering Process
**Step 1: `renderDashboard()`**
- Builds HTML from `dashboardLogTree` structure
- Handles empty state (no operations)
- Sets up event handlers for collapse/expand functionality
**Step 2: `renderOperationNode(operationId, depth)`** (Recursive)
- Renders a single operation node
- Calculates indentation based on depth (8px per level)
- Determines if operation has child operations
- Gets latest log entry for operation name and type
- Calculates progress percentage (forces 100% when status is 'completed')
- Builds HTML for:
- Expand/collapse button (if has children)
- Operation icon (based on log type)
- Operation name (from latest log message)
- Status and progress percentage
- Progress bar (if progress available)
- Recursively renders child operations if expanded
#### Visual Elements
**Operation Header**
- Expand/collapse button (chevron icon)
- Operation icon (info/success/error/warning)
- Operation name (from latest log message)
- Status badge (running/completed/failed/etc.)
- Progress percentage (if available)
**Progress Bar**
- Visual progress indicator
- Width based on progress percentage (0-100%)
- "completed" class when progress >= 100%
- Hidden if no progress value
**Indentation**
- Root level (depth 0): No indentation
- Child levels: Indented via parent container padding (8px per level)
- Creates visual hierarchy
#### State Management
**Expanded/Collapsed State**
- Stored in `operation.expanded` boolean
- Toggled via `toggleOperationExpanded(operationId)`
- Persists during re-renders
- Controls visibility of child operations container
**Event Handlers**
- `setupCollapseExpandHandlers()`: Sets up click handlers for expand buttons
- `setupLogCollapseExpandHandlers()`: Sets up handlers for log entry expansion
- Click handlers toggle expanded state and re-render dashboard
#### Rendering Flow
```
renderDashboard()
[For each root operation]
renderOperationNode(operationId, 0)
[Build operation header HTML]
[If has children and expanded]
[For each child operation]
renderOperationNode(childOperationId, depth)
[Recursive rendering continues...]
[Set innerHTML of dashboard container]
[Setup event handlers]
```
#### Key Rendering Features
1. **Progress Updates**: Operation line updates in-place as new logs arrive
2. **Status Changes**: Status badge updates when operation status changes
3. **Collapsible Tree**: Users can expand/collapse operation groups
4. **Visual Hierarchy**: Indentation shows parent-child relationships
5. **Latest State**: Always shows most recent log message, progress, and status
## Data Structures
### Dashboard Log Tree
```javascript
{
operations: Map<operationId, {
logs: Map<logId, log>, // All logs for this operation
parentId: string | null, // Parent operation ID
expanded: boolean, // UI expanded state
latestProgress: number | null, // Most recent progress (0-1)
latestStatus: string | null // Most recent status
}>,
rootOperations: string[], // Operation IDs without parent
logExpandedStates: Map<logId, boolean>, // Individual log expanded states
currentRound: number | null // Current workflow round
}
```
### Log Entry Format
```javascript
{
id: string, // Unique log ID
message: string, // Log message text
type: 'info' | 'success' | 'error' | 'warning',
timestamp: number, // Unix timestamp (seconds)
status: string, // Operation status
progress: number | null, // Progress value (0-1) or null
operationId: string | null, // Operation ID (null = unified content)
parentId: string | null // Parent operation ID (for nesting)
}
```
### Unified Chat Data Item
```javascript
{
type: 'message' | 'log' | 'stat', // Item type
item: { /* message/log/stat data */ },
createdAt: number // Timestamp for sorting
}
```
## Key Features
### 1. Incremental Polling
- Uses `lastRenderedTimestamp` to fetch only new items
- First poll loads all historical data (`afterTimestamp = null`)
- Subsequent polls fetch incrementally (`afterTimestamp = lastRenderedTimestamp`)
- Reduces API load and improves performance
### 2. Hierarchical Display
- Operations can have parent-child relationships via `parentId`
- Visual indentation shows hierarchy
- Collapsible tree structure for better UX
- Supports unlimited nesting depth
### 3. Progress Tracking
- Shows progress bars for operations with progress values
- Updates in real-time as new logs arrive
- Forces 100% progress when status is 'completed'
- Displays status badges (running/completed/failed/etc.)
### 4. Collapsible Tree
- Users can expand/collapse operation groups
- Expand/collapse state persists during re-renders
- Click handlers on operation headers and expand buttons
- Smooth visual transitions
### 5. Round Detection
- Tracks current workflow round in `dashboardLogTree.currentRound`
- Clears dashboard when round changes (via `updateProgressFromMessage()`)
- Prevents mixing data from different workflow rounds
### 6. Duplicate Prevention
- Uses Map with `logId` keys to prevent duplicate entries
- Same log ID updates in place rather than creating duplicates
- Ensures unique log entries even with same progress value
## Error Handling
### Rate Limiting (429 Errors)
- Detected in `pollWorkflowData()` and `doPolling()`
- Triggers exponential backoff with increased multiplier
- Stops polling after 5 consecutive rate limit errors
- Prevents API abuse
### Network Errors
- Logged but don't immediately stop polling
- Allows retry on transient network issues
- Controller handles backoff automatically
- Polling continues for recoverable errors
### Rendering Errors
- Don't stop polling (UI issue, not data issue)
- Logged for debugging
- Polling continues to get workflow status updates
- UI can recover on next successful render
### Workflow Validation
- `isWorkflowValid()` checks before each poll cycle
- Validates workflow state exists and matches active workflow
- Checks if polling is still enabled (`pollActive` flag)
- Stops polling if workflow is invalid
## Performance Considerations
### Polling Intervals
- Base interval: 5 seconds (balanced between responsiveness and server load)
- Maximum interval: 10 seconds (prevents excessive backoff)
- Exponential backoff: Prevents overwhelming server during errors
### Data Processing
- Processes items sequentially to maintain chronological order
- Uses Maps for O(1) lookups when grouping operations
- Incremental polling reduces data transfer
- Timestamp-based filtering at API level
### Rendering Optimization
- Full re-render on each update (simplifies state management)
- Event handlers re-attached after each render
- HTML generation is efficient (string concatenation)
- Minimal DOM manipulation (innerHTML replacement)
## Usage Examples
### Starting Polling
```javascript
import pollingController from './workflowPollingController.js';
// Start polling for a workflow
pollingController.startPolling('workflow-123');
```
### Stopping Polling
```javascript
// Stop polling
pollingController.stopPolling();
```
### Processing Dashboard Logs
```javascript
import { processDashboardLogs } from './workflowUiRendererDashboard.js';
// Process logs with operationId
const logs = [
{
id: 'log-1',
message: 'Processing file...',
type: 'info',
timestamp: 1234567890,
status: 'running',
progress: 0.5,
operationId: 'op-123',
parentId: null
}
];
processDashboardLogs(logs);
```
### Clearing Dashboard
```javascript
import { clearDashboard } from './workflowUiRendererDashboard.js';
// Clear dashboard (e.g., on workflow reset)
clearDashboard(true); // true = reset round tracking
```
## Related Documentation
- `FRONTEND_ARCHITECTURE.md` - Overall frontend architecture
- `workflowCoordination.js` - State management coordination
- `workflowUiRenderer.js` - Unified content rendering
## Conclusion
The dashboard log polling and rendering system provides a robust, hierarchical display of workflow operations with real-time updates. The system efficiently handles incremental polling, sorts data chronologically, and renders a collapsible tree structure that scales to complex workflows with multiple nested operations.