# DB-backed Object Consistency Refactor Date: 2025-10-16 Owner: Workflow/Data Layer Goal: Eliminate divergence between in-memory objects and database records by centralizing create/update/delete (CUD) logic in service/DB layer, and prohibiting direct list mutations in workflow objects. --- ## Inputs / Constraints - No caching subsystem: we use the in-memory workflow objects directly (no separate cache layer to reconcile). - ChatDocuments must be able to be produced and exist in memory only (transient) when needed; this capability must be preserved. Implication: Service methods must support both DB-backed and in-memory-only flows, with clear, explicit steps for when to persist. --- ## 1) Affected Pydantic Models (with DB representation) Source: `gateway/modules/interfaces/interfaceDbChatObjects.py` - ChatWorkflow (table: workflows) - ChatMessage (table: messages) - ChatDocument (table: documents) - ChatLog (table: logs) - ChatStat (table: stats) Notes: - `ChatWorkflow` has references to: `messages`, `logs`, `stats` (loaded via read operations only) - `ChatMessage` has references to: `documents`, `stats` --- ## 2) Where CUD is performed today (by module/function) - Workflow (ChatWorkflow) - createWorkflow/updateWorkflow/deleteWorkflow: `interfaceDbChatObjects.ChatObjects` - updateWorkflowStats: `serviceWorkflow.WorkflowService` - Message (ChatMessage) - createMessage/updateMessage/deleteMessage: `interfaceDbChatObjects.ChatObjects` - createMessage (write-through + local append attempt): `serviceWorkflow.WorkflowService.createMessage` - MessageCreator paths (indirect, see forbidden appends below): `workflows/processing/core/messageCreator.py` - workflowManager paths (indirect, see forbidden appends below): `workflows/workflowManager.py` - Document (ChatDocument) - createDocument/getDocuments: `interfaceDbChatObjects.ChatObjects` - deleteFileFromMessage: `interfaceDbChatObjects.ChatObjects` - Log (ChatLog) - createLog/getLogs: `interfaceDbChatObjects.ChatObjects` - Stat (ChatStat) - getMessageStats/getWorkflowStats: `interfaceDbChatObjects.ChatObjects` - updateWorkflowStats: `serviceWorkflow.WorkflowService` --- ## 3) Forbidden in-memory `.append(...)` sites (must be removed or guarded) - `gateway/modules/services/serviceWorkflow/mainServiceWorkflow.py` - createMessage: `workflow.messages.append(message)` (has id-based guard; keep only this site to have calls to self.services.workflow.createMessage(...) in the code) - (Two occurrences) `workflow.logs.append(log_entry)` - `gateway/modules/workflows/processing/core/messageCreator.py` - Multiple occurrences of `workflow.messages.append(message)` - `gateway/modules/workflows/workflowManager.py` - Multiple occurrences of `workflow.messages.append(message)` across handling steps - `gateway/modules/workflows/processing/modes/modeReact.py` - `workflow.messages.append(message)` - `gateway/modules/workflows/processing/modes/modeActionplan.py` - `workflow.messages.append(message)` Action: Replace all of the above with calls to `serviceWorkflow.createMessage/updateMessage` methods. No direct appends. --- ## 4) Cascading CUD that should NOT happen Observed in: `interfaceDbChatObjects.ChatObjects.updateWorkflow` (lines updating `logs`, `messages`, `stats` inside workflow update). Issue: Parent-level update attempts to write child tables (‘cascade write’) during a workflow update. This causes ordering issues and potential duplicates. Action: - Remove cascade CUD from `updateWorkflow`. Workflow update should only modify workflow fields. - CUD for `messages`, `logs`, `stats`, `documents` must be done via their own dedicated service methods. - Keep cascade on READ ONLY (e.g., `getWorkflow` loads logs/messages/stats/documents). --- ## 5) Proposed Operating Model (write-through, DB as source of truth) - For each DB-backed object (workflow, message, document, log, stat): 1) All create/update/delete goes through `interfaceDbChatObjects.ChatObjects` or `serviceWorkflow.WorkflowService` dedicated methods. 2) After DB success, update in-memory cache (replace by id or remove by id). No direct `.append()` other than in the single service function immediately after DB write (and guarded by id check). 3) All other modules must not mutate lists directly. Replace with service calls. - Reads: May hydrate workflow with referenced objects (messages/logs/stats/documents). No writes in read functions. --- ## 6) Tasklist: Interface changes in `interfaceDbChatObjects.py` - [ ] Remove cascade writes in `updateWorkflow` for `logs`, `messages`, `stats`. - [ ] Ensure `createMessage` returns persisted `ChatMessage` and does not assume local `sequenceNr` from in-memory list; optionally compute from DB if required, or drop if unused. - [ ] Ensure `updateMessage` only updates message table and related tables when explicitly provided via dedicated methods. - [ ] Ensure `createDocument` and `deleteFileFromMessage` are the only paths to manage documents. - [ ] Add helper `syncWorkflowInMemory(workflowId)` to refresh `services.currentWorkflow` from DB when needed. --- ## 7) Tasklist: Module/function updates - `services/serviceWorkflow/mainServiceWorkflow.py` - [ ] createMessage: keep DB write-through; keep single guarded in-memory append (or drop append and rely on refresh); add optional `syncWorkflowInMemory` call. - [ ] Remove any direct `workflow.logs.append(...)` writes; replace with `createLog` and in-memory refresh. - `workflows/processing/core/messageCreator.py` - [ ] Replace all `workflow.messages.append(message)` with `services.workflow.createMessage(...)` (or `updateMessage(...)` if appropriate). - `workflows/workflowManager.py` - [ ] Replace all direct appends with service calls. - `workflows/processing/modes/modeReact.py` - [ ] Replace direct append with service calls. - `workflows/processing/modes/modeActionplan.py` - [ ] Replace direct append with service calls. --- ## 8) Diagnostics to add (temporary) - Before any in-memory list mutation (temporary while refactoring): - Log caller, object type, id, len(list) before/after. - This will surface any remaining illegal appends. --- ## 10) Service-level Transactions (must update DB and in-memory references) All transactions below are implemented at the service layer (not interface) and must perform both the database write and the in-memory reference updates to keep the live `workflow` object consistent. ### A) Store message with documents to workflow Function: `services.workflow.storeMessageWithDocuments(workflow, messageData, documents)` Required behavior: 1) Ensure ChatMessage exists in DB (create if not exists) with proper `workflowId` (source of truth id coming from parameter). 2) Persist each ChatDocument in DB referencing the created message (set `messageId`) and the same `workflowId` when applicable. 3) Rehydrate/construct a `ChatMessage` object that contains the newly created `ChatDocument` objects. 4) Update in-memory workflow object: - Append/replace the message in `services.currentWorkflow.messages` by id (replace if exists, else append). - Ensure the in-memory `ChatMessage.documents` points to the same list of in-memory `ChatDocument` instances. 5) Return the in-memory `ChatMessage` (with documents) as confirmation. Notes: - If `documents` include in-memory-only `ChatDocument` instances (no DB ids yet), step (2) must persist them and update their ids before binding them back to the message in-memory object. - No other modules are allowed to mutate `workflow.messages` directly; this transaction is the single entry point for this behavior. Future transactions (to be specified similarly): - Update message with added/removed documents - Remove document from message - Delete message (+ cascade delete of its documents in DB) and remove from in-memory workflow ### B) Store log entry to workflow Function: `services.workflow.storeLog(workflow, logData)` Required behavior: 1) Persist `ChatLog` in DB with `workflowId`. 2) Update in-memory workflow object: append/replace the `ChatLog` in `services.currentWorkflow.logs` by id (if logs are held in-memory), or trigger a lightweight refresh. 3) Return the in-memory `ChatLog`. Notes: - No direct `workflow.logs.append(...)` outside this service method. ### C) Store stat entry to workflow (workflow-level stat) Function: `services.workflow.storeWorkflowStat(workflow, statData)` Required behavior: 1) Persist `ChatStat` in DB with `workflowId` (no `messageId`). 2) Update in-memory workflow object: set/replace workflow-level `stats` on `services.currentWorkflow` as applicable. 3) Return the in-memory `ChatStat`. Notes: - No direct in-memory mutation outside this service method. ### D) Store stat entry to message (message-level stat) Function: `services.workflow.storeMessageStat(workflow, messageId, statData)` Required behavior: 1) Persist `ChatStat` in DB with both `workflowId` and `messageId`. 2) Update in-memory workflow object: locate message by `messageId` in `services.currentWorkflow.messages` and set/replace its `stats`. 3) Return the in-memory `ChatStat`. Notes: - If the target message is not present in memory, optionally refresh the workflow messages or require caller to provide the message object; no direct appends outside service method.