wiki/z-archive/implementation/implementation_refactor_db-consistency.md

206 lines
9.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# DB-backed Object Consistency Refactor
Date: 2025-10-16
Owner: Workflow/Data Layer
Goal: Eliminate divergence between in-memory objects and database records by centralizing create/update/delete (CUD) logic in service/DB layer, and prohibiting direct list mutations in workflow objects.
---
## Inputs / Constraints
- No caching subsystem: we use the in-memory workflow objects directly (no separate cache layer to reconcile).
- ChatDocuments must be able to be produced and exist in memory only (transient) when needed; this capability must be preserved.
Implication: Service methods must support both DB-backed and in-memory-only flows, with clear, explicit steps for when to persist.
---
## 1) Affected Pydantic Models (with DB representation)
Source: `gateway/modules/interfaces/interfaceDbChatObjects.py`
- ChatWorkflow (table: workflows)
- ChatMessage (table: messages)
- ChatDocument (table: documents)
- ChatLog (table: logs)
- ChatStat (table: stats)
Notes:
- `ChatWorkflow` has references to: `messages`, `logs`, `stats` (loaded via read operations only)
- `ChatMessage` has references to: `documents`, `stats`
---
## 2) Where CUD is performed today (by module/function)
- Workflow (ChatWorkflow)
- createWorkflow/updateWorkflow/deleteWorkflow: `interfaceDbChatObjects.ChatObjects`
- updateWorkflowStats: `serviceWorkflow.WorkflowService`
- Message (ChatMessage)
- createMessage/updateMessage/deleteMessage: `interfaceDbChatObjects.ChatObjects`
- createMessage (write-through + local append attempt): `serviceWorkflow.WorkflowService.createMessage`
- MessageCreator paths (indirect, see forbidden appends below): `workflows/processing/core/messageCreator.py`
- workflowManager paths (indirect, see forbidden appends below): `workflows/workflowManager.py`
- Document (ChatDocument)
- createDocument/getDocuments: `interfaceDbChatObjects.ChatObjects`
- deleteFileFromMessage: `interfaceDbChatObjects.ChatObjects`
- Log (ChatLog)
- createLog/getLogs: `interfaceDbChatObjects.ChatObjects`
- Stat (ChatStat)
- getMessageStats/getWorkflowStats: `interfaceDbChatObjects.ChatObjects`
- updateWorkflowStats: `serviceWorkflow.WorkflowService`
---
## 3) Forbidden in-memory `.append(...)` sites (must be removed or guarded)
- `gateway/modules/services/serviceWorkflow/mainServiceWorkflow.py`
- createMessage: `workflow.messages.append(message)` (has id-based guard; keep only this site to have calls to self.services.workflow.createMessage(...) in the code)
- (Two occurrences) `workflow.logs.append(log_entry)`
- `gateway/modules/workflows/processing/core/messageCreator.py`
- Multiple occurrences of `workflow.messages.append(message)`
- `gateway/modules/workflows/workflowManager.py`
- Multiple occurrences of `workflow.messages.append(message)` across handling steps
- `gateway/modules/workflows/processing/modes/modeReact.py`
- `workflow.messages.append(message)`
- `gateway/modules/workflows/processing/modes/modeActionplan.py`
- `workflow.messages.append(message)`
Action: Replace all of the above with calls to `serviceWorkflow.createMessage/updateMessage` methods. No direct appends.
---
## 4) Cascading CUD that should NOT happen
Observed in: `interfaceDbChatObjects.ChatObjects.updateWorkflow` (lines updating `logs`, `messages`, `stats` inside workflow update).
Issue: Parent-level update attempts to write child tables (cascade write) during a workflow update. This causes ordering issues and potential duplicates.
Action:
- Remove cascade CUD from `updateWorkflow`. Workflow update should only modify workflow fields.
- CUD for `messages`, `logs`, `stats`, `documents` must be done via their own dedicated service methods.
- Keep cascade on READ ONLY (e.g., `getWorkflow` loads logs/messages/stats/documents).
---
## 5) Proposed Operating Model (write-through, DB as source of truth)
- For each DB-backed object (workflow, message, document, log, stat):
1) All create/update/delete goes through `interfaceDbChatObjects.ChatObjects` or `serviceWorkflow.WorkflowService` dedicated methods.
2) After DB success, update in-memory cache (replace by id or remove by id). No direct `.append()` other than in the single service function immediately after DB write (and guarded by id check).
3) All other modules must not mutate lists directly. Replace with service calls.
- Reads: May hydrate workflow with referenced objects (messages/logs/stats/documents). No writes in read functions.
---
## 6) Tasklist: Interface changes in `interfaceDbChatObjects.py`
- [ ] Remove cascade writes in `updateWorkflow` for `logs`, `messages`, `stats`.
- [ ] Ensure `createMessage` returns persisted `ChatMessage` and does not assume local `sequenceNr` from in-memory list; optionally compute from DB if required, or drop if unused.
- [ ] Ensure `updateMessage` only updates message table and related tables when explicitly provided via dedicated methods.
- [ ] Ensure `createDocument` and `deleteFileFromMessage` are the only paths to manage documents.
- [ ] Add helper `syncWorkflowInMemory(workflowId)` to refresh `services.currentWorkflow` from DB when needed.
---
## 7) Tasklist: Module/function updates
- `services/serviceWorkflow/mainServiceWorkflow.py`
- [ ] createMessage: keep DB write-through; keep single guarded in-memory append (or drop append and rely on refresh); add optional `syncWorkflowInMemory` call.
- [ ] Remove any direct `workflow.logs.append(...)` writes; replace with `createLog` and in-memory refresh.
- `workflows/processing/core/messageCreator.py`
- [ ] Replace all `workflow.messages.append(message)` with `services.workflow.createMessage(...)` (or `updateMessage(...)` if appropriate).
- `workflows/workflowManager.py`
- [ ] Replace all direct appends with service calls.
- `workflows/processing/modes/modeReact.py`
- [ ] Replace direct append with service calls.
- `workflows/processing/modes/modeActionplan.py`
- [ ] Replace direct append with service calls.
---
## 8) Diagnostics to add (temporary)
- Before any in-memory list mutation (temporary while refactoring):
- Log caller, object type, id, len(list) before/after.
- This will surface any remaining illegal appends.
---
## 10) Service-level Transactions (must update DB and in-memory references)
All transactions below are implemented at the service layer (not interface) and must perform both the database write and the in-memory reference updates to keep the live `workflow` object consistent.
### A) Store message with documents to workflow
Function: `services.workflow.storeMessageWithDocuments(workflow, messageData, documents)`
Required behavior:
1) Ensure ChatMessage exists in DB (create if not exists) with proper `workflowId` (source of truth id coming from parameter).
2) Persist each ChatDocument in DB referencing the created message (set `messageId`) and the same `workflowId` when applicable.
3) Rehydrate/construct a `ChatMessage` object that contains the newly created `ChatDocument` objects.
4) Update in-memory workflow object:
- Append/replace the message in `services.currentWorkflow.messages` by id (replace if exists, else append).
- Ensure the in-memory `ChatMessage.documents` points to the same list of in-memory `ChatDocument` instances.
5) Return the in-memory `ChatMessage` (with documents) as confirmation.
Notes:
- If `documents` include in-memory-only `ChatDocument` instances (no DB ids yet), step (2) must persist them and update their ids before binding them back to the message in-memory object.
- No other modules are allowed to mutate `workflow.messages` directly; this transaction is the single entry point for this behavior.
Future transactions (to be specified similarly):
- Update message with added/removed documents
- Remove document from message
- Delete message (+ cascade delete of its documents in DB) and remove from in-memory workflow
### B) Store log entry to workflow
Function: `services.workflow.storeLog(workflow, logData)`
Required behavior:
1) Persist `ChatLog` in DB with `workflowId`.
2) Update in-memory workflow object: append/replace the `ChatLog` in `services.currentWorkflow.logs` by id (if logs are held in-memory), or trigger a lightweight refresh.
3) Return the in-memory `ChatLog`.
Notes:
- No direct `workflow.logs.append(...)` outside this service method.
### C) Store stat entry to workflow (workflow-level stat)
Function: `services.workflow.storeWorkflowStat(workflow, statData)`
Required behavior:
1) Persist `ChatStat` in DB with `workflowId` (no `messageId`).
2) Update in-memory workflow object: set/replace workflow-level `stats` on `services.currentWorkflow` as applicable.
3) Return the in-memory `ChatStat`.
Notes:
- No direct in-memory mutation outside this service method.
### D) Store stat entry to message (message-level stat)
Function: `services.workflow.storeMessageStat(workflow, messageId, statData)`
Required behavior:
1) Persist `ChatStat` in DB with both `workflowId` and `messageId`.
2) Update in-memory workflow object: locate message by `messageId` in `services.currentWorkflow.messages` and set/replace its `stats`.
3) Return the in-memory `ChatStat`.
Notes:
- If the target message is not present in memory, optionally refresh the workflow messages or require caller to provide the message object; no direct appends outside service method.