From b7f8bfe3e43f39af8714b9a14bddd5130916df1a Mon Sep 17 00:00:00 2001 From: ValueOn AG Date: Mon, 29 Dec 2025 23:54:13 +0100 Subject: [PATCH] new features --- ..._progress_reporting_generation_proposal.md | 319 ++++ appdoc/doc_userauth_process_concept_review.md | 334 +++++ .../doc_workflow_actions_rbac_concept_done.md | 1119 ++++++++++++++ ...orkflow_method_refactoring_concept_done.md | 1313 +++++++++++++++++ 4 files changed, 3085 insertions(+) create mode 100644 appdoc/doc_progress_reporting_generation_proposal.md create mode 100644 appdoc/doc_userauth_process_concept_review.md create mode 100644 appdoc/doc_workflow_actions_rbac_concept_done.md create mode 100644 appdoc/doc_workflow_method_refactoring_concept_done.md diff --git a/appdoc/doc_progress_reporting_generation_proposal.md b/appdoc/doc_progress_reporting_generation_proposal.md new file mode 100644 index 0000000..e6e696c --- /dev/null +++ b/appdoc/doc_progress_reporting_generation_proposal.md @@ -0,0 +1,319 @@ +# Progress Reporting for Document Generation - Analysis & Proposal + +## Current State Analysis + +### Existing Progress Reporting System + +**ProgressLogger** (`gateway/modules/shared/progressLogger.py`): +- Centralized progress logging with hierarchical support +- Methods: `startOperation()`, `updateOperation()`, `finishOperation()` +- Supports parent-child relationships via `parentOperationId` +- Creates `ChatLog` entries with progress (0.0-1.0), status, and hierarchical structure + +**ChatLog Model** (`gateway/modules/datamodels/datamodelChat.py`): +- Fields: `progress` (0.0-1.0), `status`, `parentId`, `operationId` +- Hierarchical display support via `parentId` (references parent's `operationId`) + +### Current Generation Progress Reporting + +**Chapter Structure Generation** (`subStructureGeneration.py`): +- ✅ Has progress logging: `startOperation()` and `finishOperation()` +- ❌ No progress updates during generation +- ❌ No chapter-level granularity + +**Structure Filling** (`subStructureFilling.py`): +- ✅ Overall operation: `startOperation()` and `finishOperation()` for "Chapter Content Generation" +- ✅ Individual sections: `startOperation()` and `finishOperation()` for each section +- ❌ **Missing**: Chapter-level progress operations +- ❌ **Missing**: Section progress updates (only start/finish, no intermediate progress) +- ❌ **Missing**: Overall progress calculation showing "Chapter X/Y, Section Z/W" + +**Content Generation** (`mainServiceGeneration.py`): +- ✅ Has `contentProgressCallback` that reports section-level progress +- ✅ Maps section progress to overall progress (30-90%) +- ❌ But only at section level, not chapter level +- ❌ No hierarchical display of chapters and sections + +## Gap Analysis + +### What Users Currently See: +``` +[Progress] Chapter Content Generation - Filling (0%) +[Progress] Section Generation - Section (0%) +[Progress] Section Generation - Section (0%) +[Progress] Section Generation - Section (0%) +... +[Progress] Chapter Content Generation - Filling (100%) +``` + +### What Users Should See: +``` +[Progress] Chapter Content Generation - Filling (0%) + [Progress] Chapter 1/5: Introduction (0%) + [Progress] Section 1/3: Overview (0%) + [Progress] Section 1/3: Overview (50%) - Generating content... + [Progress] Section 1/3: Overview (100%) + [Progress] Section 2/3: Background (0%) + ... + [Progress] Chapter 1/5: Introduction (100%) + [Progress] Chapter 2/5: Analysis (0%) + ... +``` + +## Proposed Integration + +### 1. Chapter-Level Progress Operations + +**Location**: `subStructureFilling.py` → `_fillChapterSections()` + +**Changes**: +- Create chapter-level operation IDs +- Start chapter operation before processing sections +- Update chapter progress as sections complete +- Finish chapter operation when all sections done + +**Implementation**: +```python +# Before processing chapters +totalChapters = sum(len(doc.get("chapters", [])) for doc in chapterStructure.get("documents", [])) +chapterIndex = 0 + +for doc in chapterStructure.get("documents", []): + for chapter in doc.get("chapters", []): + chapterIndex += 1 + chapterId = chapter.get("id", "unknown") + chapterTitle = chapter.get("title", "Untitled") + + # Start chapter operation + chapterOperationId = f"{fillOperationId}_chapter_{chapterId}" + self.services.chat.progressLogStart( + chapterOperationId, + "Chapter Generation", + f"Chapter {chapterIndex}/{totalChapters}", + f"{chapterTitle}", + parentOperationId=fillOperationId + ) + + # Process sections within chapter + sections = chapter.get("sections", []) + totalSections = len(sections) + + for sectionIndex, section in enumerate(sections): + # ... existing section processing ... + + # Update chapter progress after each section + chapterProgress = (sectionIndex + 1) / totalSections if totalSections > 0 else 1.0 + self.services.chat.progressLogUpdate( + chapterOperationId, + chapterProgress, + f"Section {sectionIndex + 1}/{totalSections} completed" + ) + + # Finish chapter operation + self.services.chat.progressLogFinish(chapterOperationId, True) +``` + +### 2. Section Progress Updates + +**Location**: `subStructureFilling.py` → `_fillChapterSections()` + +**Changes**: +- Add progress updates during section processing +- Report progress at key stages: + - 0.2: Building prompt + - 0.4: Calling AI + - 0.6: Processing response + - 0.8: Validating content + - 1.0: Complete + +**Implementation**: +```python +# Start section operation (existing) +sectionOperationId = f"{fillOperationId}_section_{sectionId}" +self.services.chat.progressLogStart( + sectionOperationId, + "Section Generation", + "Section", + f"Generating section {sectionId}", + parentOperationId=chapterOperationId # Parent is chapter, not fillOperationId +) + +try: + # Update: Building prompt + self.services.chat.progressLogUpdate(sectionOperationId, 0.2, "Building generation prompt") + + generationPrompt = self._buildSectionGenerationPrompt(...) + + # Update: Calling AI + self.services.chat.progressLogUpdate(sectionOperationId, 0.4, "Calling AI for content generation") + + aiResponse = await self.aiService.callAi(...) + + # Update: Processing response + self.services.chat.progressLogUpdate(sectionOperationId, 0.6, "Processing AI response") + + # Parse and validate + generatedElements = json.loads(...) + + # Update: Validating content + self.services.chat.progressLogUpdate(sectionOperationId, 0.8, "Validating generated content") + + elements.extend(generatedElements) + + # Finish section (existing) + self.services.chat.progressLogFinish(sectionOperationId, True) + +except Exception as e: + self.services.chat.progressLogFinish(sectionOperationId, False) + # ... error handling ... +``` + +### 3. Overall Progress Calculation + +**Location**: `subStructureFilling.py` → `_fillChapterSections()` + +**Changes**: +- Calculate overall progress based on chapters and sections +- Update parent operation (`fillOperationId`) with overall progress + +**Implementation**: +```python +# Calculate overall progress +def calculateOverallProgress(chapterIndex, totalChapters, sectionIndex, totalSections): + """Calculate overall progress: 0.0 to 1.0""" + if totalChapters == 0: + return 1.0 + + # Progress from completed chapters + completedChaptersProgress = chapterIndex / totalChapters + + # Progress from current chapter + currentChapterProgress = (sectionIndex / totalSections) / totalChapters if totalSections > 0 else 0 + + return completedChaptersProgress + currentChapterProgress + +# Update overall progress after each section +overallProgress = calculateOverallProgress( + chapterIndex - 1, # -1 because we're processing current chapter + totalChapters, + sectionIndex, + totalSections +) +self.services.chat.progressLogUpdate( + fillOperationId, + overallProgress, + f"Chapter {chapterIndex}/{totalChapters}, Section {sectionIndex + 1}/{totalSections}" +) +``` + +### 4. Chapter Structure Generation Progress + +**Location**: `subStructureFilling.py` → `_generateChapterSectionsStructure()` + +**Changes**: +- Add progress reporting for chapter structure generation +- Report progress per chapter + +**Implementation**: +```python +# Count total chapters +totalChapters = sum(len(doc.get("chapters", [])) for doc in chapterStructure.get("documents", [])) +chapterIndex = 0 + +for doc in chapterStructure.get("documents", []): + for chapter in doc.get("chapters", []): + chapterIndex += 1 + chapterId = chapter.get("id", "unknown") + chapterTitle = chapter.get("title", "Untitled") + + # Update progress + progress = chapterIndex / totalChapters if totalChapters > 0 else 1.0 + self.services.chat.progressLogUpdate( + parentOperationId, # Use parent operation (structure generation) + progress, + f"Generating sections for Chapter {chapterIndex}/{totalChapters}: {chapterTitle}" + ) + + # ... existing chapter structure generation ... +``` + +## Implementation Plan + +### Phase 1: Chapter-Level Progress (High Priority) +1. Add chapter operation tracking in `_fillChapterSections()` +2. Create chapter-level operations with proper parent hierarchy +3. Update chapter progress as sections complete + +### Phase 2: Section Progress Updates (High Priority) +1. Add progress updates during section processing +2. Report progress at key stages (prompt building, AI call, response processing) +3. Update section parent to be chapter operation (not fill operation) + +### Phase 3: Overall Progress Calculation (Medium Priority) +1. Implement overall progress calculation function +2. Update parent operation with overall progress +3. Include chapter/section counts in status messages + +### Phase 4: Chapter Structure Generation Progress (Low Priority) +1. Add progress updates during chapter structure generation +2. Report progress per chapter + +## Benefits + +1. **User Visibility**: Users can see exactly which chapter and section is being processed +2. **Better UX**: Hierarchical progress display shows document structure +3. **Debugging**: Easier to identify where generation is stuck +4. **Performance Monitoring**: Can track time per chapter/section +5. **Consistency**: Uses existing ProgressLogger infrastructure + +## Example Progress Display + +**Before**: +``` +[0%] Chapter Content Generation - Filling +[0%] Section Generation - Section +[100%] Section Generation - Section +[0%] Section Generation - Section +[100%] Section Generation - Section +... +[100%] Chapter Content Generation - Filling +``` + +**After**: +``` +[0%] Chapter Content Generation - Filling + [0%] Chapter 1/5: Introduction + [0%] Section 1/3: Overview + [20%] Section 1/3: Overview - Building prompt + [40%] Section 1/3: Overview - Calling AI + [60%] Section 1/3: Overview - Processing response + [80%] Section 1/3: Overview - Validating content + [100%] Section 1/3: Overview + [33%] Chapter 1/5: Introduction - Section 1/3 completed + [0%] Section 2/3: Background + ... + [100%] Chapter 1/5: Introduction + [20%] Chapter Content Generation - Chapter 1/5 completed + [0%] Chapter 2/5: Analysis + ... +``` + +## Files to Modify + +1. `gateway/modules/services/serviceAi/subStructureFilling.py` + - `_fillChapterSections()`: Add chapter and section progress + - `_generateChapterSectionsStructure()`: Add chapter structure progress + +2. No changes needed to: + - `progressLogger.py` (already supports hierarchy) + - `datamodelChat.py` (already supports parentId) + - `mainServiceChat.py` (wrapper methods already exist) + +## Testing Considerations + +1. Test with documents containing multiple chapters +2. Test with chapters containing multiple sections +3. Verify hierarchical display in frontend +4. Test error handling (failed sections should not break progress) +5. Test with single chapter/section documents + diff --git a/appdoc/doc_userauth_process_concept_review.md b/appdoc/doc_userauth_process_concept_review.md new file mode 100644 index 0000000..a5617b2 --- /dev/null +++ b/appdoc/doc_userauth_process_concept_review.md @@ -0,0 +1,334 @@ +# Critical Review: User Authentication Process Concept + +## Issues Found + +### 1. **CRITICAL: Data Model Inconsistency** + +**Location**: Line 277 in concept document +**Issue**: `resetTokenExpires` is shown as `Optional[str]` but should be `Optional[float]` +```python +# WRONG (line 277): +resetTokenExpires: Optional[str] = Field(None, description="Reset token expiration (ISO datetime)") + +# CORRECT: +resetTokenExpires: Optional[float] = Field(None, description="Reset token expiration (UTC timestamp in seconds)") +``` + +**Impact**: Implementation confusion, type errors + +### 2. **CRITICAL: Config Value Inconsistency** + +**Location**: Line 429 in concept document +**Issue**: Config shows `48` hours but requirement is `24` hours +```ini +# WRONG (line 429): +Auth_RESET_TOKEN_EXPIRY_HOURS = 48 + +# CORRECT: +Auth_RESET_TOKEN_EXPIRY_HOURS = 24 +``` + +**Impact**: Wrong expiration time in implementation + +### 3. **SECURITY: Missing Rate Limiting** + +**Issue**: Password reset request endpoint needs stricter rate limiting than registration +**Current**: Registration has `@limiter.limit("10/minute")` +**Needed**: Password reset request should have stricter limits to prevent: +- Email enumeration attacks +- Spam/DoS attacks +- Brute force attempts + +**Recommendation**: +- Password reset request: `@limiter.limit("5/minute")` or `@limiter.limit("3/minute")` +- Password reset (token usage): `@limiter.limit("10/minute")` (less critical, token is single-use) + +### 4. **SECURITY: Token Invalidation Strategy** ✅ CLARIFIED + +**Issue**: What happens when user requests password reset multiple times? +**Current**: Concept doesn't specify behavior +**Clarification**: +- Token is stored in the `UserInDB` record (single field: `resetToken`) +- When new reset token is generated, **old token is overwritten** in the database +- **No risk**: Only one token can exist per user at any time (the last one generated) +- The old token is automatically invalidated because it no longer exists in the database + +**Status**: ✅ No risk - token overwriting ensures only one valid token exists + +### 5. **SECURITY: Token Reuse Prevention** ✅ CLARIFIED + +**Issue**: What prevents token from being used multiple times? +**Current**: Concept says "Tokens are single-use (cleared after password reset)" but doesn't specify validation +**Clarification**: +- Token is **cleared immediately after successful password reset** (atomic operation) +- Once cleared, token no longer exists in database +- **No risk**: If token is used again, validation will fail because token doesn't exist +- Token reuse is prevented by the fact that token is removed from database after use + +**Status**: ✅ No risk - token clearing after use prevents reuse automatically + +### 6. **SECURITY: Email Case Sensitivity** + +**Issue**: Email uniqueness check - are emails case-sensitive? +**Current**: Concept doesn't specify +**Risk**: +- `User@Example.com` and `user@example.com` could be treated as different +- Email enumeration could reveal accounts + +**Recommendation**: +- Normalize emails to lowercase before storage and comparison +- Use `email.lower().strip()` in all email operations +- Document this in the concept + +### 7. **SECURITY: Email Enumeration Prevention** ✅ APPROACH SELECTED + +**Issue**: Registration endpoint reveals if email exists +**Current**: Registration rejects if email exists → reveals email is registered +**Risk**: Attacker can enumerate registered emails + +**Selected Approach**: +- ✅ **If email exists for LOCAL auth user**: Send password reset email to existing user, don't create duplicate account +- ✅ **Return generic success message**: "Registration successful. Please check your email to set your password." +- ✅ **Don't reveal**: Whether email already existed or new account was created +- This prevents email enumeration while providing good UX + +**Status**: ✅ Approach selected - send reset email to existing user if email exists + +### 8. **EDGE CASE: Multiple Reset Requests** ✅ CLARIFIED + +**Issue**: User requests reset multiple times before using first token +**Current**: Concept doesn't specify behavior +**Clarification**: +- Token is stored in `UserInDB` record (single field) +- **Only one token can exist at a time** - the last one generated +- When new reset request comes in, **old token is overwritten** in database +- **Only the last token is valid** - previous tokens are automatically invalidated +- No special handling needed - database structure ensures single token + +**Status**: ✅ No risk - database structure prevents multiple tokens + +### 9. **EDGE CASE: User State Changes During Reset** + +**Issue**: What happens if user state changes while reset token is active? +**Scenarios**: +- User is disabled (`enabled=False`) while reset token is active +- User changes email while reset token is active +- User account is deleted while reset token is active +- User sets password via admin while reset token is active + +**Recommendation**: +- On password reset, check user still exists and is enabled +- If user is disabled, allow reset but keep `enabled=False` (admin must activate) +- If email changed, token should still work (token is tied to user, not email) +- If user deleted, token validation should fail (user not found) + +### 10. **EDGE CASE: Expired Token Cleanup** ✅ APPROACH SELECTED + +**Issue**: Expired tokens remain in database indefinitely +**Current**: Concept doesn't mention cleanup +**Risk**: Database bloat, potential security issues + +**Selected Approach**: +- ✅ **Check token expiration during user authentication** (on each login attempt) +- ✅ **Clear expired token if found** during authentication check +- ✅ **Also check during token validation** (password reset endpoint) and clear if expired +- ✅ **Simplest approach**: No separate cleanup job needed, tokens cleaned on access +- This ensures expired tokens are cleaned automatically without additional jobs + +**Status**: ✅ Approach selected - check and clean expired tokens during auth/validation + +### 11. **IMPLEMENTATION: Cross-Mandate Search** + +**Issue**: How exactly to search across all mandates? +**Current**: Concept says "use root interface" but doesn't specify implementation +**Questions**: +- Does `getRecordsetWithRBAC` work with root interface? +- Should we bypass RBAC for this search? +- How to ensure we search ALL mandates? + +**Recommendation**: +- Use root interface with `mandateId=None` or empty filter +- Use direct database query if RBAC filtering interferes +- Document exact implementation approach + +### 12. **IMPLEMENTATION: Email Search Method** + +**Issue**: No existing method to find user by email across mandates +**Current**: Concept mentions `findUserByEmailLocalAuth()` but doesn't specify how it works +**Questions**: +- Does database support case-insensitive email search? +- Should we use exact match or LIKE? +- How to handle NULL emails? + +**Recommendation**: +- Implement method that: + 1. Uses root interface + 2. Searches with `recordFilter={"email": email.lower(), "authenticationAuthority": AuthAuthority.LOCAL}` + 3. Returns first match or None + 4. Handles NULL emails gracefully + +### 13. **USER EXPERIENCE: Email Delivery Failure** ✅ MESSAGE TO ADD + +**Issue**: What if user never receives email? +**Current**: Concept doesn't address this +**Scenarios**: +- Email goes to spam +- Email server is down +- User enters wrong email +- Email delivery is delayed + +**Selected Approach**: +- ✅ **Show message about spam folder**: "If you don't receive an email within 5 minutes, please check your spam folder" +- ✅ Display this message after registration and password reset request +- Future enhancement: Consider "Resend email" functionality (with rate limiting) + +**Status**: ✅ Message to add - show spam folder reminder + +### 14. **USER EXPERIENCE: Password Reset After Registration** ✅ NO SPECIAL HANDLING + +**Issue**: User registers, sets password, then immediately forgets it +**Current**: User must use password reset flow +**Decision**: +- ✅ **No special handling required** - use standard password reset flow +- ✅ User can request password reset like any other user +- ✅ Same token expiration (24 hours) applies to all reset tokens + +**Status**: ✅ No special handling needed + +### 15. **MISSING: Token Validation Order** + +**Issue**: Order of validation checks not specified +**Current**: Concept lists validations but not order +**Risk**: Inefficient validation, potential security issues + +**Recommendation**: +- Validate in this order: + 1. Token format (UUID format check) + 2. Token exists in database + 3. User exists and is not deleted + 4. Token matches user's resetToken + 5. Token not expired (compare timestamps) + 6. Password strength + 7. Set password and clear token (atomic operation) + +### 16. **MISSING: Atomic Operations** ✅ CONFIRMED + +**Issue**: Password reset involves multiple database operations +**Current**: Concept doesn't specify transaction handling +**Risk**: +- Partial updates if operation fails +- Token cleared but password not set +- Race conditions + +**Confirmation**: +- ✅ **Use database transaction** for password reset operation +- ✅ **Atomic operation** ensures all-or-nothing: + - Setting passwordHash + - Clearing resetToken and resetTokenExpires + - Setting enabled=True +- ✅ All operations succeed or all fail together + +**Status**: ✅ Confirmed - atomic operations required + +### 17. **MISSING: Audit Logging** + +**Issue**: No mention of audit logging for security events +**Current**: Concept doesn't specify logging +**Risk**: No audit trail for security incidents + +**Recommendation**: +- Log all password reset requests (with email hash for privacy) +- Log successful password resets +- Log failed token validations +- Log token generation events + +### 18. **MISSING: Error Messages** + +**Issue**: Error messages could reveal too much information +**Current**: Some error messages are specified, but not all +**Risk**: Information leakage + +**Recommendation**: +- Use generic error messages: + - "Invalid or expired reset token" (don't specify which) + - "Registration failed" (don't specify why) + - "Password reset request processed" (always same message) +- Log detailed errors server-side only + +### 19. **MISSING: Frontend Token Validation** + +**Issue**: Frontend should validate token format before API call +**Current**: Concept doesn't specify frontend validation +**Risk**: Unnecessary API calls, poor UX + +**Recommendation**: +- Validate UUID format in frontend before calling API +- Show error immediately if token format is invalid +- Extract token from URL and validate on page load + +### 20. **MISSING: Magic Link Base URL** ✅ LOCATION SPECIFIED + +**Issue**: Magic link generation needs frontend base URL +**Current**: Concept mentions "Should use frontend base URL from config" but doesn't specify +**Selected Approach**: +- ✅ **Add frontend URL to frontend env files**: `frontend_agents/public/config/env_*.env` +- ✅ **Files to update**: + - `env_int.env` - Integration environment URL + - `env_prod.env` - Production environment URL + - (Add `env_dev.env` if exists) +- ✅ **Variable name**: `APP_FRONTEND_URL` or `Frontend_BASE_URL` +- ✅ **Usage**: Backend reads from frontend config or gets from request origin +- ✅ **Environment-specific**: Each env file has its own URL + +**Status**: ✅ Location specified - add to frontend env files + +## Recommendations Summary + +### High Priority (Security & Correctness) +1. ✅ Fix data model inconsistency (resetTokenExpires type) +2. ✅ Fix config value (24 hours, not 48) +3. ✅ Add rate limiting to password reset endpoints +4. ✅ Implement token invalidation on new reset request +5. ✅ Normalize emails to lowercase +6. ✅ Prevent token reuse (clear immediately after use) +7. ✅ Add atomic transaction for password reset + +### Medium Priority (Edge Cases) +8. ✅ Handle user state changes during reset +9. ✅ Implement expired token cleanup +10. ✅ Add email delivery failure handling +11. ✅ Specify cross-mandate search implementation +12. ✅ Add audit logging + +### Low Priority (UX & Polish) +13. ✅ Add frontend token validation +14. ✅ Specify magic link base URL configuration +15. ✅ Improve error messages (generic, secure) +16. ✅ Add resend email functionality (future enhancement) + +## Security Checklist + +- [ ] Rate limiting on all endpoints +- [ ] Token invalidation strategy implemented +- [ ] Token reuse prevention +- [ ] Email normalization (lowercase) +- [ ] Email enumeration prevention +- [ ] Atomic database operations +- [ ] Audit logging +- [ ] Generic error messages +- [ ] Token expiration validation +- [ ] User state validation during reset +- [ ] Cross-mandate search security (RBAC bypass only for root operations) + +## Testing Checklist (Additional) + +- [ ] Multiple reset requests (old token invalidated) +- [ ] Token reuse attempt (should fail) +- [ ] Expired token usage (should fail) +- [ ] User disabled during reset (should handle gracefully) +- [ ] Email case variations (User@Example.com vs user@example.com) +- [ ] Concurrent reset requests (race condition handling) +- [ ] Email delivery failure scenarios +- [ ] Invalid token format handling +- [ ] Cross-mandate email search accuracy +- [ ] Atomic operation rollback on failure diff --git a/appdoc/doc_workflow_actions_rbac_concept_done.md b/appdoc/doc_workflow_actions_rbac_concept_done.md new file mode 100644 index 0000000..7ea4c0d --- /dev/null +++ b/appdoc/doc_workflow_actions_rbac_concept_done.md @@ -0,0 +1,1119 @@ +# Workflow Actions RBAC Integration Concept + +## Übersicht + +Dieses Dokument beschreibt das Konzept für die Umstrukturierung der Workflow Actions, um: +1. **RBAC-Integration** zu ermöglichen (Schutz von Actions über RESOURCE-Context) +2. **Strukturierte Parameter-Definitionen** statt Docstrings zu verwenden +3. **UI-Rendering-Typen** für Parameter zu definieren +4. **Keine Duplikation** von Parameter-Definitionen zu haben +5. **Plug-and-Play** Funktionalität beizubehalten + +**WICHTIG**: +- Alle Actions MÜSSEN in `_actions` Dictionary definiert sein +- Keine Backward Compatibility - Actions ohne `_actions` Definition sind nicht verfügbar +- RBAC-Service ist REQUIRED (kein Fallback ohne RBAC) + +## Architektur-Konzept + +### Grundprinzip: Deklarative Action-Definition mit Refactored Structure + +Nach dem Refactoring sind Actions in separaten Dateien in `actions/` Ordnern organisiert. Die Action-Definitionen werden deklarativ in der Hauptklasse definiert und referenzieren die Execute-Funktionen aus den separaten Action-Dateien. + +**Neue Ordnerstruktur** (nach Refactoring): +``` +methodOutlook/ +├── __init__.py +├── methodOutlook.py (Hauptklasse mit Action-Definitionen) +├── helpers/ +│ ├── connection.py +│ ├── emailProcessing.py +│ └── folderManagement.py +└── actions/ + ├── readEmails.py (Execute-Funktion) + ├── searchEmails.py + └── ... +``` + +**Action-Definition in Hauptklasse** (`methodOutlook/methodOutlook.py`): +```python +from modules.datamodels.datamodelWorkflowActions import WorkflowActionDefinition, WorkflowActionParameter +from modules.shared.frontendTypes import FrontendType +from .actions.readEmails import readEmails # Execute-Funktion importieren + +class MethodOutlook(MethodBase): + def __init__(self, services): + super().__init__(services) + self.name = "outlook" + self.description = "Handle Microsoft Outlook email operations" + + # Initialize helper modules + self.connection = ConnectionHelper(self) + self.emailProcessing = EmailProcessingHelper(self) + self.folderManagement = FolderManagementHelper(self) + + # Actions werden deklarativ definiert + # Execute-Funktionen werden aus separaten Action-Dateien importiert + self._actions = { + "readEmails": WorkflowActionDefinition( + actionId="outlook.readEmails", # Für RBAC: RESOURCE context + description="Read emails and metadata from a mailbox folder", + parameters={ + "connectionReference": WorkflowActionParameter( + name="connectionReference", + type="str", + frontendType=FrontendType.USER_CONNECTION, + required=True, + description="Microsoft connection label" + ), + "folder": WorkflowActionParameter( + name="folder", + type="str", + frontendType=FrontendType.SELECT, + frontendOptions="outlook.folder", + required=False, + default="Inbox", + description="Folder to read from" + ), + "limit": WorkflowActionParameter( + name="limit", + type="int", + frontendType=FrontendType.NUMBER, + required=False, + default=10, + description="Maximum items to return", + validation={"min": 1, "max": 1000} + ), + "filter": WorkflowActionParameter( + name="filter", + type="str", + frontendType=FrontendType.TEXT, + required=False, + description="Sender, query operators, or subject text" + ), + "outputMimeType": WorkflowActionParameter( + name="outputMimeType", + type="str", + frontendType=FrontendType.SELECT, + frontendOptions=["application/json", "text/plain", "text/csv"], + required=False, + default="application/json", + description="MIME type for output file" + ) + }, + execute=readEmails.__get__(self, self.__class__) # Referenz auf Execute-Funktion + ), + "searchEmails": WorkflowActionDefinition( + actionId="outlook.searchEmails", + description="Search emails by query and return matching items", + parameters={...}, + execute=searchEmails.__get__(self, self.__class__) + ) + } + + # Register actions as methods (optional, für direkten Zugriff) + # MethodBase lädt Actions primär aus _actions Dictionary + self.readEmails = readEmails.__get__(self, self.__class__) + self.searchEmails = searchEmails.__get__(self, self.__class__) +``` + +**Execute-Funktion in separater Datei** (`methodOutlook/actions/readEmails.py`): +```python +from modules.workflows.methods.methodBase import action +from modules.datamodels.datamodelChat import ActionResult + +@action +async def readEmails(self, parameters: Dict[str, Any]) -> ActionResult: + """ + Execute function - Parameter-Definition ist jetzt in WorkflowActionDefinition. + Diese Funktion enthält nur noch die Implementierung. + """ + # Implementation bleibt gleich... + connectionReference = parameters.get("connectionReference") + folder = parameters.get("folder", "Inbox") + # ... rest of implementation +``` + +## Globale Frontend-Type-Definition + +**WICHTIG**: Frontend-Types werden zentral in `modules/shared/frontendTypes.py` definiert, nicht redundant pro Action. + +Die globale `FrontendType` Enum enthält: +- **Standard Types**: `text`, `textarea`, `number`, `select`, `multiselect`, `checkbox`, `date`, `datetime`, `email`, `timestamp`, `json`, `multilingual`, `file` +- **Custom Types für Actions**: `userConnection`, `documentReference`, `workflowAction` + +Custom-Types unterstützen dynamische Option-Listen über API-Endpoints: +- `userConnection` → `/api/options/user.connection` (Connections des aktuellen Users) +- `documentReference` → `/api/options/workflow.documentReference` (Document-Referenzen aus Workflow-Context) +- `workflowAction` → `/api/options/workflow.action` (Verfügbare Actions aus Workflow-Context) + +## Datenmodelle + +### WorkflowActionParameter + +**WICHTIG**: +- Frontend-Types werden global definiert in `modules/shared/frontendTypes.py` und nicht redundant in Actions +- Diese Klasse heißt `WorkflowActionParameter` (nicht `ActionParameter`) um Konflikte mit `ActionParameters` aus `datamodelChat.py` zu vermeiden + +```python +from typing import Optional, Any, Union, List, Dict +from pydantic import BaseModel, Field +from modules.shared.frontendTypes import FrontendType # Globale Definition + +class WorkflowActionParameter(BaseModel): + """ + Parameter schema definition for a workflow action. + + This defines the structure and UI rendering for a single action parameter, + NOT the actual parameter values (those are in ActionDefinition.parameters). + """ + name: str = Field(description="Parameter name") + type: str = Field(description="Python type as string (e.g., 'str', 'int', 'bool', 'List[str]')") + frontendType: FrontendType = Field(description="UI rendering type (from global FrontendType enum)") + frontendOptions: Optional[Union[str, List[Dict[str, Any]]]] = Field( + None, + description="Options for select/multiselect/custom types. String reference (e.g., 'user.connection') or static list. For custom types like userConnection, this is automatically set to the API endpoint." + ) + required: bool = Field(False, description="Whether parameter is required") + default: Optional[Any] = Field(None, description="Default value") + description: str = Field("", description="Parameter description") + validation: Optional[Dict[str, Any]] = Field( + None, + description="Validation rules (e.g., {'min': 1, 'max': 100})" + ) +``` + +**Custom Frontend Types**: +- `FrontendType.USER_CONNECTION`: User connection selector - dynamische Options von `/api/options/user.connection` +- `FrontendType.DOCUMENT_REFERENCE`: Document reference selector - dynamische Options aus Workflow-Context +- `FrontendType.WORKFLOW_ACTION`: Workflow action selector - dynamische Options aus verfügbaren Actions + +Für Custom-Types wird `frontendOptions` automatisch auf den entsprechenden API-Endpoint gesetzt (z.B. `"user.connection"`). + +### WorkflowActionDefinition + +**WICHTIG**: Diese Klasse heißt `WorkflowActionDefinition` (nicht `ActionDefinition`) um Konflikte mit der bestehenden `ActionDefinition` aus `datamodelWorkflow.py` zu vermeiden: +- **Bestehende `ActionDefinition`**: Für Workflow-Execution-Planning (enthält konkrete Werte: `action`, `actionObjective`, `parameters` mit Werten) +- **Neue `WorkflowActionDefinition`**: Für Action-Schema-Definitionen (enthält Metadaten: `actionId`, `description`, `parameters` als Schemas) + +```python +from typing import Dict, Callable, Awaitable, Optional, List +from pydantic import BaseModel, Field +from modules.datamodels.datamodelChat import ActionResult + +class WorkflowActionDefinition(BaseModel): + """ + Complete schema definition of a workflow action. + + This defines the metadata, parameters, and execution function for an action. + This is different from datamodelWorkflow.ActionDefinition which contains + actual execution values (action, actionObjective, parameters with values). + + This class defines the ACTION SCHEMA, not the execution plan. + """ + actionId: str = Field( + description="Unique action identifier for RBAC (format: 'module.actionName', e.g., 'outlook.readEmails')" + ) + description: str = Field(description="Action description") + parameters: Dict[str, WorkflowActionParameter] = Field( + default_factory=dict, + description="Parameter schema definitions" + ) + execute: Optional[Callable[[Dict[str, Any]], Awaitable[ActionResult]]] = Field( + None, + description="Execution function - async function that takes parameters dict and returns ActionResult. Set dynamically." + ) + category: Optional[str] = Field(None, description="Action category for grouping") + tags: List[str] = Field(default_factory=list, description="Tags for search/filtering") +``` + +## Integration mit Refactored Structure + +### Kompatibilität mit Folder-basierter Struktur + +Nach dem Refactoring sind alle Methods in Folder-Strukturen organisiert: +- Actions in `actions/` Unterordnern +- Helper-Funktionen in `helpers/` Unterordnern +- Hauptklasse minimal gehalten + +**RBAC-Integration funktioniert nahtlos mit dieser Struktur**: + +1. **Action-Definitionen** werden in der **Hauptklasse** (`methodOutlook.py`) im `_actions` Dictionary definiert +2. **Execute-Funktionen** bleiben in **separaten Action-Dateien** (`actions/readEmails.py`) +3. **Helper-Klassen** werden in der Hauptklasse initialisiert und von Actions verwendet + +**Vorteile**: +- Zentrale Verwaltung aller Action-Definitionen (inkl. RBAC-IDs) +- Actions bleiben modular und testbar +- Helper-Funktionen bleiben wiederverwendbar +- Einfache Migration: Schrittweise `_actions` Dictionary hinzufügen + +### Beispiel: Vollständige Integration + +**Struktur**: +``` +methodOutlook/ +├── __init__.py +├── methodOutlook.py (Hauptklasse mit _actions Dictionary) +├── helpers/ +│ ├── connection.py +│ ├── emailProcessing.py +│ └── folderManagement.py +└── actions/ + ├── readEmails.py (Execute-Funktion) + ├── searchEmails.py + └── ... +``` + +**Hauptklasse** (`methodOutlook/methodOutlook.py`): +```python +class MethodOutlook(MethodBase): + def __init__(self, services): + super().__init__(services) + self.name = "outlook" + self.description = "Handle Microsoft Outlook email operations" + + # Initialize helper modules + self.connection = ConnectionHelper(self) + self.emailProcessing = EmailProcessingHelper(self) + self.folderManagement = FolderManagementHelper(self) + + # RBAC-Integration: Action-Definitionen mit actionId + self._actions = { + "readEmails": WorkflowActionDefinition( + actionId="outlook.readEmails", # RBAC-ID + description="Read emails and metadata from a mailbox folder", + parameters={...}, + execute=readEmails.__get__(self, self.__class__) + ), + # ... weitere Actions + } + + # Actions als Methoden registrieren (optional, für direkten Zugriff) + self.readEmails = readEmails.__get__(self, self.__class__) +``` + +**Action-Datei** (`methodOutlook/actions/readEmails.py`): +```python +@action +async def readEmails(self, parameters: Dict[str, Any]) -> ActionResult: + """Execute function - verwendet Helper-Klassen""" + connection = self.connection.getMicrosoftConnection(...) + params = self.emailProcessing.buildSearchParameters(...) + # ... implementation +``` + +## MethodBase Erweiterung + +### Neue MethodBase Struktur + +```python +class MethodBase: + """Base class for all methods""" + + def __init__(self, services: Any): + self.services = services + self.name: str + self.description: str + self.logger = logging.getLogger(f"{__name__}.{self.__class__.__name__}") + + # Actions MÜSSEN als Dictionary definiert sein + # Jede Method-Klasse muss _actions Dictionary in __init__ definieren + self._actions: Dict[str, WorkflowActionDefinition] = {} + + # Nach Initialisierung: Actions validieren + self._validateActions() + + def _validateActions(self): + """Validate that _actions dictionary is properly defined""" + if not hasattr(self, '_actions') or not isinstance(self._actions, dict): + raise ValueError(f"Method {self.name} must define _actions dictionary in __init__") + + for actionName, actionDef in self._actions.items(): + if not isinstance(actionDef, WorkflowActionDefinition): + raise ValueError(f"Action '{actionName}' in {self.name} must be WorkflowActionDefinition instance") + + if not actionDef.actionId: + raise ValueError(f"Action '{actionName}' in {self.name} must have actionId") + + if not actionDef.execute: + raise ValueError(f"Action '{actionName}' in {self.name} must have execute function") + + @property + def actions(self) -> Dict[str, Dict[str, Any]]: + """ + Dynamically collect all actions from _actions dictionary. + Returns format for API/UI consumption. + + REQUIREMENT: Alle Actions müssen in _actions Dictionary definiert sein. + Actions ohne _actions Definition sind nicht verfügbar. + """ + result = {} + + # Actions müssen in _actions Dictionary definiert sein + if not hasattr(self, '_actions') or not self._actions: + logger.error(f"Method {self.name} has no _actions dictionary defined. Actions will not be available.") + return result + + for actionName, actionDef in self._actions.items(): + # RBAC-Check: Prüfe ob Action für aktuellen User verfügbar ist + if not self._checkActionPermission(actionDef.actionId): + continue # Skip if user doesn't have permission + + # Konvertiere WorkflowActionDefinition zu System-Format + result[actionName] = { + 'description': actionDef.description, + 'parameters': self._convertParametersToSystemFormat(actionDef.parameters), + 'method': self._createActionWrapper(actionDef) + } + + return result + + def _checkActionPermission(self, actionId: str) -> bool: + """ + Check if current user has permission to execute this action. + Uses RBAC RESOURCE context. + + REQUIREMENT: RBAC-Service muss verfügbar sein. + """ + if not hasattr(self.services, 'rbac') or not self.services.rbac: + logger.error(f"RBAC service not available. Action {actionId} will be denied.") + return False + + currentUser = self.services.chat.getCurrentUser() + if not currentUser: + logger.warning(f"No current user found. Action {actionId} will be denied.") + return False + + # RBAC-Check: RESOURCE context, item = actionId + permissions = self.services.rbac.getUserPermissions( + user=currentUser, + context=AccessRuleContext.RESOURCE, + item=actionId + ) + + return permissions.view + + def _convertParametersToSystemFormat(self, parameters: Dict[str, WorkflowActionParameter]) -> Dict[str, Dict[str, Any]]: + """Convert WorkflowActionParameter dict to system format for API/UI consumption""" + result = {} + for paramName, param in parameters.items(): + result[paramName] = { + 'type': param.type, + 'required': param.required, + 'description': param.description, + 'default': param.default, + 'frontendType': param.frontendType.value, + 'frontendOptions': param.frontendOptions, + 'validation': param.validation + } + return result + + def _createActionWrapper(self, actionDef: WorkflowActionDefinition): + """Create wrapper function for action execution with parameter validation""" + async def wrapper(parameters: Dict[str, Any], *args, **kwargs): + # Parameter-Validierung basierend auf WorkflowActionParameter definitions + validatedParams = self._validateParameters(parameters, actionDef.parameters) + + # Execute action + return await actionDef.execute(validatedParams, *args, **kwargs) + + wrapper.is_action = True + return wrapper + + def _validateParameters(self, parameters: Dict[str, Any], paramDefs: Dict[str, WorkflowActionParameter]) -> Dict[str, Any]: + """Validate parameters against definitions""" + validated = {} + + for paramName, paramDef in paramDefs.items(): + value = parameters.get(paramName) + + # Check required + if paramDef.required and value is None: + raise ValueError(f"Required parameter '{paramName}' is missing") + + # Use default if not provided + if value is None and paramDef.default is not None: + value = paramDef.default + + # Type validation + if value is not None: + value = self._validateType(value, paramDef.type) + + # Custom validation rules + if paramDef.validation and value is not None: + self._applyValidationRules(value, paramDef.validation) + + validated[paramName] = value + + return validated + + def _validateType(self, value: Any, expectedType: type) -> Any: + """Validate and convert value to expected type""" + # Type validation logic... + if expectedType == int: + return int(value) + elif expectedType == str: + return str(value) + # ... weitere Typen + return value + + def _applyValidationRules(self, value: Any, rules: Dict[str, Any]): + """Apply custom validation rules""" + if 'min' in rules and value < rules['min']: + raise ValueError(f"Value must be >= {rules['min']}") + if 'max' in rules and value > rules['max']: + raise ValueError(f"Value must be <= {rules['max']}") + # ... weitere Validierungsregeln +``` + +## Migrationsstrategie + +### Schritt 1: Neue Datenmodelle erstellen + +**WICHTIG**: Die bestehenden Klassen `ActionDefinition` (in `datamodelWorkflow.py`) und `ActionParameters` (in `datamodelChat.py`) haben einen anderen Zweck: +- `ActionDefinition` (existing): Für Workflow-Execution-Planning (enthält konkrete Werte) +- `ActionParameters` (existing): Einfacher Parameter-Wrapper + +**Lösung**: Neue Klassen mit klaren Namen für Action-Schema-Definitionen erstellen. + +**Datei**: `gateway/modules/datamodels/datamodelWorkflowActions.py` + +```python +from typing import Optional, Any, Union, List, Dict, Callable, Awaitable +from pydantic import BaseModel, Field +from modules.datamodels.datamodelChat import ActionResult +from modules.shared.frontendTypes import FrontendType # Globale Definition verwenden +from modules.shared.attributeUtils import registerModelLabels + +class WorkflowActionParameter(BaseModel): + """ + Parameter schema definition for a workflow action. + + This defines the structure and UI rendering for a single action parameter, + NOT the actual parameter values (those are in ActionDefinition.parameters). + """ + name: str = Field(description="Parameter name") + type: str = Field(description="Python type as string: 'str', 'int', 'bool', 'List[str]', etc.") + frontendType: FrontendType = Field(description="UI rendering type (from global FrontendType enum)") + frontendOptions: Optional[Union[str, List[Dict[str, Any]]]] = Field( + None, + description="Options for select/multiselect/custom types. String reference (e.g., 'user.connection') or static list. For custom types, this is automatically set to the API endpoint." + ) + required: bool = Field(False, description="Whether parameter is required") + default: Optional[Any] = Field(None, description="Default value") + description: str = Field("", description="Parameter description") + validation: Optional[Dict[str, Any]] = Field( + None, + description="Validation rules (e.g., {'min': 1, 'max': 100})" + ) + +class WorkflowActionDefinition(BaseModel): + """ + Complete schema definition of a workflow action. + + This defines the metadata, parameters, and execution function for an action. + This is different from datamodelWorkflow.ActionDefinition which contains + actual execution values (action, actionObjective, parameters with values). + + This class defines the ACTION SCHEMA, not the execution plan. + """ + actionId: str = Field( + description="Unique action identifier for RBAC (format: 'module.actionName', e.g., 'outlook.readEmails')" + ) + description: str = Field(description="Action description") + parameters: Dict[str, WorkflowActionParameter] = Field( + default_factory=dict, + description="Parameter schema definitions" + ) + execute: Optional[Callable] = Field( + None, + description="Execution function - async function that takes parameters dict and returns ActionResult. Set dynamically." + ) + category: Optional[str] = Field(None, description="Action category for grouping") + tags: List[str] = Field(default_factory=list, description="Tags for search/filtering") + +# Register model labels for UI +registerModelLabels( + "WorkflowActionDefinition", + {"en": "Workflow Action Definition", "fr": "Définition d'action de workflow"}, + { + "actionId": {"en": "Action ID", "fr": "ID d'action"}, + "description": {"en": "Description", "fr": "Description"}, + "parameters": {"en": "Parameters", "fr": "Paramètres"}, + "category": {"en": "Category", "fr": "Catégorie"}, + "tags": {"en": "Tags", "fr": "Étiquettes"}, + }, +) + +registerModelLabels( + "WorkflowActionParameter", + {"en": "Workflow Action Parameter", "fr": "Paramètre d'action de workflow"}, + { + "name": {"en": "Name", "fr": "Nom"}, + "type": {"en": "Type", "fr": "Type"}, + "frontendType": {"en": "Frontend Type", "fr": "Type frontend"}, + "frontendOptions": {"en": "Frontend Options", "fr": "Options frontend"}, + "required": {"en": "Required", "fr": "Requis"}, + "default": {"en": "Default", "fr": "Par défaut"}, + "description": {"en": "Description", "fr": "Description"}, + "validation": {"en": "Validation", "fr": "Validation"}, + }, +) +``` + +### Schritt 2: MethodBase erweitern + +**Datei**: `gateway/modules/workflows/methods/methodBase.py` + +- Neue `_actions` Dictionary Property (REQUIRED) +- RBAC-Check Integration (REQUIRED) +- Parameter-Validierung +- Unterstützung für Refactored Structure (Actions in separaten Dateien) + +**WICHTIG**: +- MethodBase unterstützt NUR noch die `_actions` Dictionary-Struktur +- Alle Actions MÜSSEN in `_actions` Dictionary definiert sein +- RBAC-Service ist REQUIRED (kein Fallback ohne RBAC) +- `@action` Decorator wird weiterhin verwendet, aber nur für Execute-Funktionen (nicht für Discovery) + +### Schritt 3: Beispiel-Migration mit Refactored Structure + +**Vorher** (monolithische `methodOutlook.py`): +```python +@action +async def readEmails(self, parameters: Dict[str, Any]) -> ActionResult: + """ + GENERAL: + - Purpose: Read emails from Outlook mailbox + + Parameters: + - connectionReference (str, required): Microsoft connection label. + - query (str, optional): Search query for emails. + - folder (str, optional): Folder name. + - limit (int, optional): Maximum number of emails. Default: 50. + """ + # Implementation... +``` + +**Nachher** (Refactored Structure): + +**1. Hauptklasse** (`methodOutlook/methodOutlook.py`): +```python +from modules.datamodels.datamodelWorkflowActions import WorkflowActionDefinition, WorkflowActionParameter +from modules.shared.frontendTypes import FrontendType +from .actions.readEmails import readEmails +from .helpers.connection import ConnectionHelper +from .helpers.emailProcessing import EmailProcessingHelper +from .helpers.folderManagement import FolderManagementHelper + +class MethodOutlook(MethodBase): + def __init__(self, services): + super().__init__(services) + self.name = "outlook" + self.description = "Handle Microsoft Outlook email operations" + + # Initialize helper modules + self.connection = ConnectionHelper(self) + self.emailProcessing = EmailProcessingHelper(self) + self.folderManagement = FolderManagementHelper(self) + + # Actions werden deklarativ definiert + self._actions = { + "readEmails": WorkflowActionDefinition( + actionId="outlook.readEmails", + description="Read emails and metadata from a mailbox folder", + parameters={ + "connectionReference": WorkflowActionParameter( + name="connectionReference", + type="str", + frontendType=FrontendType.USER_CONNECTION, + required=True, + description="Microsoft connection label" + ), + "folder": WorkflowActionParameter( + name="folder", + type="str", + frontendType=FrontendType.SELECT, + frontendOptions="outlook.folder", + required=False, + default="Inbox", + description="Folder to read from" + ), + "limit": WorkflowActionParameter( + name="limit", + type="int", + frontendType=FrontendType.NUMBER, + required=False, + default=10, + description="Maximum items to return", + validation={"min": 1, "max": 1000} + ), + "filter": WorkflowActionParameter( + name="filter", + type="str", + frontendType=FrontendType.TEXT, + required=False, + description="Sender, query operators, or subject text" + ), + "outputMimeType": WorkflowActionParameter( + name="outputMimeType", + type="str", + frontendType=FrontendType.SELECT, + frontendOptions=["application/json", "text/plain", "text/csv"], + required=False, + default="application/json", + description="MIME type for output file" + ) + }, + execute=readEmails.__get__(self, self.__class__) # Referenz auf Execute-Funktion + ) + } + + # Register actions as methods (optional, für direkten Zugriff) + self.readEmails = readEmails.__get__(self, self.__class__) +``` + +**2. Action-Datei** (`methodOutlook/actions/readEmails.py`): +```python +from modules.workflows.methods.methodBase import action +from modules.datamodels.datamodelChat import ActionResult + +@action +async def readEmails(self, parameters: Dict[str, Any]) -> ActionResult: + """ + Execute function - Parameter-Definition ist jetzt in WorkflowActionDefinition. + Diese Funktion enthält nur noch die Implementierung. + """ + # Implementation bleibt gleich... + connectionReference = parameters.get("connectionReference") + folder = parameters.get("folder", "Inbox") + # ... rest of implementation using self.connection, self.emailProcessing, etc. +``` + +## RBAC-Integration + +### Action-IDs Format + +Actions werden im RBAC-System als RESOURCE-Context Items behandelt: + +- **Format**: `{moduleName}.{actionName}` +- **Beispiele**: + - `outlook.readEmails` + - `outlook.sendEmail` + - `sharepoint.uploadDocument` + - `ai.process` + +### RBAC-Regeln für Actions + +```json +{ + "roleLabel": "user", + "context": "RESOURCE", + "item": "outlook.readEmails", + "view": true +} +``` + +```json +{ + "roleLabel": "admin", + "context": "RESOURCE", + "item": "outlook", + "view": true +} +``` + +**Hierarchie**: Spezifische Action-Regeln überschreiben generische Module-Regeln. + +### Bootstrap: Default RBAC Rules für Actions + +In `interfaceBootstrap.py`: + +```python +def initRbacRules(db: DatabaseConnector) -> None: + # ... existing rules ... + + # Action Rules (RESOURCE context) + createActionRules(db) + +def createActionRules(db: DatabaseConnector): + """Create default RBAC rules for workflow actions""" + + # SysAdmin: Access to all actions + db.recordCreate(AccessRule( + roleLabel="sysadmin", + context=AccessRuleContext.RESOURCE, + item=None, # All resources + view=True + )) + + # Admin: Access to all actions + db.recordCreate(AccessRule( + roleLabel="admin", + context=AccessRuleContext.RESOURCE, + item=None, + view=True + )) + + # User: Access to specific actions only + userActions = [ + "outlook.readEmails", + "outlook.sendEmail", + "sharepoint.readDocuments", + "ai.process" + ] + + for actionId in userActions: + db.recordCreate(AccessRule( + roleLabel="user", + context=AccessRuleContext.RESOURCE, + item=actionId, + view=True + )) + + # Viewer: Read-only actions + viewerActions = [ + "outlook.readEmails", + "sharepoint.readDocuments" + ] + + for actionId in viewerActions: + db.recordCreate(AccessRule( + roleLabel="viewer", + context=AccessRuleContext.RESOURCE, + item=actionId, + view=True + )) +``` + +## Organisation der Action-Definitionen + +### Zentrale Definition in Hauptklasse + +**Prinzip**: Alle Action-Definitionen werden zentral in der Hauptklasse (`methodOutlook.py`) im `_actions` Dictionary definiert. + +**Vorteile**: +- Übersichtliche Verwaltung aller Actions einer Method +- Einfache RBAC-Integration (alle Action-IDs an einem Ort) +- Einfache API-Discovery (MethodBase kann alle Actions sammeln) +- Type-Safety durch Pydantic Models + +### Execute-Funktionen bleiben in separaten Dateien + +**Prinzip**: Die Execute-Funktionen bleiben in den separaten Action-Dateien (`actions/readEmails.py`). + +**Vorteile**: +- Modulare Struktur (eine Datei pro Action) +- Einfache Wartung und Tests +- Parallele Entwicklung möglich +- Helper-Klassen bleiben zugänglich über `self` + +### Beispiel-Struktur + +``` +methodOutlook/ +├── methodOutlook.py +│ └── _actions = { +│ "readEmails": WorkflowActionDefinition(...), +│ "searchEmails": WorkflowActionDefinition(...) +│ } +└── actions/ + ├── readEmails.py + │ └── async def readEmails(self, parameters) -> ActionResult + └── searchEmails.py + └── async def searchEmails(self, parameters) -> ActionResult +``` + +### Migration-Strategie + +1. **Schritt 1**: Action-Definitionen in `_actions` Dictionary hinzufügen +2. **Schritt 2**: Execute-Funktionen aus Action-Dateien referenzieren +3. **Schritt 3**: RBAC-Regeln in Bootstrap erstellen +4. **Schritt 4**: Tests und Validierung + +**Wichtig**: Execute-Funktionen müssen nicht geändert werden - sie bleiben identisch! + +## Vorteile + +### 1. Keine Duplikation +- Parameter werden nur einmal definiert (in `WorkflowActionDefinition`) +- Keine Docstring-Parsing mehr nötig +- Type-Safety durch Pydantic Models +- Zentrale Verwaltung in Hauptklasse + +### 2. RBAC-Integration +- Jede Action hat eine eindeutige ID für RBAC +- Granulare Kontrolle pro Action möglich +- Hierarchische Regeln (Module → Action) +- Action-IDs zentral in `_actions` Dictionary verwaltet + +### 3. UI-Rendering +- Frontend-Typen explizit definiert +- Options-Referenzen für dynamische Optionen +- Validierung auf Backend-Ebene +- Strukturierte Parameter-Definitionen für Frontend + +### 4. Plug-and-Play +- Actions bleiben als separate Method-Klassen +- Einfache Erweiterung durch neue Method-Klassen +- Refactored Structure bleibt erhalten +- Klare Anforderungen: Alle Actions müssen `_actions` Dictionary haben + +### 5. Type Safety +- Pydantic Models für Validierung +- Type-Hints für bessere IDE-Unterstützung +- Runtime-Validierung +- Zentrale Definitionen für bessere Wartbarkeit + +### 6. Refactored Structure Kompatibilität +- Funktioniert nahtlos mit Folder-basierter Struktur +- Execute-Funktionen bleiben in separaten Dateien +- Helper-Klassen bleiben wiederverwendbar +- Einfache Migration ohne Code-Änderungen in Action-Dateien + +## Migration Timeline + +### Phase 1: Foundation (Woche 1) +- ✅ Datenmodelle erstellen (`datamodelWorkflowActions.py`) +- ✅ MethodBase erweitern +- ✅ RBAC-Integration in MethodBase +- ✅ Refactoring aller Methods abgeschlossen (Ordnerstruktur) + +### Phase 2: Beispiel-Migration (Woche 2) +- 📝 Ein Method-Beispiel migrieren (z.B. `methodOutlook` mit RBAC-Definitionen) +- 📝 Action-Definitionen in Hauptklasse (`_actions` Dictionary) hinzufügen +- 📝 Execute-Funktionen in Action-Dateien bleiben unverändert +- 📝 Tests schreiben +- 📝 Dokumentation aktualisieren + +**Hinweis**: Da alle Methods bereits refactored sind (Actions in separaten Dateien), ist die Migration einfacher: Nur `_actions` Dictionary in Hauptklasse hinzufügen, Execute-Funktionen bleiben unverändert. + +### Phase 3: Vollständige Migration (Woche 3-4) +- 📝 Alle Methods migrieren (Action-Definitionen hinzufügen) +- 📝 RBAC-Regeln in Bootstrap erstellen +- 📝 Frontend-Integration +- 📝 API-Endpunkte für Action-Discovery implementieren + +### Phase 4: Testing & Cleanup (Woche 5) +- 📝 Unit Tests +- 📝 Integration Tests +- 📝 Performance Tests +- 📝 Alte Docstring-Parsing-Logik entfernen (nicht mehr benötigt) +- 📝 Sicherstellen, dass alle Methods `_actions` Dictionary haben + +## Praktische Umsetzung + +### Schritt-für-Schritt: Action-Definition hinzufügen + +**Voraussetzung**: Method ist bereits refactored (Actions in separaten Dateien). + +**Schritt 1**: Importiere benötigte Klassen in Hauptklasse +```python +# In methodOutlook/methodOutlook.py +from modules.datamodels.datamodelWorkflowActions import WorkflowActionDefinition, WorkflowActionParameter +from modules.shared.frontendTypes import FrontendType +``` + +**Schritt 2**: Erstelle `_actions` Dictionary in `__init__` +```python +def __init__(self, services): + super().__init__(services) + # ... existing code ... + + # RBAC-Integration: Action-Definitionen + self._actions = { + "readEmails": WorkflowActionDefinition( + actionId="outlook.readEmails", + description="Read emails and metadata from a mailbox folder", + parameters={ + "connectionReference": WorkflowActionParameter( + name="connectionReference", + type="str", + frontendType=FrontendType.USER_CONNECTION, + required=True, + description="Microsoft connection label" + ), + # ... weitere Parameter + }, + execute=readEmails.__get__(self, self.__class__) + ) + } +``` + +**Schritt 3**: Execute-Funktion bleibt unverändert +```python +# In methodOutlook/actions/readEmails.py +# Keine Änderungen nötig - Funktion bleibt identisch +@action +async def readEmails(self, parameters: Dict[str, Any]) -> ActionResult: + # Implementation bleibt gleich... +``` + +**Schritt 4**: RBAC-Regel in Bootstrap hinzufügen +```python +# In interfaceBootstrap.py +db.recordCreate(AccessRule( + roleLabel="user", + context=AccessRuleContext.RESOURCE, + item="outlook.readEmails", + view=True +)) +``` + +### Migration-Checkliste pro Method + +- [ ] `_actions` Dictionary in Hauptklasse erstellen +- [ ] Alle Actions mit `WorkflowActionDefinition` definieren +- [ ] Parameter mit `WorkflowActionParameter` definieren +- [ ] Frontend-Types aus globaler `FrontendType` Enum verwenden +- [ ] Execute-Funktionen aus Action-Dateien referenzieren +- [ ] RBAC-Regeln in Bootstrap hinzufügen +- [ ] Tests durchführen + +## Offene Fragen + +1. **Backward Compatibility**: Werden alte Actions ohne `_actions` Dictionary unterstützt? + - **Antwort**: Nein. Alle Actions MÜSSEN in `_actions` Dictionary definiert sein. Es gibt keinen Fallback auf `@action` Decorator. Actions ohne `_actions` Definition sind nicht verfügbar. + +2. **Parameter-Validierung**: Soll Validierung strikt sein oder tolerant? + - **Antwort**: Konfigurierbar pro Action + +3. **Action-Discovery**: Sollen Actions zur Laufzeit registriert werden können? + - **Antwort**: Ja, über `_registerActions()` Methode + +4. **Frontend-Integration**: Wie werden Actions im Frontend angezeigt? + - **Antwort**: API-Endpoint `/api/workflows/actions` liefert strukturierte Action-Definitionen aus `_actions` Dictionary (gefiltert nach RBAC) + +5. **Action-Definitionen in separaten Dateien**: Sollen Action-Definitionen auch in den Action-Dateien stehen? + - **Antwort**: Nein, Action-Definitionen bleiben in der Hauptklasse (`_actions` Dictionary). Die Action-Dateien enthalten nur die Execute-Funktionen. Dies ermöglicht zentrale Verwaltung und einfache RBAC-Integration. + +6. **Migration-Reihenfolge**: Sollen alle Actions einer Method gleichzeitig migriert werden? + - **Antwort**: Empfohlen: Schrittweise pro Action, um Risiko zu minimieren. Aber auch vollständige Migration pro Method ist möglich. + +## API-Endpunkte + +**WICHTIG**: Diese API-Endpunkte beziehen sich auf **Action-Definitionen** (Schema), nicht auf ausführbare Workflows oder Templates. + +### GET /api/workflows/actions +Liefert alle verfügbaren Actions für den aktuellen User (gefiltert nach RBAC): + +**Zweck**: Action-Discovery für Workflow-Editor und dynamische Workflows +**Verwendung**: +- Workflow-Editor: Zeigt verfügbare Actions in Toolbox +- Dynamic Workflows: Zeigt verfügbare Actions für AI-Auswahl + +```json +{ + "actions": [ + { + "module": "outlook", + "actionId": "outlook.readEmails", + "name": "readEmails", + "description": "Read emails from Outlook mailbox", + "parameters": { + "connectionReference": { + "type": "str", + "frontendType": "userConnection", + "frontendOptions": "user.connection", # Automatisch für Custom-Types + "required": true, + "description": "Microsoft connection label" + }, + "documentList": { + "type": "List[str]", + "frontendType": "documentReference", + "frontendOptions": "workflow.documentReference", # Automatisch für Custom-Types + "required": false, + "description": "Document list reference(s) from previous actions" + }, + ... + } + }, + ... + ] +} +``` + +### GET /api/workflows/actions/{module} +Liefert Actions für ein spezifisches Modul. + +### POST /api/workflows/actions/{module}/{action}/execute +Führt eine Action aus (mit RBAC-Check). + +## Custom Frontend Types für Actions + +### Verfügbare Custom Types + +1. **`FrontendType.USER_CONNECTION`** + - **API-Endpoint**: `/api/options/user.connection` + - **Beschreibung**: Zeigt alle aktiven Connections des aktuellen Users + - **Verwendung**: Für Parameter wie `connectionReference` in Outlook/SharePoint Actions + - **Beispiel**: + ```python + WorkflowActionParameter( + name="connectionReference", + type="str", + frontendType=FrontendType.USER_CONNECTION, + required=True, + description="Microsoft connection label" + ) + ``` + +2. **`FrontendType.DOCUMENT_REFERENCE`** + - **API-Endpoint**: `/api/options/workflow.documentReference` (zu implementieren) + - **Beschreibung**: Zeigt verfügbare Document-Referenzen aus dem aktuellen Workflow-Context + - **Verwendung**: Für Parameter wie `documentList` in Actions, die auf vorherige Action-Ergebnisse verweisen + - **Beispiel**: + ```python + WorkflowActionParameter( + name="documentList", + type="List[str]", + frontendType=FrontendType.DOCUMENT_REFERENCE, + required=False, + description="Document list reference(s) from previous actions" + ) + ``` + +3. **`FrontendType.WORKFLOW_ACTION`** + - **API-Endpoint**: `/api/options/workflow.action` (zu implementieren) + - **Beschreibung**: Zeigt verfügbare Actions aus dem Workflow-Context + - **Verwendung**: Für Parameter, die auf andere Actions verweisen + +### Custom Types erweitern + +Neue Custom-Types können über `frontendTypes.py` registriert werden: + +```python +from modules.shared.frontendTypes import FrontendType, registerCustomType + +# Neuer Custom-Type hinzufügen +FrontendType.SHAREPOINT_FOLDER = "sharepointFolder" + +# Registrieren +registerCustomType( + frontendType=FrontendType.SHAREPOINT_FOLDER, + optionsApiEndpoint="sharepoint.folder", + description={ + "en": "SharePoint Folder", + "fr": "Dossier SharePoint", + "de": "SharePoint-Ordner" + } +) +``` + +### Frontend-Integration + +Das Frontend muss: +1. Custom-Types erkennen (z.B. `frontendType === "userConnection"`) +2. Automatisch Options von `/api/options/{optionsName}` laden +3. Die Options als Select/Multiselect rendern + +**Beispiel Frontend-Logik**: +```typescript +if (param.frontendType === 'userConnection') { + // Automatisch Options von /api/options/user.connection laden + const options = await fetch(`/api/options/${param.frontendOptions}`); + // Als Select rendern +} +``` + diff --git a/appdoc/doc_workflow_method_refactoring_concept_done.md b/appdoc/doc_workflow_method_refactoring_concept_done.md new file mode 100644 index 0000000..bd696f1 --- /dev/null +++ b/appdoc/doc_workflow_method_refactoring_concept_done.md @@ -0,0 +1,1313 @@ +# Method-Dateien Refactoring Konzept + +## Übersicht + +Dieses Dokument beschreibt das **Standard-Refactoring-Konzept** für alle Method-Dateien im `gateway/modules/workflows/methods/` Verzeichnis. + +**Ziel**: Alle Methods werden nach der **gleichen Folder-basierten Struktur** umorganisiert, um: +- Wartbarkeit zu verbessern +- Parallele Entwicklung zu ermöglichen +- Testbarkeit zu erhöhen +- Skalierbarkeit sicherzustellen + +**Standard-Struktur**: Jede Method wird in einen eigenen Ordner mit `helpers/` und `actions/` Unterordnern aufgeteilt. + +**Betroffene Methods**: +- `methodSharepoint.py` (2840 Zeilen → Folder-Struktur) +- `methodOutlook.py` (1905 Zeilen → Folder-Struktur) +- `methodJira.py` (1102 Zeilen → Folder-Struktur) +- `methodAi.py` (743 Zeilen → Folder-Struktur) +- `methodContext.py` (461 Zeilen → Folder-Struktur) + +## Problemstellung + +Die Method-Dateien sind sehr lang geworden: +- `methodSharepoint.py`: **2840 Zeilen** (9 Actions, ~16 Helper-Funktionen) +- `methodOutlook.py`: **1905 Zeilen** (4 Actions, ~8 Helper-Funktionen) +- `methodJira.py`: **1102 Zeilen** (8 Actions) +- `methodAi.py`: **743 Zeilen** (8 Actions) + +**Probleme**: +- Schwer wartbar und navigierbar +- Hohe Komplexität pro Datei +- Actions sind relativ unabhängig, teilen sich aber Helper-Funktionen +- Schwierig, mehrere Entwickler parallel arbeiten zu lassen + +## Analyse der aktuellen Struktur + +### MethodSharepoint.py Struktur + +``` +methodSharepoint.py (2840 Zeilen) +├── __init__() - Initialisierung +├── Helper-Funktionen (16 Stück): +│ ├── _format_timestamp_for_filename() +│ ├── _getMicrosoftConnection() +│ ├── _discoverSharePointSites() +│ ├── _extractHostnameFromWebUrl() +│ ├── _extractSiteFromStandardPath() +│ ├── _getSiteByStandardPath() +│ ├── _filterSitesByHint() +│ ├── _parseSearchQuery() +│ ├── _resolvePathQuery() +│ ├── _parseSiteUrl() +│ ├── _cleanSearchQuery() +│ ├── _makeGraphApiCall() +│ ├── _getSiteId() +│ ├── _parseDocumentListForFoundDocuments() +│ ├── _resolveSitesFromPathQuery() +│ └── _parseDocumentListForFolder() (vermutlich) +├── Actions (9 Stück): +│ ├── findDocumentPath() (~480 Zeilen) +│ ├── readDocuments() (~275 Zeilen) +│ ├── uploadDocument() (~270 Zeilen) +│ ├── listDocuments() (~270 Zeilen) +│ ├── analyzeFolderUsage() (~320 Zeilen) +│ ├── findSiteByUrl() (~70 Zeilen) +│ ├── downloadFileByPath() (~100 Zeilen) +│ ├── copyFile() (~150 Zeilen) +│ └── uploadFile() (~130 Zeilen) +``` + +### Helper-Funktionen Kategorisierung + +**Connection & Authentication**: +- `_getMicrosoftConnection()` - Wird von ALLEN Actions verwendet + +**Site Discovery & Resolution**: +- `_discoverSharePointSites()` - Wird von mehreren Actions verwendet +- `_getSiteByStandardPath()` - Wird von mehreren Actions verwendet +- `_filterSitesByHint()` - Wird von mehreren Actions verwendet +- `_resolveSitesFromPathQuery()` - Wird von mehreren Actions verwendet +- `_getSiteId()` - Wird von mehreren Actions verwendet + +**Document Parsing**: +- `_parseDocumentListForFoundDocuments()` - Wird von readDocuments, uploadDocument verwendet +- `_parseDocumentListForFolder()` - Wird von uploadDocument, listDocuments verwendet + +**Path & Query Processing**: +- `_parseSearchQuery()` - Wird von findDocumentPath verwendet +- `_resolvePathQuery()` - Wird von mehreren Actions verwendet +- `_extractSiteFromStandardPath()` - Wird von mehreren Actions verwendet +- `_extractHostnameFromWebUrl()` - Wird von mehreren Actions verwendet +- `_parseSiteUrl()` - Wird von mehreren Actions verwendet +- `_cleanSearchQuery()` - Wird von findDocumentPath verwendet + +**API Communication**: +- `_makeGraphApiCall()` - Wird von mehreren Actions verwendet + +**Utilities**: +- `_format_timestamp_for_filename()` - Wird von mehreren Actions verwendet + +## Refactoring-Konzept: Folder-basierte Struktur + +**Standard-Struktur für alle Methods**: + +Jede Method wird in einen eigenen Ordner mit folgender Struktur aufgeteilt: +- `methodName/` - Hauptordner + - `__init__.py` - Exportiert die Method-Klasse + - `methodName.py` - Hauptklasse (minimal, ~50-150 Zeilen) + - `helpers/` - Helper-Module nach Funktionalität gruppiert + - `actions/` - Jede Action in eigenem Modul + +### Vollständige Struktur für alle Methods + +``` +gateway/modules/workflows/methods/ +├── methodBase.py (bleibt) +│ +├── methodSharepoint/ +│ ├── __init__.py +│ ├── methodSharepoint.py +│ ├── helpers/ +│ │ ├── __init__.py +│ │ ├── connection.py +│ │ ├── siteDiscovery.py +│ │ ├── documentParsing.py +│ │ ├── pathProcessing.py +│ │ └── apiClient.py +│ └── actions/ +│ ├── __init__.py +│ ├── findDocumentPath.py +│ ├── readDocuments.py +│ ├── uploadDocument.py +│ ├── listDocuments.py +│ ├── analyzeFolderUsage.py +│ ├── findSiteByUrl.py +│ ├── downloadFileByPath.py +│ ├── copyFile.py +│ └── uploadFile.py +│ +├── methodOutlook/ +│ ├── __init__.py +│ ├── methodOutlook.py +│ ├── helpers/ +│ │ ├── __init__.py +│ │ ├── connection.py +│ │ ├── emailProcessing.py +│ │ └── folderManagement.py +│ └── actions/ +│ ├── __init__.py +│ ├── readEmails.py +│ ├── searchEmails.py +│ ├── composeAndDraftEmailWithContext.py +│ └── sendDraftEmail.py +│ +├── methodJira/ +│ ├── __init__.py +│ ├── methodJira.py +│ ├── helpers/ +│ │ ├── __init__.py +│ │ ├── connection.py +│ │ ├── adfConverter.py (ADF to Text) +│ │ └── documentParsing.py +│ └── actions/ +│ ├── __init__.py +│ ├── connectJira.py +│ ├── exportTicketsAsJson.py +│ ├── importTicketsFromJson.py +│ ├── mergeTicketData.py +│ ├── parseCsvContent.py +│ ├── parseExcelContent.py +│ ├── createCsvContent.py +│ └── createExcelContent.py +│ +├── methodAi/ +│ ├── __init__.py +│ ├── methodAi.py +│ ├── helpers/ +│ │ ├── __init__.py +│ │ └── csvProcessing.py +│ └── actions/ +│ ├── __init__.py +│ ├── process.py +│ ├── webResearch.py +│ ├── summarizeDocument.py +│ ├── translateDocument.py +│ ├── convert.py +│ ├── convertDocument.py +│ ├── extractData.py +│ └── generateDocument.py +│ +└── methodContext/ + ├── __init__.py + ├── methodContext.py + ├── helpers/ + │ ├── __init__.py + │ ├── documentIndex.py + │ └── formatting.py + └── actions/ + ├── __init__.py + ├── getDocumentIndex.py + ├── extractContent.py + └── triggerPreprocessingServer.py +``` + +### Detaillierte Struktur: methodSharepoint/ + +#### methodSharepoint/__init__.py +```python +# Copyright (c) 2025 Patrick Motsch +# All rights reserved. + +from .methodSharepoint import MethodSharepoint + +__all__ = ['MethodSharepoint'] +``` + +#### methodSharepoint/methodSharepoint.py (Hauptklasse) +```python +# Copyright (c) 2025 Patrick Motsch +# All rights reserved. + +import logging +from typing import Dict, Any +from modules.workflows.methods.methodBase import MethodBase + +# Import helpers +from .helpers.connection import ConnectionHelper +from .helpers.siteDiscovery import SiteDiscoveryHelper +from .helpers.documentParsing import DocumentParsingHelper +from .helpers.pathProcessing import PathProcessingHelper +from .helpers.apiClient import ApiClientHelper + +# Import actions +from .actions.findDocumentPath import findDocumentPath +from .actions.readDocuments import readDocuments +from .actions.uploadDocument import uploadDocument +from .actions.listDocuments import listDocuments +from .actions.analyzeFolderUsage import analyzeFolderUsage +from .actions.findSiteByUrl import findSiteByUrl +from .actions.downloadFileByPath import downloadFileByPath +from .actions.copyFile import copyFile +from .actions.uploadFile import uploadFile + +logger = logging.getLogger(__name__) + +class MethodSharepoint(MethodBase): + """SharePoint operations methods.""" + + def __init__(self, services): + super().__init__(services) + self.name = "sharepoint" + self.description = "SharePoint operations methods" + + # Initialize helper modules + self.connection = ConnectionHelper(self) + self.siteDiscovery = SiteDiscoveryHelper(self) + self.documentParsing = DocumentParsingHelper(self) + self.pathProcessing = PathProcessingHelper(self) + self.apiClient = ApiClientHelper(self) + + # Register actions + self.findDocumentPath = findDocumentPath.__get__(self, self.__class__) + self.readDocuments = readDocuments.__get__(self, self.__class__) + self.uploadDocument = uploadDocument.__get__(self, self.__class__) + self.listDocuments = listDocuments.__get__(self, self.__class__) + self.analyzeFolderUsage = analyzeFolderUsage.__get__(self, self.__class__) + self.findSiteByUrl = findSiteByUrl.__get__(self, self.__class__) + self.downloadFileByPath = downloadFileByPath.__get__(self, self.__class__) + self.copyFile = copyFile.__get__(self, self.__class__) + self.uploadFile = uploadFile.__get__(self, self.__class__) +``` + +#### methodSharepoint/helpers/connection.py +```python +# Copyright (c) 2025 Patrick Motsch +# All rights reserved. + +import logging +from typing import Dict, Any, Optional + +logger = logging.getLogger(__name__) + +class ConnectionHelper: + """Helper for Microsoft connection management""" + + def __init__(self, methodInstance): + self.method = methodInstance + self.services = methodInstance.services + + def getMicrosoftConnection(self, connectionReference: str) -> Optional[Dict[str, Any]]: + """Get Microsoft connection from connection reference and configure SharePoint service""" + try: + userConnection = self.services.chat.getUserConnectionFromConnectionReference(connectionReference) + if not userConnection: + logger.warning(f"No user connection found for reference: {connectionReference}") + return None + + if userConnection.authority.value != "msft": + logger.warning(f"Connection {userConnection.id} is not Microsoft (authority: {userConnection.authority.value})") + return None + + # Check if connection is active or pending + if userConnection.status.value not in ["active", "pending"]: + logger.warning(f"Connection {userConnection.id} status is not active/pending: {userConnection.status.value}") + return None + + # Configure SharePoint service + if not self.services.sharepoint.setAccessTokenFromConnection(userConnection): + logger.warning(f"Failed to configure SharePoint service with connection {userConnection.id}") + return None + + logger.info(f"Successfully configured SharePoint service with Microsoft connection: {userConnection.id}") + + return { + "id": userConnection.id, + "userConnection": userConnection, + "scopes": ["Sites.ReadWrite.All", "Files.ReadWrite.All", "User.Read"] + } + except Exception as e: + logger.error(f"Error getting Microsoft connection: {str(e)}") + return None +``` + +#### methodSharepoint/helpers/siteDiscovery.py +```python +# Copyright (c) 2025 Patrick Motsch +# All rights reserved. + +import logging +from typing import Dict, Any, List, Optional + +logger = logging.getLogger(__name__) + +class SiteDiscoveryHelper: + """Helper for SharePoint site discovery and resolution""" + + def __init__(self, methodInstance): + self.method = methodInstance + self.services = methodInstance.services + + async def discoverSharePointSites(self, limit: Optional[int] = None) -> List[Dict[str, Any]]: + """Discover SharePoint sites accessible to the user via Microsoft Graph API""" + # ... Implementation ... + pass + + def filterSitesByHint(self, sites: List[Dict[str, Any]], siteHint: str) -> List[Dict[str, Any]]: + """Filter sites by hint""" + # ... Implementation ... + pass + + async def getSiteByStandardPath(self, sitePath: str) -> Optional[Dict[str, Any]]: + """Get site by standard path""" + # ... Implementation ... + pass + + async def getSiteId(self, hostname: str, sitePath: str) -> str: + """Get site ID from hostname and path""" + # ... Implementation ... + pass + + async def resolveSitesFromPathQuery(self, pathQuery: str) -> tuple[List[Dict[str, Any]], Optional[str]]: + """Resolve sites from pathQuery""" + # ... Implementation ... + pass +``` + +#### methodSharepoint/actions/findDocumentPath.py +```python +# Copyright (c) 2025 Patrick Motsch +# All rights reserved. + +import logging +import time +from typing import Dict, Any +from modules.workflows.methods.methodBase import action +from modules.datamodels.datamodelChat import ActionResult, ActionDocument + +logger = logging.getLogger(__name__) + +@action +async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult: + """ + GENERAL: + - Purpose: Find documents and folders by name/path across sites. + - Input requirements: connectionReference (required); searchQuery (required); optional site, maxResults. + - Output format: JSON with found items and paths. + + Parameters: + - connectionReference (str, required): Microsoft connection label. + - site (str, optional): Site hint. + - searchQuery (str, required): Search terms or path. + - maxResults (int, optional): Maximum items to return. Default: 1000. + """ + operationId = None + try: + # Init progress logger + workflowId = self.services.workflow.id if self.services.workflow else f"no-workflow-{int(time.time())}" + operationId = f"sharepoint_find_{workflowId}_{int(time.time())}" + + # Start progress tracking + parentOperationId = parameters.get('parentOperationId') + self.services.chat.progressLogStart( + operationId, + "Find Document Path", + "SharePoint Search", + f"Query: {parameters.get('searchQuery', '*')}", + parentOperationId=parentOperationId + ) + + connectionReference = parameters.get("connectionReference") + searchQuery = parameters.get("searchQuery") + siteHint = parameters.get("site") + maxResults = parameters.get("maxResults", 1000) + + if not connectionReference: + if operationId: + self.services.chat.progressLogFinish(operationId, False) + return ActionResult.isFailure(error="Connection reference is required") + + if not searchQuery: + if operationId: + self.services.chat.progressLogFinish(operationId, False) + return ActionResult.isFailure(error="Search query is required") + + # Get Microsoft connection + self.services.chat.progressLogUpdate(operationId, 0.2, "Getting Microsoft connection") + connection = self.connection.getMicrosoftConnection(connectionReference) + if not connection: + if operationId: + self.services.chat.progressLogFinish(operationId, False) + return ActionResult.isFailure(error="No valid Microsoft connection found") + + # Parse search query + self.services.chat.progressLogUpdate(operationId, 0.3, "Parsing search query") + siteHintFromQuery, pathQuery, searchText, searchOptions = self.pathProcessing.parseSearchQuery(searchQuery) + + # Use site hint from parameter or query + finalSiteHint = siteHint or siteHintFromQuery + + # Discover sites + self.services.chat.progressLogUpdate(operationId, 0.4, "Discovering SharePoint sites") + allSites = await self.siteDiscovery.discoverSharePointSites() + + # Filter sites by hint if provided + if finalSiteHint: + allSites = self.siteDiscovery.filterSitesByHint(allSites, finalSiteHint) + + # ... rest of implementation using helpers ... + + # Return result + return ActionResult.isSuccess(documents=[...]) + + except Exception as e: + errorMsg = f"Error finding document path: {str(e)}" + logger.error(errorMsg) + if operationId: + self.services.chat.progressLogFinish(operationId, False) + return ActionResult.isFailure(error=errorMsg) +``` + +## Vorteile der Folder-basierten Struktur + +### Vorteile + +1. **Klare Organisation**: + - Helper-Funktionen gruppiert nach Funktionalität + - Actions isoliert in eigenen Modulen + - Einfach zu navigieren + +2. **Wartbarkeit**: + - Jede Action in eigenem Modul (~100-500 Zeilen) + - Helper-Module fokussiert auf spezifische Aufgaben + - Einfacher zu testen + +3. **Skalierbarkeit**: + - Neue Actions einfach hinzufügen + - Helper-Funktionen wiederverwendbar + - Parallele Entwicklung möglich + +4. **Kompatibilität**: + - `methodDiscovery.py` funktioniert weiterhin (findet `MethodSharepoint` Klasse) + - Keine Änderungen an bestehender Discovery-Logik nötig + - Actions werden weiterhin über `@action` Decorator erkannt + +### Migration-Strategie + +#### Phase 1: Helper-Module erstellen +1. Helper-Funktionen in separate Module verschieben +2. Helper-Klassen erstellen (ConnectionHelper, SiteDiscoveryHelper, etc.) +3. Hauptklasse anpassen, um Helper zu verwenden +4. Tests schreiben + +#### Phase 2: Actions extrahieren +1. Eine Action nach der anderen in separates Modul verschieben +2. Action-Funktion als standalone Funktion definieren +3. In Hauptklasse als Method registrieren +4. Tests für jede Action + +#### Phase 3: Folder-Struktur +1. Ordner `methodSharepoint/` erstellen +2. Dateien in Ordner verschieben +3. `__init__.py` erstellen +4. Imports anpassen + +#### Phase 4: Cleanup +1. Alte Dateien entfernen +2. Tests aktualisieren +3. Dokumentation aktualisieren + +## Technische Details + +### Action-Registrierung + +**Problem**: Actions müssen als Methoden der Klasse verfügbar sein, damit `@action` Decorator funktioniert. + +**Lösung**: Actions als standalone Funktionen definieren und als Descriptors registrieren: + +```python +# In action module +@action +async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult: + # Implementation + pass + +# In main class +class MethodSharepoint(MethodBase): + def __init__(self, services): + # ... + # Register action as method + self.findDocumentPath = findDocumentPath.__get__(self, self.__class__) +``` + +**Alternative**: Actions direkt als Methoden importieren: + +```python +# In action module +class ActionFindDocumentPath: + @action + async def execute(self, parameters: Dict[str, Any]) -> ActionResult: + # Implementation + pass + +# In main class +from .actions.findDocumentPath import ActionFindDocumentPath + +class MethodSharepoint(MethodBase): + def __init__(self, services): + # ... + self._actionFindDocumentPath = ActionFindDocumentPath() + self.findDocumentPath = self._actionFindDocumentPath.execute +``` + +**Beste Lösung**: Actions als standalone Funktionen mit `self` Parameter: + +```python +# In action module +from modules.workflows.methods.methodBase import action + +@action +async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult: + """Action implementation""" + # self ist die MethodSharepoint Instanz + connection = self.connection.getMicrosoftConnection(...) + # ... +``` + +### Helper-Zugriff + +Helper werden als Instanz-Variablen verfügbar gemacht: + +```python +class MethodSharepoint(MethodBase): + def __init__(self, services): + super().__init__(services) + # ... + self.connection = ConnectionHelper(self) + self.siteDiscovery = SiteDiscoveryHelper(self) + # ... +``` + +Actions können dann auf Helper zugreifen: + +```python +@action +async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult: + # Zugriff auf Helper über self + connection = self.connection.getMicrosoftConnection(...) + sites = await self.siteDiscovery.discoverSharePointSites() + # ... +``` + +### methodDiscovery.py Kompatibilität + +Die bestehende Discovery-Logik funktioniert weiterhin: + +```python +# methodDiscovery.py sucht nach: +# 1. Modulen, die mit "method" beginnen +# 2. Klassen, die von MethodBase erben + +# Mit Folder-Struktur: +# methodSharepoint/__init__.py exportiert MethodSharepoint +# → importlib.import_module('modules.workflows.methods.methodSharepoint') +# → findet MethodSharepoint Klasse +# → funktioniert wie bisher! +``` + +## Beispiel: methodSharepoint/ Struktur + +### methodSharepoint/methodSharepoint.py (Vollständig) + +```python +# Copyright (c) 2025 Patrick Motsch +# All rights reserved. + +import logging +from modules.workflows.methods.methodBase import MethodBase + +# Import helpers +from .helpers.connection import ConnectionHelper +from .helpers.siteDiscovery import SiteDiscoveryHelper +from .helpers.documentParsing import DocumentParsingHelper +from .helpers.pathProcessing import PathProcessingHelper +from .helpers.apiClient import ApiClientHelper + +# Import actions +from .actions import findDocumentPath +from .actions import readDocuments +from .actions import uploadDocument +from .actions import listDocuments +from .actions import analyzeFolderUsage +from .actions import findSiteByUrl +from .actions import downloadFileByPath +from .actions import copyFile +from .actions import uploadFile + +logger = logging.getLogger(__name__) + +class MethodSharepoint(MethodBase): + """SharePoint operations methods.""" + + def __init__(self, services): + super().__init__(services) + self.name = "sharepoint" + self.description = "SharePoint operations methods" + + # Initialize helper modules + self.connection = ConnectionHelper(self) + self.siteDiscovery = SiteDiscoveryHelper(self) + self.documentParsing = DocumentParsingHelper(self) + self.pathProcessing = PathProcessingHelper(self) + self.apiClient = ApiClientHelper(self) + + # Register actions as methods + # Actions werden als Methoden registriert, damit @action Decorator funktioniert + self.findDocumentPath = findDocumentPath.__get__(self, self.__class__) + self.readDocuments = readDocuments.__get__(self, self.__class__) + self.uploadDocument = uploadDocument.__get__(self, self.__class__) + self.listDocuments = listDocuments.__get__(self, self.__class__) + self.analyzeFolderUsage = analyzeFolderUsage.__get__(self, self.__class__) + self.findSiteByUrl = findSiteByUrl.__get__(self, self.__class__) + self.downloadFileByPath = downloadFileByPath.__get__(self, self.__class__) + self.copyFile = copyFile.__get__(self, self.__class__) + self.uploadFile = uploadFile.__get__(self, self.__class__) +``` + +### methodSharepoint/actions/__init__.py + +```python +# Copyright (c) 2025 Patrick Motsch +# All rights reserved. + +# Export all actions +from .findDocumentPath import findDocumentPath +from .readDocuments import readDocuments +from .uploadDocument import uploadDocument +from .listDocuments import listDocuments +from .analyzeFolderUsage import analyzeFolderUsage +from .findSiteByUrl import findSiteByUrl +from .downloadFileByPath import downloadFileByPath +from .copyFile import copyFile +from .uploadFile import uploadFile + +__all__ = [ + 'findDocumentPath', + 'readDocuments', + 'uploadDocument', + 'listDocuments', + 'analyzeFolderUsage', + 'findSiteByUrl', + 'downloadFileByPath', + 'copyFile', + 'uploadFile', +] +``` + +### methodSharepoint/actions/findDocumentPath.py (Beispiel) + +```python +# Copyright (c) 2025 Patrick Motsch +# All rights reserved. + +import logging +import time +import json +from typing import Dict, Any +from modules.workflows.methods.methodBase import action +from modules.datamodels.datamodelChat import ActionResult, ActionDocument + +logger = logging.getLogger(__name__) + +@action +async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult: + """ + GENERAL: + - Purpose: Find documents and folders by name/path across sites. + - Input requirements: connectionReference (required); searchQuery (required); optional site, maxResults. + - Output format: JSON with found items and paths. + + Parameters: + - connectionReference (str, required): Microsoft connection label. + - site (str, optional): Site hint. + - searchQuery (str, required): Search terms or path. + - maxResults (int, optional): Maximum items to return. Default: 1000. + """ + operationId = None + try: + # Init progress logger + workflowId = self.services.workflow.id if self.services.workflow else f"no-workflow-{int(time.time())}" + operationId = f"sharepoint_find_{workflowId}_{int(time.time())}" + + # Start progress tracking + parentOperationId = parameters.get('parentOperationId') + self.services.chat.progressLogStart( + operationId, + "Find Document Path", + "SharePoint Search", + f"Query: {parameters.get('searchQuery', '*')}", + parentOperationId=parentOperationId + ) + + connectionReference = parameters.get("connectionReference") + searchQuery = parameters.get("searchQuery") + siteHint = parameters.get("site") + maxResults = parameters.get("maxResults", 1000) + + if not connectionReference: + if operationId: + self.services.chat.progressLogFinish(operationId, False) + return ActionResult.isFailure(error="Connection reference is required") + + if not searchQuery: + if operationId: + self.services.chat.progressLogFinish(operationId, False) + return ActionResult.isFailure(error="Search query is required") + + # Get Microsoft connection using helper + self.services.chat.progressLogUpdate(operationId, 0.2, "Getting Microsoft connection") + connection = self.connection.getMicrosoftConnection(connectionReference) + if not connection: + if operationId: + self.services.chat.progressLogFinish(operationId, False) + return ActionResult.isFailure(error="No valid Microsoft connection found") + + # Parse search query using helper + self.services.chat.progressLogUpdate(operationId, 0.3, "Parsing search query") + siteHintFromQuery, pathQuery, searchText, searchOptions = self.pathProcessing.parseSearchQuery(searchQuery) + + # Use site hint from parameter or query + finalSiteHint = siteHint or siteHintFromQuery + + # Discover sites using helper + self.services.chat.progressLogUpdate(operationId, 0.4, "Discovering SharePoint sites") + allSites = await self.siteDiscovery.discoverSharePointSites() + + # Filter sites by hint if provided + if finalSiteHint: + allSites = self.siteDiscovery.filterSitesByHint(allSites, finalSiteHint) + + # ... rest of implementation ... + + # Generate result + workflowContext = self.services.chat.getWorkflowContext() if hasattr(self.services, 'chat') else None + filename = self._generateMeaningfulFileName( + "sharepoint_find_result", + "json", + workflowContext, + "findDocumentPath" + ) + + result = { + "foundDocuments": foundDocuments, + "sites": sites, + "totalCount": len(foundDocuments) + } + + validationMetadata = self._createValidationMetadata( + "findDocumentPath", + connectionReference=connectionReference, + searchQuery=searchQuery, + siteHint=finalSiteHint, + resultCount=len(foundDocuments) + ) + + document = ActionDocument( + documentName=filename, + documentData=json.dumps(result, indent=2), + mimeType="application/json", + validationMetadata=validationMetadata + ) + + self.services.chat.progressLogFinish(operationId, True) + return ActionResult.isSuccess(documents=[document]) + + except Exception as e: + errorMsg = f"Error finding document path: {str(e)}" + logger.error(errorMsg) + if operationId: + self.services.chat.progressLogFinish(operationId, False) + return ActionResult.isFailure(error=errorMsg) +``` + +## Helper-Module Struktur + +### methodSharepoint/helpers/connection.py + +```python +# Copyright (c) 2025 Patrick Motsch +# All rights reserved. + +import logging +from typing import Dict, Any, Optional + +logger = logging.getLogger(__name__) + +class ConnectionHelper: + """Helper for Microsoft connection management in SharePoint operations""" + + def __init__(self, methodInstance): + """ + Initialize connection helper. + + Args: + methodInstance: Instance of MethodSharepoint (for access to services) + """ + self.method = methodInstance + self.services = methodInstance.services + + def getMicrosoftConnection(self, connectionReference: str) -> Optional[Dict[str, Any]]: + """ + Get Microsoft connection from connection reference and configure SharePoint service. + + Args: + connectionReference: Connection reference string + + Returns: + Dict with connection info or None if failed + """ + try: + userConnection = self.services.chat.getUserConnectionFromConnectionReference(connectionReference) + if not userConnection: + logger.warning(f"No user connection found for reference: {connectionReference}") + return None + + if userConnection.authority.value != "msft": + logger.warning(f"Connection {userConnection.id} is not Microsoft (authority: {userConnection.authority.value})") + return None + + # Check if connection is active or pending + if userConnection.status.value not in ["active", "pending"]: + logger.warning(f"Connection {userConnection.id} status is not active/pending: {userConnection.status.value}") + return None + + # Configure SharePoint service + if not self.services.sharepoint.setAccessTokenFromConnection(userConnection): + logger.warning(f"Failed to configure SharePoint service with connection {userConnection.id}") + return None + + logger.info(f"Successfully configured SharePoint service with Microsoft connection: {userConnection.id}") + + return { + "id": userConnection.id, + "userConnection": userConnection, + "scopes": ["Sites.ReadWrite.All", "Files.ReadWrite.All", "User.Read"] + } + except Exception as e: + logger.error(f"Error getting Microsoft connection: {str(e)}") + return None +``` + +### methodSharepoint/helpers/siteDiscovery.py + +```python +# Copyright (c) 2025 Patrick Motsch +# All rights reserved. + +import logging +from typing import Dict, Any, List, Optional + +logger = logging.getLogger(__name__) + +class SiteDiscoveryHelper: + """Helper for SharePoint site discovery and resolution""" + + def __init__(self, methodInstance): + self.method = methodInstance + self.services = methodInstance.services + + async def discoverSharePointSites(self, limit: Optional[int] = None) -> List[Dict[str, Any]]: + """ + Discover SharePoint sites accessible to the user via Microsoft Graph API. + + Args: + limit: Optional limit on number of sites to return + + Returns: + List of site information dictionaries + """ + try: + endpoint = "sites?search=*" + if limit: + endpoint += f"&$top={limit}" + + result = await self.method.apiClient.makeGraphApiCall(endpoint) + + if "error" in result: + logger.error(f"Error discovering SharePoint sites: {result['error']}") + return [] + + sites = result.get("value", []) + if limit: + sites = sites[:limit] + + logger.info(f"Discovered {len(sites)} SharePoint sites" + (f" (limited to {limit})" if limit else "")) + + # Process and return site information + processedSites = [] + for site in sites: + siteInfo = { + "id": site.get("id"), + "displayName": site.get("displayName"), + "webUrl": site.get("webUrl"), + "name": site.get("name"), + "description": site.get("description") + } + processedSites.append(siteInfo) + + return processedSites + except Exception as e: + logger.error(f"Error discovering SharePoint sites: {str(e)}") + return [] + + def filterSitesByHint(self, sites: List[Dict[str, Any]], siteHint: str) -> List[Dict[str, Any]]: + """ + Filter sites by hint (name, displayName, or webUrl contains hint). + + Args: + sites: List of site dictionaries + siteHint: Hint string to match against + + Returns: + Filtered list of sites + """ + if not siteHint: + return sites + + hintLower = siteHint.lower() + filtered = [] + + for site in sites: + displayName = site.get("displayName", "").lower() + name = site.get("name", "").lower() + webUrl = site.get("webUrl", "").lower() + + if hintLower in displayName or hintLower in name or hintLower in webUrl: + filtered.append(site) + + return filtered + + async def getSiteByStandardPath(self, sitePath: str) -> Optional[Dict[str, Any]]: + """ + Get site by standard path format (hostname/path or /sites/path). + + Args: + sitePath: Site path string + + Returns: + Site dictionary or None if not found + """ + # Implementation... + pass + + async def getSiteId(self, hostname: str, sitePath: str) -> str: + """ + Get site ID from hostname and site path. + + Args: + hostname: SharePoint hostname + sitePath: Site path + + Returns: + Site ID string + """ + # Implementation... + pass + + async def resolveSitesFromPathQuery(self, pathQuery: str) -> tuple[List[Dict[str, Any]], Optional[str]]: + """ + Resolve sites from pathQuery using SharePoint service helper methods. + + Args: + pathQuery: Path query string + + Returns: + Tuple of (sites list, error message) + """ + try: + isValid, errorMsg = self.services.sharepoint.validatePathQuery(pathQuery) + if not isValid: + return [], errorMsg + + sites = await self.services.sharepoint.resolveSitesFromPathQuery(pathQuery) + if not sites: + return [], "No SharePoint sites found or accessible" + + return sites, None + except Exception as e: + logger.error(f"Error resolving sites from pathQuery '{pathQuery}': {str(e)}") + return [], f"Error resolving sites from pathQuery: {str(e)}" +``` + +## Migration-Plan + +### Schritt 1: Helper-Module erstellen (Woche 1) + +1. **Helper-Kategorien identifizieren**: + - Connection & Authentication + - Site Discovery & Resolution + - Document Parsing + - Path & Query Processing + - API Communication + +2. **Helper-Klassen erstellen**: + - `methodSharepoint/helpers/connection.py` + - `methodSharepoint/helpers/siteDiscovery.py` + - `methodSharepoint/helpers/documentParsing.py` + - `methodSharepoint/helpers/pathProcessing.py` + - `methodSharepoint/helpers/apiClient.py` + +3. **Helper-Funktionen migrieren**: + - Funktionen aus `methodSharepoint.py` in entsprechende Helper-Klassen verschieben + - `self` Parameter durch `self.method` ersetzen + - Tests schreiben + +### Schritt 2: Actions extrahieren (Woche 2-3) + +1. **Action-Module erstellen**: + - `methodSharepoint/actions/findDocumentPath.py` + - `methodSharepoint/actions/readDocuments.py` + - `methodSharepoint/actions/uploadDocument.py` + - etc. + +2. **Actions migrieren**: + - Action-Funktion in separates Modul verschieben + - Helper-Zugriff über `self.helperName` anpassen + - Tests schreiben + +3. **Hauptklasse anpassen**: + - Helper initialisieren + - Actions registrieren + - Alte Implementierung entfernen + +### Schritt 3: Folder-Struktur (Woche 4) + +1. **Ordner erstellen**: + - `methodSharepoint/` Ordner erstellen + - `helpers/` und `actions/` Unterordner erstellen + +2. **Dateien verschieben**: + - Helper-Module in `helpers/` + - Action-Module in `actions/` + - Hauptklasse bleibt in `methodSharepoint/` + +3. **Imports anpassen**: + - `__init__.py` Dateien erstellen + - Relative Imports anpassen + +### Schritt 4: Testing & Cleanup (Woche 5) + +1. **Tests**: + - Unit-Tests für Helper-Module + - Integration-Tests für Actions + - End-to-End Tests für Workflows + +2. **Dokumentation**: + - README für methodSharepoint/ + - Helper-Dokumentation + - Action-Dokumentation + +3. **Cleanup**: + - Alte `methodSharepoint.py` entfernen + - Unused Imports entfernen + - Code-Review + +## Vorteile der neuen Struktur + +### Vorher (2840 Zeilen in einer Datei) +``` +methodSharepoint.py +├── 16 Helper-Funktionen (verstreut) +├── 9 Actions (200-500 Zeilen each) +└── Schwer zu navigieren, schwer zu testen +``` + +### Nachher (aufgeteilt) +``` +methodSharepoint/ +├── methodSharepoint.py (~100 Zeilen) +├── helpers/ +│ ├── connection.py (~100 Zeilen) +│ ├── siteDiscovery.py (~200 Zeilen) +│ ├── documentParsing.py (~150 Zeilen) +│ ├── pathProcessing.py (~200 Zeilen) +│ └── apiClient.py (~100 Zeilen) +└── actions/ + ├── findDocumentPath.py (~300 Zeilen) + ├── readDocuments.py (~200 Zeilen) + ├── uploadDocument.py (~200 Zeilen) + └── ... (weitere Actions, je ~100-300 Zeilen) +``` + +**Vorteile**: +- ✅ Jede Datei < 300 Zeilen (meist < 200) +- ✅ Klare Trennung von Concerns +- ✅ Einfach zu testen +- ✅ Parallele Entwicklung möglich +- ✅ Wiederverwendbare Helper-Module + +## Kompatibilität + +### methodDiscovery.py + +Die bestehende Discovery-Logik funktioniert weiterhin: + +```python +# methodDiscovery.py sucht nach: +for _, name, isPkg in pkgutil.iter_modules(methodsPackage.__path__): + if not isPkg and name.startswith('method'): + # Importiert: modules.workflows.methods.methodSharepoint + # → methodSharepoint/__init__.py wird geladen + # → MethodSharepoint Klasse wird gefunden + # → Funktioniert wie bisher! +``` + +**Anpassung nötig**: `methodDiscovery.py` muss auch Packages (Ordner) erkennen: + +```python +# In methodDiscovery.py - discoverMethods() Funktion +for _, name, isPkg in pkgutil.iter_modules(methodsPackage.__path__): + if name.startswith('method'): + try: + if isPkg: + # Package (Ordner) - importiere __init__.py + module = importlib.import_module(f'modules.workflows.methods.{name}') + else: + # Modul (Datei) - wie bisher (für Rückwärtskompatibilität) + module = importlib.import_module(f'modules.workflows.methods.{name}') + + # Find all classes in the module that inherit from MethodBase + for itemName, item in inspect.getmembers(module): + if (inspect.isclass(item) and + issubclass(item, MethodBase) and + item != MethodBase): + # ... rest of discovery logic ... + except Exception as e: + logger.error(f"Error discovering method {name}: {str(e)}") + continue +``` + +**Wichtig**: Diese Änderung ist rückwärtskompatibel - bestehende Method-Dateien funktionieren weiterhin. + +## Struktur-Details pro Method + +### methodSharepoint (9 Actions, ~16 Helper-Funktionen) + +**Helper-Kategorien**: +- `connection.py` - Microsoft Connection Handling +- `siteDiscovery.py` - SharePoint Site Discovery & Resolution +- `documentParsing.py` - Document List Parsing +- `pathProcessing.py` - Path & Query Processing +- `apiClient.py` - Microsoft Graph API Calls + +**Actions**: findDocumentPath, readDocuments, uploadDocument, listDocuments, analyzeFolderUsage, findSiteByUrl, downloadFileByPath, copyFile, uploadFile + +### methodOutlook (4 Actions, ~8 Helper-Funktionen) + +**Helper-Kategorien**: +- `connection.py` - Microsoft Connection Handling +- `emailProcessing.py` - Email Search, Filtering, Processing +- `folderManagement.py` - Folder Operations + +**Actions**: readEmails, searchEmails, composeAndDraftEmailWithContext, sendDraftEmail + +### methodJira (8 Actions, ~4 Helper-Funktionen) + +**Helper-Kategorien**: +- `connection.py` - JIRA Connection Handling +- `adfConverter.py` - Atlassian Document Format to Text Conversion +- `documentParsing.py` - Document Reference Parsing + +**Actions**: connectJira, exportTicketsAsJson, importTicketsFromJson, mergeTicketData, parseCsvContent, parseExcelContent, createCsvContent, createExcelContent + +### methodAi (8 Actions, ~1 Helper-Funktion) + +**Helper-Kategorien**: +- `csvProcessing.py` - CSV Options Processing + +**Actions**: process, webResearch, summarizeDocument, translateDocument, convert, convertDocument, extractData, generateDocument + +### methodContext (3 Actions, ~3 Helper-Funktionen) + +**Helper-Kategorien**: +- `documentIndex.py` - Document Index Parsing +- `formatting.py` - Markdown/Text Formatting + +**Actions**: getDocumentIndex, extractContent, triggerPreprocessingServer + +## Migration-Plan für alle Methods + +### Phase 1: methodSharepoint (Pilot) - Woche 1-2 +1. Helper-Module erstellen und migrieren +2. Actions extrahieren +3. Folder-Struktur erstellen +4. Tests schreiben +5. Dokumentation + +### Phase 2: methodOutlook - Woche 3-4 +1. Gleiche Struktur wie methodSharepoint +2. Helper-Module erstellen +3. Actions extrahieren +4. Tests schreiben + +### Phase 3: methodJira - Woche 5-6 +1. Helper-Module erstellen +2. Actions extrahieren +3. Tests schreiben + +### Phase 4: methodAi - Woche 7 +1. Helper-Module erstellen (minimal) +2. Actions extrahieren +3. Tests schreiben + +### Phase 5: methodContext - Woche 8 +1. Helper-Module erstellen +2. Actions extrahieren +3. Tests schreiben + +### Phase 6: Cleanup & Dokumentation - Woche 9 +1. Alte Dateien entfernen +2. methodDiscovery.py anpassen (Package-Support) +3. Gesamtdokumentation aktualisieren +4. Code-Review + +## Gemeinsame Helper-Funktionen + +### Analyse: Duplizierte Helper-Funktionen + +**Gefundene Duplikationen**: +- `_format_timestamp_for_filename()`: In `methodOutlook`, `methodSharepoint`, `methodAi` dupliziert +- `_getMicrosoftConnection()`: In `methodOutlook` und `methodSharepoint` dupliziert (aber unterschiedlich implementiert) +- `_createValidationMetadata()`: Bereits in `MethodBase` (gut!) +- `_generateMeaningfulFileName()`: Bereits in `MethodBase` (gut!) + +### Lösung: Gemeinsame Helper-Module + +**Struktur für gemeinsame Helper**: +``` +gateway/modules/workflows/methods/ +├── methodBase.py +├── shared/ # NEU: Gemeinsame Helper für alle Methods +│ ├── __init__.py +│ ├── connection.py # Microsoft Connection Helper (wenn identisch) +│ └── utils.py # Gemeinsame Utilities (_format_timestamp, etc.) +├── methodSharepoint/ +│ └── ... +└── methodOutlook/ + └── ... +``` + +**Option 1: Gemeinsame Helper in `shared/`** +- Wenn Implementierung identisch ist → gemeinsames Modul +- Wenn unterschiedlich → method-spezifische Helper + +**Option 2: Helper in MethodBase** +- Für wirklich universelle Helper (wie `_createValidationMetadata`) +- Nicht für method-spezifische Logik + +**Empfehlung**: +- `_format_timestamp_for_filename()` → In `MethodBase` verschieben (ist identisch) +- `_getMicrosoftConnection()` → Method-spezifisch belassen (unterschiedliche Implementierungen) + +## Offene Fragen + +1. **Action-Registrierung**: Soll die Descriptor-Methode (`__get__`) verwendet werden oder eine andere Lösung? + - **Empfehlung**: Descriptor-Methode ist am saubersten und funktioniert mit `@action` Decorator + +2. **Helper-Sharing**: Sollen Helper zwischen Methods geteilt werden? + - **Empfehlung**: Nur wenn Implementierung identisch ist. Sonst method-spezifisch belassen. + +3. **Testing**: Wie sollen Helper-Module getestet werden? + - **Empfehlung**: Unit-Tests mit Mock `methodInstance` für Helper-Module, Integration-Tests für Actions +