# Method-Dateien Refactoring Konzept ## Übersicht Dieses Dokument beschreibt das **Standard-Refactoring-Konzept** für alle Method-Dateien im `gateway/modules/workflows/methods/` Verzeichnis. **Ziel**: Alle Methods werden nach der **gleichen Folder-basierten Struktur** umorganisiert, um: - Wartbarkeit zu verbessern - Parallele Entwicklung zu ermöglichen - Testbarkeit zu erhöhen - Skalierbarkeit sicherzustellen **Standard-Struktur**: Jede Method wird in einen eigenen Ordner mit `helpers/` und `actions/` Unterordnern aufgeteilt. **Betroffene Methods**: - `methodSharepoint.py` (2840 Zeilen → Folder-Struktur) - `methodOutlook.py` (1905 Zeilen → Folder-Struktur) - `methodJira.py` (1102 Zeilen → Folder-Struktur) - `methodAi.py` (743 Zeilen → Folder-Struktur) - `methodContext.py` (461 Zeilen → Folder-Struktur) ## Problemstellung Die Method-Dateien sind sehr lang geworden: - `methodSharepoint.py`: **2840 Zeilen** (9 Actions, ~16 Helper-Funktionen) - `methodOutlook.py`: **1905 Zeilen** (4 Actions, ~8 Helper-Funktionen) - `methodJira.py`: **1102 Zeilen** (8 Actions) - `methodAi.py`: **743 Zeilen** (8 Actions) **Probleme**: - Schwer wartbar und navigierbar - Hohe Komplexität pro Datei - Actions sind relativ unabhängig, teilen sich aber Helper-Funktionen - Schwierig, mehrere Entwickler parallel arbeiten zu lassen ## Analyse der aktuellen Struktur ### MethodSharepoint.py Struktur ``` methodSharepoint.py (2840 Zeilen) ├── __init__() - Initialisierung ├── Helper-Funktionen (16 Stück): │ ├── _format_timestamp_for_filename() │ ├── _getMicrosoftConnection() │ ├── _discoverSharePointSites() │ ├── _extractHostnameFromWebUrl() │ ├── _extractSiteFromStandardPath() │ ├── _getSiteByStandardPath() │ ├── _filterSitesByHint() │ ├── _parseSearchQuery() │ ├── _resolvePathQuery() │ ├── _parseSiteUrl() │ ├── _cleanSearchQuery() │ ├── _makeGraphApiCall() │ ├── _getSiteId() │ ├── _parseDocumentListForFoundDocuments() │ ├── _resolveSitesFromPathQuery() │ └── _parseDocumentListForFolder() (vermutlich) ├── Actions (9 Stück): │ ├── findDocumentPath() (~480 Zeilen) │ ├── readDocuments() (~275 Zeilen) │ ├── uploadDocument() (~270 Zeilen) │ ├── listDocuments() (~270 Zeilen) │ ├── analyzeFolderUsage() (~320 Zeilen) │ ├── findSiteByUrl() (~70 Zeilen) │ ├── downloadFileByPath() (~100 Zeilen) │ ├── copyFile() (~150 Zeilen) │ └── uploadFile() (~130 Zeilen) ``` ### Helper-Funktionen Kategorisierung **Connection & Authentication**: - `_getMicrosoftConnection()` - Wird von ALLEN Actions verwendet **Site Discovery & Resolution**: - `_discoverSharePointSites()` - Wird von mehreren Actions verwendet - `_getSiteByStandardPath()` - Wird von mehreren Actions verwendet - `_filterSitesByHint()` - Wird von mehreren Actions verwendet - `_resolveSitesFromPathQuery()` - Wird von mehreren Actions verwendet - `_getSiteId()` - Wird von mehreren Actions verwendet **Document Parsing**: - `_parseDocumentListForFoundDocuments()` - Wird von readDocuments, uploadDocument verwendet - `_parseDocumentListForFolder()` - Wird von uploadDocument, listDocuments verwendet **Path & Query Processing**: - `_parseSearchQuery()` - Wird von findDocumentPath verwendet - `_resolvePathQuery()` - Wird von mehreren Actions verwendet - `_extractSiteFromStandardPath()` - Wird von mehreren Actions verwendet - `_extractHostnameFromWebUrl()` - Wird von mehreren Actions verwendet - `_parseSiteUrl()` - Wird von mehreren Actions verwendet - `_cleanSearchQuery()` - Wird von findDocumentPath verwendet **API Communication**: - `_makeGraphApiCall()` - Wird von mehreren Actions verwendet **Utilities**: - `_format_timestamp_for_filename()` - Wird von mehreren Actions verwendet ## Refactoring-Konzept: Folder-basierte Struktur **Standard-Struktur für alle Methods**: Jede Method wird in einen eigenen Ordner mit folgender Struktur aufgeteilt: - `methodName/` - Hauptordner - `__init__.py` - Exportiert die Method-Klasse - `methodName.py` - Hauptklasse (minimal, ~50-150 Zeilen) - `helpers/` - Helper-Module nach Funktionalität gruppiert - `actions/` - Jede Action in eigenem Modul ### Vollständige Struktur für alle Methods ``` gateway/modules/workflows/methods/ ├── methodBase.py (bleibt) │ ├── methodSharepoint/ │ ├── __init__.py │ ├── methodSharepoint.py │ ├── helpers/ │ │ ├── __init__.py │ │ ├── connection.py │ │ ├── siteDiscovery.py │ │ ├── documentParsing.py │ │ ├── pathProcessing.py │ │ └── apiClient.py │ └── actions/ │ ├── __init__.py │ ├── findDocumentPath.py │ ├── readDocuments.py │ ├── uploadDocument.py │ ├── listDocuments.py │ ├── analyzeFolderUsage.py │ ├── findSiteByUrl.py │ ├── downloadFileByPath.py │ ├── copyFile.py │ └── uploadFile.py │ ├── methodOutlook/ │ ├── __init__.py │ ├── methodOutlook.py │ ├── helpers/ │ │ ├── __init__.py │ │ ├── connection.py │ │ ├── emailProcessing.py │ │ └── folderManagement.py │ └── actions/ │ ├── __init__.py │ ├── readEmails.py │ ├── searchEmails.py │ ├── composeAndDraftEmailWithContext.py │ └── sendDraftEmail.py │ ├── methodJira/ │ ├── __init__.py │ ├── methodJira.py │ ├── helpers/ │ │ ├── __init__.py │ │ ├── connection.py │ │ ├── adfConverter.py (ADF to Text) │ │ └── documentParsing.py │ └── actions/ │ ├── __init__.py │ ├── connectJira.py │ ├── exportTicketsAsJson.py │ ├── importTicketsFromJson.py │ ├── mergeTicketData.py │ ├── parseCsvContent.py │ ├── parseExcelContent.py │ ├── createCsvContent.py │ └── createExcelContent.py │ ├── methodAi/ │ ├── __init__.py │ ├── methodAi.py │ ├── helpers/ │ │ ├── __init__.py │ │ └── csvProcessing.py │ └── actions/ │ ├── __init__.py │ ├── process.py │ ├── webResearch.py │ ├── summarizeDocument.py │ ├── translateDocument.py │ ├── convert.py │ ├── convertDocument.py │ ├── extractData.py │ └── generateDocument.py │ └── methodContext/ ├── __init__.py ├── methodContext.py ├── helpers/ │ ├── __init__.py │ ├── documentIndex.py │ └── formatting.py └── actions/ ├── __init__.py ├── getDocumentIndex.py ├── extractContent.py └── triggerPreprocessingServer.py ``` ### Detaillierte Struktur: methodSharepoint/ #### methodSharepoint/__init__.py ```python # Copyright (c) 2025 Patrick Motsch # All rights reserved. from .methodSharepoint import MethodSharepoint __all__ = ['MethodSharepoint'] ``` #### methodSharepoint/methodSharepoint.py (Hauptklasse) ```python # Copyright (c) 2025 Patrick Motsch # All rights reserved. import logging from typing import Dict, Any from modules.workflows.methods.methodBase import MethodBase # Import helpers from .helpers.connection import ConnectionHelper from .helpers.siteDiscovery import SiteDiscoveryHelper from .helpers.documentParsing import DocumentParsingHelper from .helpers.pathProcessing import PathProcessingHelper from .helpers.apiClient import ApiClientHelper # Import actions from .actions.findDocumentPath import findDocumentPath from .actions.readDocuments import readDocuments from .actions.uploadDocument import uploadDocument from .actions.listDocuments import listDocuments from .actions.analyzeFolderUsage import analyzeFolderUsage from .actions.findSiteByUrl import findSiteByUrl from .actions.downloadFileByPath import downloadFileByPath from .actions.copyFile import copyFile from .actions.uploadFile import uploadFile logger = logging.getLogger(__name__) class MethodSharepoint(MethodBase): """SharePoint operations methods.""" def __init__(self, services): super().__init__(services) self.name = "sharepoint" self.description = "SharePoint operations methods" # Initialize helper modules self.connection = ConnectionHelper(self) self.siteDiscovery = SiteDiscoveryHelper(self) self.documentParsing = DocumentParsingHelper(self) self.pathProcessing = PathProcessingHelper(self) self.apiClient = ApiClientHelper(self) # Register actions self.findDocumentPath = findDocumentPath.__get__(self, self.__class__) self.readDocuments = readDocuments.__get__(self, self.__class__) self.uploadDocument = uploadDocument.__get__(self, self.__class__) self.listDocuments = listDocuments.__get__(self, self.__class__) self.analyzeFolderUsage = analyzeFolderUsage.__get__(self, self.__class__) self.findSiteByUrl = findSiteByUrl.__get__(self, self.__class__) self.downloadFileByPath = downloadFileByPath.__get__(self, self.__class__) self.copyFile = copyFile.__get__(self, self.__class__) self.uploadFile = uploadFile.__get__(self, self.__class__) ``` #### methodSharepoint/helpers/connection.py ```python # Copyright (c) 2025 Patrick Motsch # All rights reserved. import logging from typing import Dict, Any, Optional logger = logging.getLogger(__name__) class ConnectionHelper: """Helper for Microsoft connection management""" def __init__(self, methodInstance): self.method = methodInstance self.services = methodInstance.services def getMicrosoftConnection(self, connectionReference: str) -> Optional[Dict[str, Any]]: """Get Microsoft connection from connection reference and configure SharePoint service""" try: userConnection = self.services.chat.getUserConnectionFromConnectionReference(connectionReference) if not userConnection: logger.warning(f"No user connection found for reference: {connectionReference}") return None if userConnection.authority.value != "msft": logger.warning(f"Connection {userConnection.id} is not Microsoft (authority: {userConnection.authority.value})") return None # Check if connection is active or pending if userConnection.status.value not in ["active", "pending"]: logger.warning(f"Connection {userConnection.id} status is not active/pending: {userConnection.status.value}") return None # Configure SharePoint service if not self.services.sharepoint.setAccessTokenFromConnection(userConnection): logger.warning(f"Failed to configure SharePoint service with connection {userConnection.id}") return None logger.info(f"Successfully configured SharePoint service with Microsoft connection: {userConnection.id}") return { "id": userConnection.id, "userConnection": userConnection, "scopes": ["Sites.ReadWrite.All", "Files.ReadWrite.All", "User.Read"] } except Exception as e: logger.error(f"Error getting Microsoft connection: {str(e)}") return None ``` #### methodSharepoint/helpers/siteDiscovery.py ```python # Copyright (c) 2025 Patrick Motsch # All rights reserved. import logging from typing import Dict, Any, List, Optional logger = logging.getLogger(__name__) class SiteDiscoveryHelper: """Helper for SharePoint site discovery and resolution""" def __init__(self, methodInstance): self.method = methodInstance self.services = methodInstance.services async def discoverSharePointSites(self, limit: Optional[int] = None) -> List[Dict[str, Any]]: """Discover SharePoint sites accessible to the user via Microsoft Graph API""" # ... Implementation ... pass def filterSitesByHint(self, sites: List[Dict[str, Any]], siteHint: str) -> List[Dict[str, Any]]: """Filter sites by hint""" # ... Implementation ... pass async def getSiteByStandardPath(self, sitePath: str) -> Optional[Dict[str, Any]]: """Get site by standard path""" # ... Implementation ... pass async def getSiteId(self, hostname: str, sitePath: str) -> str: """Get site ID from hostname and path""" # ... Implementation ... pass async def resolveSitesFromPathQuery(self, pathQuery: str) -> tuple[List[Dict[str, Any]], Optional[str]]: """Resolve sites from pathQuery""" # ... Implementation ... pass ``` #### methodSharepoint/actions/findDocumentPath.py ```python # Copyright (c) 2025 Patrick Motsch # All rights reserved. import logging import time from typing import Dict, Any from modules.workflows.methods.methodBase import action from modules.datamodels.datamodelChat import ActionResult, ActionDocument logger = logging.getLogger(__name__) @action async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult: """ GENERAL: - Purpose: Find documents and folders by name/path across sites. - Input requirements: connectionReference (required); searchQuery (required); optional site, maxResults. - Output format: JSON with found items and paths. Parameters: - connectionReference (str, required): Microsoft connection label. - site (str, optional): Site hint. - searchQuery (str, required): Search terms or path. - maxResults (int, optional): Maximum items to return. Default: 1000. """ operationId = None try: # Init progress logger workflowId = self.services.workflow.id if self.services.workflow else f"no-workflow-{int(time.time())}" operationId = f"sharepoint_find_{workflowId}_{int(time.time())}" # Start progress tracking parentOperationId = parameters.get('parentOperationId') self.services.chat.progressLogStart( operationId, "Find Document Path", "SharePoint Search", f"Query: {parameters.get('searchQuery', '*')}", parentOperationId=parentOperationId ) connectionReference = parameters.get("connectionReference") searchQuery = parameters.get("searchQuery") siteHint = parameters.get("site") maxResults = parameters.get("maxResults", 1000) if not connectionReference: if operationId: self.services.chat.progressLogFinish(operationId, False) return ActionResult.isFailure(error="Connection reference is required") if not searchQuery: if operationId: self.services.chat.progressLogFinish(operationId, False) return ActionResult.isFailure(error="Search query is required") # Get Microsoft connection self.services.chat.progressLogUpdate(operationId, 0.2, "Getting Microsoft connection") connection = self.connection.getMicrosoftConnection(connectionReference) if not connection: if operationId: self.services.chat.progressLogFinish(operationId, False) return ActionResult.isFailure(error="No valid Microsoft connection found") # Parse search query self.services.chat.progressLogUpdate(operationId, 0.3, "Parsing search query") siteHintFromQuery, pathQuery, searchText, searchOptions = self.pathProcessing.parseSearchQuery(searchQuery) # Use site hint from parameter or query finalSiteHint = siteHint or siteHintFromQuery # Discover sites self.services.chat.progressLogUpdate(operationId, 0.4, "Discovering SharePoint sites") allSites = await self.siteDiscovery.discoverSharePointSites() # Filter sites by hint if provided if finalSiteHint: allSites = self.siteDiscovery.filterSitesByHint(allSites, finalSiteHint) # ... rest of implementation using helpers ... # Return result return ActionResult.isSuccess(documents=[...]) except Exception as e: errorMsg = f"Error finding document path: {str(e)}" logger.error(errorMsg) if operationId: self.services.chat.progressLogFinish(operationId, False) return ActionResult.isFailure(error=errorMsg) ``` ## Vorteile der Folder-basierten Struktur ### Vorteile 1. **Klare Organisation**: - Helper-Funktionen gruppiert nach Funktionalität - Actions isoliert in eigenen Modulen - Einfach zu navigieren 2. **Wartbarkeit**: - Jede Action in eigenem Modul (~100-500 Zeilen) - Helper-Module fokussiert auf spezifische Aufgaben - Einfacher zu testen 3. **Skalierbarkeit**: - Neue Actions einfach hinzufügen - Helper-Funktionen wiederverwendbar - Parallele Entwicklung möglich 4. **Kompatibilität**: - `methodDiscovery.py` funktioniert weiterhin (findet `MethodSharepoint` Klasse) - Keine Änderungen an bestehender Discovery-Logik nötig - Actions werden weiterhin über `@action` Decorator erkannt ### Migration-Strategie #### Phase 1: Helper-Module erstellen 1. Helper-Funktionen in separate Module verschieben 2. Helper-Klassen erstellen (ConnectionHelper, SiteDiscoveryHelper, etc.) 3. Hauptklasse anpassen, um Helper zu verwenden 4. Tests schreiben #### Phase 2: Actions extrahieren 1. Eine Action nach der anderen in separates Modul verschieben 2. Action-Funktion als standalone Funktion definieren 3. In Hauptklasse als Method registrieren 4. Tests für jede Action #### Phase 3: Folder-Struktur 1. Ordner `methodSharepoint/` erstellen 2. Dateien in Ordner verschieben 3. `__init__.py` erstellen 4. Imports anpassen #### Phase 4: Cleanup 1. Alte Dateien entfernen 2. Tests aktualisieren 3. Dokumentation aktualisieren ## Technische Details ### Action-Registrierung **Problem**: Actions müssen als Methoden der Klasse verfügbar sein, damit `@action` Decorator funktioniert. **Lösung**: Actions als standalone Funktionen definieren und als Descriptors registrieren: ```python # In action module @action async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult: # Implementation pass # In main class class MethodSharepoint(MethodBase): def __init__(self, services): # ... # Register action as method self.findDocumentPath = findDocumentPath.__get__(self, self.__class__) ``` **Alternative**: Actions direkt als Methoden importieren: ```python # In action module class ActionFindDocumentPath: @action async def execute(self, parameters: Dict[str, Any]) -> ActionResult: # Implementation pass # In main class from .actions.findDocumentPath import ActionFindDocumentPath class MethodSharepoint(MethodBase): def __init__(self, services): # ... self._actionFindDocumentPath = ActionFindDocumentPath() self.findDocumentPath = self._actionFindDocumentPath.execute ``` **Beste Lösung**: Actions als standalone Funktionen mit `self` Parameter: ```python # In action module from modules.workflows.methods.methodBase import action @action async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult: """Action implementation""" # self ist die MethodSharepoint Instanz connection = self.connection.getMicrosoftConnection(...) # ... ``` ### Helper-Zugriff Helper werden als Instanz-Variablen verfügbar gemacht: ```python class MethodSharepoint(MethodBase): def __init__(self, services): super().__init__(services) # ... self.connection = ConnectionHelper(self) self.siteDiscovery = SiteDiscoveryHelper(self) # ... ``` Actions können dann auf Helper zugreifen: ```python @action async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult: # Zugriff auf Helper über self connection = self.connection.getMicrosoftConnection(...) sites = await self.siteDiscovery.discoverSharePointSites() # ... ``` ### methodDiscovery.py Kompatibilität Die bestehende Discovery-Logik funktioniert weiterhin: ```python # methodDiscovery.py sucht nach: # 1. Modulen, die mit "method" beginnen # 2. Klassen, die von MethodBase erben # Mit Folder-Struktur: # methodSharepoint/__init__.py exportiert MethodSharepoint # → importlib.import_module('modules.workflows.methods.methodSharepoint') # → findet MethodSharepoint Klasse # → funktioniert wie bisher! ``` ## Beispiel: methodSharepoint/ Struktur ### methodSharepoint/methodSharepoint.py (Vollständig) ```python # Copyright (c) 2025 Patrick Motsch # All rights reserved. import logging from modules.workflows.methods.methodBase import MethodBase # Import helpers from .helpers.connection import ConnectionHelper from .helpers.siteDiscovery import SiteDiscoveryHelper from .helpers.documentParsing import DocumentParsingHelper from .helpers.pathProcessing import PathProcessingHelper from .helpers.apiClient import ApiClientHelper # Import actions from .actions import findDocumentPath from .actions import readDocuments from .actions import uploadDocument from .actions import listDocuments from .actions import analyzeFolderUsage from .actions import findSiteByUrl from .actions import downloadFileByPath from .actions import copyFile from .actions import uploadFile logger = logging.getLogger(__name__) class MethodSharepoint(MethodBase): """SharePoint operations methods.""" def __init__(self, services): super().__init__(services) self.name = "sharepoint" self.description = "SharePoint operations methods" # Initialize helper modules self.connection = ConnectionHelper(self) self.siteDiscovery = SiteDiscoveryHelper(self) self.documentParsing = DocumentParsingHelper(self) self.pathProcessing = PathProcessingHelper(self) self.apiClient = ApiClientHelper(self) # Register actions as methods # Actions werden als Methoden registriert, damit @action Decorator funktioniert self.findDocumentPath = findDocumentPath.__get__(self, self.__class__) self.readDocuments = readDocuments.__get__(self, self.__class__) self.uploadDocument = uploadDocument.__get__(self, self.__class__) self.listDocuments = listDocuments.__get__(self, self.__class__) self.analyzeFolderUsage = analyzeFolderUsage.__get__(self, self.__class__) self.findSiteByUrl = findSiteByUrl.__get__(self, self.__class__) self.downloadFileByPath = downloadFileByPath.__get__(self, self.__class__) self.copyFile = copyFile.__get__(self, self.__class__) self.uploadFile = uploadFile.__get__(self, self.__class__) ``` ### methodSharepoint/actions/__init__.py ```python # Copyright (c) 2025 Patrick Motsch # All rights reserved. # Export all actions from .findDocumentPath import findDocumentPath from .readDocuments import readDocuments from .uploadDocument import uploadDocument from .listDocuments import listDocuments from .analyzeFolderUsage import analyzeFolderUsage from .findSiteByUrl import findSiteByUrl from .downloadFileByPath import downloadFileByPath from .copyFile import copyFile from .uploadFile import uploadFile __all__ = [ 'findDocumentPath', 'readDocuments', 'uploadDocument', 'listDocuments', 'analyzeFolderUsage', 'findSiteByUrl', 'downloadFileByPath', 'copyFile', 'uploadFile', ] ``` ### methodSharepoint/actions/findDocumentPath.py (Beispiel) ```python # Copyright (c) 2025 Patrick Motsch # All rights reserved. import logging import time import json from typing import Dict, Any from modules.workflows.methods.methodBase import action from modules.datamodels.datamodelChat import ActionResult, ActionDocument logger = logging.getLogger(__name__) @action async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult: """ GENERAL: - Purpose: Find documents and folders by name/path across sites. - Input requirements: connectionReference (required); searchQuery (required); optional site, maxResults. - Output format: JSON with found items and paths. Parameters: - connectionReference (str, required): Microsoft connection label. - site (str, optional): Site hint. - searchQuery (str, required): Search terms or path. - maxResults (int, optional): Maximum items to return. Default: 1000. """ operationId = None try: # Init progress logger workflowId = self.services.workflow.id if self.services.workflow else f"no-workflow-{int(time.time())}" operationId = f"sharepoint_find_{workflowId}_{int(time.time())}" # Start progress tracking parentOperationId = parameters.get('parentOperationId') self.services.chat.progressLogStart( operationId, "Find Document Path", "SharePoint Search", f"Query: {parameters.get('searchQuery', '*')}", parentOperationId=parentOperationId ) connectionReference = parameters.get("connectionReference") searchQuery = parameters.get("searchQuery") siteHint = parameters.get("site") maxResults = parameters.get("maxResults", 1000) if not connectionReference: if operationId: self.services.chat.progressLogFinish(operationId, False) return ActionResult.isFailure(error="Connection reference is required") if not searchQuery: if operationId: self.services.chat.progressLogFinish(operationId, False) return ActionResult.isFailure(error="Search query is required") # Get Microsoft connection using helper self.services.chat.progressLogUpdate(operationId, 0.2, "Getting Microsoft connection") connection = self.connection.getMicrosoftConnection(connectionReference) if not connection: if operationId: self.services.chat.progressLogFinish(operationId, False) return ActionResult.isFailure(error="No valid Microsoft connection found") # Parse search query using helper self.services.chat.progressLogUpdate(operationId, 0.3, "Parsing search query") siteHintFromQuery, pathQuery, searchText, searchOptions = self.pathProcessing.parseSearchQuery(searchQuery) # Use site hint from parameter or query finalSiteHint = siteHint or siteHintFromQuery # Discover sites using helper self.services.chat.progressLogUpdate(operationId, 0.4, "Discovering SharePoint sites") allSites = await self.siteDiscovery.discoverSharePointSites() # Filter sites by hint if provided if finalSiteHint: allSites = self.siteDiscovery.filterSitesByHint(allSites, finalSiteHint) # ... rest of implementation ... # Generate result workflowContext = self.services.chat.getWorkflowContext() if hasattr(self.services, 'chat') else None filename = self._generateMeaningfulFileName( "sharepoint_find_result", "json", workflowContext, "findDocumentPath" ) result = { "foundDocuments": foundDocuments, "sites": sites, "totalCount": len(foundDocuments) } validationMetadata = self._createValidationMetadata( "findDocumentPath", connectionReference=connectionReference, searchQuery=searchQuery, siteHint=finalSiteHint, resultCount=len(foundDocuments) ) document = ActionDocument( documentName=filename, documentData=json.dumps(result, indent=2), mimeType="application/json", validationMetadata=validationMetadata ) self.services.chat.progressLogFinish(operationId, True) return ActionResult.isSuccess(documents=[document]) except Exception as e: errorMsg = f"Error finding document path: {str(e)}" logger.error(errorMsg) if operationId: self.services.chat.progressLogFinish(operationId, False) return ActionResult.isFailure(error=errorMsg) ``` ## Helper-Module Struktur ### methodSharepoint/helpers/connection.py ```python # Copyright (c) 2025 Patrick Motsch # All rights reserved. import logging from typing import Dict, Any, Optional logger = logging.getLogger(__name__) class ConnectionHelper: """Helper for Microsoft connection management in SharePoint operations""" def __init__(self, methodInstance): """ Initialize connection helper. Args: methodInstance: Instance of MethodSharepoint (for access to services) """ self.method = methodInstance self.services = methodInstance.services def getMicrosoftConnection(self, connectionReference: str) -> Optional[Dict[str, Any]]: """ Get Microsoft connection from connection reference and configure SharePoint service. Args: connectionReference: Connection reference string Returns: Dict with connection info or None if failed """ try: userConnection = self.services.chat.getUserConnectionFromConnectionReference(connectionReference) if not userConnection: logger.warning(f"No user connection found for reference: {connectionReference}") return None if userConnection.authority.value != "msft": logger.warning(f"Connection {userConnection.id} is not Microsoft (authority: {userConnection.authority.value})") return None # Check if connection is active or pending if userConnection.status.value not in ["active", "pending"]: logger.warning(f"Connection {userConnection.id} status is not active/pending: {userConnection.status.value}") return None # Configure SharePoint service if not self.services.sharepoint.setAccessTokenFromConnection(userConnection): logger.warning(f"Failed to configure SharePoint service with connection {userConnection.id}") return None logger.info(f"Successfully configured SharePoint service with Microsoft connection: {userConnection.id}") return { "id": userConnection.id, "userConnection": userConnection, "scopes": ["Sites.ReadWrite.All", "Files.ReadWrite.All", "User.Read"] } except Exception as e: logger.error(f"Error getting Microsoft connection: {str(e)}") return None ``` ### methodSharepoint/helpers/siteDiscovery.py ```python # Copyright (c) 2025 Patrick Motsch # All rights reserved. import logging from typing import Dict, Any, List, Optional logger = logging.getLogger(__name__) class SiteDiscoveryHelper: """Helper for SharePoint site discovery and resolution""" def __init__(self, methodInstance): self.method = methodInstance self.services = methodInstance.services async def discoverSharePointSites(self, limit: Optional[int] = None) -> List[Dict[str, Any]]: """ Discover SharePoint sites accessible to the user via Microsoft Graph API. Args: limit: Optional limit on number of sites to return Returns: List of site information dictionaries """ try: endpoint = "sites?search=*" if limit: endpoint += f"&$top={limit}" result = await self.method.apiClient.makeGraphApiCall(endpoint) if "error" in result: logger.error(f"Error discovering SharePoint sites: {result['error']}") return [] sites = result.get("value", []) if limit: sites = sites[:limit] logger.info(f"Discovered {len(sites)} SharePoint sites" + (f" (limited to {limit})" if limit else "")) # Process and return site information processedSites = [] for site in sites: siteInfo = { "id": site.get("id"), "displayName": site.get("displayName"), "webUrl": site.get("webUrl"), "name": site.get("name"), "description": site.get("description") } processedSites.append(siteInfo) return processedSites except Exception as e: logger.error(f"Error discovering SharePoint sites: {str(e)}") return [] def filterSitesByHint(self, sites: List[Dict[str, Any]], siteHint: str) -> List[Dict[str, Any]]: """ Filter sites by hint (name, displayName, or webUrl contains hint). Args: sites: List of site dictionaries siteHint: Hint string to match against Returns: Filtered list of sites """ if not siteHint: return sites hintLower = siteHint.lower() filtered = [] for site in sites: displayName = site.get("displayName", "").lower() name = site.get("name", "").lower() webUrl = site.get("webUrl", "").lower() if hintLower in displayName or hintLower in name or hintLower in webUrl: filtered.append(site) return filtered async def getSiteByStandardPath(self, sitePath: str) -> Optional[Dict[str, Any]]: """ Get site by standard path format (hostname/path or /sites/path). Args: sitePath: Site path string Returns: Site dictionary or None if not found """ # Implementation... pass async def getSiteId(self, hostname: str, sitePath: str) -> str: """ Get site ID from hostname and site path. Args: hostname: SharePoint hostname sitePath: Site path Returns: Site ID string """ # Implementation... pass async def resolveSitesFromPathQuery(self, pathQuery: str) -> tuple[List[Dict[str, Any]], Optional[str]]: """ Resolve sites from pathQuery using SharePoint service helper methods. Args: pathQuery: Path query string Returns: Tuple of (sites list, error message) """ try: isValid, errorMsg = self.services.sharepoint.validatePathQuery(pathQuery) if not isValid: return [], errorMsg sites = await self.services.sharepoint.resolveSitesFromPathQuery(pathQuery) if not sites: return [], "No SharePoint sites found or accessible" return sites, None except Exception as e: logger.error(f"Error resolving sites from pathQuery '{pathQuery}': {str(e)}") return [], f"Error resolving sites from pathQuery: {str(e)}" ``` ## Migration-Plan ### Schritt 1: Helper-Module erstellen (Woche 1) 1. **Helper-Kategorien identifizieren**: - Connection & Authentication - Site Discovery & Resolution - Document Parsing - Path & Query Processing - API Communication 2. **Helper-Klassen erstellen**: - `methodSharepoint/helpers/connection.py` - `methodSharepoint/helpers/siteDiscovery.py` - `methodSharepoint/helpers/documentParsing.py` - `methodSharepoint/helpers/pathProcessing.py` - `methodSharepoint/helpers/apiClient.py` 3. **Helper-Funktionen migrieren**: - Funktionen aus `methodSharepoint.py` in entsprechende Helper-Klassen verschieben - `self` Parameter durch `self.method` ersetzen - Tests schreiben ### Schritt 2: Actions extrahieren (Woche 2-3) 1. **Action-Module erstellen**: - `methodSharepoint/actions/findDocumentPath.py` - `methodSharepoint/actions/readDocuments.py` - `methodSharepoint/actions/uploadDocument.py` - etc. 2. **Actions migrieren**: - Action-Funktion in separates Modul verschieben - Helper-Zugriff über `self.helperName` anpassen - Tests schreiben 3. **Hauptklasse anpassen**: - Helper initialisieren - Actions registrieren - Alte Implementierung entfernen ### Schritt 3: Folder-Struktur (Woche 4) 1. **Ordner erstellen**: - `methodSharepoint/` Ordner erstellen - `helpers/` und `actions/` Unterordner erstellen 2. **Dateien verschieben**: - Helper-Module in `helpers/` - Action-Module in `actions/` - Hauptklasse bleibt in `methodSharepoint/` 3. **Imports anpassen**: - `__init__.py` Dateien erstellen - Relative Imports anpassen ### Schritt 4: Testing & Cleanup (Woche 5) 1. **Tests**: - Unit-Tests für Helper-Module - Integration-Tests für Actions - End-to-End Tests für Workflows 2. **Dokumentation**: - README für methodSharepoint/ - Helper-Dokumentation - Action-Dokumentation 3. **Cleanup**: - Alte `methodSharepoint.py` entfernen - Unused Imports entfernen - Code-Review ## Vorteile der neuen Struktur ### Vorher (2840 Zeilen in einer Datei) ``` methodSharepoint.py ├── 16 Helper-Funktionen (verstreut) ├── 9 Actions (200-500 Zeilen each) └── Schwer zu navigieren, schwer zu testen ``` ### Nachher (aufgeteilt) ``` methodSharepoint/ ├── methodSharepoint.py (~100 Zeilen) ├── helpers/ │ ├── connection.py (~100 Zeilen) │ ├── siteDiscovery.py (~200 Zeilen) │ ├── documentParsing.py (~150 Zeilen) │ ├── pathProcessing.py (~200 Zeilen) │ └── apiClient.py (~100 Zeilen) └── actions/ ├── findDocumentPath.py (~300 Zeilen) ├── readDocuments.py (~200 Zeilen) ├── uploadDocument.py (~200 Zeilen) └── ... (weitere Actions, je ~100-300 Zeilen) ``` **Vorteile**: - ✅ Jede Datei < 300 Zeilen (meist < 200) - ✅ Klare Trennung von Concerns - ✅ Einfach zu testen - ✅ Parallele Entwicklung möglich - ✅ Wiederverwendbare Helper-Module ## Kompatibilität ### methodDiscovery.py Die bestehende Discovery-Logik funktioniert weiterhin: ```python # methodDiscovery.py sucht nach: for _, name, isPkg in pkgutil.iter_modules(methodsPackage.__path__): if not isPkg and name.startswith('method'): # Importiert: modules.workflows.methods.methodSharepoint # → methodSharepoint/__init__.py wird geladen # → MethodSharepoint Klasse wird gefunden # → Funktioniert wie bisher! ``` **Anpassung nötig**: `methodDiscovery.py` muss auch Packages (Ordner) erkennen: ```python # In methodDiscovery.py - discoverMethods() Funktion for _, name, isPkg in pkgutil.iter_modules(methodsPackage.__path__): if name.startswith('method'): try: if isPkg: # Package (Ordner) - importiere __init__.py module = importlib.import_module(f'modules.workflows.methods.{name}') else: # Modul (Datei) - wie bisher (für Rückwärtskompatibilität) module = importlib.import_module(f'modules.workflows.methods.{name}') # Find all classes in the module that inherit from MethodBase for itemName, item in inspect.getmembers(module): if (inspect.isclass(item) and issubclass(item, MethodBase) and item != MethodBase): # ... rest of discovery logic ... except Exception as e: logger.error(f"Error discovering method {name}: {str(e)}") continue ``` **Wichtig**: Diese Änderung ist rückwärtskompatibel - bestehende Method-Dateien funktionieren weiterhin. ## Struktur-Details pro Method ### methodSharepoint (9 Actions, ~16 Helper-Funktionen) **Helper-Kategorien**: - `connection.py` - Microsoft Connection Handling - `siteDiscovery.py` - SharePoint Site Discovery & Resolution - `documentParsing.py` - Document List Parsing - `pathProcessing.py` - Path & Query Processing - `apiClient.py` - Microsoft Graph API Calls **Actions**: findDocumentPath, readDocuments, uploadDocument, listDocuments, analyzeFolderUsage, findSiteByUrl, downloadFileByPath, copyFile, uploadFile ### methodOutlook (4 Actions, ~8 Helper-Funktionen) **Helper-Kategorien**: - `connection.py` - Microsoft Connection Handling - `emailProcessing.py` - Email Search, Filtering, Processing - `folderManagement.py` - Folder Operations **Actions**: readEmails, searchEmails, composeAndDraftEmailWithContext, sendDraftEmail ### methodJira (8 Actions, ~4 Helper-Funktionen) **Helper-Kategorien**: - `connection.py` - JIRA Connection Handling - `adfConverter.py` - Atlassian Document Format to Text Conversion - `documentParsing.py` - Document Reference Parsing **Actions**: connectJira, exportTicketsAsJson, importTicketsFromJson, mergeTicketData, parseCsvContent, parseExcelContent, createCsvContent, createExcelContent ### methodAi (8 Actions, ~1 Helper-Funktion) **Helper-Kategorien**: - `csvProcessing.py` - CSV Options Processing **Actions**: process, webResearch, summarizeDocument, translateDocument, convert, convertDocument, extractData, generateDocument ### methodContext (3 Actions, ~3 Helper-Funktionen) **Helper-Kategorien**: - `documentIndex.py` - Document Index Parsing - `formatting.py` - Markdown/Text Formatting **Actions**: getDocumentIndex, extractContent, triggerPreprocessingServer ## Migration-Plan für alle Methods ### Phase 1: methodSharepoint (Pilot) - Woche 1-2 1. Helper-Module erstellen und migrieren 2. Actions extrahieren 3. Folder-Struktur erstellen 4. Tests schreiben 5. Dokumentation ### Phase 2: methodOutlook - Woche 3-4 1. Gleiche Struktur wie methodSharepoint 2. Helper-Module erstellen 3. Actions extrahieren 4. Tests schreiben ### Phase 3: methodJira - Woche 5-6 1. Helper-Module erstellen 2. Actions extrahieren 3. Tests schreiben ### Phase 4: methodAi - Woche 7 1. Helper-Module erstellen (minimal) 2. Actions extrahieren 3. Tests schreiben ### Phase 5: methodContext - Woche 8 1. Helper-Module erstellen 2. Actions extrahieren 3. Tests schreiben ### Phase 6: Cleanup & Dokumentation - Woche 9 1. Alte Dateien entfernen 2. methodDiscovery.py anpassen (Package-Support) 3. Gesamtdokumentation aktualisieren 4. Code-Review ## Gemeinsame Helper-Funktionen ### Analyse: Duplizierte Helper-Funktionen **Gefundene Duplikationen**: - `_format_timestamp_for_filename()`: In `methodOutlook`, `methodSharepoint`, `methodAi` dupliziert - `_getMicrosoftConnection()`: In `methodOutlook` und `methodSharepoint` dupliziert (aber unterschiedlich implementiert) - `_createValidationMetadata()`: Bereits in `MethodBase` (gut!) - `_generateMeaningfulFileName()`: Bereits in `MethodBase` (gut!) ### Lösung: Gemeinsame Helper-Module **Struktur für gemeinsame Helper**: ``` gateway/modules/workflows/methods/ ├── methodBase.py ├── shared/ # NEU: Gemeinsame Helper für alle Methods │ ├── __init__.py │ ├── connection.py # Microsoft Connection Helper (wenn identisch) │ └── utils.py # Gemeinsame Utilities (_format_timestamp, etc.) ├── methodSharepoint/ │ └── ... └── methodOutlook/ └── ... ``` **Option 1: Gemeinsame Helper in `shared/`** - Wenn Implementierung identisch ist → gemeinsames Modul - Wenn unterschiedlich → method-spezifische Helper **Option 2: Helper in MethodBase** - Für wirklich universelle Helper (wie `_createValidationMetadata`) - Nicht für method-spezifische Logik **Empfehlung**: - `_format_timestamp_for_filename()` → In `MethodBase` verschieben (ist identisch) - `_getMicrosoftConnection()` → Method-spezifisch belassen (unterschiedliche Implementierungen) ## Offene Fragen 1. **Action-Registrierung**: Soll die Descriptor-Methode (`__get__`) verwendet werden oder eine andere Lösung? - **Empfehlung**: Descriptor-Methode ist am saubersten und funktioniert mit `@action` Decorator 2. **Helper-Sharing**: Sollen Helper zwischen Methods geteilt werden? - **Empfehlung**: Nur wenn Implementierung identisch ist. Sonst method-spezifisch belassen. 3. **Testing**: Wie sollen Helper-Module getestet werden? - **Empfehlung**: Unit-Tests mit Mock `methodInstance` für Helper-Module, Integration-Tests für Actions