wiki/appdoc/doc_workflow_method_refactoring_concept_done.md
2025-12-29 23:54:13 +01:00

45 KiB

Method-Dateien Refactoring Konzept

Übersicht

Dieses Dokument beschreibt das Standard-Refactoring-Konzept für alle Method-Dateien im gateway/modules/workflows/methods/ Verzeichnis.

Ziel: Alle Methods werden nach der gleichen Folder-basierten Struktur umorganisiert, um:

  • Wartbarkeit zu verbessern
  • Parallele Entwicklung zu ermöglichen
  • Testbarkeit zu erhöhen
  • Skalierbarkeit sicherzustellen

Standard-Struktur: Jede Method wird in einen eigenen Ordner mit helpers/ und actions/ Unterordnern aufgeteilt.

Betroffene Methods:

  • methodSharepoint.py (2840 Zeilen → Folder-Struktur)
  • methodOutlook.py (1905 Zeilen → Folder-Struktur)
  • methodJira.py (1102 Zeilen → Folder-Struktur)
  • methodAi.py (743 Zeilen → Folder-Struktur)
  • methodContext.py (461 Zeilen → Folder-Struktur)

Problemstellung

Die Method-Dateien sind sehr lang geworden:

  • methodSharepoint.py: 2840 Zeilen (9 Actions, ~16 Helper-Funktionen)
  • methodOutlook.py: 1905 Zeilen (4 Actions, ~8 Helper-Funktionen)
  • methodJira.py: 1102 Zeilen (8 Actions)
  • methodAi.py: 743 Zeilen (8 Actions)

Probleme:

  • Schwer wartbar und navigierbar
  • Hohe Komplexität pro Datei
  • Actions sind relativ unabhängig, teilen sich aber Helper-Funktionen
  • Schwierig, mehrere Entwickler parallel arbeiten zu lassen

Analyse der aktuellen Struktur

MethodSharepoint.py Struktur

methodSharepoint.py (2840 Zeilen)
├── __init__() - Initialisierung
├── Helper-Funktionen (16 Stück):
│   ├── _format_timestamp_for_filename()
│   ├── _getMicrosoftConnection()
│   ├── _discoverSharePointSites()
│   ├── _extractHostnameFromWebUrl()
│   ├── _extractSiteFromStandardPath()
│   ├── _getSiteByStandardPath()
│   ├── _filterSitesByHint()
│   ├── _parseSearchQuery()
│   ├── _resolvePathQuery()
│   ├── _parseSiteUrl()
│   ├── _cleanSearchQuery()
│   ├── _makeGraphApiCall()
│   ├── _getSiteId()
│   ├── _parseDocumentListForFoundDocuments()
│   ├── _resolveSitesFromPathQuery()
│   └── _parseDocumentListForFolder() (vermutlich)
├── Actions (9 Stück):
│   ├── findDocumentPath() (~480 Zeilen)
│   ├── readDocuments() (~275 Zeilen)
│   ├── uploadDocument() (~270 Zeilen)
│   ├── listDocuments() (~270 Zeilen)
│   ├── analyzeFolderUsage() (~320 Zeilen)
│   ├── findSiteByUrl() (~70 Zeilen)
│   ├── downloadFileByPath() (~100 Zeilen)
│   ├── copyFile() (~150 Zeilen)
│   └── uploadFile() (~130 Zeilen)

Helper-Funktionen Kategorisierung

Connection & Authentication:

  • _getMicrosoftConnection() - Wird von ALLEN Actions verwendet

Site Discovery & Resolution:

  • _discoverSharePointSites() - Wird von mehreren Actions verwendet
  • _getSiteByStandardPath() - Wird von mehreren Actions verwendet
  • _filterSitesByHint() - Wird von mehreren Actions verwendet
  • _resolveSitesFromPathQuery() - Wird von mehreren Actions verwendet
  • _getSiteId() - Wird von mehreren Actions verwendet

Document Parsing:

  • _parseDocumentListForFoundDocuments() - Wird von readDocuments, uploadDocument verwendet
  • _parseDocumentListForFolder() - Wird von uploadDocument, listDocuments verwendet

Path & Query Processing:

  • _parseSearchQuery() - Wird von findDocumentPath verwendet
  • _resolvePathQuery() - Wird von mehreren Actions verwendet
  • _extractSiteFromStandardPath() - Wird von mehreren Actions verwendet
  • _extractHostnameFromWebUrl() - Wird von mehreren Actions verwendet
  • _parseSiteUrl() - Wird von mehreren Actions verwendet
  • _cleanSearchQuery() - Wird von findDocumentPath verwendet

API Communication:

  • _makeGraphApiCall() - Wird von mehreren Actions verwendet

Utilities:

  • _format_timestamp_for_filename() - Wird von mehreren Actions verwendet

Refactoring-Konzept: Folder-basierte Struktur

Standard-Struktur für alle Methods:

Jede Method wird in einen eigenen Ordner mit folgender Struktur aufgeteilt:

  • methodName/ - Hauptordner
    • __init__.py - Exportiert die Method-Klasse
    • methodName.py - Hauptklasse (minimal, ~50-150 Zeilen)
    • helpers/ - Helper-Module nach Funktionalität gruppiert
    • actions/ - Jede Action in eigenem Modul

Vollständige Struktur für alle Methods

gateway/modules/workflows/methods/
├── methodBase.py (bleibt)
│
├── methodSharepoint/
│   ├── __init__.py
│   ├── methodSharepoint.py
│   ├── helpers/
│   │   ├── __init__.py
│   │   ├── connection.py
│   │   ├── siteDiscovery.py
│   │   ├── documentParsing.py
│   │   ├── pathProcessing.py
│   │   └── apiClient.py
│   └── actions/
│       ├── __init__.py
│       ├── findDocumentPath.py
│       ├── readDocuments.py
│       ├── uploadDocument.py
│       ├── listDocuments.py
│       ├── analyzeFolderUsage.py
│       ├── findSiteByUrl.py
│       ├── downloadFileByPath.py
│       ├── copyFile.py
│       └── uploadFile.py
│
├── methodOutlook/
│   ├── __init__.py
│   ├── methodOutlook.py
│   ├── helpers/
│   │   ├── __init__.py
│   │   ├── connection.py
│   │   ├── emailProcessing.py
│   │   └── folderManagement.py
│   └── actions/
│       ├── __init__.py
│       ├── readEmails.py
│       ├── searchEmails.py
│       ├── composeAndDraftEmailWithContext.py
│       └── sendDraftEmail.py
│
├── methodJira/
│   ├── __init__.py
│   ├── methodJira.py
│   ├── helpers/
│   │   ├── __init__.py
│   │   ├── connection.py
│   │   ├── adfConverter.py (ADF to Text)
│   │   └── documentParsing.py
│   └── actions/
│       ├── __init__.py
│       ├── connectJira.py
│       ├── exportTicketsAsJson.py
│       ├── importTicketsFromJson.py
│       ├── mergeTicketData.py
│       ├── parseCsvContent.py
│       ├── parseExcelContent.py
│       ├── createCsvContent.py
│       └── createExcelContent.py
│
├── methodAi/
│   ├── __init__.py
│   ├── methodAi.py
│   ├── helpers/
│   │   ├── __init__.py
│   │   └── csvProcessing.py
│   └── actions/
│       ├── __init__.py
│       ├── process.py
│       ├── webResearch.py
│       ├── summarizeDocument.py
│       ├── translateDocument.py
│       ├── convert.py
│       ├── convertDocument.py
│       ├── extractData.py
│       └── generateDocument.py
│
└── methodContext/
    ├── __init__.py
    ├── methodContext.py
    ├── helpers/
    │   ├── __init__.py
    │   ├── documentIndex.py
    │   └── formatting.py
    └── actions/
        ├── __init__.py
        ├── getDocumentIndex.py
        ├── extractContent.py
        └── triggerPreprocessingServer.py

Detaillierte Struktur: methodSharepoint/

methodSharepoint/init.py

# Copyright (c) 2025 Patrick Motsch
# All rights reserved.

from .methodSharepoint import MethodSharepoint

__all__ = ['MethodSharepoint']

methodSharepoint/methodSharepoint.py (Hauptklasse)

# Copyright (c) 2025 Patrick Motsch
# All rights reserved.

import logging
from typing import Dict, Any
from modules.workflows.methods.methodBase import MethodBase

# Import helpers
from .helpers.connection import ConnectionHelper
from .helpers.siteDiscovery import SiteDiscoveryHelper
from .helpers.documentParsing import DocumentParsingHelper
from .helpers.pathProcessing import PathProcessingHelper
from .helpers.apiClient import ApiClientHelper

# Import actions
from .actions.findDocumentPath import findDocumentPath
from .actions.readDocuments import readDocuments
from .actions.uploadDocument import uploadDocument
from .actions.listDocuments import listDocuments
from .actions.analyzeFolderUsage import analyzeFolderUsage
from .actions.findSiteByUrl import findSiteByUrl
from .actions.downloadFileByPath import downloadFileByPath
from .actions.copyFile import copyFile
from .actions.uploadFile import uploadFile

logger = logging.getLogger(__name__)

class MethodSharepoint(MethodBase):
    """SharePoint operations methods."""
    
    def __init__(self, services):
        super().__init__(services)
        self.name = "sharepoint"
        self.description = "SharePoint operations methods"
        
        # Initialize helper modules
        self.connection = ConnectionHelper(self)
        self.siteDiscovery = SiteDiscoveryHelper(self)
        self.documentParsing = DocumentParsingHelper(self)
        self.pathProcessing = PathProcessingHelper(self)
        self.apiClient = ApiClientHelper(self)
        
        # Register actions
        self.findDocumentPath = findDocumentPath.__get__(self, self.__class__)
        self.readDocuments = readDocuments.__get__(self, self.__class__)
        self.uploadDocument = uploadDocument.__get__(self, self.__class__)
        self.listDocuments = listDocuments.__get__(self, self.__class__)
        self.analyzeFolderUsage = analyzeFolderUsage.__get__(self, self.__class__)
        self.findSiteByUrl = findSiteByUrl.__get__(self, self.__class__)
        self.downloadFileByPath = downloadFileByPath.__get__(self, self.__class__)
        self.copyFile = copyFile.__get__(self, self.__class__)
        self.uploadFile = uploadFile.__get__(self, self.__class__)

methodSharepoint/helpers/connection.py

# Copyright (c) 2025 Patrick Motsch
# All rights reserved.

import logging
from typing import Dict, Any, Optional

logger = logging.getLogger(__name__)

class ConnectionHelper:
    """Helper for Microsoft connection management"""
    
    def __init__(self, methodInstance):
        self.method = methodInstance
        self.services = methodInstance.services
    
    def getMicrosoftConnection(self, connectionReference: str) -> Optional[Dict[str, Any]]:
        """Get Microsoft connection from connection reference and configure SharePoint service"""
        try:
            userConnection = self.services.chat.getUserConnectionFromConnectionReference(connectionReference)
            if not userConnection:
                logger.warning(f"No user connection found for reference: {connectionReference}")
                return None
                
            if userConnection.authority.value != "msft":
                logger.warning(f"Connection {userConnection.id} is not Microsoft (authority: {userConnection.authority.value})")
                return None
            
            # Check if connection is active or pending
            if userConnection.status.value not in ["active", "pending"]:
                logger.warning(f"Connection {userConnection.id} status is not active/pending: {userConnection.status.value}")
                return None
            
            # Configure SharePoint service
            if not self.services.sharepoint.setAccessTokenFromConnection(userConnection):
                logger.warning(f"Failed to configure SharePoint service with connection {userConnection.id}")
                return None
            
            logger.info(f"Successfully configured SharePoint service with Microsoft connection: {userConnection.id}")
            
            return {
                "id": userConnection.id,
                "userConnection": userConnection,
                "scopes": ["Sites.ReadWrite.All", "Files.ReadWrite.All", "User.Read"]
            }
        except Exception as e:
            logger.error(f"Error getting Microsoft connection: {str(e)}")
            return None

methodSharepoint/helpers/siteDiscovery.py

# Copyright (c) 2025 Patrick Motsch
# All rights reserved.

import logging
from typing import Dict, Any, List, Optional

logger = logging.getLogger(__name__)

class SiteDiscoveryHelper:
    """Helper for SharePoint site discovery and resolution"""
    
    def __init__(self, methodInstance):
        self.method = methodInstance
        self.services = methodInstance.services
    
    async def discoverSharePointSites(self, limit: Optional[int] = None) -> List[Dict[str, Any]]:
        """Discover SharePoint sites accessible to the user via Microsoft Graph API"""
        # ... Implementation ...
        pass
    
    def filterSitesByHint(self, sites: List[Dict[str, Any]], siteHint: str) -> List[Dict[str, Any]]:
        """Filter sites by hint"""
        # ... Implementation ...
        pass
    
    async def getSiteByStandardPath(self, sitePath: str) -> Optional[Dict[str, Any]]:
        """Get site by standard path"""
        # ... Implementation ...
        pass
    
    async def getSiteId(self, hostname: str, sitePath: str) -> str:
        """Get site ID from hostname and path"""
        # ... Implementation ...
        pass
    
    async def resolveSitesFromPathQuery(self, pathQuery: str) -> tuple[List[Dict[str, Any]], Optional[str]]:
        """Resolve sites from pathQuery"""
        # ... Implementation ...
        pass

methodSharepoint/actions/findDocumentPath.py

# Copyright (c) 2025 Patrick Motsch
# All rights reserved.

import logging
import time
from typing import Dict, Any
from modules.workflows.methods.methodBase import action
from modules.datamodels.datamodelChat import ActionResult, ActionDocument

logger = logging.getLogger(__name__)

@action
async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult:
    """
    GENERAL:
    - Purpose: Find documents and folders by name/path across sites.
    - Input requirements: connectionReference (required); searchQuery (required); optional site, maxResults.
    - Output format: JSON with found items and paths.

    Parameters:
    - connectionReference (str, required): Microsoft connection label.
    - site (str, optional): Site hint.
    - searchQuery (str, required): Search terms or path.
    - maxResults (int, optional): Maximum items to return. Default: 1000.
    """
    operationId = None
    try:
        # Init progress logger
        workflowId = self.services.workflow.id if self.services.workflow else f"no-workflow-{int(time.time())}"
        operationId = f"sharepoint_find_{workflowId}_{int(time.time())}"
        
        # Start progress tracking
        parentOperationId = parameters.get('parentOperationId')
        self.services.chat.progressLogStart(
            operationId,
            "Find Document Path",
            "SharePoint Search",
            f"Query: {parameters.get('searchQuery', '*')}",
            parentOperationId=parentOperationId
        )
        
        connectionReference = parameters.get("connectionReference")
        searchQuery = parameters.get("searchQuery")
        siteHint = parameters.get("site")
        maxResults = parameters.get("maxResults", 1000)
        
        if not connectionReference:
            if operationId:
                self.services.chat.progressLogFinish(operationId, False)
            return ActionResult.isFailure(error="Connection reference is required")
        
        if not searchQuery:
            if operationId:
                self.services.chat.progressLogFinish(operationId, False)
            return ActionResult.isFailure(error="Search query is required")
        
        # Get Microsoft connection
        self.services.chat.progressLogUpdate(operationId, 0.2, "Getting Microsoft connection")
        connection = self.connection.getMicrosoftConnection(connectionReference)
        if not connection:
            if operationId:
                self.services.chat.progressLogFinish(operationId, False)
            return ActionResult.isFailure(error="No valid Microsoft connection found")
        
        # Parse search query
        self.services.chat.progressLogUpdate(operationId, 0.3, "Parsing search query")
        siteHintFromQuery, pathQuery, searchText, searchOptions = self.pathProcessing.parseSearchQuery(searchQuery)
        
        # Use site hint from parameter or query
        finalSiteHint = siteHint or siteHintFromQuery
        
        # Discover sites
        self.services.chat.progressLogUpdate(operationId, 0.4, "Discovering SharePoint sites")
        allSites = await self.siteDiscovery.discoverSharePointSites()
        
        # Filter sites by hint if provided
        if finalSiteHint:
            allSites = self.siteDiscovery.filterSitesByHint(allSites, finalSiteHint)
        
        # ... rest of implementation using helpers ...
        
        # Return result
        return ActionResult.isSuccess(documents=[...])
        
    except Exception as e:
        errorMsg = f"Error finding document path: {str(e)}"
        logger.error(errorMsg)
        if operationId:
            self.services.chat.progressLogFinish(operationId, False)
        return ActionResult.isFailure(error=errorMsg)

Vorteile der Folder-basierten Struktur

Vorteile

  1. Klare Organisation:

    • Helper-Funktionen gruppiert nach Funktionalität
    • Actions isoliert in eigenen Modulen
    • Einfach zu navigieren
  2. Wartbarkeit:

    • Jede Action in eigenem Modul (~100-500 Zeilen)
    • Helper-Module fokussiert auf spezifische Aufgaben
    • Einfacher zu testen
  3. Skalierbarkeit:

    • Neue Actions einfach hinzufügen
    • Helper-Funktionen wiederverwendbar
    • Parallele Entwicklung möglich
  4. Kompatibilität:

    • methodDiscovery.py funktioniert weiterhin (findet MethodSharepoint Klasse)
    • Keine Änderungen an bestehender Discovery-Logik nötig
    • Actions werden weiterhin über @action Decorator erkannt

Migration-Strategie

Phase 1: Helper-Module erstellen

  1. Helper-Funktionen in separate Module verschieben
  2. Helper-Klassen erstellen (ConnectionHelper, SiteDiscoveryHelper, etc.)
  3. Hauptklasse anpassen, um Helper zu verwenden
  4. Tests schreiben

Phase 2: Actions extrahieren

  1. Eine Action nach der anderen in separates Modul verschieben
  2. Action-Funktion als standalone Funktion definieren
  3. In Hauptklasse als Method registrieren
  4. Tests für jede Action

Phase 3: Folder-Struktur

  1. Ordner methodSharepoint/ erstellen
  2. Dateien in Ordner verschieben
  3. __init__.py erstellen
  4. Imports anpassen

Phase 4: Cleanup

  1. Alte Dateien entfernen
  2. Tests aktualisieren
  3. Dokumentation aktualisieren

Technische Details

Action-Registrierung

Problem: Actions müssen als Methoden der Klasse verfügbar sein, damit @action Decorator funktioniert.

Lösung: Actions als standalone Funktionen definieren und als Descriptors registrieren:

# In action module
@action
async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult:
    # Implementation
    pass

# In main class
class MethodSharepoint(MethodBase):
    def __init__(self, services):
        # ...
        # Register action as method
        self.findDocumentPath = findDocumentPath.__get__(self, self.__class__)

Alternative: Actions direkt als Methoden importieren:

# In action module
class ActionFindDocumentPath:
    @action
    async def execute(self, parameters: Dict[str, Any]) -> ActionResult:
        # Implementation
        pass

# In main class
from .actions.findDocumentPath import ActionFindDocumentPath

class MethodSharepoint(MethodBase):
    def __init__(self, services):
        # ...
        self._actionFindDocumentPath = ActionFindDocumentPath()
        self.findDocumentPath = self._actionFindDocumentPath.execute

Beste Lösung: Actions als standalone Funktionen mit self Parameter:

# In action module
from modules.workflows.methods.methodBase import action

@action
async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult:
    """Action implementation"""
    # self ist die MethodSharepoint Instanz
    connection = self.connection.getMicrosoftConnection(...)
    # ...

Helper-Zugriff

Helper werden als Instanz-Variablen verfügbar gemacht:

class MethodSharepoint(MethodBase):
    def __init__(self, services):
        super().__init__(services)
        # ...
        self.connection = ConnectionHelper(self)
        self.siteDiscovery = SiteDiscoveryHelper(self)
        # ...

Actions können dann auf Helper zugreifen:

@action
async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult:
    # Zugriff auf Helper über self
    connection = self.connection.getMicrosoftConnection(...)
    sites = await self.siteDiscovery.discoverSharePointSites()
    # ...

methodDiscovery.py Kompatibilität

Die bestehende Discovery-Logik funktioniert weiterhin:

# methodDiscovery.py sucht nach:
# 1. Modulen, die mit "method" beginnen
# 2. Klassen, die von MethodBase erben

# Mit Folder-Struktur:
# methodSharepoint/__init__.py exportiert MethodSharepoint
# → importlib.import_module('modules.workflows.methods.methodSharepoint')
# → findet MethodSharepoint Klasse
# → funktioniert wie bisher!

Beispiel: methodSharepoint/ Struktur

methodSharepoint/methodSharepoint.py (Vollständig)

# Copyright (c) 2025 Patrick Motsch
# All rights reserved.

import logging
from modules.workflows.methods.methodBase import MethodBase

# Import helpers
from .helpers.connection import ConnectionHelper
from .helpers.siteDiscovery import SiteDiscoveryHelper
from .helpers.documentParsing import DocumentParsingHelper
from .helpers.pathProcessing import PathProcessingHelper
from .helpers.apiClient import ApiClientHelper

# Import actions
from .actions import findDocumentPath
from .actions import readDocuments
from .actions import uploadDocument
from .actions import listDocuments
from .actions import analyzeFolderUsage
from .actions import findSiteByUrl
from .actions import downloadFileByPath
from .actions import copyFile
from .actions import uploadFile

logger = logging.getLogger(__name__)

class MethodSharepoint(MethodBase):
    """SharePoint operations methods."""
    
    def __init__(self, services):
        super().__init__(services)
        self.name = "sharepoint"
        self.description = "SharePoint operations methods"
        
        # Initialize helper modules
        self.connection = ConnectionHelper(self)
        self.siteDiscovery = SiteDiscoveryHelper(self)
        self.documentParsing = DocumentParsingHelper(self)
        self.pathProcessing = PathProcessingHelper(self)
        self.apiClient = ApiClientHelper(self)
        
        # Register actions as methods
        # Actions werden als Methoden registriert, damit @action Decorator funktioniert
        self.findDocumentPath = findDocumentPath.__get__(self, self.__class__)
        self.readDocuments = readDocuments.__get__(self, self.__class__)
        self.uploadDocument = uploadDocument.__get__(self, self.__class__)
        self.listDocuments = listDocuments.__get__(self, self.__class__)
        self.analyzeFolderUsage = analyzeFolderUsage.__get__(self, self.__class__)
        self.findSiteByUrl = findSiteByUrl.__get__(self, self.__class__)
        self.downloadFileByPath = downloadFileByPath.__get__(self, self.__class__)
        self.copyFile = copyFile.__get__(self, self.__class__)
        self.uploadFile = uploadFile.__get__(self, self.__class__)

methodSharepoint/actions/init.py

# Copyright (c) 2025 Patrick Motsch
# All rights reserved.

# Export all actions
from .findDocumentPath import findDocumentPath
from .readDocuments import readDocuments
from .uploadDocument import uploadDocument
from .listDocuments import listDocuments
from .analyzeFolderUsage import analyzeFolderUsage
from .findSiteByUrl import findSiteByUrl
from .downloadFileByPath import downloadFileByPath
from .copyFile import copyFile
from .uploadFile import uploadFile

__all__ = [
    'findDocumentPath',
    'readDocuments',
    'uploadDocument',
    'listDocuments',
    'analyzeFolderUsage',
    'findSiteByUrl',
    'downloadFileByPath',
    'copyFile',
    'uploadFile',
]

methodSharepoint/actions/findDocumentPath.py (Beispiel)

# Copyright (c) 2025 Patrick Motsch
# All rights reserved.

import logging
import time
import json
from typing import Dict, Any
from modules.workflows.methods.methodBase import action
from modules.datamodels.datamodelChat import ActionResult, ActionDocument

logger = logging.getLogger(__name__)

@action
async def findDocumentPath(self, parameters: Dict[str, Any]) -> ActionResult:
    """
    GENERAL:
    - Purpose: Find documents and folders by name/path across sites.
    - Input requirements: connectionReference (required); searchQuery (required); optional site, maxResults.
    - Output format: JSON with found items and paths.

    Parameters:
    - connectionReference (str, required): Microsoft connection label.
    - site (str, optional): Site hint.
    - searchQuery (str, required): Search terms or path.
    - maxResults (int, optional): Maximum items to return. Default: 1000.
    """
    operationId = None
    try:
        # Init progress logger
        workflowId = self.services.workflow.id if self.services.workflow else f"no-workflow-{int(time.time())}"
        operationId = f"sharepoint_find_{workflowId}_{int(time.time())}"
        
        # Start progress tracking
        parentOperationId = parameters.get('parentOperationId')
        self.services.chat.progressLogStart(
            operationId,
            "Find Document Path",
            "SharePoint Search",
            f"Query: {parameters.get('searchQuery', '*')}",
            parentOperationId=parentOperationId
        )
        
        connectionReference = parameters.get("connectionReference")
        searchQuery = parameters.get("searchQuery")
        siteHint = parameters.get("site")
        maxResults = parameters.get("maxResults", 1000)
        
        if not connectionReference:
            if operationId:
                self.services.chat.progressLogFinish(operationId, False)
            return ActionResult.isFailure(error="Connection reference is required")
        
        if not searchQuery:
            if operationId:
                self.services.chat.progressLogFinish(operationId, False)
            return ActionResult.isFailure(error="Search query is required")
        
        # Get Microsoft connection using helper
        self.services.chat.progressLogUpdate(operationId, 0.2, "Getting Microsoft connection")
        connection = self.connection.getMicrosoftConnection(connectionReference)
        if not connection:
            if operationId:
                self.services.chat.progressLogFinish(operationId, False)
            return ActionResult.isFailure(error="No valid Microsoft connection found")
        
        # Parse search query using helper
        self.services.chat.progressLogUpdate(operationId, 0.3, "Parsing search query")
        siteHintFromQuery, pathQuery, searchText, searchOptions = self.pathProcessing.parseSearchQuery(searchQuery)
        
        # Use site hint from parameter or query
        finalSiteHint = siteHint or siteHintFromQuery
        
        # Discover sites using helper
        self.services.chat.progressLogUpdate(operationId, 0.4, "Discovering SharePoint sites")
        allSites = await self.siteDiscovery.discoverSharePointSites()
        
        # Filter sites by hint if provided
        if finalSiteHint:
            allSites = self.siteDiscovery.filterSitesByHint(allSites, finalSiteHint)
        
        # ... rest of implementation ...
        
        # Generate result
        workflowContext = self.services.chat.getWorkflowContext() if hasattr(self.services, 'chat') else None
        filename = self._generateMeaningfulFileName(
            "sharepoint_find_result",
            "json",
            workflowContext,
            "findDocumentPath"
        )
        
        result = {
            "foundDocuments": foundDocuments,
            "sites": sites,
            "totalCount": len(foundDocuments)
        }
        
        validationMetadata = self._createValidationMetadata(
            "findDocumentPath",
            connectionReference=connectionReference,
            searchQuery=searchQuery,
            siteHint=finalSiteHint,
            resultCount=len(foundDocuments)
        )
        
        document = ActionDocument(
            documentName=filename,
            documentData=json.dumps(result, indent=2),
            mimeType="application/json",
            validationMetadata=validationMetadata
        )
        
        self.services.chat.progressLogFinish(operationId, True)
        return ActionResult.isSuccess(documents=[document])
        
    except Exception as e:
        errorMsg = f"Error finding document path: {str(e)}"
        logger.error(errorMsg)
        if operationId:
            self.services.chat.progressLogFinish(operationId, False)
        return ActionResult.isFailure(error=errorMsg)

Helper-Module Struktur

methodSharepoint/helpers/connection.py

# Copyright (c) 2025 Patrick Motsch
# All rights reserved.

import logging
from typing import Dict, Any, Optional

logger = logging.getLogger(__name__)

class ConnectionHelper:
    """Helper for Microsoft connection management in SharePoint operations"""
    
    def __init__(self, methodInstance):
        """
        Initialize connection helper.
        
        Args:
            methodInstance: Instance of MethodSharepoint (for access to services)
        """
        self.method = methodInstance
        self.services = methodInstance.services
    
    def getMicrosoftConnection(self, connectionReference: str) -> Optional[Dict[str, Any]]:
        """
        Get Microsoft connection from connection reference and configure SharePoint service.
        
        Args:
            connectionReference: Connection reference string
            
        Returns:
            Dict with connection info or None if failed
        """
        try:
            userConnection = self.services.chat.getUserConnectionFromConnectionReference(connectionReference)
            if not userConnection:
                logger.warning(f"No user connection found for reference: {connectionReference}")
                return None
                
            if userConnection.authority.value != "msft":
                logger.warning(f"Connection {userConnection.id} is not Microsoft (authority: {userConnection.authority.value})")
                return None
            
            # Check if connection is active or pending
            if userConnection.status.value not in ["active", "pending"]:
                logger.warning(f"Connection {userConnection.id} status is not active/pending: {userConnection.status.value}")
                return None
            
            # Configure SharePoint service
            if not self.services.sharepoint.setAccessTokenFromConnection(userConnection):
                logger.warning(f"Failed to configure SharePoint service with connection {userConnection.id}")
                return None
            
            logger.info(f"Successfully configured SharePoint service with Microsoft connection: {userConnection.id}")
            
            return {
                "id": userConnection.id,
                "userConnection": userConnection,
                "scopes": ["Sites.ReadWrite.All", "Files.ReadWrite.All", "User.Read"]
            }
        except Exception as e:
            logger.error(f"Error getting Microsoft connection: {str(e)}")
            return None

methodSharepoint/helpers/siteDiscovery.py

# Copyright (c) 2025 Patrick Motsch
# All rights reserved.

import logging
from typing import Dict, Any, List, Optional

logger = logging.getLogger(__name__)

class SiteDiscoveryHelper:
    """Helper for SharePoint site discovery and resolution"""
    
    def __init__(self, methodInstance):
        self.method = methodInstance
        self.services = methodInstance.services
    
    async def discoverSharePointSites(self, limit: Optional[int] = None) -> List[Dict[str, Any]]:
        """
        Discover SharePoint sites accessible to the user via Microsoft Graph API.
        
        Args:
            limit: Optional limit on number of sites to return
            
        Returns:
            List of site information dictionaries
        """
        try:
            endpoint = "sites?search=*"
            if limit:
                endpoint += f"&$top={limit}"
            
            result = await self.method.apiClient.makeGraphApiCall(endpoint)
            
            if "error" in result:
                logger.error(f"Error discovering SharePoint sites: {result['error']}")
                return []
            
            sites = result.get("value", [])
            if limit:
                sites = sites[:limit]
            
            logger.info(f"Discovered {len(sites)} SharePoint sites" + (f" (limited to {limit})" if limit else ""))
            
            # Process and return site information
            processedSites = []
            for site in sites:
                siteInfo = {
                    "id": site.get("id"),
                    "displayName": site.get("displayName"),
                    "webUrl": site.get("webUrl"),
                    "name": site.get("name"),
                    "description": site.get("description")
                }
                processedSites.append(siteInfo)
            
            return processedSites
        except Exception as e:
            logger.error(f"Error discovering SharePoint sites: {str(e)}")
            return []
    
    def filterSitesByHint(self, sites: List[Dict[str, Any]], siteHint: str) -> List[Dict[str, Any]]:
        """
        Filter sites by hint (name, displayName, or webUrl contains hint).
        
        Args:
            sites: List of site dictionaries
            siteHint: Hint string to match against
            
        Returns:
            Filtered list of sites
        """
        if not siteHint:
            return sites
        
        hintLower = siteHint.lower()
        filtered = []
        
        for site in sites:
            displayName = site.get("displayName", "").lower()
            name = site.get("name", "").lower()
            webUrl = site.get("webUrl", "").lower()
            
            if hintLower in displayName or hintLower in name or hintLower in webUrl:
                filtered.append(site)
        
        return filtered
    
    async def getSiteByStandardPath(self, sitePath: str) -> Optional[Dict[str, Any]]:
        """
        Get site by standard path format (hostname/path or /sites/path).
        
        Args:
            sitePath: Site path string
            
        Returns:
            Site dictionary or None if not found
        """
        # Implementation...
        pass
    
    async def getSiteId(self, hostname: str, sitePath: str) -> str:
        """
        Get site ID from hostname and site path.
        
        Args:
            hostname: SharePoint hostname
            sitePath: Site path
            
        Returns:
            Site ID string
        """
        # Implementation...
        pass
    
    async def resolveSitesFromPathQuery(self, pathQuery: str) -> tuple[List[Dict[str, Any]], Optional[str]]:
        """
        Resolve sites from pathQuery using SharePoint service helper methods.
        
        Args:
            pathQuery: Path query string
            
        Returns:
            Tuple of (sites list, error message)
        """
        try:
            isValid, errorMsg = self.services.sharepoint.validatePathQuery(pathQuery)
            if not isValid:
                return [], errorMsg
            
            sites = await self.services.sharepoint.resolveSitesFromPathQuery(pathQuery)
            if not sites:
                return [], "No SharePoint sites found or accessible"
            
            return sites, None
        except Exception as e:
            logger.error(f"Error resolving sites from pathQuery '{pathQuery}': {str(e)}")
            return [], f"Error resolving sites from pathQuery: {str(e)}"

Migration-Plan

Schritt 1: Helper-Module erstellen (Woche 1)

  1. Helper-Kategorien identifizieren:

    • Connection & Authentication
    • Site Discovery & Resolution
    • Document Parsing
    • Path & Query Processing
    • API Communication
  2. Helper-Klassen erstellen:

    • methodSharepoint/helpers/connection.py
    • methodSharepoint/helpers/siteDiscovery.py
    • methodSharepoint/helpers/documentParsing.py
    • methodSharepoint/helpers/pathProcessing.py
    • methodSharepoint/helpers/apiClient.py
  3. Helper-Funktionen migrieren:

    • Funktionen aus methodSharepoint.py in entsprechende Helper-Klassen verschieben
    • self Parameter durch self.method ersetzen
    • Tests schreiben

Schritt 2: Actions extrahieren (Woche 2-3)

  1. Action-Module erstellen:

    • methodSharepoint/actions/findDocumentPath.py
    • methodSharepoint/actions/readDocuments.py
    • methodSharepoint/actions/uploadDocument.py
    • etc.
  2. Actions migrieren:

    • Action-Funktion in separates Modul verschieben
    • Helper-Zugriff über self.helperName anpassen
    • Tests schreiben
  3. Hauptklasse anpassen:

    • Helper initialisieren
    • Actions registrieren
    • Alte Implementierung entfernen

Schritt 3: Folder-Struktur (Woche 4)

  1. Ordner erstellen:

    • methodSharepoint/ Ordner erstellen
    • helpers/ und actions/ Unterordner erstellen
  2. Dateien verschieben:

    • Helper-Module in helpers/
    • Action-Module in actions/
    • Hauptklasse bleibt in methodSharepoint/
  3. Imports anpassen:

    • __init__.py Dateien erstellen
    • Relative Imports anpassen

Schritt 4: Testing & Cleanup (Woche 5)

  1. Tests:

    • Unit-Tests für Helper-Module
    • Integration-Tests für Actions
    • End-to-End Tests für Workflows
  2. Dokumentation:

    • README für methodSharepoint/
    • Helper-Dokumentation
    • Action-Dokumentation
  3. Cleanup:

    • Alte methodSharepoint.py entfernen
    • Unused Imports entfernen
    • Code-Review

Vorteile der neuen Struktur

Vorher (2840 Zeilen in einer Datei)

methodSharepoint.py
├── 16 Helper-Funktionen (verstreut)
├── 9 Actions (200-500 Zeilen each)
└── Schwer zu navigieren, schwer zu testen

Nachher (aufgeteilt)

methodSharepoint/
├── methodSharepoint.py (~100 Zeilen)
├── helpers/
│   ├── connection.py (~100 Zeilen)
│   ├── siteDiscovery.py (~200 Zeilen)
│   ├── documentParsing.py (~150 Zeilen)
│   ├── pathProcessing.py (~200 Zeilen)
│   └── apiClient.py (~100 Zeilen)
└── actions/
    ├── findDocumentPath.py (~300 Zeilen)
    ├── readDocuments.py (~200 Zeilen)
    ├── uploadDocument.py (~200 Zeilen)
    └── ... (weitere Actions, je ~100-300 Zeilen)

Vorteile:

  • Jede Datei < 300 Zeilen (meist < 200)
  • Klare Trennung von Concerns
  • Einfach zu testen
  • Parallele Entwicklung möglich
  • Wiederverwendbare Helper-Module

Kompatibilität

methodDiscovery.py

Die bestehende Discovery-Logik funktioniert weiterhin:

# methodDiscovery.py sucht nach:
for _, name, isPkg in pkgutil.iter_modules(methodsPackage.__path__):
    if not isPkg and name.startswith('method'):
        # Importiert: modules.workflows.methods.methodSharepoint
        # → methodSharepoint/__init__.py wird geladen
        # → MethodSharepoint Klasse wird gefunden
        # → Funktioniert wie bisher!

Anpassung nötig: methodDiscovery.py muss auch Packages (Ordner) erkennen:

# In methodDiscovery.py - discoverMethods() Funktion
for _, name, isPkg in pkgutil.iter_modules(methodsPackage.__path__):
    if name.startswith('method'):
        try:
            if isPkg:
                # Package (Ordner) - importiere __init__.py
                module = importlib.import_module(f'modules.workflows.methods.{name}')
            else:
                # Modul (Datei) - wie bisher (für Rückwärtskompatibilität)
                module = importlib.import_module(f'modules.workflows.methods.{name}')
            
            # Find all classes in the module that inherit from MethodBase
            for itemName, item in inspect.getmembers(module):
                if (inspect.isclass(item) and 
                    issubclass(item, MethodBase) and 
                    item != MethodBase):
                    # ... rest of discovery logic ...
        except Exception as e:
            logger.error(f"Error discovering method {name}: {str(e)}")
            continue

Wichtig: Diese Änderung ist rückwärtskompatibel - bestehende Method-Dateien funktionieren weiterhin.

Struktur-Details pro Method

methodSharepoint (9 Actions, ~16 Helper-Funktionen)

Helper-Kategorien:

  • connection.py - Microsoft Connection Handling
  • siteDiscovery.py - SharePoint Site Discovery & Resolution
  • documentParsing.py - Document List Parsing
  • pathProcessing.py - Path & Query Processing
  • apiClient.py - Microsoft Graph API Calls

Actions: findDocumentPath, readDocuments, uploadDocument, listDocuments, analyzeFolderUsage, findSiteByUrl, downloadFileByPath, copyFile, uploadFile

methodOutlook (4 Actions, ~8 Helper-Funktionen)

Helper-Kategorien:

  • connection.py - Microsoft Connection Handling
  • emailProcessing.py - Email Search, Filtering, Processing
  • folderManagement.py - Folder Operations

Actions: readEmails, searchEmails, composeAndDraftEmailWithContext, sendDraftEmail

methodJira (8 Actions, ~4 Helper-Funktionen)

Helper-Kategorien:

  • connection.py - JIRA Connection Handling
  • adfConverter.py - Atlassian Document Format to Text Conversion
  • documentParsing.py - Document Reference Parsing

Actions: connectJira, exportTicketsAsJson, importTicketsFromJson, mergeTicketData, parseCsvContent, parseExcelContent, createCsvContent, createExcelContent

methodAi (8 Actions, ~1 Helper-Funktion)

Helper-Kategorien:

  • csvProcessing.py - CSV Options Processing

Actions: process, webResearch, summarizeDocument, translateDocument, convert, convertDocument, extractData, generateDocument

methodContext (3 Actions, ~3 Helper-Funktionen)

Helper-Kategorien:

  • documentIndex.py - Document Index Parsing
  • formatting.py - Markdown/Text Formatting

Actions: getDocumentIndex, extractContent, triggerPreprocessingServer

Migration-Plan für alle Methods

Phase 1: methodSharepoint (Pilot) - Woche 1-2

  1. Helper-Module erstellen und migrieren
  2. Actions extrahieren
  3. Folder-Struktur erstellen
  4. Tests schreiben
  5. Dokumentation

Phase 2: methodOutlook - Woche 3-4

  1. Gleiche Struktur wie methodSharepoint
  2. Helper-Module erstellen
  3. Actions extrahieren
  4. Tests schreiben

Phase 3: methodJira - Woche 5-6

  1. Helper-Module erstellen
  2. Actions extrahieren
  3. Tests schreiben

Phase 4: methodAi - Woche 7

  1. Helper-Module erstellen (minimal)
  2. Actions extrahieren
  3. Tests schreiben

Phase 5: methodContext - Woche 8

  1. Helper-Module erstellen
  2. Actions extrahieren
  3. Tests schreiben

Phase 6: Cleanup & Dokumentation - Woche 9

  1. Alte Dateien entfernen
  2. methodDiscovery.py anpassen (Package-Support)
  3. Gesamtdokumentation aktualisieren
  4. Code-Review

Gemeinsame Helper-Funktionen

Analyse: Duplizierte Helper-Funktionen

Gefundene Duplikationen:

  • _format_timestamp_for_filename(): In methodOutlook, methodSharepoint, methodAi dupliziert
  • _getMicrosoftConnection(): In methodOutlook und methodSharepoint dupliziert (aber unterschiedlich implementiert)
  • _createValidationMetadata(): Bereits in MethodBase (gut!)
  • _generateMeaningfulFileName(): Bereits in MethodBase (gut!)

Lösung: Gemeinsame Helper-Module

Struktur für gemeinsame Helper:

gateway/modules/workflows/methods/
├── methodBase.py
├── shared/  # NEU: Gemeinsame Helper für alle Methods
│   ├── __init__.py
│   ├── connection.py  # Microsoft Connection Helper (wenn identisch)
│   └── utils.py  # Gemeinsame Utilities (_format_timestamp, etc.)
├── methodSharepoint/
│   └── ...
└── methodOutlook/
    └── ...

Option 1: Gemeinsame Helper in shared/

  • Wenn Implementierung identisch ist → gemeinsames Modul
  • Wenn unterschiedlich → method-spezifische Helper

Option 2: Helper in MethodBase

  • Für wirklich universelle Helper (wie _createValidationMetadata)
  • Nicht für method-spezifische Logik

Empfehlung:

  • _format_timestamp_for_filename() → In MethodBase verschieben (ist identisch)
  • _getMicrosoftConnection() → Method-spezifisch belassen (unterschiedliche Implementierungen)

Offene Fragen

  1. Action-Registrierung: Soll die Descriptor-Methode (__get__) verwendet werden oder eine andere Lösung?

    • Empfehlung: Descriptor-Methode ist am saubersten und funktioniert mit @action Decorator
  2. Helper-Sharing: Sollen Helper zwischen Methods geteilt werden?

    • Empfehlung: Nur wenn Implementierung identisch ist. Sonst method-spezifisch belassen.
  3. Testing: Wie sollen Helper-Module getestet werden?

    • Empfehlung: Unit-Tests mit Mock methodInstance für Helper-Module, Integration-Tests für Actions