wiki/z-archive/appdoc/mcp_architecture_analysis.md

18 KiB

MCP (Model Context Protocol) Architecture Analysis

MCP Overview

Model Context Protocol (MCP) is an open standard introduced by Anthropic (November 2024) that provides a standardized way for AI systems to interact with external tools, data sources, and systems.

Core MCP Concepts

  1. MCP Server: Provides capabilities (tools, resources, prompts) to AI clients
  2. MCP Client: AI system that uses MCP servers to access external capabilities
  3. Tools: Executable functions that the AI can call (similar to actions)
  4. Resources: Readable data sources (files, databases, APIs)
  5. Prompts: Pre-defined prompt templates with placeholders

MCP Communication Model

  • Protocol: JSON-RPC 2.0 over stdio, HTTP, or WebSocket
  • Request/Response: Client sends requests, server responds
  • Discovery: Client discovers available tools/resources/prompts from server
  • Execution: Client calls tools, reads resources, uses prompts

MCP is Model-Agnostic

Critical Point: MCP is NOT limited to Anthropic's Claude models. It is designed as an open, model-agnostic standard that works with:

  • Anthropic Claude (Claude 3.5 Sonnet, Opus, Haiku)
  • OpenAI GPT (GPT-4, GPT-4 Turbo, GPT-3.5)
  • Google Gemini (Gemini Pro, Gemini Ultra)
  • Other LLM Providers (any provider that supports function calling/tool use)
  • Multi-Model Systems (systems that dynamically select models)

How It Works:

  • MCP servers are independent of the AI model
  • MCP clients can be implemented for any AI provider
  • The protocol standardizes tool/resource interfaces, not model-specific APIs
  • Your dynamic model selection can work through MCP, not instead of it

Your Architecture Compatibility:

  • Dynamic Model Selection: MCP works with your per-call model selection
  • Failover Mechanism: MCP servers don't care which model calls them
  • Model-Aware Chunking: MCP tools receive content, model selection happens before MCP call
  • Operation Type Selection: MCP tools can be selected based on operationType (same as current actions)

Current Architecture vs MCP

Current Architecture

Structure:

User Request
  ↓
WorkflowProcessor
  ↓
Mode (Dynamic/Actionplan)
  ↓
ActionExecutor
  ↓
Methods (methodAi, methodOutlook, methodSharepoint)
  ↓
Actions (process, readEmails, uploadFiles)
  ↓
Services (aiService, chatService, generationService)

Key Characteristics:

  • Action-based: Actions are discovered dynamically via @action decorator
  • Service-oriented: Services provide capabilities (AI, chat, generation, extraction)
  • Workflow-driven: Sequential execution with state management
  • Type-safe: Pydantic models for all parameters/returns
  • Two-stage planning: Stage 1 (action selection) + Stage 2 (parameter generation)

MCP Architecture

Structure:

AI Client (Any LLM: Claude, GPT-4, Gemini, etc.)
  ↓
MCP Client Library (model-agnostic)
  ↓
MCP Server (provides tools/resources/prompts)
  ↓
External System (database, API, file system, etc.)

Key Characteristics:

  • Model-Agnostic: Works with any AI provider (not just Anthropic)
  • Tool-based: Tools are registered with MCP server
  • Resource-based: Resources provide read-only data access
  • Prompt-based: Pre-defined prompts with placeholders
  • Discovery-driven: Client discovers capabilities at runtime
  • Standardized: JSON-RPC protocol for all communication

Compatibility Analysis

Compatible Aspects

  1. Tool/Action Equivalence:

    • MCP ToolsCurrent Actions
    • Both are executable functions with parameters
    • Both support discovery and execution
    • Both can return results
  2. Resource/Document Equivalence:

    • MCP ResourcesCurrent Document References
    • Both provide read-only data access
    • Both support discovery and reading
    • Both can be referenced by URI/identifier
  3. Type Safety:

    • MCP: Uses JSON Schema for tool parameters
    • Current: Uses Pydantic models for action parameters
    • Compatibility: Both provide type validation
  4. Modularity:

    • MCP: Servers are modular and composable
    • Current: Methods are modular and composable
    • Compatibility: Both support modular architecture

⚠️ Incompatible Aspects

  1. Execution Model:

    • MCP: Client-driven (AI decides which tools to call)
    • Current: Workflow-driven (predefined action sequence)
    • Conflict: MCP assumes AI autonomy, current system uses planning
  2. Planning vs Execution:

    • MCP: AI directly calls tools based on user request
    • Current: Two-stage planning (select action → generate parameters → execute)
    • Conflict: MCP doesn't have planning phase
  3. State Management:

    • MCP: Stateless tool calls
    • Current: Stateful workflow (rounds, tasks, actions)
    • Conflict: MCP doesn't track workflow state
  4. Service Dependencies:

    • MCP: Servers are independent
    • Current: Services have dependencies (chatService → aiService)
    • Conflict: MCP assumes flat server structure

Integration Possibilities

Concept: Expose current actions as MCP tools, but keep workflow planning

Architecture:

User Request
  ↓
WorkflowProcessor (planning phase - unchanged)
  ↓
ActionExecutor
  ↓
Model Selection (dynamic - based on operationType, priority, etc.)
  ├─> Select Model: GPT-4, Claude, Gemini, etc. (unchanged)
  ↓
MCP Client (model-agnostic - works with any selected model)
  ↓
MCP Servers (wrapping current methods as tools)
  ├─> MCP Server: "ai" (provides ai.process, ai.webResearch, etc.)
  ├─> MCP Server: "outlook" (provides outlook.readEmails, etc.)
  └─> MCP Server: "sharepoint" (provides sharepoint.uploadFiles, etc.)
  ↓
Methods (unchanged implementation)

Benefits:

  • Keep existing workflow planning
  • Keep dynamic model selection (MCP is model-agnostic)
  • Standardize action interface (MCP tools)
  • Enable external MCP servers (third-party tools)
  • Maintain type safety (MCP JSON Schema ↔ Pydantic)
  • Model selection happens before MCP call (MCP doesn't care which model)

Implementation:

# MCP Server wrapper for methods
class MethodMcpServer:
    """MCP server that exposes method actions as tools (model-agnostic)"""
    
    def __init__(self, method: MethodBase):
        self.method = method
        self.tools = self._discoverTools()
    
    def _discoverTools(self) -> List[McpTool]:
        """Convert @action methods to MCP tools"""
        tools = []
        for actionName, actionInfo in self.method.actions.items():
            tool = McpTool(
                name=f"{self.method.name}.{actionName}",
                description=actionInfo['description'],
                inputSchema=self._pydanticToJsonSchema(actionInfo['parameters'])
            )
            tools.append(tool)
        return tools
    
    async def callTool(self, name: str, arguments: Dict[str, Any]) -> Any:
        """Execute action via MCP tool call (model-agnostic)"""
        # MCP server doesn't know/care which AI model called it
        methodName, actionName = name.split('.', 1)
        return await self.method.actions[actionName]['method'](arguments)

# MCP Client with dynamic model selection
class McpClientWithModelSelection:
    """MCP client that supports dynamic model selection"""
    
    def __init__(self, aiObjects: Any, mcpServers: List[MethodMcpServer]):
        self.aiObjects = aiObjects  # Your existing aiObjects (handles model selection)
        self.mcpServers = mcpServers
    
    async def callTool(
        self, 
        toolName: str, 
        arguments: Dict[str, Any],
        operationType: OperationTypeEnum,
        options: AiCallOptions
    ) -> Any:
        """Call MCP tool with dynamic model selection"""
        # 1. Select model (your existing logic - unchanged)
        selectedModel = self.aiObjects.selectModel(
            operationType=operationType,
            priority=options.priority,
            contentType=options.contentType
        )
        
        # 2. Call MCP tool (model-agnostic - works with any model)
        server = self._findServerForTool(toolName)
        result = await server.callTool(toolName, arguments)
        
        # 3. Model selection and failover handled by aiObjects (unchanged)
        return result

Option 2: MCP as External Tool Integration

Concept: Use MCP to integrate external tools, keep internal actions as-is

Architecture:

User Request
  ↓
WorkflowProcessor
  ↓
ActionExecutor
  ├─> Internal Actions (unchanged - methodAi, methodOutlook, etc.)
  └─> External MCP Tools (new - via MCP client)
      ├─> MCP Server: "slack" (external)
      ├─> MCP Server: "github" (external)
      └─> MCP Server: "database" (external)

Benefits:

  • Keep existing architecture unchanged
  • Add external tools via MCP
  • Standardized interface for external integrations

Implementation:

# Add MCP client to ActionExecutor
class ActionExecutor:
    def __init__(self, services, mcpClients: List[McpClient]):
        self.services = services
        self.mcpClients = mcpClients  # External MCP servers
    
    async def executeAction(self, action: str, parameters: Dict):
        # Check if action is internal
        if '.' in action and action.split('.')[0] in methods:
            return await self._executeInternalAction(action, parameters)
        
        # Check if action is external MCP tool
        for mcpClient in self.mcpClients:
            if mcpClient.hasTool(action):
                return await mcpClient.callTool(action, parameters)
        
        raise ValueError(f"Unknown action: {action}")

Option 3: Hybrid Approach (Best of Both)

Concept: Internal actions remain direct, external tools via MCP, unified interface

Architecture:

User Request
  ↓
WorkflowProcessor
  ↓
ActionExecutor
  ├─> Internal Actions (direct method calls - fast, type-safe)
  └─> External Tools (MCP client - standardized, extensible)

Benefits:

  • Keep internal actions fast (no MCP overhead)
  • Standardize external tool integration
  • Unified action interface for planning

Detailed Comparison

Tool/Action Discovery

MCP:

{
  "tools": [
    {
      "name": "read_file",
      "description": "Read a file",
      "inputSchema": {
        "type": "object",
        "properties": {
          "path": {"type": "string"}
        }
      }
    }
  ]
}

Current:

@action
async def process(parameters: AiProcessParameters) -> ActionResult:
    """AI processing action"""
    # Action discovered via @action decorator
    # Parameters: Pydantic model (AiProcessParameters)

Compatibility: High - Both support discovery and type validation


Tool/Action Execution

MCP:

{
  "method": "tools/call",
  "params": {
    "name": "read_file",
    "arguments": {"path": "/tmp/file.txt"}
  }
}

Current:

result = await executeAction(
    methodName="ai",
    actionName="process",
    selection=ActionDefinition(
        action="ai.process",
        parameters={"aiPrompt": "...", "contentParts": [...]}
    )
)

Compatibility: High - Both execute functions with parameters


Resource/Document Access

MCP:

{
  "resources": [
    {
      "uri": "file:///tmp/document.pdf",
      "name": "Document PDF",
      "mimeType": "application/pdf"
    }
  ]
}

Current:

documentList = DocumentReferenceList([
    DocumentListReference(label="task1_results"),
    DocumentItemReference(documentId="doc_123")
])

Compatibility: ⚠️ Medium - Different reference models, but both provide data access


Planning vs Direct Execution

MCP:

User: "Read file X and summarize it"
  ↓
AI: [Discovers tools] → [Calls read_file] → [Calls summarize]
  ↓
Result: Summary

Current:

User: "Read file X and summarize it"
  ↓
Planning: [Stage 1: Select action] → [Stage 2: Generate parameters]
  ↓
Execution: [Execute action with parameters]
  ↓
Result: Summary

Compatibility: ⚠️ Low - Different execution models


Integration Strategy

Why:

  1. Standardization: Actions become MCP tools (standard interface)
  2. Extensibility: Can add external MCP servers easily
  3. Compatibility: Keep existing workflow planning
  4. Type Safety: MCP JSON Schema ↔ Pydantic models

Implementation Steps:

  1. Create MCP Server Wrapper:

    class MethodMcpServer:
        """Wraps method actions as MCP tools"""
        def __init__(self, method: MethodBase):
            self.method = method
    
        def getTools(self) -> List[McpTool]:
            """Convert @action methods to MCP tools"""
            # Discover actions via @action decorator
            # Convert Pydantic models to JSON Schema
            # Return MCP tool definitions
    
  2. Create MCP Client in ActionExecutor:

    class ActionExecutor:
        def __init__(self, services, mcpServers: List[MethodMcpServer]):
            self.services = services
            self.mcpServers = mcpServers
    
        async def executeAction(self, action: str, parameters: Dict):
            # Find MCP server that provides this tool
            server = self._findServerForAction(action)
            return await server.callTool(action, parameters)
    
  3. Convert Pydantic to JSON Schema:

    def pydanticToJsonSchema(model: Type[BaseModel]) -> Dict:
        """Convert Pydantic model to JSON Schema for MCP"""
        # Use pydantic-to-json-schema library
        return json_schema(model)
    
  4. Keep Workflow Planning:

    • Stage 1/Stage 2 planning remains unchanged
    • Actions are discovered via MCP tool discovery
    • Parameters validated via MCP JSON Schema

Benefits of MCP Integration

1. Standardization

  • Actions become standard MCP tools
  • Consistent interface across all actions
  • Standardized error handling

2. Extensibility

  • Easy to add external MCP servers
  • Third-party tools can be integrated
  • No custom integration code needed

3. Interoperability

  • Compatible with other MCP clients
  • Can be used by external AI systems
  • Standard protocol (JSON-RPC)

4. Tool Discovery

  • Dynamic tool discovery
  • Runtime capability detection
  • No hardcoded action lists

Challenges and Considerations

1. Planning vs Direct Execution

  • Challenge: MCP assumes AI directly calls tools, current system uses planning
  • Solution: Keep planning phase, use MCP for execution only

2. State Management

  • Challenge: MCP is stateless, current system is stateful
  • Solution: State management remains in WorkflowProcessor, MCP tools are stateless

3. Type Conversion

  • Challenge: Pydantic models ↔ JSON Schema conversion
  • Solution: Use existing libraries (pydantic-to-json-schema)

4. Performance

  • Challenge: MCP adds JSON-RPC overhead
  • Solution: Option 3 (hybrid) - internal actions direct, external via MCP

Dynamic Model Selection with MCP

How MCP Works with Multiple Models

Key Insight: MCP is completely model-agnostic. The protocol standardizes the interface between AI systems and tools, not the AI model itself.

Your Current Architecture:

# Current: Dynamic model selection per call
selectedModel = aiObjects.selectModel(
    operationType=OperationTypeEnum.DATA_EXTRACT,
    priority=options.priority,
    contentType=options.contentType
)
result = await aiObjects.call(request, selectedModel)

With MCP:

# MCP: Model selection happens BEFORE MCP call
selectedModel = aiObjects.selectModel(
    operationType=OperationTypeEnum.DATA_EXTRACT,
    priority=options.priority,
    contentType=options.contentType
)

# MCP tool call (model-agnostic - doesn't care which model)
mcpResult = await mcpClient.callTool(
    toolName="ai.process",
    arguments={"aiPrompt": "...", "contentParts": [...]},
    selectedModel=selectedModel  # Passed for logging/tracking, not protocol requirement
)

Important Points:

  1. MCP servers don't know which model calls them - they're model-agnostic
  2. Model selection happens in your system - before MCP tool call
  3. Failover works the same way - if model fails, select next model, retry MCP call
  4. Model-aware chunking - happens before MCP call (chunking is your system's concern)
  5. MCP just standardizes tool interface - doesn't care about model selection logic

Model Selection Flow with MCP

User Request
  ↓
WorkflowProcessor
  ↓
ActionExecutor
  ↓
[Model Selection Logic - YOUR SYSTEM]
  ├─> Select model based on:
  │   - operationType (DATA_EXTRACT, DOCUMENT_GENERATE, etc.)
  │   - priority (high, medium, low)
  │   - contentType (text, image, etc.)
  │   - Model capabilities (contextLength, maxTokens)
  │   - Failover chain (if first model fails)
  ↓
[MCP Tool Call - MODEL-AGNOSTIC]
  ├─> Call MCP tool with selected model
  ├─> MCP server executes tool (doesn't care which model)
  └─> Return result
  ↓
[If Model Fails]
  ├─> Select next model from failover chain
  └─> Retry MCP tool call with new model

Conclusion

Compatibility Assessment

Overall: Highly Compatible with some architectural adaptations

Key Findings:

  1. Actions ≈ MCP Tools: Direct mapping possible
  2. Documents ≈ MCP Resources: Similar concepts
  3. Dynamic Model Selection: MCP is model-agnostic - works with any model
  4. Multi-Provider Support: MCP works with OpenAI, Anthropic, Google, etc.
  5. ⚠️ Planning Phase: MCP doesn't have planning, but can be kept
  6. ⚠️ State Management: MCP is stateless, but state can remain in workflow

Option 1: MCP as Action Backend (Recommended)

  • Expose actions as MCP tools
  • Keep workflow planning unchanged
  • Enable external MCP server integration
  • Maintain type safety

Benefits:

  • Standardized action interface
  • Easy external tool integration
  • Compatible with MCP ecosystem
  • Minimal changes to existing architecture

Next Steps:

  1. Create MCP server wrapper for methods
  2. Convert Pydantic models to JSON Schema
  3. Add MCP client to ActionExecutor
  4. Test with external MCP servers