cursor-mode-integration-mvp

2026-02-23 17:13:26 +01:00 · 2026-02-23 17:13:26 +01:00 · 040067f6af
commit 040067f6af
parent d8c0ba9237
3 changed files with 1736 additions and 0 deletions
--- a/cursor-doc/doc_cursor_ai_agent_architecture.md
+++ b/cursor-doc/doc_cursor_ai_agent_architecture.md
@ -0,0 +1,633 @@
+# Cursor AI Agent Architecture - Agent & Planning Mode
+
+**Stand**: Februar 2026
+**Zweck**: Technische Dokumentation der AI-Agenten-Integration in Cursor (VS Code Fork) als Grundlage fuer die Entwicklung einer eigenstaendigen Chat-Komponente mit API-Zugang
+
+---
+
+## 1. Ueberblick
+
+Cursor ist ein Fork von Visual Studio Code, erweitert um tiefgreifende AI-Faehigkeiten. Im Gegensatz zu VS Code Extensions, die durch die Extension-API limitiert sind, hat Cursor als echter Fork vollen Zugriff auf den Rendering-Pipeline, den AST-Level und die Language Server Infrastructure.
+
+Die AI-Integration besteht aus zwei Hauptkomponenten:
+
+1. **Client (Body)**: Die Electron/VS Code-basierte Desktop-Applikation mit Custom-UI-Elementen (Chat-Sidebar, Composer-Panel, Inline-Completions)
+2. **Backend (Brain)**: Cloud-Services fuer LLM-Orchestrierung, Prompt-Konstruktion, Codebase-Indexierung und Embedding-Storage
+
+```
+User Interaction
+  ↓
+Cursor Client (Electron / VS Code Fork)
+  ├─> Chat Panel UI (Custom Webview)
+  ├─> Composer Panel (Multi-File Editor)
+  ├─> Inline Completions (Tab)
+  └─> Cmd/Ctrl+K (Inline Edit)
+        ↓
+  Context Assembly Layer
+  ├─> Open Files / Cursor Position
+  ├─> Semantic Search (RAG)
+  ├─> AST / Language Server Data
+  └─> Conversation History
+        ↓
+  Cursor Backend (Cloud)
+  ├─> Prompt Construction
+  ├─> Model Routing (GPT / Claude / Gemini / Custom)
+  ├─> Tool Orchestration
+  └─> Response Streaming
+        ↓
+  Response Processing (Client)
+  ├─> Text Rendering (Markdown)
+  ├─> Code Diff Application
+  ├─> Tool Call Execution
+  └─> Plan/Todo Management
+```
+
+---
+
+## 2. Interaction Modes
+
+Cursor bietet vier Modi, die ueber `Ctrl+.` gewechselt werden koennen:
+
+### 2.1 Agent Mode (Standard-Ausfuehrungsmodus)
+
+**Zweck**: Autonomes Implementieren, Refactoring, Bug-Fixing
+
+**Faehigkeiten**:
+- Multi-File Exploration und Editing
+- Terminal-Kommando-Ausfuehrung (sandboxed)
+- Codebase-Suche (semantisch + grep)
+- Iterative Verfeinerung mit Compiler/Linter-Feedback
+- Parallele Tool-Aufrufe fuer Geschwindigkeit
+
+**Ablauf**:
+```
+User Prompt
+  ↓
+Context Assembly
+  ├─> user_info (OS, Shell, Workspace)
+  ├─> project_layout (File Tree Snapshot)
+  ├─> open_files / cursor_position
+  └─> edit_history / linter_errors
+        ↓
+System Prompt Injection
+  ├─> Role Definition ("AI coding assistant, pair programmer")
+  ├─> Communication Guidelines (Markdown, Code Citations)
+  ├─> Tool Calling Rules (15+ Tools)
+  ├─> Context Strategy (Parallel Execution, Thorough Exploration)
+  └─> Code Change Guidelines (Read before Edit, Fix Lints)
+        ↓
+LLM Request (POST /v1/chat/completions)
+  ├─> model: "claude-4.6-opus" | "gpt-5.2" | ...
+  ├─> temperature: 0
+  ├─> messages: [system, user_context, user_query]
+  ├─> tools: [15+ tool definitions]
+  ├─> tool_choice: "auto"
+  └─> stream: true
+        ↓
+Streaming Response
+  ├─> Text Chunks (Markdown)
+  ├─> Tool Calls (JSON)
+  │     ├─> Tool Execution (Client-Side)
+  │     └─> Tool Results → Next LLM Turn
+  └─> Completion
+```
+
+### 2.2 Plan Mode (Planungsmodus)
+
+**Zweck**: Komplexe Features zerlegen, Codebase analysieren, Implementierungsplaene erstellen
+
+**Aktivierung**: `Shift+Tab` im Agent Input oder `Ctrl+.` → Plan
+
+**Kernmerkmal**: Read-Only - es werden keine Aenderungen am Code vorgenommen
+
+**System Reminder bei Plan Mode**:
+```
+Plan mode is active. The user indicated that they do not want you to execute
+yet -- you MUST NOT make any edits, run any non-readonly tools (including
+changing configs or making commits), or otherwise make any changes to the
+system.
+```
+
+**Plan-Mode-Workflow**:
+```
+User beschreibt komplexe Aufgabe
+  ↓
+Codebase Research (Read-Only)
+  ├─> Semantic Search nach relevanten Dateien
+  ├─> File Reading fuer Kontextverstaendnis
+  └─> Grep fuer spezifische Symbole
+        ↓
+Klaerende Fragen an User
+  ├─> Anforderungen praezisieren
+  ├─> Implementierungsvarianten abklaeren
+  └─> Scope eingrenzen
+        ↓
+Plan-Erstellung (create_plan Tool)
+  ├─> Markdown-Dokument mit Datei-Referenzen
+  ├─> Code-Beispiele und Snippets
+  ├─> Strukturierte Todos mit IDs und Abhaengigkeiten
+  └─> Uebersicht (1-2 Saetze)
+        ↓
+User Review & Inline-Editing
+  ├─> Plan direkt im Editor bearbeitbar
+  ├─> Todos hinzufuegen/entfernen
+  └─> Bei Zufriedenheit: Ausfuehrung starten
+```
+
+**Plan-Datenmodell** (`create_plan` Tool):
+```json
+{
+  "name": "Kurzer Plan-Name (3-4 Worte)",
+  "overview": "1-2 Saetze Zusammenfassung",
+  "plan": "# Plan Title\n\nMarkdown-Inhalt mit Dateireferenzen...",
+  "todos": [
+    {
+      "id": "setup-auth",
+      "content": "Auth-Middleware implementieren",
+      "dependencies": []
+    },
+    {
+      "id": "implement-ui",
+      "content": "Login-Formular erstellen",
+      "dependencies": ["setup-auth"]
+    }
+  ]
+}
+```
+
+### 2.3 Ask Mode (Frage-Modus)
+
+**Zweck**: Code verstehen, Fragen beantworten ohne Aenderungen
+**Einschraenkung**: Read-Only, keine Write-Tools verfuegbar
+
+### 2.4 Debug Mode (Fehlerbehebungs-Modus)
+
+**Zweck**: Systematische Fehleranalyse mit Hypothesen, Log-Instrumentierung, Runtime-Analyse
+
+---
+
+## 3. Communication Protocol
+
+### 3.1 Transport Layer
+
+Cursor kommuniziert mit dem Backend ueber **ConnectRPC** (gRPC-Web Variante) mit **Protocol Buffers (Protobuf)** als Serialisierungsformat.
+
+**Primaerer Endpoint**: `https://api2.cursor.sh`
+
+**Encoding**: HTTP/2 mit binaerer Protobuf-Kodierung im Envelope-Format:
+```
+[type:1 byte][length:4 bytes Big-Endian][payload:N bytes]
+```
+
+### 3.2 Protobuf-Schema (Reverse Engineered)
+
+**Service**: `aiserver.v1.ChatService`
+
+**Kern-Messages**:
+```protobuf
+message GetChatRequest {
+  ModelDetails modelDetails = 1;
+  repeated ConversationMessage conversation = 2;
+}
+
+message ModelDetails {
+  optional string modelName = 1;
+}
+
+message ConversationMessage {
+  string text = 1;
+  MessageType type = 2;
+
+  enum MessageType {
+    MESSAGE_TYPE_HUMAN = 0;
+    MESSAGE_TYPE_AI = 1;
+  }
+}
+```
+
+**Streaming-Methode**: `StreamChat()` bzw. `StreamUnifiedChatWithTools()`
+- Antworten kommen als iterativer Stream
+- Jeder Chunk enthaelt ein `text`-Feld mit partiellem Content
+
+### 3.3 OpenAI-kompatibles Chat Completions API
+
+Parallel zum Protobuf-Protokoll nutzt Cursor intern das **OpenAI Chat Completions Format**:
+
+```
+POST /v1/chat/completions
+```
+
+**Request-Struktur**:
+```json
+{
+  "model": "claude-4.6-opus",
+  "temperature": 0,
+  "user": "github|user_ID",
+  "messages": [
+    {
+      "role": "system",
+      "content": "[System Prompt mit allen Instruktionen]"
+    },
+    {
+      "role": "user",
+      "content": "<user_info>...</user_info>\n<project_layout>...</project_layout>"
+    },
+    {
+      "role": "user",
+      "content": "<user_query>...</user_query>\n<system_reminder>...</system_reminder>"
+    }
+  ],
+  "tools": [ /* 15+ Tool-Definitionen */ ],
+  "tool_choice": "auto",
+  "stream": true
+}
+```
+
+**Kontext-Injection im User-Message**:
+
+| Tag | Inhalt |
+|---|---|
+| `<user_info>` | OS, Datum, Shell, Workspace-Pfad |
+| `<project_layout>` | Dateibaum-Snapshot (einmalig zu Gespraechsbeginn) |
+| `<user_query>` | Der eigentliche User-Prompt |
+| `<system_reminder>` | Dynamische Modus-Instruktionen (z.B. Plan Mode aktiv) |
+| `<open_and_recently_viewed_files>` | Offene und kuerzlich angesehene Dateien |
+
+---
+
+## 4. System Prompt Architektur
+
+Der System Prompt ist das Herzstrueck der AI-Steuerung. Er wird vom Cursor-Backend zusammengebaut und besteht aus folgenden Bloecken:
+
+### 4.1 Prompt-Bloecke
+
+```
+System Prompt Assembly
+  ├─> Role Definition
+  │     "You are an AI coding assistant, powered by [model]. You operate in Cursor."
+  │     "You are pair programming with a USER to solve their coding task."
+  │
+  ├─> <communication>
+  │     Markdown-Formatierung, Code-Citations, Math-Notation
+  │
+  ├─> <tool_calling>
+  │     Regeln fuer Tool-Nutzung: Nicht Tool-Namen erwaehnen, Standard-Format nutzen
+  │
+  ├─> <maximize_parallel_tool_calls>
+  │     Unabhaengige Tool-Calls parallel ausfuehren
+  │
+  ├─> <maximize_context_understanding>
+  │     Gruendliche Exploration, Symbol-Tracing, Semantic Search Strategie
+  │
+  ├─> <making_code_changes>
+  │     Read before Edit, Dependency Files, Linter-Fixes
+  │
+  ├─> <citing_code>
+  │     CODE REFERENCES (startLine:endLine:filepath) vs MARKDOWN CODE BLOCKS
+  │
+  ├─> <task_management>
+  │     todo_write Tool fuer komplexe Tasks
+  │
+  ├─> Workspace Rules (.cursor/rules/)
+  │     Projekt-spezifische Coding-Konventionen
+  │
+  ├─> User Rules
+  │     Benutzerdefinierte Praeferenzen
+  │
+  └─> MCP Instructions
+        Konfigurierte MCP-Server und deren Nutzungshinweise
+```
+
+### 4.2 Mode-spezifische Anpassungen
+
+Der System Prompt wird je nach aktivem Mode modifiziert:
+
+| Mode | Zusatz-Instruktionen |
+|---|---|
+| **Agent** | Volle Tool-Palette, Edit-Berechtigung, Terminal-Zugang |
+| **Plan** | Read-Only Reminder, `create_plan` Tool verfuegbar, keine Edits |
+| **Ask** | Read-Only, eingeschraenkte Tools, Erklaer-Fokus |
+| **Debug** | Hypothesen-getriebene Analyse, Log-Instrumentierung |
+
+---
+
+## 5. Tool-System
+
+Cursor stellt dem LLM 15+ spezialisierte Tools zur Verfuegung, definiert im OpenAI Function Calling Format.
+
+### 5.1 Tool-Kategorien
+
+**Code-Verstaendnis**:
+- `codebase_search` / `SemanticSearch` - Semantische Suche nach Bedeutung
+- `grep` / `Grep` - Exakte Text/Regex-Suche (basiert auf ripgrep)
+- `read_file` / `Read` - Dateien lesen mit optionalem Offset/Limit
+- `glob_file_search` / `Glob` - Dateien nach Muster finden
+- `list_dir` - Verzeichnisinhalt auflisten
+
+**Code-Aenderung**:
+- `edit_file` - Intelligentes Code-Editing mit `// ... existing code ...` Markern
+- `StrReplace` - Exakte String-Ersetzungen
+- `Write` - Neue Dateien erstellen / ueberschreiben
+- `delete_file` / `Delete` - Dateien loeschen
+- `edit_notebook` / `EditNotebook` - Jupyter Notebook Zellen bearbeiten
+
+**Ausfuehrung**:
+- `run_terminal_cmd` / `Shell` - Sandboxed Terminal-Kommandos
+- `web_search` / `WebSearch` - Web-Recherche
+- `WebFetch` - URL-Inhalte abrufen
+
+**Projekt-Management**:
+- `todo_write` / `TodoWrite` - Task-Tracking mit Status
+- `create_plan` - Strukturierte Planungsdokumente
+
+**Linting**:
+- `read_lints` / `ReadLints` - Linter-Fehler auslesen
+
+**Subagenten**:
+- `Task` - Spezialisierte Sub-Agenten starten (generalPurpose, explore, browser-use)
+- `SwitchMode` - Modus wechseln
+
+### 5.2 Tool-Definition Format
+
+Jedes Tool wird als OpenAI Function Definition spezifiziert:
+
+```json
+{
+  "type": "function",
+  "function": {
+    "name": "codebase_search",
+    "description": "semantic search that finds code by meaning...",
+    "parameters": {
+      "type": "object",
+      "properties": {
+        "query": {
+          "type": "string",
+          "description": "A complete question about what you want to understand"
+        },
+        "target_directories": {
+          "type": "array",
+          "items": { "type": "string" }
+        },
+        "explanation": {
+          "type": "string",
+          "description": "Why this tool is being used"
+        }
+      },
+      "required": ["query", "target_directories", "explanation"]
+    }
+  }
+}
+```
+
+### 5.3 Tool-Aufruf und Antwort-Zyklus
+
+```
+LLM generiert Tool-Call
+  ↓
+{
+  "tool_calls": [{
+    "id": "call_abc123",
+    "type": "function",
+    "function": {
+      "name": "read_file",
+      "arguments": "{\"target_file\": \"src/main.py\"}"
+    }
+  }]
+}
+  ↓
+Client fuehrt Tool aus (lokal)
+  ↓
+Tool-Result wird als neue Message eingefuegt
+  ↓
+{
+  "role": "tool",
+  "tool_call_id": "call_abc123",
+  "content": "1|import os\n2|import sys\n..."
+}
+  ↓
+Naechster LLM-Turn mit Tool-Result im Kontext
+  ↓
+LLM generiert Text-Antwort oder weitere Tool-Calls
+```
+
+### 5.4 Parallele Tool-Ausfuehrung
+
+Cursor optimiert auf parallele Tool-Calls. Wenn keine Abhaengigkeiten bestehen, werden mehrere Tools gleichzeitig aufgerufen:
+
+```
+Beispiel: "Fuege Authentication hinzu"
+  ↓
+Parallel:
+  ├─> codebase_search("How does authentication work?")
+  ├─> grep("auth|login|session")
+  ├─> read_file("src/app.py")
+  └─> web_search("best practices authentication 2026")
+        ↓
+Sequential (abhaengig von Ergebnissen):
+  ├─> read_file("src/middleware/auth.py")  [gefunden durch search]
+  └─> edit_file("src/routes/login.py")    [basierend auf Kontext]
+```
+
+---
+
+## 6. Codebase-Indexierung und RAG
+
+### 6.1 Indexierungs-Pipeline
+
+```
+Projekt oeffnen
+  ↓
+Background Scanning
+  ├─> Dateien in Chunks aufteilen (tree-sitter fuer AST-basiertes Splitting)
+  ├─> Chunks: ~einige hundert Tokens, an logischen Grenzen (Funktionen, Klassen)
+  └─> .gitignore / .cursorignore respektieren
+        ↓
+Embedding-Berechnung
+  ├─> OpenAI Embedding Model oder Custom Model
+  └─> Vektor-Repraesentation des semantischen Inhalts
+        ↓
+Vektor-Datenbank (Backend)
+  ├─> Embeddings mit Metadaten (Datei, Zeilen, Chunk-ID)
+  └─> Nearest-Neighbor-Suche fuer Queries
+```
+
+### 6.2 Retrieval-Augmented Generation (RAG)
+
+```
+User-Query
+  ↓
+Query → Embedding-Vektor
+  ↓
+Nearest-Neighbor-Suche in Vektor-DB
+  ↓
+Top-K relevante Code-Chunks
+  ├─> Dateiname + Zeilennummern
+  └─> Code-Inhalt
+        ↓
+In LLM-Prompt eingebettet als Kontext
+  ↓
+LLM generiert Antwort basierend auf Projekt-Kontext
+```
+
+---
+
+## 7. Shadow Workspace
+
+Cursor implementiert ein **Shadow Workspace** Konzept: ein verstecktes Electron-Fenster, das das Projekt spiegelt.
+
+**Zweck**:
+- AI-generierte Code-Aenderungen sicher testen bevor sie dem User praesentiert werden
+- Language Server Feedback (Type-Checking, Linting) auf vorgeschlagene Aenderungen anwenden
+- Selbst-Verbesserungsschleife: Fehler werden zurueck an das LLM gegeben
+
+**Ablauf**:
+```
+AI generiert Code-Aenderung
+  ↓
+Shadow Workspace (verstecktes Fenster)
+  ├─> Aenderungen anwenden (nicht in echten Dateien)
+  ├─> Language Server pruefen (TypeScript, Python, etc.)
+  └─> Diagnostics sammeln
+        ↓
+[Wenn Fehler] → Feedback an LLM → Neuer Versuch
+[Wenn OK]     → Aenderung dem User praesentieren
+```
+
+---
+
+## 8. Subagenten-Architektur
+
+Seit Januar 2026 unterstuetzt Cursor das Starten spezialisierter Subagenten:
+
+### 8.1 Subagent-Typen
+
+| Typ | Zweck | Faehigkeiten |
+|---|---|---|
+| `generalPurpose` | Komplexe, mehrstufige Tasks | Voller Tool-Zugang, Code-Suche, Ausfuehrung |
+| `explore` | Schnelle Codebase-Exploration | Glob, Grep, Read - optimiert fuer Geschwindigkeit |
+| `browser-use` | Browser-basiertes Testing | Navigation, Interaktion, Screenshots |
+
+### 8.2 Subagent-Kommunikation
+
+```
+Parent Agent
+  ├─> Task(prompt="...", subagent_type="explore")
+  │     ├─> Subagent erhaelt eigenen Kontext
+  │     ├─> Subagent fuehrt Tools aus
+  │     └─> Subagent gibt Ergebnis zurueck (einzelne Message)
+  │
+  └─> Bis zu 4 Subagenten parallel
+```
+
+**Wichtig**: Subagenten haben keinen Zugriff auf den Chat-Verlauf des Parent-Agenten. Der Prompt muss alle noetige Information enthalten.
+
+---
+
+## 9. Background Agent API
+
+Cursor bietet seit 2026 eine **Background Agent API** fuer programmatischen Zugriff:
+
+### 9.1 Authentifizierung
+
+Ueber `WorkosCursorSessionToken` (Cookie/Environment Variable)
+
+### 9.2 Endpunkte
+
+| Endpunkt | Methode | Beschreibung |
+|---|---|---|
+| Composer erstellen | POST | Neuen Background Composer mit Task-Beschreibung starten |
+| Composer auflisten | GET | Alle laufenden Background Composer abrufen |
+| Composer Details | GET | Status und Details eines spezifischen Composers |
+| Settings | GET | User-Einstellungen abrufen |
+
+### 9.3 Programmatischer Zugriff (TypeScript)
+
+```typescript
+import { CursorAPIClient } from 'cursor-api-client';
+
+const client = new CursorAPIClient('session-token');
+
+const result = await client.createBackgroundComposer({
+  taskDescription: 'Add user authentication',
+  repositoryUrl: 'https://github.com/user/repo.git',
+  branch: 'main',
+  model: 'claude-4-sonnet-thinking'
+});
+
+const composers = await client.listComposers();
+```
+
+### 9.4 Direkter RPC-Zugriff (Go)
+
+```go
+creds, _ := cursor.GetDefaultCredentials()
+aiService := cursor.NewAiServiceClient()
+
+model := "gpt-4"
+resp, _ := aiService.StreamChat(context.TODO(), cursor.NewRequest(creds, &aiserverv1.GetChatRequest{
+    ModelDetails: &aiserverv1.ModelDetails{
+        ModelName: &model,
+    },
+    Conversation: []*aiserverv1.ConversationMessage{
+        {
+            Text: "Hello, who are you?",
+            Type: aiserverv1.ConversationMessage_MESSAGE_TYPE_HUMAN,
+        },
+    },
+}))
+
+for resp.Receive() {
+    next := resp.Msg()
+    fmt.Printf(next.Text)
+}
+```
+
+---
+
+## 10. Model-Routing
+
+Cursor verwendet verschiedene Modelle je nach Aufgabentyp:
+
+| Aufgabe | Modell-Typ | Beispiele |
+|---|---|---|
+| Chat / komplexe Analyse | Frontier LLM | Claude 4.6 Opus, GPT-5.2/5.3, Gemini 3 Pro |
+| Tab Completion | Custom Fast Model | "Copilot++" (basierend auf Llama 70B) |
+| Fast Apply (Diffs) | Spezialisiertes Fine-Tuned Model | Optimiert fuer Code-Diffs, >1000 Tokens/s |
+| Embeddings | Embedding Model | OpenAI Embedding oder Custom |
+
+**Speculative Decoding**: Ein kleineres "Draft"-Modell generiert mehrere Token voraus, die das Hauptmodell parallel verifiziert → signifikant schnellere Inferenz.
+
+---
+
+## 11. Sicherheitsmodell
+
+### 11.1 Terminal Sandbox
+
+Kommandos laufen standardmaessig in einer Sandbox:
+- Erlaubt: Schreiben im Workspace, Lesen im Dateisystem
+- Blockiert: Netzwerk, Git-Write, ignorierte Dateien, USB
+- Erweiterbar: `network`, `git_write`, `all` Permissions
+
+### 11.2 Privacy Mode
+
+- Wenn aktiviert: Kein Code wird nach Erfuellung des Requests gespeichert
+- Wenn deaktiviert: Anonymisierte Telemetrie, kein langfristiges Code-Speichern
+
+### 11.3 Rules System
+
+- `.cursor/rules/` - Projekt-spezifische AI-Verhaltensregeln
+- User Rules - Globale Praeferenzen
+- `<system_reminder>` Tags - Dynamische Instruktionen (nie sichtbar fuer User)
+
+---
+
+## Quellen und Referenzen
+
+- Cursor Offizielle Dokumentation: https://docs.cursor.com
+- Cursor Blog - Plan Mode: https://cursor.com/blog/plan-mode
+- Cursor Blog - Dynamic Context Discovery: https://cursor.com/blog/dynamic-context-discovery
+- Cursor Blog - Codex Model Harness: https://cursor.com/blog/codex-model-harness
+- Reverse-Engineered RPC Client: https://github.com/everestmz/cursor-rpc
+- Background Agent API Client: https://github.com/mjdierkes/cursor-background-agent-api
+- Architecture Analysis: https://adityarohilla.com/2025/05/08/how-cursor-works-internally/
+- System Prompt Analysis: https://medium.com/@lakkannawalikar/cursor-ai-architecture-system-prompts-and-tools-deep-dive-77f44cb1c6b0
--- a/cursor-doc/doc_cursor_chat_api_integration_concept.md
+++ b/cursor-doc/doc_cursor_chat_api_integration_concept.md
@ -0,0 +1,377 @@
+# Cursor Chat-Komponente als API - Integrationskonzept
+
+**Stand**: Februar 2026
+**Zweck**: Konzept zur Extraktion der Chat-Funktionalitaet aus Cursor als eigenstaendige, API-gesteuerte Komponente
+
+---
+
+## 1. Zielsetzung
+
+Die Chat-Funktionalitaet von Cursor soll als eigenstaendige Komponente ueber ein API nutzbar sein, ohne den Cursor selbst zu gefaehrden. Das Ziel:
+
+- User-Prompt in Chat-Fenstern als Komponente ueber API nutzen
+- Antworten als JSON-Objekt erhalten, in welchem Antwortteile als einzelne Dokumente erfasst sind
+- Neue Funktionen zuerst isoliert testen, bevor sie in den Cursor integriert werden
+
+---
+
+## 2. Architektur-Optionen
+
+### Option A: Cursor Backend direkt ansprechen (Recommended)
+
+Cursor's Backend ist ueber zwei Protokolle erreichbar:
+
+**Variante 1: ConnectRPC / Protobuf**
+```
+Eigene App → ConnectRPC Client → api2.cursor.sh → LLM Response Stream
+```
+
+- Verwendet `aiserver.v1.ChatService`
+- Methode: `StreamChat()` oder `StreamUnifiedChatWithTools()`
+- Authentifizierung: Cursor Session Token
+- Vorteil: Direkter Zugriff auf alle Modelle und Features
+- Nachteil: Abhaengigkeit von Cursor's Backend, Protobuf-Schema kann sich aendern
+
+**Variante 2: Background Agent API**
+```
+Eigene App → REST/HTTP → Cursor Background Composer API → Agent Response
+```
+
+- Nutzt den offiziellen Background Agent Endpoint
+- Authentifizierung: `WorkosCursorSessionToken`
+- Vorteil: Offiziell unterstuetzt, stabiler
+- Nachteil: Weniger Kontrolle, asynchron
+
+### Option B: Eigener LLM-Orchestrierungslayer (Empfohlen fuer Unabhaengigkeit)
+
+Die Cursor-Architektur nachbauen, aber als eigene Komponente:
+
+```
+Eigene App
+  ↓
+Chat API Gateway (eigener Service)
+  ├─> Prompt Assembly
+  │     ├─> System Prompt (konfigurierbar)
+  │     ├─> Context Injection (Dateien, Projekt-Layout)
+  │     └─> Tool-Definitionen
+  ├─> LLM Provider (OpenAI / Anthropic / eigener)
+  │     ├─> POST /v1/chat/completions (OpenAI-kompatibel)
+  │     ├─> Streaming via SSE
+  │     └─> Tool Calling
+  ├─> Response Parser
+  │     ├─> Text-Segmente extrahieren
+  │     ├─> Tool-Calls identifizieren
+  │     ├─> Code-Bloecke parsen
+  │     └─> Plan/Todo-Strukturen erkennen
+  └─> JSON Response Builder
+        └─> Strukturiertes JSON mit Dokumenten-Array
+```
+
+**Vorteile**:
+- Keine Abhaengigkeit von Cursor's Backend
+- Eigene Modelle und Provider waehlbar
+- Volle Kontrolle ueber Prompt-Struktur
+- Testbar ohne Cursor-Installation
+- Erweiterbar mit eigenen Tools
+
+---
+
+## 3. API-Design
+
+### 3.1 Chat Endpoint
+
+```
+POST /api/v1/chat/completions
+Content-Type: application/json
+Authorization: Bearer <token>
+```
+
+**Request**:
+```json
+{
+  "model": "claude-4.6-opus",
+  "mode": "agent",
+  "context": {
+    "workspacePath": "/path/to/project",
+    "openFiles": ["src/main.py", "src/utils.py"],
+    "projectLayout": "src/\n  main.py\n  utils.py\nREADME.md",
+    "rules": [
+      "camelCase fuer Variablen",
+      "Interne Funktionen mit _ Prefix"
+    ]
+  },
+  "messages": [
+    {
+      "role": "user",
+      "content": "Implementiere eine Authentication-Middleware"
+    }
+  ],
+  "tools": ["codebase_search", "read_file", "edit_file", "grep"],
+  "stream": false
+}
+```
+
+**Response** (siehe `doc_cursor_response_json_structure.md` fuer Details):
+```json
+{
+  "id": "chat_abc123",
+  "model": "claude-4.6-opus",
+  "mode": "agent",
+  "created": "2026-02-23T10:30:00Z",
+  "documents": [
+    {
+      "id": "doc_001",
+      "type": "text",
+      "content": "Ich werde eine Authentication-Middleware implementieren..."
+    },
+    {
+      "id": "doc_002",
+      "type": "tool_call",
+      "toolName": "read_file",
+      "arguments": { "target_file": "src/main.py" },
+      "result": "import os\nimport sys..."
+    },
+    {
+      "id": "doc_003",
+      "type": "code_reference",
+      "filePath": "src/main.py",
+      "startLine": 12,
+      "endLine": 25,
+      "content": "def main():\n    app = Flask(__name__)..."
+    },
+    {
+      "id": "doc_004",
+      "type": "code_block",
+      "language": "python",
+      "content": "def authMiddleware(f):\n    @wraps(f)\n    def decorated(*args):\n        ..."
+    },
+    {
+      "id": "doc_005",
+      "type": "file_edit",
+      "filePath": "src/middleware/auth.py",
+      "operation": "create",
+      "content": "from functools import wraps\n..."
+    },
+    {
+      "id": "doc_006",
+      "type": "plan",
+      "title": "Auth Middleware Plan",
+      "todos": [
+        { "id": "setup-auth", "content": "Middleware-Funktion erstellen", "status": "completed" },
+        { "id": "add-routes", "content": "Routes schuetzen", "status": "pending" }
+      ]
+    }
+  ],
+  "usage": {
+    "promptTokens": 2450,
+    "completionTokens": 890,
+    "totalTokens": 3340
+  }
+}
+```
+
+### 3.2 Streaming Endpoint
+
+```
+POST /api/v1/chat/completions
+Content-Type: application/json
+Accept: text/event-stream
+```
+
+Gleicher Request wie oben, aber mit `"stream": true`.
+
+**SSE Response**:
+```
+data: {"type": "document_start", "document": {"id": "doc_001", "type": "text"}}
+
+data: {"type": "content_delta", "documentId": "doc_001", "delta": "Ich werde "}
+
+data: {"type": "content_delta", "documentId": "doc_001", "delta": "eine Authentication-"}
+
+data: {"type": "content_delta", "documentId": "doc_001", "delta": "Middleware implementieren..."}
+
+data: {"type": "document_end", "documentId": "doc_001"}
+
+data: {"type": "document_start", "document": {"id": "doc_002", "type": "tool_call", "toolName": "read_file"}}
+
+data: {"type": "tool_call_arguments", "documentId": "doc_002", "arguments": {"target_file": "src/main.py"}}
+
+data: {"type": "tool_result", "documentId": "doc_002", "result": "import os\n..."}
+
+data: {"type": "document_end", "documentId": "doc_002"}
+
+data: {"type": "done", "usage": {"promptTokens": 2450, "completionTokens": 890}}
+
+data: [DONE]
+```
+
+### 3.3 Plan Mode Endpoint
+
+```
+POST /api/v1/chat/plan
+```
+
+Gleiche Struktur, aber:
+- Nur Read-Only Tools verfuegbar
+- Response enthaelt `type: "plan"` Dokument mit strukturiertem Markdown
+- Klaerende Fragen als `type: "clarification"` Dokumente
+
+---
+
+## 4. Prompt Assembly Service
+
+Der Kern der Integration ist der **Prompt Assembly Service**, der die gleiche Logik wie Cursor implementiert:
+
+### 4.1 System Prompt Builder
+
+```
+buildSystemPrompt(config)
+  ├─> Base Role Prompt
+  │     "You are an AI coding assistant..."
+  ├─> Communication Rules
+  │     Markdown, Code Citations
+  ├─> Tool Calling Rules
+  │     Parallel Execution, Standard Format
+  ├─> Context Strategy
+  │     Thorough Exploration, Semantic Search
+  ├─> Code Change Rules
+  │     Read before Edit, Fix Lints
+  ├─> Mode-Specific Rules
+  │     Agent: Full tools | Plan: Read-only | Ask: Explain only
+  ├─> Workspace Rules
+  │     Aus .cursor/rules/ oder eigener Konfiguration
+  └─> User Rules
+        Benutzerdefinierte Praeferenzen
+```
+
+### 4.2 Context Assembly
+
+```
+assembleContext(request)
+  ├─> User Info
+  │     { os, date, shell, workspacePath }
+  ├─> Project Layout
+  │     File Tree (einmalig generiert)
+  ├─> Open Files
+  │     Aktuell geoeffnete Dateien mit Inhalt
+  ├─> Recent Files
+  │     Kuerzlich angesehene Dateien
+  ├─> Git Status
+  │     Branch, Staged/Unstaged Changes
+  └─> Linter Errors
+        Aktuelle Diagnostics
+```
+
+### 4.3 Tool-Registry
+
+Tools werden dynamisch registriert und koennen pro Request konfiguriert werden:
+
+```json
+{
+  "availableTools": {
+    "codebase_search": { "enabled": true, "scope": "workspace" },
+    "read_file": { "enabled": true, "allowedPaths": ["src/", "tests/"] },
+    "edit_file": { "enabled": true, "requireApproval": true },
+    "run_terminal_cmd": { "enabled": false },
+    "web_search": { "enabled": true }
+  }
+}
+```
+
+---
+
+## 5. Response Parser
+
+Der Response Parser wandelt den LLM-Stream in strukturierte Dokumente um:
+
+### 5.1 Parsing-Pipeline
+
+```
+LLM Streaming Response
+  ↓
+Token Accumulator
+  ↓
+Segment Detector
+  ├─> Text-Segment (alles ausserhalb von Code/Tools)
+  ├─> Code-Block (``` ... ```)
+  │     ├─> Code Reference (startLine:endLine:filepath)
+  │     └─> Markdown Code Block (language tag)
+  ├─> Tool Call (function call JSON)
+  ├─> Plan (create_plan Tool-Aufruf)
+  └─> Todo (todo_write Tool-Aufruf)
+        ↓
+Document Builder
+  ├─> Jedem Segment eine Document-ID zuweisen
+  ├─> Typ klassifizieren
+  ├─> Metadaten extrahieren (Datei, Zeilen, Sprache)
+  └─> In documents[] Array einfuegen
+        ↓
+JSON Response
+```
+
+### 5.2 Segment-Erkennung
+
+| Pattern | Dokument-Typ | Extrahierte Metadaten |
+|---|---|---|
+| Freitext | `text` | content |
+| ` ```startLine:endLine:filepath ` | `code_reference` | filePath, startLine, endLine, content |
+| ` ```python ` | `code_block` | language, content |
+| Tool Call: `edit_file` | `file_edit` | filePath, operation, content |
+| Tool Call: `read_file` | `tool_call` | toolName, arguments, result |
+| Tool Call: `create_plan` | `plan` | title, overview, todos |
+| Tool Call: `todo_write` | `todo_update` | todos mit id/status |
+| Tool Call: `run_terminal_cmd` | `terminal_command` | command, output, exitCode |
+
+---
+
+## 6. Implementierungsschritte
+
+### Phase 1: Chat API Kern
+
+1. **LLM Gateway** aufbauen mit OpenAI-kompatiblem Interface
+2. **System Prompt Builder** implementieren (konfigurierbar)
+3. **Context Assembly** fuer Workspace-Informationen
+4. **Streaming Response** via SSE
+5. **Response Parser** fuer Dokument-Extraktion
+
+### Phase 2: Tool-Integration
+
+1. **Tool Registry** mit dynamischer Konfiguration
+2. **File System Tools** (read, edit, grep, glob)
+3. **Tool Execution Engine** mit Sandbox
+4. **Multi-Turn Conversation** mit Tool-Results
+
+### Phase 3: Plan Mode
+
+1. **Plan Mode Flag** im Request
+2. **Read-Only Enforcement** (nur lesende Tools)
+3. **create_plan Response Parsing**
+4. **Klaerende Fragen** als eigener Dokument-Typ
+
+### Phase 4: Subagenten
+
+1. **Task Delegation** an parallel laufende Sub-Agenten
+2. **Result Aggregation** in Parent-Response
+3. **Background Agent** fuer lang laufende Tasks
+
+---
+
+## 7. Technologie-Empfehlung
+
+| Komponente | Empfehlung | Begruendung |
+|---|---|---|
+| API Framework | FastAPI (Python) oder Express (Node.js) | SSE-Support, async, schnell |
+| LLM Client | `openai` SDK oder `anthropic` SDK | Standard, gut gewartet |
+| Streaming | Server-Sent Events (SSE) | Browser-kompatibel, einfach |
+| Prompt Templates | Jinja2 (Python) oder Handlebars (JS) | Flexibel, testbar |
+| File System Access | Lokaler Zugriff oder MCP Server | Sicherheit, Modularitaet |
+| Codebase Search | tree-sitter + Vektor-DB | Wie Cursor, AST-basiert |
+
+---
+
+## Referenzen
+
+- Cursor Architektur: `doc_cursor_ai_agent_architecture.md`
+- JSON Response Struktur: `doc_cursor_response_json_structure.md`
+- OpenAI Chat Completions API: https://platform.openai.com/docs/api-reference/chat
+- Server-Sent Events: https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events
--- a/cursor-doc/doc_cursor_response_json_structure.md
+++ b/cursor-doc/doc_cursor_response_json_structure.md
@ -0,0 +1,726 @@
+# Cursor Response JSON-Struktur - Spezifikation
+
+**Stand**: Februar 2026
+**Zweck**: Definition der JSON-Antwortstruktur fuer die Chat-API-Komponente, in der Antwortteile als einzelne Dokumente erfasst werden
+
+---
+
+## 1. Design-Prinzip
+
+Eine LLM-Antwort besteht typischerweise aus verschiedenen Segmenten: erklaerenden Texten, Code-Referenzen, neuen Code-Bloecken, Tool-Aufrufen, Datei-Aenderungen und Plaenen. Diese Segmente werden als einzelne **Dokumente** in einem Array erfasst, um sie einzeln verarbeiten, rendern und referenzieren zu koennen.
+
+---
+
+## 2. Top-Level Response
+
+```json
+{
+  "id": "chat_<uuid>",
+  "conversationId": "conv_<uuid>",
+  "model": "claude-4.6-opus",
+  "mode": "agent | plan | ask | debug",
+  "created": "2026-02-23T10:30:00Z",
+  "status": "completed | streaming | error",
+  "documents": [ /* Array von Document-Objekten */ ],
+  "usage": {
+    "promptTokens": 2450,
+    "completionTokens": 890,
+    "totalTokens": 3340
+  },
+  "metadata": {
+    "duration_ms": 4520,
+    "toolCallCount": 3,
+    "turnCount": 2
+  }
+}
+```
+
+### Feld-Beschreibung
+
+| Feld | Typ | Beschreibung |
+|---|---|---|
+| `id` | string | Eindeutige ID dieser Antwort |
+| `conversationId` | string | ID der laufenden Konversation (fuer Multi-Turn) |
+| `model` | string | Verwendetes LLM-Modell |
+| `mode` | enum | Aktiver Modus: agent, plan, ask, debug |
+| `created` | ISO 8601 | Zeitstempel der Erstellung |
+| `status` | enum | Status: completed, streaming, error |
+| `documents` | array | Geordnetes Array der Antwort-Dokumente |
+| `usage` | object | Token-Verbrauch |
+| `metadata` | object | Zusaetzliche Metadaten |
+
+---
+
+## 3. Document Types
+
+Jedes Dokument im `documents[]` Array hat folgende Basis-Struktur:
+
+```json
+{
+  "id": "doc_<sequenzNummer>",
+  "type": "<documentType>",
+  "sequence": 1,
+  "content": "...",
+  "metadata": { /* typ-spezifische Metadaten */ }
+}
+```
+
+### 3.1 `text` - Erklaerungstext
+
+Freitext-Erklaerungen, Beschreibungen, Instruktionen des AI-Assistenten.
+
+```json
+{
+  "id": "doc_001",
+  "type": "text",
+  "sequence": 1,
+  "content": "Ich werde eine Authentication-Middleware implementieren, die JWT-Token validiert und den Request-Context mit User-Informationen anreichert.\n\nDafuer sind folgende Schritte noetig:",
+  "metadata": {
+    "format": "markdown"
+  }
+}
+```
+
+---
+
+### 3.2 `code_reference` - Verweis auf existierenden Code
+
+Referenz auf bestehenden Code im Projekt (Cursor's `startLine:endLine:filepath` Format).
+
+```json
+{
+  "id": "doc_002",
+  "type": "code_reference",
+  "sequence": 2,
+  "content": "def main():\n    app = Flask(__name__)\n    app.register_blueprint(api_bp)",
+  "metadata": {
+    "filePath": "src/main.py",
+    "startLine": 12,
+    "endLine": 14,
+    "language": "python"
+  }
+}
+```
+
+---
+
+### 3.3 `code_block` - Neuer/vorgeschlagener Code
+
+Code-Block fuer neuen oder vorgeschlagenen Code, der nicht direkt einer Datei zugeordnet ist.
+
+```json
+{
+  "id": "doc_003",
+  "type": "code_block",
+  "sequence": 3,
+  "content": "from functools import wraps\nfrom flask import request, jsonify\nimport jwt\n\ndef authMiddleware(f):\n    @wraps(f)\n    def decorated(*args, **kwargs):\n        token = request.headers.get('Authorization')\n        if not token:\n            return jsonify({'error': 'Token fehlt'}), 401\n        try:\n            data = jwt.decode(token, SECRET_KEY, algorithms=['HS256'])\n            request.userId = data['userId']\n        except jwt.InvalidTokenError:\n            return jsonify({'error': 'Ungueltiger Token'}), 401\n        return f(*args, **kwargs)\n    return decorated",
+  "metadata": {
+    "language": "python",
+    "purpose": "new_code"
+  }
+}
+```
+
+---
+
+### 3.4 `file_edit` - Datei-Aenderung
+
+Konkrete Aenderung an einer bestehenden oder neuen Datei.
+
+```json
+{
+  "id": "doc_004",
+  "type": "file_edit",
+  "sequence": 4,
+  "content": "from functools import wraps\nfrom flask import request, jsonify\nimport jwt\n\nSECRET_KEY = 'your-secret-key'\n\ndef authMiddleware(f):\n    ...",
+  "metadata": {
+    "filePath": "src/middleware/auth.py",
+    "operation": "create",
+    "language": "python",
+    "diff": null
+  }
+}
+```
+
+**Operationen**:
+
+| Operation | Beschreibung |
+|---|---|
+| `create` | Neue Datei erstellen |
+| `edit` | Bestehende Datei aendern |
+| `delete` | Datei loeschen |
+| `rename` | Datei umbenennen |
+
+Bei `edit` wird zusaetzlich ein Diff mitgeliefert:
+
+```json
+{
+  "id": "doc_005",
+  "type": "file_edit",
+  "sequence": 5,
+  "content": "app.register_blueprint(api_bp)\napp.register_blueprint(auth_bp)",
+  "metadata": {
+    "filePath": "src/main.py",
+    "operation": "edit",
+    "language": "python",
+    "diff": {
+      "oldString": "app.register_blueprint(api_bp)",
+      "newString": "app.register_blueprint(api_bp)\napp.register_blueprint(auth_bp)",
+      "startLine": 14,
+      "endLine": 14
+    }
+  }
+}
+```
+
+---
+
+### 3.5 `tool_call` - Tool-Aufruf und Ergebnis
+
+Dokumentiert einen Tool-Aufruf inklusive Argumente und Ergebnis.
+
+```json
+{
+  "id": "doc_006",
+  "type": "tool_call",
+  "sequence": 6,
+  "content": null,
+  "metadata": {
+    "toolName": "codebase_search",
+    "toolCallId": "call_abc123",
+    "arguments": {
+      "query": "Where is the Flask app initialized?",
+      "target_directories": ["src/"],
+      "explanation": "Find the main app entry point"
+    },
+    "result": {
+      "status": "success",
+      "data": "Found in src/main.py:12 - app = Flask(__name__)"
+    },
+    "duration_ms": 320
+  }
+}
+```
+
+---
+
+### 3.6 `terminal_command` - Terminal-Ausfuehrung
+
+Dokumentiert die Ausfuehrung eines Terminal-Kommandos.
+
+```json
+{
+  "id": "doc_007",
+  "type": "terminal_command",
+  "sequence": 7,
+  "content": null,
+  "metadata": {
+    "command": "pip install PyJWT flask",
+    "workingDirectory": "/path/to/project",
+    "exitCode": 0,
+    "output": "Successfully installed PyJWT-2.8.0 flask-3.0.0",
+    "duration_ms": 5200,
+    "permissions": ["network"]
+  }
+}
+```
+
+---
+
+### 3.7 `plan` - Strukturierter Plan
+
+Ergebnis des Plan-Modus oder eines `create_plan` Tool-Aufrufs.
+
+```json
+{
+  "id": "doc_008",
+  "type": "plan",
+  "sequence": 8,
+  "content": "# Authentication Middleware\n\n## Uebersicht\nImplementierung einer JWT-basierten Auth-Middleware...\n\n## Dateien\n- `src/middleware/auth.py` (neu)\n- `src/main.py` (aendern)\n- `requirements.txt` (aendern)",
+  "metadata": {
+    "title": "Auth Middleware Implementation",
+    "overview": "JWT-basierte Authentication-Middleware mit Token-Validierung",
+    "todos": [
+      {
+        "id": "create-middleware",
+        "content": "Auth-Middleware-Modul erstellen",
+        "status": "pending",
+        "dependencies": []
+      },
+      {
+        "id": "register-blueprint",
+        "content": "Blueprint in main.py registrieren",
+        "status": "pending",
+        "dependencies": ["create-middleware"]
+      },
+      {
+        "id": "update-deps",
+        "content": "PyJWT zu requirements.txt hinzufuegen",
+        "status": "pending",
+        "dependencies": []
+      },
+      {
+        "id": "protect-routes",
+        "content": "Schuetzenswerte Routes mit Decorator versehen",
+        "status": "pending",
+        "dependencies": ["create-middleware", "register-blueprint"]
+      }
+    ],
+    "format": "markdown"
+  }
+}
+```
+
+---
+
+### 3.8 `clarification` - Klaerende Frage
+
+Klaerende Rueckfrage des AI-Assistenten an den User (haeufig im Plan Mode).
+
+```json
+{
+  "id": "doc_009",
+  "type": "clarification",
+  "sequence": 9,
+  "content": "Bevor ich den Plan finalisiere, habe ich folgende Fragen:",
+  "metadata": {
+    "questions": [
+      {
+        "id": "q1",
+        "text": "Soll die Middleware JWT oder Session-basierte Auth verwenden?",
+        "options": [
+          { "id": "jwt", "label": "JWT Token-basiert" },
+          { "id": "session", "label": "Session-basiert mit Cookies" }
+        ]
+      },
+      {
+        "id": "q2",
+        "text": "Welche Routes sollen geschuetzt werden?",
+        "options": [
+          { "id": "all", "label": "Alle ausser Login/Register" },
+          { "id": "selective", "label": "Nur explizit markierte Routes" }
+        ]
+      }
+    ]
+  }
+}
+```
+
+---
+
+### 3.9 `error` - Fehlermeldung
+
+Fehlermeldung bei Tool-Ausfuehrung oder LLM-Fehler.
+
+```json
+{
+  "id": "doc_010",
+  "type": "error",
+  "sequence": 10,
+  "content": "Die Datei konnte nicht gelesen werden",
+  "metadata": {
+    "errorCode": "FILE_NOT_FOUND",
+    "source": "read_file",
+    "details": "src/middleware/auth.py does not exist"
+  }
+}
+```
+
+---
+
+### 3.10 `todo_update` - Task-Status-Aenderung
+
+Aenderung am Task-Tracking waehrend der Ausfuehrung.
+
+```json
+{
+  "id": "doc_011",
+  "type": "todo_update",
+  "sequence": 11,
+  "content": null,
+  "metadata": {
+    "todos": [
+      { "id": "create-middleware", "status": "completed" },
+      { "id": "register-blueprint", "status": "in_progress" }
+    ]
+  }
+}
+```
+
+---
+
+## 4. Vollstaendiges Response-Beispiel
+
+Ein realistisches Beispiel fuer die Anfrage "Implementiere JWT Authentication":
+
+```json
+{
+  "id": "chat_7f3a2b1c",
+  "conversationId": "conv_9e8d7c6b",
+  "model": "claude-4.6-opus",
+  "mode": "agent",
+  "created": "2026-02-23T10:30:00Z",
+  "status": "completed",
+  "documents": [
+    {
+      "id": "doc_001",
+      "type": "text",
+      "sequence": 1,
+      "content": "Ich analysiere zuerst die bestehende Projektstruktur, um die beste Integrationsstrategie zu finden.",
+      "metadata": { "format": "markdown" }
+    },
+    {
+      "id": "doc_002",
+      "type": "tool_call",
+      "sequence": 2,
+      "content": null,
+      "metadata": {
+        "toolName": "codebase_search",
+        "toolCallId": "call_001",
+        "arguments": {
+          "query": "Where is the Flask app initialized?",
+          "target_directories": ["src/"]
+        },
+        "result": { "status": "success", "data": "src/main.py:12" },
+        "duration_ms": 280
+      }
+    },
+    {
+      "id": "doc_003",
+      "type": "tool_call",
+      "sequence": 3,
+      "content": null,
+      "metadata": {
+        "toolName": "read_file",
+        "toolCallId": "call_002",
+        "arguments": { "target_file": "src/main.py" },
+        "result": {
+          "status": "success",
+          "data": "import os\nfrom flask import Flask\n\ndef main():\n    app = Flask(__name__)\n    app.register_blueprint(api_bp)\n    return app"
+        },
+        "duration_ms": 45
+      }
+    },
+    {
+      "id": "doc_004",
+      "type": "code_reference",
+      "sequence": 4,
+      "content": "def main():\n    app = Flask(__name__)\n    app.register_blueprint(api_bp)\n    return app",
+      "metadata": {
+        "filePath": "src/main.py",
+        "startLine": 5,
+        "endLine": 8,
+        "language": "python"
+      }
+    },
+    {
+      "id": "doc_005",
+      "type": "text",
+      "sequence": 5,
+      "content": "Ich erstelle jetzt die Auth-Middleware und integriere sie in die App.",
+      "metadata": { "format": "markdown" }
+    },
+    {
+      "id": "doc_006",
+      "type": "file_edit",
+      "sequence": 6,
+      "content": "from functools import wraps\nfrom flask import request, jsonify\nimport jwt\nimport os\n\nSECRET_KEY = os.environ.get('JWT_SECRET', 'dev-secret')\n\ndef authMiddleware(f):\n    @wraps(f)\n    def decorated(*args, **kwargs):\n        token = request.headers.get('Authorization', '').replace('Bearer ', '')\n        if not token:\n            return jsonify({'error': 'Token required'}), 401\n        try:\n            data = jwt.decode(token, SECRET_KEY, algorithms=['HS256'])\n            request.userId = data['userId']\n        except jwt.InvalidTokenError:\n            return jsonify({'error': 'Invalid token'}), 401\n        return f(*args, **kwargs)\n    return decorated",
+      "metadata": {
+        "filePath": "src/middleware/auth.py",
+        "operation": "create",
+        "language": "python"
+      }
+    },
+    {
+      "id": "doc_007",
+      "type": "file_edit",
+      "sequence": 7,
+      "content": "from middleware.auth import authMiddleware\n\napp.register_blueprint(api_bp)\napp.register_blueprint(auth_bp)",
+      "metadata": {
+        "filePath": "src/main.py",
+        "operation": "edit",
+        "language": "python",
+        "diff": {
+          "oldString": "app.register_blueprint(api_bp)",
+          "newString": "from middleware.auth import authMiddleware\n\napp.register_blueprint(api_bp)\napp.register_blueprint(auth_bp)",
+          "startLine": 6,
+          "endLine": 6
+        }
+      }
+    },
+    {
+      "id": "doc_008",
+      "type": "terminal_command",
+      "sequence": 8,
+      "content": null,
+      "metadata": {
+        "command": "pip install PyJWT",
+        "exitCode": 0,
+        "output": "Successfully installed PyJWT-2.8.0",
+        "duration_ms": 3200,
+        "permissions": ["network"]
+      }
+    },
+    {
+      "id": "doc_009",
+      "type": "text",
+      "sequence": 9,
+      "content": "Die JWT Authentication-Middleware ist implementiert. Du kannst jetzt Routes schuetzen mit dem `@authMiddleware` Decorator:\n\n```python\n@app.route('/api/protected')\n@authMiddleware\ndef protectedRoute():\n    return jsonify({'userId': request.userId})\n```",
+      "metadata": { "format": "markdown" }
+    }
+  ],
+  "usage": {
+    "promptTokens": 3200,
+    "completionTokens": 1450,
+    "totalTokens": 4650
+  },
+  "metadata": {
+    "duration_ms": 8750,
+    "toolCallCount": 2,
+    "turnCount": 3
+  }
+}
+```
+
+---
+
+## 5. Streaming-Format
+
+Im Streaming-Modus wird jedes Dokument als Serie von Server-Sent Events uebertragen:
+
+### 5.1 Event Types
+
+| Event Type | Beschreibung |
+|---|---|
+| `document_start` | Neues Dokument beginnt, enthaelt id und type |
+| `content_delta` | Text-Chunk fuer ein laufendes Dokument |
+| `tool_call_start` | Tool-Aufruf beginnt |
+| `tool_call_arguments` | Argumente des Tool-Aufrufs |
+| `tool_result` | Ergebnis des Tool-Aufrufs |
+| `document_end` | Dokument abgeschlossen |
+| `done` | Gesamte Antwort abgeschlossen, enthaelt usage |
+
+### 5.2 Event-Sequenz fuer ein Text-Dokument
+
+```
+event: document_start
+data: {"id":"doc_001","type":"text","sequence":1}
+
+event: content_delta
+data: {"documentId":"doc_001","delta":"Ich analysiere "}
+
+event: content_delta
+data: {"documentId":"doc_001","delta":"zuerst die bestehende "}
+
+event: content_delta
+data: {"documentId":"doc_001","delta":"Projektstruktur..."}
+
+event: document_end
+data: {"documentId":"doc_001","finalContent":"Ich analysiere zuerst die bestehende Projektstruktur..."}
+```
+
+### 5.3 Event-Sequenz fuer einen Tool-Call
+
+```
+event: document_start
+data: {"id":"doc_002","type":"tool_call","sequence":2}
+
+event: tool_call_start
+data: {"documentId":"doc_002","toolName":"read_file","toolCallId":"call_001"}
+
+event: tool_call_arguments
+data: {"documentId":"doc_002","arguments":{"target_file":"src/main.py"}}
+
+event: tool_result
+data: {"documentId":"doc_002","result":{"status":"success","data":"import os\n..."}}
+
+event: document_end
+data: {"documentId":"doc_002"}
+```
+
+---
+
+## 6. TypeScript Interface-Definitionen
+
+```typescript
+interface ChatResponse {
+  id: string;
+  conversationId: string;
+  model: string;
+  mode: "agent" | "plan" | "ask" | "debug";
+  created: string;
+  status: "completed" | "streaming" | "error";
+  documents: Document[];
+  usage: UsageInfo;
+  metadata: ResponseMetadata;
+}
+
+type DocumentType =
+  | "text"
+  | "code_reference"
+  | "code_block"
+  | "file_edit"
+  | "tool_call"
+  | "terminal_command"
+  | "plan"
+  | "clarification"
+  | "error"
+  | "todo_update";
+
+interface Document {
+  id: string;
+  type: DocumentType;
+  sequence: number;
+  content: string | null;
+  metadata: Record<string, unknown>;
+}
+
+interface TextMetadata {
+  format: "markdown" | "plain";
+}
+
+interface CodeReferenceMetadata {
+  filePath: string;
+  startLine: number;
+  endLine: number;
+  language: string;
+}
+
+interface CodeBlockMetadata {
+  language: string;
+  purpose: "new_code" | "example" | "suggestion";
+}
+
+interface FileEditMetadata {
+  filePath: string;
+  operation: "create" | "edit" | "delete" | "rename";
+  language: string;
+  diff?: {
+    oldString: string;
+    newString: string;
+    startLine: number;
+    endLine: number;
+  };
+}
+
+interface ToolCallMetadata {
+  toolName: string;
+  toolCallId: string;
+  arguments: Record<string, unknown>;
+  result: {
+    status: "success" | "error";
+    data: unknown;
+  };
+  duration_ms: number;
+}
+
+interface TerminalCommandMetadata {
+  command: string;
+  workingDirectory?: string;
+  exitCode: number;
+  output: string;
+  duration_ms: number;
+  permissions: string[];
+}
+
+interface PlanMetadata {
+  title: string;
+  overview: string;
+  todos: PlanTodo[];
+  format: "markdown";
+}
+
+interface PlanTodo {
+  id: string;
+  content: string;
+  status: "pending" | "in_progress" | "completed" | "cancelled";
+  dependencies: string[];
+}
+
+interface ClarificationMetadata {
+  questions: ClarificationQuestion[];
+}
+
+interface ClarificationQuestion {
+  id: string;
+  text: string;
+  options: { id: string; label: string }[];
+}
+
+interface ErrorMetadata {
+  errorCode: string;
+  source: string;
+  details: string;
+}
+
+interface TodoUpdateMetadata {
+  todos: { id: string; status: string }[];
+}
+
+interface UsageInfo {
+  promptTokens: number;
+  completionTokens: number;
+  totalTokens: number;
+}
+
+interface ResponseMetadata {
+  duration_ms: number;
+  toolCallCount: number;
+  turnCount: number;
+}
+
+// Streaming Events
+type StreamEventType =
+  | "document_start"
+  | "content_delta"
+  | "tool_call_start"
+  | "tool_call_arguments"
+  | "tool_result"
+  | "document_end"
+  | "done";
+
+interface StreamEvent {
+  type: StreamEventType;
+  documentId?: string;
+  data: unknown;
+}
+```
+
+---
+
+## 7. Mapping: Cursor LLM Response → Document Types
+
+Wie Cursor's roher LLM-Stream in Dokument-Typen umgewandelt wird:
+
+```
+LLM Output Stream
+  ↓
+Parser State Machine
+  │
+  ├─> Erkennt freien Text → type: "text"
+  │     Akkumuliert bis naechster Segment-Beginn
+  │
+  ├─> Erkennt ``` mit startLine:endLine:filepath → type: "code_reference"
+  │     Extrahiert: filePath, startLine, endLine, content
+  │
+  ├─> Erkennt ``` mit Language-Tag → type: "code_block"
+  │     Extrahiert: language, content
+  │
+  ├─> Erkennt Tool Call im Response
+  │     ├─> tool_name == "edit_file" → type: "file_edit"
+  │     ├─> tool_name == "run_terminal_cmd" → type: "terminal_command"
+  │     ├─> tool_name == "create_plan" → type: "plan"
+  │     ├─> tool_name == "todo_write" → type: "todo_update"
+  │     └─> alle anderen → type: "tool_call"
+  │
+  └─> Erkennt Fehler → type: "error"
+```
+
+---
+
+## Referenzen
+
+- Cursor Architektur: `doc_cursor_ai_agent_architecture.md`
+- API Integration Konzept: `doc_cursor_chat_api_integration_concept.md`
+- OpenAI Chat Completions Streaming: https://platform.openai.com/docs/api-reference/chat/streaming
+- Server-Sent Events Spec: https://html.spec.whatwg.org/multipage/server-sent-events.html