cursor-mode-integration-mvp
This commit is contained in:
parent
d8c0ba9237
commit
040067f6af
3 changed files with 1736 additions and 0 deletions
633
cursor-doc/doc_cursor_ai_agent_architecture.md
Normal file
633
cursor-doc/doc_cursor_ai_agent_architecture.md
Normal file
|
|
@ -0,0 +1,633 @@
|
|||
# Cursor AI Agent Architecture - Agent & Planning Mode
|
||||
|
||||
**Stand**: Februar 2026
|
||||
**Zweck**: Technische Dokumentation der AI-Agenten-Integration in Cursor (VS Code Fork) als Grundlage fuer die Entwicklung einer eigenstaendigen Chat-Komponente mit API-Zugang
|
||||
|
||||
---
|
||||
|
||||
## 1. Ueberblick
|
||||
|
||||
Cursor ist ein Fork von Visual Studio Code, erweitert um tiefgreifende AI-Faehigkeiten. Im Gegensatz zu VS Code Extensions, die durch die Extension-API limitiert sind, hat Cursor als echter Fork vollen Zugriff auf den Rendering-Pipeline, den AST-Level und die Language Server Infrastructure.
|
||||
|
||||
Die AI-Integration besteht aus zwei Hauptkomponenten:
|
||||
|
||||
1. **Client (Body)**: Die Electron/VS Code-basierte Desktop-Applikation mit Custom-UI-Elementen (Chat-Sidebar, Composer-Panel, Inline-Completions)
|
||||
2. **Backend (Brain)**: Cloud-Services fuer LLM-Orchestrierung, Prompt-Konstruktion, Codebase-Indexierung und Embedding-Storage
|
||||
|
||||
```
|
||||
User Interaction
|
||||
↓
|
||||
Cursor Client (Electron / VS Code Fork)
|
||||
├─> Chat Panel UI (Custom Webview)
|
||||
├─> Composer Panel (Multi-File Editor)
|
||||
├─> Inline Completions (Tab)
|
||||
└─> Cmd/Ctrl+K (Inline Edit)
|
||||
↓
|
||||
Context Assembly Layer
|
||||
├─> Open Files / Cursor Position
|
||||
├─> Semantic Search (RAG)
|
||||
├─> AST / Language Server Data
|
||||
└─> Conversation History
|
||||
↓
|
||||
Cursor Backend (Cloud)
|
||||
├─> Prompt Construction
|
||||
├─> Model Routing (GPT / Claude / Gemini / Custom)
|
||||
├─> Tool Orchestration
|
||||
└─> Response Streaming
|
||||
↓
|
||||
Response Processing (Client)
|
||||
├─> Text Rendering (Markdown)
|
||||
├─> Code Diff Application
|
||||
├─> Tool Call Execution
|
||||
└─> Plan/Todo Management
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Interaction Modes
|
||||
|
||||
Cursor bietet vier Modi, die ueber `Ctrl+.` gewechselt werden koennen:
|
||||
|
||||
### 2.1 Agent Mode (Standard-Ausfuehrungsmodus)
|
||||
|
||||
**Zweck**: Autonomes Implementieren, Refactoring, Bug-Fixing
|
||||
|
||||
**Faehigkeiten**:
|
||||
- Multi-File Exploration und Editing
|
||||
- Terminal-Kommando-Ausfuehrung (sandboxed)
|
||||
- Codebase-Suche (semantisch + grep)
|
||||
- Iterative Verfeinerung mit Compiler/Linter-Feedback
|
||||
- Parallele Tool-Aufrufe fuer Geschwindigkeit
|
||||
|
||||
**Ablauf**:
|
||||
```
|
||||
User Prompt
|
||||
↓
|
||||
Context Assembly
|
||||
├─> user_info (OS, Shell, Workspace)
|
||||
├─> project_layout (File Tree Snapshot)
|
||||
├─> open_files / cursor_position
|
||||
└─> edit_history / linter_errors
|
||||
↓
|
||||
System Prompt Injection
|
||||
├─> Role Definition ("AI coding assistant, pair programmer")
|
||||
├─> Communication Guidelines (Markdown, Code Citations)
|
||||
├─> Tool Calling Rules (15+ Tools)
|
||||
├─> Context Strategy (Parallel Execution, Thorough Exploration)
|
||||
└─> Code Change Guidelines (Read before Edit, Fix Lints)
|
||||
↓
|
||||
LLM Request (POST /v1/chat/completions)
|
||||
├─> model: "claude-4.6-opus" | "gpt-5.2" | ...
|
||||
├─> temperature: 0
|
||||
├─> messages: [system, user_context, user_query]
|
||||
├─> tools: [15+ tool definitions]
|
||||
├─> tool_choice: "auto"
|
||||
└─> stream: true
|
||||
↓
|
||||
Streaming Response
|
||||
├─> Text Chunks (Markdown)
|
||||
├─> Tool Calls (JSON)
|
||||
│ ├─> Tool Execution (Client-Side)
|
||||
│ └─> Tool Results → Next LLM Turn
|
||||
└─> Completion
|
||||
```
|
||||
|
||||
### 2.2 Plan Mode (Planungsmodus)
|
||||
|
||||
**Zweck**: Komplexe Features zerlegen, Codebase analysieren, Implementierungsplaene erstellen
|
||||
|
||||
**Aktivierung**: `Shift+Tab` im Agent Input oder `Ctrl+.` → Plan
|
||||
|
||||
**Kernmerkmal**: Read-Only - es werden keine Aenderungen am Code vorgenommen
|
||||
|
||||
**System Reminder bei Plan Mode**:
|
||||
```
|
||||
Plan mode is active. The user indicated that they do not want you to execute
|
||||
yet -- you MUST NOT make any edits, run any non-readonly tools (including
|
||||
changing configs or making commits), or otherwise make any changes to the
|
||||
system.
|
||||
```
|
||||
|
||||
**Plan-Mode-Workflow**:
|
||||
```
|
||||
User beschreibt komplexe Aufgabe
|
||||
↓
|
||||
Codebase Research (Read-Only)
|
||||
├─> Semantic Search nach relevanten Dateien
|
||||
├─> File Reading fuer Kontextverstaendnis
|
||||
└─> Grep fuer spezifische Symbole
|
||||
↓
|
||||
Klaerende Fragen an User
|
||||
├─> Anforderungen praezisieren
|
||||
├─> Implementierungsvarianten abklaeren
|
||||
└─> Scope eingrenzen
|
||||
↓
|
||||
Plan-Erstellung (create_plan Tool)
|
||||
├─> Markdown-Dokument mit Datei-Referenzen
|
||||
├─> Code-Beispiele und Snippets
|
||||
├─> Strukturierte Todos mit IDs und Abhaengigkeiten
|
||||
└─> Uebersicht (1-2 Saetze)
|
||||
↓
|
||||
User Review & Inline-Editing
|
||||
├─> Plan direkt im Editor bearbeitbar
|
||||
├─> Todos hinzufuegen/entfernen
|
||||
└─> Bei Zufriedenheit: Ausfuehrung starten
|
||||
```
|
||||
|
||||
**Plan-Datenmodell** (`create_plan` Tool):
|
||||
```json
|
||||
{
|
||||
"name": "Kurzer Plan-Name (3-4 Worte)",
|
||||
"overview": "1-2 Saetze Zusammenfassung",
|
||||
"plan": "# Plan Title\n\nMarkdown-Inhalt mit Dateireferenzen...",
|
||||
"todos": [
|
||||
{
|
||||
"id": "setup-auth",
|
||||
"content": "Auth-Middleware implementieren",
|
||||
"dependencies": []
|
||||
},
|
||||
{
|
||||
"id": "implement-ui",
|
||||
"content": "Login-Formular erstellen",
|
||||
"dependencies": ["setup-auth"]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### 2.3 Ask Mode (Frage-Modus)
|
||||
|
||||
**Zweck**: Code verstehen, Fragen beantworten ohne Aenderungen
|
||||
**Einschraenkung**: Read-Only, keine Write-Tools verfuegbar
|
||||
|
||||
### 2.4 Debug Mode (Fehlerbehebungs-Modus)
|
||||
|
||||
**Zweck**: Systematische Fehleranalyse mit Hypothesen, Log-Instrumentierung, Runtime-Analyse
|
||||
|
||||
---
|
||||
|
||||
## 3. Communication Protocol
|
||||
|
||||
### 3.1 Transport Layer
|
||||
|
||||
Cursor kommuniziert mit dem Backend ueber **ConnectRPC** (gRPC-Web Variante) mit **Protocol Buffers (Protobuf)** als Serialisierungsformat.
|
||||
|
||||
**Primaerer Endpoint**: `https://api2.cursor.sh`
|
||||
|
||||
**Encoding**: HTTP/2 mit binaerer Protobuf-Kodierung im Envelope-Format:
|
||||
```
|
||||
[type:1 byte][length:4 bytes Big-Endian][payload:N bytes]
|
||||
```
|
||||
|
||||
### 3.2 Protobuf-Schema (Reverse Engineered)
|
||||
|
||||
**Service**: `aiserver.v1.ChatService`
|
||||
|
||||
**Kern-Messages**:
|
||||
```protobuf
|
||||
message GetChatRequest {
|
||||
ModelDetails modelDetails = 1;
|
||||
repeated ConversationMessage conversation = 2;
|
||||
}
|
||||
|
||||
message ModelDetails {
|
||||
optional string modelName = 1;
|
||||
}
|
||||
|
||||
message ConversationMessage {
|
||||
string text = 1;
|
||||
MessageType type = 2;
|
||||
|
||||
enum MessageType {
|
||||
MESSAGE_TYPE_HUMAN = 0;
|
||||
MESSAGE_TYPE_AI = 1;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Streaming-Methode**: `StreamChat()` bzw. `StreamUnifiedChatWithTools()`
|
||||
- Antworten kommen als iterativer Stream
|
||||
- Jeder Chunk enthaelt ein `text`-Feld mit partiellem Content
|
||||
|
||||
### 3.3 OpenAI-kompatibles Chat Completions API
|
||||
|
||||
Parallel zum Protobuf-Protokoll nutzt Cursor intern das **OpenAI Chat Completions Format**:
|
||||
|
||||
```
|
||||
POST /v1/chat/completions
|
||||
```
|
||||
|
||||
**Request-Struktur**:
|
||||
```json
|
||||
{
|
||||
"model": "claude-4.6-opus",
|
||||
"temperature": 0,
|
||||
"user": "github|user_ID",
|
||||
"messages": [
|
||||
{
|
||||
"role": "system",
|
||||
"content": "[System Prompt mit allen Instruktionen]"
|
||||
},
|
||||
{
|
||||
"role": "user",
|
||||
"content": "<user_info>...</user_info>\n<project_layout>...</project_layout>"
|
||||
},
|
||||
{
|
||||
"role": "user",
|
||||
"content": "<user_query>...</user_query>\n<system_reminder>...</system_reminder>"
|
||||
}
|
||||
],
|
||||
"tools": [ /* 15+ Tool-Definitionen */ ],
|
||||
"tool_choice": "auto",
|
||||
"stream": true
|
||||
}
|
||||
```
|
||||
|
||||
**Kontext-Injection im User-Message**:
|
||||
|
||||
| Tag | Inhalt |
|
||||
|---|---|
|
||||
| `<user_info>` | OS, Datum, Shell, Workspace-Pfad |
|
||||
| `<project_layout>` | Dateibaum-Snapshot (einmalig zu Gespraechsbeginn) |
|
||||
| `<user_query>` | Der eigentliche User-Prompt |
|
||||
| `<system_reminder>` | Dynamische Modus-Instruktionen (z.B. Plan Mode aktiv) |
|
||||
| `<open_and_recently_viewed_files>` | Offene und kuerzlich angesehene Dateien |
|
||||
|
||||
---
|
||||
|
||||
## 4. System Prompt Architektur
|
||||
|
||||
Der System Prompt ist das Herzstrueck der AI-Steuerung. Er wird vom Cursor-Backend zusammengebaut und besteht aus folgenden Bloecken:
|
||||
|
||||
### 4.1 Prompt-Bloecke
|
||||
|
||||
```
|
||||
System Prompt Assembly
|
||||
├─> Role Definition
|
||||
│ "You are an AI coding assistant, powered by [model]. You operate in Cursor."
|
||||
│ "You are pair programming with a USER to solve their coding task."
|
||||
│
|
||||
├─> <communication>
|
||||
│ Markdown-Formatierung, Code-Citations, Math-Notation
|
||||
│
|
||||
├─> <tool_calling>
|
||||
│ Regeln fuer Tool-Nutzung: Nicht Tool-Namen erwaehnen, Standard-Format nutzen
|
||||
│
|
||||
├─> <maximize_parallel_tool_calls>
|
||||
│ Unabhaengige Tool-Calls parallel ausfuehren
|
||||
│
|
||||
├─> <maximize_context_understanding>
|
||||
│ Gruendliche Exploration, Symbol-Tracing, Semantic Search Strategie
|
||||
│
|
||||
├─> <making_code_changes>
|
||||
│ Read before Edit, Dependency Files, Linter-Fixes
|
||||
│
|
||||
├─> <citing_code>
|
||||
│ CODE REFERENCES (startLine:endLine:filepath) vs MARKDOWN CODE BLOCKS
|
||||
│
|
||||
├─> <task_management>
|
||||
│ todo_write Tool fuer komplexe Tasks
|
||||
│
|
||||
├─> Workspace Rules (.cursor/rules/)
|
||||
│ Projekt-spezifische Coding-Konventionen
|
||||
│
|
||||
├─> User Rules
|
||||
│ Benutzerdefinierte Praeferenzen
|
||||
│
|
||||
└─> MCP Instructions
|
||||
Konfigurierte MCP-Server und deren Nutzungshinweise
|
||||
```
|
||||
|
||||
### 4.2 Mode-spezifische Anpassungen
|
||||
|
||||
Der System Prompt wird je nach aktivem Mode modifiziert:
|
||||
|
||||
| Mode | Zusatz-Instruktionen |
|
||||
|---|---|
|
||||
| **Agent** | Volle Tool-Palette, Edit-Berechtigung, Terminal-Zugang |
|
||||
| **Plan** | Read-Only Reminder, `create_plan` Tool verfuegbar, keine Edits |
|
||||
| **Ask** | Read-Only, eingeschraenkte Tools, Erklaer-Fokus |
|
||||
| **Debug** | Hypothesen-getriebene Analyse, Log-Instrumentierung |
|
||||
|
||||
---
|
||||
|
||||
## 5. Tool-System
|
||||
|
||||
Cursor stellt dem LLM 15+ spezialisierte Tools zur Verfuegung, definiert im OpenAI Function Calling Format.
|
||||
|
||||
### 5.1 Tool-Kategorien
|
||||
|
||||
**Code-Verstaendnis**:
|
||||
- `codebase_search` / `SemanticSearch` - Semantische Suche nach Bedeutung
|
||||
- `grep` / `Grep` - Exakte Text/Regex-Suche (basiert auf ripgrep)
|
||||
- `read_file` / `Read` - Dateien lesen mit optionalem Offset/Limit
|
||||
- `glob_file_search` / `Glob` - Dateien nach Muster finden
|
||||
- `list_dir` - Verzeichnisinhalt auflisten
|
||||
|
||||
**Code-Aenderung**:
|
||||
- `edit_file` - Intelligentes Code-Editing mit `// ... existing code ...` Markern
|
||||
- `StrReplace` - Exakte String-Ersetzungen
|
||||
- `Write` - Neue Dateien erstellen / ueberschreiben
|
||||
- `delete_file` / `Delete` - Dateien loeschen
|
||||
- `edit_notebook` / `EditNotebook` - Jupyter Notebook Zellen bearbeiten
|
||||
|
||||
**Ausfuehrung**:
|
||||
- `run_terminal_cmd` / `Shell` - Sandboxed Terminal-Kommandos
|
||||
- `web_search` / `WebSearch` - Web-Recherche
|
||||
- `WebFetch` - URL-Inhalte abrufen
|
||||
|
||||
**Projekt-Management**:
|
||||
- `todo_write` / `TodoWrite` - Task-Tracking mit Status
|
||||
- `create_plan` - Strukturierte Planungsdokumente
|
||||
|
||||
**Linting**:
|
||||
- `read_lints` / `ReadLints` - Linter-Fehler auslesen
|
||||
|
||||
**Subagenten**:
|
||||
- `Task` - Spezialisierte Sub-Agenten starten (generalPurpose, explore, browser-use)
|
||||
- `SwitchMode` - Modus wechseln
|
||||
|
||||
### 5.2 Tool-Definition Format
|
||||
|
||||
Jedes Tool wird als OpenAI Function Definition spezifiziert:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "codebase_search",
|
||||
"description": "semantic search that finds code by meaning...",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"query": {
|
||||
"type": "string",
|
||||
"description": "A complete question about what you want to understand"
|
||||
},
|
||||
"target_directories": {
|
||||
"type": "array",
|
||||
"items": { "type": "string" }
|
||||
},
|
||||
"explanation": {
|
||||
"type": "string",
|
||||
"description": "Why this tool is being used"
|
||||
}
|
||||
},
|
||||
"required": ["query", "target_directories", "explanation"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 5.3 Tool-Aufruf und Antwort-Zyklus
|
||||
|
||||
```
|
||||
LLM generiert Tool-Call
|
||||
↓
|
||||
{
|
||||
"tool_calls": [{
|
||||
"id": "call_abc123",
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "read_file",
|
||||
"arguments": "{\"target_file\": \"src/main.py\"}"
|
||||
}
|
||||
}]
|
||||
}
|
||||
↓
|
||||
Client fuehrt Tool aus (lokal)
|
||||
↓
|
||||
Tool-Result wird als neue Message eingefuegt
|
||||
↓
|
||||
{
|
||||
"role": "tool",
|
||||
"tool_call_id": "call_abc123",
|
||||
"content": "1|import os\n2|import sys\n..."
|
||||
}
|
||||
↓
|
||||
Naechster LLM-Turn mit Tool-Result im Kontext
|
||||
↓
|
||||
LLM generiert Text-Antwort oder weitere Tool-Calls
|
||||
```
|
||||
|
||||
### 5.4 Parallele Tool-Ausfuehrung
|
||||
|
||||
Cursor optimiert auf parallele Tool-Calls. Wenn keine Abhaengigkeiten bestehen, werden mehrere Tools gleichzeitig aufgerufen:
|
||||
|
||||
```
|
||||
Beispiel: "Fuege Authentication hinzu"
|
||||
↓
|
||||
Parallel:
|
||||
├─> codebase_search("How does authentication work?")
|
||||
├─> grep("auth|login|session")
|
||||
├─> read_file("src/app.py")
|
||||
└─> web_search("best practices authentication 2026")
|
||||
↓
|
||||
Sequential (abhaengig von Ergebnissen):
|
||||
├─> read_file("src/middleware/auth.py") [gefunden durch search]
|
||||
└─> edit_file("src/routes/login.py") [basierend auf Kontext]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Codebase-Indexierung und RAG
|
||||
|
||||
### 6.1 Indexierungs-Pipeline
|
||||
|
||||
```
|
||||
Projekt oeffnen
|
||||
↓
|
||||
Background Scanning
|
||||
├─> Dateien in Chunks aufteilen (tree-sitter fuer AST-basiertes Splitting)
|
||||
├─> Chunks: ~einige hundert Tokens, an logischen Grenzen (Funktionen, Klassen)
|
||||
└─> .gitignore / .cursorignore respektieren
|
||||
↓
|
||||
Embedding-Berechnung
|
||||
├─> OpenAI Embedding Model oder Custom Model
|
||||
└─> Vektor-Repraesentation des semantischen Inhalts
|
||||
↓
|
||||
Vektor-Datenbank (Backend)
|
||||
├─> Embeddings mit Metadaten (Datei, Zeilen, Chunk-ID)
|
||||
└─> Nearest-Neighbor-Suche fuer Queries
|
||||
```
|
||||
|
||||
### 6.2 Retrieval-Augmented Generation (RAG)
|
||||
|
||||
```
|
||||
User-Query
|
||||
↓
|
||||
Query → Embedding-Vektor
|
||||
↓
|
||||
Nearest-Neighbor-Suche in Vektor-DB
|
||||
↓
|
||||
Top-K relevante Code-Chunks
|
||||
├─> Dateiname + Zeilennummern
|
||||
└─> Code-Inhalt
|
||||
↓
|
||||
In LLM-Prompt eingebettet als Kontext
|
||||
↓
|
||||
LLM generiert Antwort basierend auf Projekt-Kontext
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Shadow Workspace
|
||||
|
||||
Cursor implementiert ein **Shadow Workspace** Konzept: ein verstecktes Electron-Fenster, das das Projekt spiegelt.
|
||||
|
||||
**Zweck**:
|
||||
- AI-generierte Code-Aenderungen sicher testen bevor sie dem User praesentiert werden
|
||||
- Language Server Feedback (Type-Checking, Linting) auf vorgeschlagene Aenderungen anwenden
|
||||
- Selbst-Verbesserungsschleife: Fehler werden zurueck an das LLM gegeben
|
||||
|
||||
**Ablauf**:
|
||||
```
|
||||
AI generiert Code-Aenderung
|
||||
↓
|
||||
Shadow Workspace (verstecktes Fenster)
|
||||
├─> Aenderungen anwenden (nicht in echten Dateien)
|
||||
├─> Language Server pruefen (TypeScript, Python, etc.)
|
||||
└─> Diagnostics sammeln
|
||||
↓
|
||||
[Wenn Fehler] → Feedback an LLM → Neuer Versuch
|
||||
[Wenn OK] → Aenderung dem User praesentieren
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Subagenten-Architektur
|
||||
|
||||
Seit Januar 2026 unterstuetzt Cursor das Starten spezialisierter Subagenten:
|
||||
|
||||
### 8.1 Subagent-Typen
|
||||
|
||||
| Typ | Zweck | Faehigkeiten |
|
||||
|---|---|---|
|
||||
| `generalPurpose` | Komplexe, mehrstufige Tasks | Voller Tool-Zugang, Code-Suche, Ausfuehrung |
|
||||
| `explore` | Schnelle Codebase-Exploration | Glob, Grep, Read - optimiert fuer Geschwindigkeit |
|
||||
| `browser-use` | Browser-basiertes Testing | Navigation, Interaktion, Screenshots |
|
||||
|
||||
### 8.2 Subagent-Kommunikation
|
||||
|
||||
```
|
||||
Parent Agent
|
||||
├─> Task(prompt="...", subagent_type="explore")
|
||||
│ ├─> Subagent erhaelt eigenen Kontext
|
||||
│ ├─> Subagent fuehrt Tools aus
|
||||
│ └─> Subagent gibt Ergebnis zurueck (einzelne Message)
|
||||
│
|
||||
└─> Bis zu 4 Subagenten parallel
|
||||
```
|
||||
|
||||
**Wichtig**: Subagenten haben keinen Zugriff auf den Chat-Verlauf des Parent-Agenten. Der Prompt muss alle noetige Information enthalten.
|
||||
|
||||
---
|
||||
|
||||
## 9. Background Agent API
|
||||
|
||||
Cursor bietet seit 2026 eine **Background Agent API** fuer programmatischen Zugriff:
|
||||
|
||||
### 9.1 Authentifizierung
|
||||
|
||||
Ueber `WorkosCursorSessionToken` (Cookie/Environment Variable)
|
||||
|
||||
### 9.2 Endpunkte
|
||||
|
||||
| Endpunkt | Methode | Beschreibung |
|
||||
|---|---|---|
|
||||
| Composer erstellen | POST | Neuen Background Composer mit Task-Beschreibung starten |
|
||||
| Composer auflisten | GET | Alle laufenden Background Composer abrufen |
|
||||
| Composer Details | GET | Status und Details eines spezifischen Composers |
|
||||
| Settings | GET | User-Einstellungen abrufen |
|
||||
|
||||
### 9.3 Programmatischer Zugriff (TypeScript)
|
||||
|
||||
```typescript
|
||||
import { CursorAPIClient } from 'cursor-api-client';
|
||||
|
||||
const client = new CursorAPIClient('session-token');
|
||||
|
||||
const result = await client.createBackgroundComposer({
|
||||
taskDescription: 'Add user authentication',
|
||||
repositoryUrl: 'https://github.com/user/repo.git',
|
||||
branch: 'main',
|
||||
model: 'claude-4-sonnet-thinking'
|
||||
});
|
||||
|
||||
const composers = await client.listComposers();
|
||||
```
|
||||
|
||||
### 9.4 Direkter RPC-Zugriff (Go)
|
||||
|
||||
```go
|
||||
creds, _ := cursor.GetDefaultCredentials()
|
||||
aiService := cursor.NewAiServiceClient()
|
||||
|
||||
model := "gpt-4"
|
||||
resp, _ := aiService.StreamChat(context.TODO(), cursor.NewRequest(creds, &aiserverv1.GetChatRequest{
|
||||
ModelDetails: &aiserverv1.ModelDetails{
|
||||
ModelName: &model,
|
||||
},
|
||||
Conversation: []*aiserverv1.ConversationMessage{
|
||||
{
|
||||
Text: "Hello, who are you?",
|
||||
Type: aiserverv1.ConversationMessage_MESSAGE_TYPE_HUMAN,
|
||||
},
|
||||
},
|
||||
}))
|
||||
|
||||
for resp.Receive() {
|
||||
next := resp.Msg()
|
||||
fmt.Printf(next.Text)
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10. Model-Routing
|
||||
|
||||
Cursor verwendet verschiedene Modelle je nach Aufgabentyp:
|
||||
|
||||
| Aufgabe | Modell-Typ | Beispiele |
|
||||
|---|---|---|
|
||||
| Chat / komplexe Analyse | Frontier LLM | Claude 4.6 Opus, GPT-5.2/5.3, Gemini 3 Pro |
|
||||
| Tab Completion | Custom Fast Model | "Copilot++" (basierend auf Llama 70B) |
|
||||
| Fast Apply (Diffs) | Spezialisiertes Fine-Tuned Model | Optimiert fuer Code-Diffs, >1000 Tokens/s |
|
||||
| Embeddings | Embedding Model | OpenAI Embedding oder Custom |
|
||||
|
||||
**Speculative Decoding**: Ein kleineres "Draft"-Modell generiert mehrere Token voraus, die das Hauptmodell parallel verifiziert → signifikant schnellere Inferenz.
|
||||
|
||||
---
|
||||
|
||||
## 11. Sicherheitsmodell
|
||||
|
||||
### 11.1 Terminal Sandbox
|
||||
|
||||
Kommandos laufen standardmaessig in einer Sandbox:
|
||||
- Erlaubt: Schreiben im Workspace, Lesen im Dateisystem
|
||||
- Blockiert: Netzwerk, Git-Write, ignorierte Dateien, USB
|
||||
- Erweiterbar: `network`, `git_write`, `all` Permissions
|
||||
|
||||
### 11.2 Privacy Mode
|
||||
|
||||
- Wenn aktiviert: Kein Code wird nach Erfuellung des Requests gespeichert
|
||||
- Wenn deaktiviert: Anonymisierte Telemetrie, kein langfristiges Code-Speichern
|
||||
|
||||
### 11.3 Rules System
|
||||
|
||||
- `.cursor/rules/` - Projekt-spezifische AI-Verhaltensregeln
|
||||
- User Rules - Globale Praeferenzen
|
||||
- `<system_reminder>` Tags - Dynamische Instruktionen (nie sichtbar fuer User)
|
||||
|
||||
---
|
||||
|
||||
## Quellen und Referenzen
|
||||
|
||||
- Cursor Offizielle Dokumentation: https://docs.cursor.com
|
||||
- Cursor Blog - Plan Mode: https://cursor.com/blog/plan-mode
|
||||
- Cursor Blog - Dynamic Context Discovery: https://cursor.com/blog/dynamic-context-discovery
|
||||
- Cursor Blog - Codex Model Harness: https://cursor.com/blog/codex-model-harness
|
||||
- Reverse-Engineered RPC Client: https://github.com/everestmz/cursor-rpc
|
||||
- Background Agent API Client: https://github.com/mjdierkes/cursor-background-agent-api
|
||||
- Architecture Analysis: https://adityarohilla.com/2025/05/08/how-cursor-works-internally/
|
||||
- System Prompt Analysis: https://medium.com/@lakkannawalikar/cursor-ai-architecture-system-prompts-and-tools-deep-dive-77f44cb1c6b0
|
||||
377
cursor-doc/doc_cursor_chat_api_integration_concept.md
Normal file
377
cursor-doc/doc_cursor_chat_api_integration_concept.md
Normal file
|
|
@ -0,0 +1,377 @@
|
|||
# Cursor Chat-Komponente als API - Integrationskonzept
|
||||
|
||||
**Stand**: Februar 2026
|
||||
**Zweck**: Konzept zur Extraktion der Chat-Funktionalitaet aus Cursor als eigenstaendige, API-gesteuerte Komponente
|
||||
|
||||
---
|
||||
|
||||
## 1. Zielsetzung
|
||||
|
||||
Die Chat-Funktionalitaet von Cursor soll als eigenstaendige Komponente ueber ein API nutzbar sein, ohne den Cursor selbst zu gefaehrden. Das Ziel:
|
||||
|
||||
- User-Prompt in Chat-Fenstern als Komponente ueber API nutzen
|
||||
- Antworten als JSON-Objekt erhalten, in welchem Antwortteile als einzelne Dokumente erfasst sind
|
||||
- Neue Funktionen zuerst isoliert testen, bevor sie in den Cursor integriert werden
|
||||
|
||||
---
|
||||
|
||||
## 2. Architektur-Optionen
|
||||
|
||||
### Option A: Cursor Backend direkt ansprechen (Recommended)
|
||||
|
||||
Cursor's Backend ist ueber zwei Protokolle erreichbar:
|
||||
|
||||
**Variante 1: ConnectRPC / Protobuf**
|
||||
```
|
||||
Eigene App → ConnectRPC Client → api2.cursor.sh → LLM Response Stream
|
||||
```
|
||||
|
||||
- Verwendet `aiserver.v1.ChatService`
|
||||
- Methode: `StreamChat()` oder `StreamUnifiedChatWithTools()`
|
||||
- Authentifizierung: Cursor Session Token
|
||||
- Vorteil: Direkter Zugriff auf alle Modelle und Features
|
||||
- Nachteil: Abhaengigkeit von Cursor's Backend, Protobuf-Schema kann sich aendern
|
||||
|
||||
**Variante 2: Background Agent API**
|
||||
```
|
||||
Eigene App → REST/HTTP → Cursor Background Composer API → Agent Response
|
||||
```
|
||||
|
||||
- Nutzt den offiziellen Background Agent Endpoint
|
||||
- Authentifizierung: `WorkosCursorSessionToken`
|
||||
- Vorteil: Offiziell unterstuetzt, stabiler
|
||||
- Nachteil: Weniger Kontrolle, asynchron
|
||||
|
||||
### Option B: Eigener LLM-Orchestrierungslayer (Empfohlen fuer Unabhaengigkeit)
|
||||
|
||||
Die Cursor-Architektur nachbauen, aber als eigene Komponente:
|
||||
|
||||
```
|
||||
Eigene App
|
||||
↓
|
||||
Chat API Gateway (eigener Service)
|
||||
├─> Prompt Assembly
|
||||
│ ├─> System Prompt (konfigurierbar)
|
||||
│ ├─> Context Injection (Dateien, Projekt-Layout)
|
||||
│ └─> Tool-Definitionen
|
||||
├─> LLM Provider (OpenAI / Anthropic / eigener)
|
||||
│ ├─> POST /v1/chat/completions (OpenAI-kompatibel)
|
||||
│ ├─> Streaming via SSE
|
||||
│ └─> Tool Calling
|
||||
├─> Response Parser
|
||||
│ ├─> Text-Segmente extrahieren
|
||||
│ ├─> Tool-Calls identifizieren
|
||||
│ ├─> Code-Bloecke parsen
|
||||
│ └─> Plan/Todo-Strukturen erkennen
|
||||
└─> JSON Response Builder
|
||||
└─> Strukturiertes JSON mit Dokumenten-Array
|
||||
```
|
||||
|
||||
**Vorteile**:
|
||||
- Keine Abhaengigkeit von Cursor's Backend
|
||||
- Eigene Modelle und Provider waehlbar
|
||||
- Volle Kontrolle ueber Prompt-Struktur
|
||||
- Testbar ohne Cursor-Installation
|
||||
- Erweiterbar mit eigenen Tools
|
||||
|
||||
---
|
||||
|
||||
## 3. API-Design
|
||||
|
||||
### 3.1 Chat Endpoint
|
||||
|
||||
```
|
||||
POST /api/v1/chat/completions
|
||||
Content-Type: application/json
|
||||
Authorization: Bearer <token>
|
||||
```
|
||||
|
||||
**Request**:
|
||||
```json
|
||||
{
|
||||
"model": "claude-4.6-opus",
|
||||
"mode": "agent",
|
||||
"context": {
|
||||
"workspacePath": "/path/to/project",
|
||||
"openFiles": ["src/main.py", "src/utils.py"],
|
||||
"projectLayout": "src/\n main.py\n utils.py\nREADME.md",
|
||||
"rules": [
|
||||
"camelCase fuer Variablen",
|
||||
"Interne Funktionen mit _ Prefix"
|
||||
]
|
||||
},
|
||||
"messages": [
|
||||
{
|
||||
"role": "user",
|
||||
"content": "Implementiere eine Authentication-Middleware"
|
||||
}
|
||||
],
|
||||
"tools": ["codebase_search", "read_file", "edit_file", "grep"],
|
||||
"stream": false
|
||||
}
|
||||
```
|
||||
|
||||
**Response** (siehe `doc_cursor_response_json_structure.md` fuer Details):
|
||||
```json
|
||||
{
|
||||
"id": "chat_abc123",
|
||||
"model": "claude-4.6-opus",
|
||||
"mode": "agent",
|
||||
"created": "2026-02-23T10:30:00Z",
|
||||
"documents": [
|
||||
{
|
||||
"id": "doc_001",
|
||||
"type": "text",
|
||||
"content": "Ich werde eine Authentication-Middleware implementieren..."
|
||||
},
|
||||
{
|
||||
"id": "doc_002",
|
||||
"type": "tool_call",
|
||||
"toolName": "read_file",
|
||||
"arguments": { "target_file": "src/main.py" },
|
||||
"result": "import os\nimport sys..."
|
||||
},
|
||||
{
|
||||
"id": "doc_003",
|
||||
"type": "code_reference",
|
||||
"filePath": "src/main.py",
|
||||
"startLine": 12,
|
||||
"endLine": 25,
|
||||
"content": "def main():\n app = Flask(__name__)..."
|
||||
},
|
||||
{
|
||||
"id": "doc_004",
|
||||
"type": "code_block",
|
||||
"language": "python",
|
||||
"content": "def authMiddleware(f):\n @wraps(f)\n def decorated(*args):\n ..."
|
||||
},
|
||||
{
|
||||
"id": "doc_005",
|
||||
"type": "file_edit",
|
||||
"filePath": "src/middleware/auth.py",
|
||||
"operation": "create",
|
||||
"content": "from functools import wraps\n..."
|
||||
},
|
||||
{
|
||||
"id": "doc_006",
|
||||
"type": "plan",
|
||||
"title": "Auth Middleware Plan",
|
||||
"todos": [
|
||||
{ "id": "setup-auth", "content": "Middleware-Funktion erstellen", "status": "completed" },
|
||||
{ "id": "add-routes", "content": "Routes schuetzen", "status": "pending" }
|
||||
]
|
||||
}
|
||||
],
|
||||
"usage": {
|
||||
"promptTokens": 2450,
|
||||
"completionTokens": 890,
|
||||
"totalTokens": 3340
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3.2 Streaming Endpoint
|
||||
|
||||
```
|
||||
POST /api/v1/chat/completions
|
||||
Content-Type: application/json
|
||||
Accept: text/event-stream
|
||||
```
|
||||
|
||||
Gleicher Request wie oben, aber mit `"stream": true`.
|
||||
|
||||
**SSE Response**:
|
||||
```
|
||||
data: {"type": "document_start", "document": {"id": "doc_001", "type": "text"}}
|
||||
|
||||
data: {"type": "content_delta", "documentId": "doc_001", "delta": "Ich werde "}
|
||||
|
||||
data: {"type": "content_delta", "documentId": "doc_001", "delta": "eine Authentication-"}
|
||||
|
||||
data: {"type": "content_delta", "documentId": "doc_001", "delta": "Middleware implementieren..."}
|
||||
|
||||
data: {"type": "document_end", "documentId": "doc_001"}
|
||||
|
||||
data: {"type": "document_start", "document": {"id": "doc_002", "type": "tool_call", "toolName": "read_file"}}
|
||||
|
||||
data: {"type": "tool_call_arguments", "documentId": "doc_002", "arguments": {"target_file": "src/main.py"}}
|
||||
|
||||
data: {"type": "tool_result", "documentId": "doc_002", "result": "import os\n..."}
|
||||
|
||||
data: {"type": "document_end", "documentId": "doc_002"}
|
||||
|
||||
data: {"type": "done", "usage": {"promptTokens": 2450, "completionTokens": 890}}
|
||||
|
||||
data: [DONE]
|
||||
```
|
||||
|
||||
### 3.3 Plan Mode Endpoint
|
||||
|
||||
```
|
||||
POST /api/v1/chat/plan
|
||||
```
|
||||
|
||||
Gleiche Struktur, aber:
|
||||
- Nur Read-Only Tools verfuegbar
|
||||
- Response enthaelt `type: "plan"` Dokument mit strukturiertem Markdown
|
||||
- Klaerende Fragen als `type: "clarification"` Dokumente
|
||||
|
||||
---
|
||||
|
||||
## 4. Prompt Assembly Service
|
||||
|
||||
Der Kern der Integration ist der **Prompt Assembly Service**, der die gleiche Logik wie Cursor implementiert:
|
||||
|
||||
### 4.1 System Prompt Builder
|
||||
|
||||
```
|
||||
buildSystemPrompt(config)
|
||||
├─> Base Role Prompt
|
||||
│ "You are an AI coding assistant..."
|
||||
├─> Communication Rules
|
||||
│ Markdown, Code Citations
|
||||
├─> Tool Calling Rules
|
||||
│ Parallel Execution, Standard Format
|
||||
├─> Context Strategy
|
||||
│ Thorough Exploration, Semantic Search
|
||||
├─> Code Change Rules
|
||||
│ Read before Edit, Fix Lints
|
||||
├─> Mode-Specific Rules
|
||||
│ Agent: Full tools | Plan: Read-only | Ask: Explain only
|
||||
├─> Workspace Rules
|
||||
│ Aus .cursor/rules/ oder eigener Konfiguration
|
||||
└─> User Rules
|
||||
Benutzerdefinierte Praeferenzen
|
||||
```
|
||||
|
||||
### 4.2 Context Assembly
|
||||
|
||||
```
|
||||
assembleContext(request)
|
||||
├─> User Info
|
||||
│ { os, date, shell, workspacePath }
|
||||
├─> Project Layout
|
||||
│ File Tree (einmalig generiert)
|
||||
├─> Open Files
|
||||
│ Aktuell geoeffnete Dateien mit Inhalt
|
||||
├─> Recent Files
|
||||
│ Kuerzlich angesehene Dateien
|
||||
├─> Git Status
|
||||
│ Branch, Staged/Unstaged Changes
|
||||
└─> Linter Errors
|
||||
Aktuelle Diagnostics
|
||||
```
|
||||
|
||||
### 4.3 Tool-Registry
|
||||
|
||||
Tools werden dynamisch registriert und koennen pro Request konfiguriert werden:
|
||||
|
||||
```json
|
||||
{
|
||||
"availableTools": {
|
||||
"codebase_search": { "enabled": true, "scope": "workspace" },
|
||||
"read_file": { "enabled": true, "allowedPaths": ["src/", "tests/"] },
|
||||
"edit_file": { "enabled": true, "requireApproval": true },
|
||||
"run_terminal_cmd": { "enabled": false },
|
||||
"web_search": { "enabled": true }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Response Parser
|
||||
|
||||
Der Response Parser wandelt den LLM-Stream in strukturierte Dokumente um:
|
||||
|
||||
### 5.1 Parsing-Pipeline
|
||||
|
||||
```
|
||||
LLM Streaming Response
|
||||
↓
|
||||
Token Accumulator
|
||||
↓
|
||||
Segment Detector
|
||||
├─> Text-Segment (alles ausserhalb von Code/Tools)
|
||||
├─> Code-Block (``` ... ```)
|
||||
│ ├─> Code Reference (startLine:endLine:filepath)
|
||||
│ └─> Markdown Code Block (language tag)
|
||||
├─> Tool Call (function call JSON)
|
||||
├─> Plan (create_plan Tool-Aufruf)
|
||||
└─> Todo (todo_write Tool-Aufruf)
|
||||
↓
|
||||
Document Builder
|
||||
├─> Jedem Segment eine Document-ID zuweisen
|
||||
├─> Typ klassifizieren
|
||||
├─> Metadaten extrahieren (Datei, Zeilen, Sprache)
|
||||
└─> In documents[] Array einfuegen
|
||||
↓
|
||||
JSON Response
|
||||
```
|
||||
|
||||
### 5.2 Segment-Erkennung
|
||||
|
||||
| Pattern | Dokument-Typ | Extrahierte Metadaten |
|
||||
|---|---|---|
|
||||
| Freitext | `text` | content |
|
||||
| ` ```startLine:endLine:filepath ` | `code_reference` | filePath, startLine, endLine, content |
|
||||
| ` ```python ` | `code_block` | language, content |
|
||||
| Tool Call: `edit_file` | `file_edit` | filePath, operation, content |
|
||||
| Tool Call: `read_file` | `tool_call` | toolName, arguments, result |
|
||||
| Tool Call: `create_plan` | `plan` | title, overview, todos |
|
||||
| Tool Call: `todo_write` | `todo_update` | todos mit id/status |
|
||||
| Tool Call: `run_terminal_cmd` | `terminal_command` | command, output, exitCode |
|
||||
|
||||
---
|
||||
|
||||
## 6. Implementierungsschritte
|
||||
|
||||
### Phase 1: Chat API Kern
|
||||
|
||||
1. **LLM Gateway** aufbauen mit OpenAI-kompatiblem Interface
|
||||
2. **System Prompt Builder** implementieren (konfigurierbar)
|
||||
3. **Context Assembly** fuer Workspace-Informationen
|
||||
4. **Streaming Response** via SSE
|
||||
5. **Response Parser** fuer Dokument-Extraktion
|
||||
|
||||
### Phase 2: Tool-Integration
|
||||
|
||||
1. **Tool Registry** mit dynamischer Konfiguration
|
||||
2. **File System Tools** (read, edit, grep, glob)
|
||||
3. **Tool Execution Engine** mit Sandbox
|
||||
4. **Multi-Turn Conversation** mit Tool-Results
|
||||
|
||||
### Phase 3: Plan Mode
|
||||
|
||||
1. **Plan Mode Flag** im Request
|
||||
2. **Read-Only Enforcement** (nur lesende Tools)
|
||||
3. **create_plan Response Parsing**
|
||||
4. **Klaerende Fragen** als eigener Dokument-Typ
|
||||
|
||||
### Phase 4: Subagenten
|
||||
|
||||
1. **Task Delegation** an parallel laufende Sub-Agenten
|
||||
2. **Result Aggregation** in Parent-Response
|
||||
3. **Background Agent** fuer lang laufende Tasks
|
||||
|
||||
---
|
||||
|
||||
## 7. Technologie-Empfehlung
|
||||
|
||||
| Komponente | Empfehlung | Begruendung |
|
||||
|---|---|---|
|
||||
| API Framework | FastAPI (Python) oder Express (Node.js) | SSE-Support, async, schnell |
|
||||
| LLM Client | `openai` SDK oder `anthropic` SDK | Standard, gut gewartet |
|
||||
| Streaming | Server-Sent Events (SSE) | Browser-kompatibel, einfach |
|
||||
| Prompt Templates | Jinja2 (Python) oder Handlebars (JS) | Flexibel, testbar |
|
||||
| File System Access | Lokaler Zugriff oder MCP Server | Sicherheit, Modularitaet |
|
||||
| Codebase Search | tree-sitter + Vektor-DB | Wie Cursor, AST-basiert |
|
||||
|
||||
---
|
||||
|
||||
## Referenzen
|
||||
|
||||
- Cursor Architektur: `doc_cursor_ai_agent_architecture.md`
|
||||
- JSON Response Struktur: `doc_cursor_response_json_structure.md`
|
||||
- OpenAI Chat Completions API: https://platform.openai.com/docs/api-reference/chat
|
||||
- Server-Sent Events: https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events
|
||||
726
cursor-doc/doc_cursor_response_json_structure.md
Normal file
726
cursor-doc/doc_cursor_response_json_structure.md
Normal file
|
|
@ -0,0 +1,726 @@
|
|||
# Cursor Response JSON-Struktur - Spezifikation
|
||||
|
||||
**Stand**: Februar 2026
|
||||
**Zweck**: Definition der JSON-Antwortstruktur fuer die Chat-API-Komponente, in der Antwortteile als einzelne Dokumente erfasst werden
|
||||
|
||||
---
|
||||
|
||||
## 1. Design-Prinzip
|
||||
|
||||
Eine LLM-Antwort besteht typischerweise aus verschiedenen Segmenten: erklaerenden Texten, Code-Referenzen, neuen Code-Bloecken, Tool-Aufrufen, Datei-Aenderungen und Plaenen. Diese Segmente werden als einzelne **Dokumente** in einem Array erfasst, um sie einzeln verarbeiten, rendern und referenzieren zu koennen.
|
||||
|
||||
---
|
||||
|
||||
## 2. Top-Level Response
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "chat_<uuid>",
|
||||
"conversationId": "conv_<uuid>",
|
||||
"model": "claude-4.6-opus",
|
||||
"mode": "agent | plan | ask | debug",
|
||||
"created": "2026-02-23T10:30:00Z",
|
||||
"status": "completed | streaming | error",
|
||||
"documents": [ /* Array von Document-Objekten */ ],
|
||||
"usage": {
|
||||
"promptTokens": 2450,
|
||||
"completionTokens": 890,
|
||||
"totalTokens": 3340
|
||||
},
|
||||
"metadata": {
|
||||
"duration_ms": 4520,
|
||||
"toolCallCount": 3,
|
||||
"turnCount": 2
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Feld-Beschreibung
|
||||
|
||||
| Feld | Typ | Beschreibung |
|
||||
|---|---|---|
|
||||
| `id` | string | Eindeutige ID dieser Antwort |
|
||||
| `conversationId` | string | ID der laufenden Konversation (fuer Multi-Turn) |
|
||||
| `model` | string | Verwendetes LLM-Modell |
|
||||
| `mode` | enum | Aktiver Modus: agent, plan, ask, debug |
|
||||
| `created` | ISO 8601 | Zeitstempel der Erstellung |
|
||||
| `status` | enum | Status: completed, streaming, error |
|
||||
| `documents` | array | Geordnetes Array der Antwort-Dokumente |
|
||||
| `usage` | object | Token-Verbrauch |
|
||||
| `metadata` | object | Zusaetzliche Metadaten |
|
||||
|
||||
---
|
||||
|
||||
## 3. Document Types
|
||||
|
||||
Jedes Dokument im `documents[]` Array hat folgende Basis-Struktur:
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "doc_<sequenzNummer>",
|
||||
"type": "<documentType>",
|
||||
"sequence": 1,
|
||||
"content": "...",
|
||||
"metadata": { /* typ-spezifische Metadaten */ }
|
||||
}
|
||||
```
|
||||
|
||||
### 3.1 `text` - Erklaerungstext
|
||||
|
||||
Freitext-Erklaerungen, Beschreibungen, Instruktionen des AI-Assistenten.
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "doc_001",
|
||||
"type": "text",
|
||||
"sequence": 1,
|
||||
"content": "Ich werde eine Authentication-Middleware implementieren, die JWT-Token validiert und den Request-Context mit User-Informationen anreichert.\n\nDafuer sind folgende Schritte noetig:",
|
||||
"metadata": {
|
||||
"format": "markdown"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3.2 `code_reference` - Verweis auf existierenden Code
|
||||
|
||||
Referenz auf bestehenden Code im Projekt (Cursor's `startLine:endLine:filepath` Format).
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "doc_002",
|
||||
"type": "code_reference",
|
||||
"sequence": 2,
|
||||
"content": "def main():\n app = Flask(__name__)\n app.register_blueprint(api_bp)",
|
||||
"metadata": {
|
||||
"filePath": "src/main.py",
|
||||
"startLine": 12,
|
||||
"endLine": 14,
|
||||
"language": "python"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3.3 `code_block` - Neuer/vorgeschlagener Code
|
||||
|
||||
Code-Block fuer neuen oder vorgeschlagenen Code, der nicht direkt einer Datei zugeordnet ist.
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "doc_003",
|
||||
"type": "code_block",
|
||||
"sequence": 3,
|
||||
"content": "from functools import wraps\nfrom flask import request, jsonify\nimport jwt\n\ndef authMiddleware(f):\n @wraps(f)\n def decorated(*args, **kwargs):\n token = request.headers.get('Authorization')\n if not token:\n return jsonify({'error': 'Token fehlt'}), 401\n try:\n data = jwt.decode(token, SECRET_KEY, algorithms=['HS256'])\n request.userId = data['userId']\n except jwt.InvalidTokenError:\n return jsonify({'error': 'Ungueltiger Token'}), 401\n return f(*args, **kwargs)\n return decorated",
|
||||
"metadata": {
|
||||
"language": "python",
|
||||
"purpose": "new_code"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3.4 `file_edit` - Datei-Aenderung
|
||||
|
||||
Konkrete Aenderung an einer bestehenden oder neuen Datei.
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "doc_004",
|
||||
"type": "file_edit",
|
||||
"sequence": 4,
|
||||
"content": "from functools import wraps\nfrom flask import request, jsonify\nimport jwt\n\nSECRET_KEY = 'your-secret-key'\n\ndef authMiddleware(f):\n ...",
|
||||
"metadata": {
|
||||
"filePath": "src/middleware/auth.py",
|
||||
"operation": "create",
|
||||
"language": "python",
|
||||
"diff": null
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Operationen**:
|
||||
|
||||
| Operation | Beschreibung |
|
||||
|---|---|
|
||||
| `create` | Neue Datei erstellen |
|
||||
| `edit` | Bestehende Datei aendern |
|
||||
| `delete` | Datei loeschen |
|
||||
| `rename` | Datei umbenennen |
|
||||
|
||||
Bei `edit` wird zusaetzlich ein Diff mitgeliefert:
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "doc_005",
|
||||
"type": "file_edit",
|
||||
"sequence": 5,
|
||||
"content": "app.register_blueprint(api_bp)\napp.register_blueprint(auth_bp)",
|
||||
"metadata": {
|
||||
"filePath": "src/main.py",
|
||||
"operation": "edit",
|
||||
"language": "python",
|
||||
"diff": {
|
||||
"oldString": "app.register_blueprint(api_bp)",
|
||||
"newString": "app.register_blueprint(api_bp)\napp.register_blueprint(auth_bp)",
|
||||
"startLine": 14,
|
||||
"endLine": 14
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3.5 `tool_call` - Tool-Aufruf und Ergebnis
|
||||
|
||||
Dokumentiert einen Tool-Aufruf inklusive Argumente und Ergebnis.
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "doc_006",
|
||||
"type": "tool_call",
|
||||
"sequence": 6,
|
||||
"content": null,
|
||||
"metadata": {
|
||||
"toolName": "codebase_search",
|
||||
"toolCallId": "call_abc123",
|
||||
"arguments": {
|
||||
"query": "Where is the Flask app initialized?",
|
||||
"target_directories": ["src/"],
|
||||
"explanation": "Find the main app entry point"
|
||||
},
|
||||
"result": {
|
||||
"status": "success",
|
||||
"data": "Found in src/main.py:12 - app = Flask(__name__)"
|
||||
},
|
||||
"duration_ms": 320
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3.6 `terminal_command` - Terminal-Ausfuehrung
|
||||
|
||||
Dokumentiert die Ausfuehrung eines Terminal-Kommandos.
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "doc_007",
|
||||
"type": "terminal_command",
|
||||
"sequence": 7,
|
||||
"content": null,
|
||||
"metadata": {
|
||||
"command": "pip install PyJWT flask",
|
||||
"workingDirectory": "/path/to/project",
|
||||
"exitCode": 0,
|
||||
"output": "Successfully installed PyJWT-2.8.0 flask-3.0.0",
|
||||
"duration_ms": 5200,
|
||||
"permissions": ["network"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3.7 `plan` - Strukturierter Plan
|
||||
|
||||
Ergebnis des Plan-Modus oder eines `create_plan` Tool-Aufrufs.
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "doc_008",
|
||||
"type": "plan",
|
||||
"sequence": 8,
|
||||
"content": "# Authentication Middleware\n\n## Uebersicht\nImplementierung einer JWT-basierten Auth-Middleware...\n\n## Dateien\n- `src/middleware/auth.py` (neu)\n- `src/main.py` (aendern)\n- `requirements.txt` (aendern)",
|
||||
"metadata": {
|
||||
"title": "Auth Middleware Implementation",
|
||||
"overview": "JWT-basierte Authentication-Middleware mit Token-Validierung",
|
||||
"todos": [
|
||||
{
|
||||
"id": "create-middleware",
|
||||
"content": "Auth-Middleware-Modul erstellen",
|
||||
"status": "pending",
|
||||
"dependencies": []
|
||||
},
|
||||
{
|
||||
"id": "register-blueprint",
|
||||
"content": "Blueprint in main.py registrieren",
|
||||
"status": "pending",
|
||||
"dependencies": ["create-middleware"]
|
||||
},
|
||||
{
|
||||
"id": "update-deps",
|
||||
"content": "PyJWT zu requirements.txt hinzufuegen",
|
||||
"status": "pending",
|
||||
"dependencies": []
|
||||
},
|
||||
{
|
||||
"id": "protect-routes",
|
||||
"content": "Schuetzenswerte Routes mit Decorator versehen",
|
||||
"status": "pending",
|
||||
"dependencies": ["create-middleware", "register-blueprint"]
|
||||
}
|
||||
],
|
||||
"format": "markdown"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3.8 `clarification` - Klaerende Frage
|
||||
|
||||
Klaerende Rueckfrage des AI-Assistenten an den User (haeufig im Plan Mode).
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "doc_009",
|
||||
"type": "clarification",
|
||||
"sequence": 9,
|
||||
"content": "Bevor ich den Plan finalisiere, habe ich folgende Fragen:",
|
||||
"metadata": {
|
||||
"questions": [
|
||||
{
|
||||
"id": "q1",
|
||||
"text": "Soll die Middleware JWT oder Session-basierte Auth verwenden?",
|
||||
"options": [
|
||||
{ "id": "jwt", "label": "JWT Token-basiert" },
|
||||
{ "id": "session", "label": "Session-basiert mit Cookies" }
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "q2",
|
||||
"text": "Welche Routes sollen geschuetzt werden?",
|
||||
"options": [
|
||||
{ "id": "all", "label": "Alle ausser Login/Register" },
|
||||
{ "id": "selective", "label": "Nur explizit markierte Routes" }
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3.9 `error` - Fehlermeldung
|
||||
|
||||
Fehlermeldung bei Tool-Ausfuehrung oder LLM-Fehler.
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "doc_010",
|
||||
"type": "error",
|
||||
"sequence": 10,
|
||||
"content": "Die Datei konnte nicht gelesen werden",
|
||||
"metadata": {
|
||||
"errorCode": "FILE_NOT_FOUND",
|
||||
"source": "read_file",
|
||||
"details": "src/middleware/auth.py does not exist"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3.10 `todo_update` - Task-Status-Aenderung
|
||||
|
||||
Aenderung am Task-Tracking waehrend der Ausfuehrung.
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "doc_011",
|
||||
"type": "todo_update",
|
||||
"sequence": 11,
|
||||
"content": null,
|
||||
"metadata": {
|
||||
"todos": [
|
||||
{ "id": "create-middleware", "status": "completed" },
|
||||
{ "id": "register-blueprint", "status": "in_progress" }
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Vollstaendiges Response-Beispiel
|
||||
|
||||
Ein realistisches Beispiel fuer die Anfrage "Implementiere JWT Authentication":
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "chat_7f3a2b1c",
|
||||
"conversationId": "conv_9e8d7c6b",
|
||||
"model": "claude-4.6-opus",
|
||||
"mode": "agent",
|
||||
"created": "2026-02-23T10:30:00Z",
|
||||
"status": "completed",
|
||||
"documents": [
|
||||
{
|
||||
"id": "doc_001",
|
||||
"type": "text",
|
||||
"sequence": 1,
|
||||
"content": "Ich analysiere zuerst die bestehende Projektstruktur, um die beste Integrationsstrategie zu finden.",
|
||||
"metadata": { "format": "markdown" }
|
||||
},
|
||||
{
|
||||
"id": "doc_002",
|
||||
"type": "tool_call",
|
||||
"sequence": 2,
|
||||
"content": null,
|
||||
"metadata": {
|
||||
"toolName": "codebase_search",
|
||||
"toolCallId": "call_001",
|
||||
"arguments": {
|
||||
"query": "Where is the Flask app initialized?",
|
||||
"target_directories": ["src/"]
|
||||
},
|
||||
"result": { "status": "success", "data": "src/main.py:12" },
|
||||
"duration_ms": 280
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "doc_003",
|
||||
"type": "tool_call",
|
||||
"sequence": 3,
|
||||
"content": null,
|
||||
"metadata": {
|
||||
"toolName": "read_file",
|
||||
"toolCallId": "call_002",
|
||||
"arguments": { "target_file": "src/main.py" },
|
||||
"result": {
|
||||
"status": "success",
|
||||
"data": "import os\nfrom flask import Flask\n\ndef main():\n app = Flask(__name__)\n app.register_blueprint(api_bp)\n return app"
|
||||
},
|
||||
"duration_ms": 45
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "doc_004",
|
||||
"type": "code_reference",
|
||||
"sequence": 4,
|
||||
"content": "def main():\n app = Flask(__name__)\n app.register_blueprint(api_bp)\n return app",
|
||||
"metadata": {
|
||||
"filePath": "src/main.py",
|
||||
"startLine": 5,
|
||||
"endLine": 8,
|
||||
"language": "python"
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "doc_005",
|
||||
"type": "text",
|
||||
"sequence": 5,
|
||||
"content": "Ich erstelle jetzt die Auth-Middleware und integriere sie in die App.",
|
||||
"metadata": { "format": "markdown" }
|
||||
},
|
||||
{
|
||||
"id": "doc_006",
|
||||
"type": "file_edit",
|
||||
"sequence": 6,
|
||||
"content": "from functools import wraps\nfrom flask import request, jsonify\nimport jwt\nimport os\n\nSECRET_KEY = os.environ.get('JWT_SECRET', 'dev-secret')\n\ndef authMiddleware(f):\n @wraps(f)\n def decorated(*args, **kwargs):\n token = request.headers.get('Authorization', '').replace('Bearer ', '')\n if not token:\n return jsonify({'error': 'Token required'}), 401\n try:\n data = jwt.decode(token, SECRET_KEY, algorithms=['HS256'])\n request.userId = data['userId']\n except jwt.InvalidTokenError:\n return jsonify({'error': 'Invalid token'}), 401\n return f(*args, **kwargs)\n return decorated",
|
||||
"metadata": {
|
||||
"filePath": "src/middleware/auth.py",
|
||||
"operation": "create",
|
||||
"language": "python"
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "doc_007",
|
||||
"type": "file_edit",
|
||||
"sequence": 7,
|
||||
"content": "from middleware.auth import authMiddleware\n\napp.register_blueprint(api_bp)\napp.register_blueprint(auth_bp)",
|
||||
"metadata": {
|
||||
"filePath": "src/main.py",
|
||||
"operation": "edit",
|
||||
"language": "python",
|
||||
"diff": {
|
||||
"oldString": "app.register_blueprint(api_bp)",
|
||||
"newString": "from middleware.auth import authMiddleware\n\napp.register_blueprint(api_bp)\napp.register_blueprint(auth_bp)",
|
||||
"startLine": 6,
|
||||
"endLine": 6
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "doc_008",
|
||||
"type": "terminal_command",
|
||||
"sequence": 8,
|
||||
"content": null,
|
||||
"metadata": {
|
||||
"command": "pip install PyJWT",
|
||||
"exitCode": 0,
|
||||
"output": "Successfully installed PyJWT-2.8.0",
|
||||
"duration_ms": 3200,
|
||||
"permissions": ["network"]
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "doc_009",
|
||||
"type": "text",
|
||||
"sequence": 9,
|
||||
"content": "Die JWT Authentication-Middleware ist implementiert. Du kannst jetzt Routes schuetzen mit dem `@authMiddleware` Decorator:\n\n```python\n@app.route('/api/protected')\n@authMiddleware\ndef protectedRoute():\n return jsonify({'userId': request.userId})\n```",
|
||||
"metadata": { "format": "markdown" }
|
||||
}
|
||||
],
|
||||
"usage": {
|
||||
"promptTokens": 3200,
|
||||
"completionTokens": 1450,
|
||||
"totalTokens": 4650
|
||||
},
|
||||
"metadata": {
|
||||
"duration_ms": 8750,
|
||||
"toolCallCount": 2,
|
||||
"turnCount": 3
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Streaming-Format
|
||||
|
||||
Im Streaming-Modus wird jedes Dokument als Serie von Server-Sent Events uebertragen:
|
||||
|
||||
### 5.1 Event Types
|
||||
|
||||
| Event Type | Beschreibung |
|
||||
|---|---|
|
||||
| `document_start` | Neues Dokument beginnt, enthaelt id und type |
|
||||
| `content_delta` | Text-Chunk fuer ein laufendes Dokument |
|
||||
| `tool_call_start` | Tool-Aufruf beginnt |
|
||||
| `tool_call_arguments` | Argumente des Tool-Aufrufs |
|
||||
| `tool_result` | Ergebnis des Tool-Aufrufs |
|
||||
| `document_end` | Dokument abgeschlossen |
|
||||
| `done` | Gesamte Antwort abgeschlossen, enthaelt usage |
|
||||
|
||||
### 5.2 Event-Sequenz fuer ein Text-Dokument
|
||||
|
||||
```
|
||||
event: document_start
|
||||
data: {"id":"doc_001","type":"text","sequence":1}
|
||||
|
||||
event: content_delta
|
||||
data: {"documentId":"doc_001","delta":"Ich analysiere "}
|
||||
|
||||
event: content_delta
|
||||
data: {"documentId":"doc_001","delta":"zuerst die bestehende "}
|
||||
|
||||
event: content_delta
|
||||
data: {"documentId":"doc_001","delta":"Projektstruktur..."}
|
||||
|
||||
event: document_end
|
||||
data: {"documentId":"doc_001","finalContent":"Ich analysiere zuerst die bestehende Projektstruktur..."}
|
||||
```
|
||||
|
||||
### 5.3 Event-Sequenz fuer einen Tool-Call
|
||||
|
||||
```
|
||||
event: document_start
|
||||
data: {"id":"doc_002","type":"tool_call","sequence":2}
|
||||
|
||||
event: tool_call_start
|
||||
data: {"documentId":"doc_002","toolName":"read_file","toolCallId":"call_001"}
|
||||
|
||||
event: tool_call_arguments
|
||||
data: {"documentId":"doc_002","arguments":{"target_file":"src/main.py"}}
|
||||
|
||||
event: tool_result
|
||||
data: {"documentId":"doc_002","result":{"status":"success","data":"import os\n..."}}
|
||||
|
||||
event: document_end
|
||||
data: {"documentId":"doc_002"}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. TypeScript Interface-Definitionen
|
||||
|
||||
```typescript
|
||||
interface ChatResponse {
|
||||
id: string;
|
||||
conversationId: string;
|
||||
model: string;
|
||||
mode: "agent" | "plan" | "ask" | "debug";
|
||||
created: string;
|
||||
status: "completed" | "streaming" | "error";
|
||||
documents: Document[];
|
||||
usage: UsageInfo;
|
||||
metadata: ResponseMetadata;
|
||||
}
|
||||
|
||||
type DocumentType =
|
||||
| "text"
|
||||
| "code_reference"
|
||||
| "code_block"
|
||||
| "file_edit"
|
||||
| "tool_call"
|
||||
| "terminal_command"
|
||||
| "plan"
|
||||
| "clarification"
|
||||
| "error"
|
||||
| "todo_update";
|
||||
|
||||
interface Document {
|
||||
id: string;
|
||||
type: DocumentType;
|
||||
sequence: number;
|
||||
content: string | null;
|
||||
metadata: Record<string, unknown>;
|
||||
}
|
||||
|
||||
interface TextMetadata {
|
||||
format: "markdown" | "plain";
|
||||
}
|
||||
|
||||
interface CodeReferenceMetadata {
|
||||
filePath: string;
|
||||
startLine: number;
|
||||
endLine: number;
|
||||
language: string;
|
||||
}
|
||||
|
||||
interface CodeBlockMetadata {
|
||||
language: string;
|
||||
purpose: "new_code" | "example" | "suggestion";
|
||||
}
|
||||
|
||||
interface FileEditMetadata {
|
||||
filePath: string;
|
||||
operation: "create" | "edit" | "delete" | "rename";
|
||||
language: string;
|
||||
diff?: {
|
||||
oldString: string;
|
||||
newString: string;
|
||||
startLine: number;
|
||||
endLine: number;
|
||||
};
|
||||
}
|
||||
|
||||
interface ToolCallMetadata {
|
||||
toolName: string;
|
||||
toolCallId: string;
|
||||
arguments: Record<string, unknown>;
|
||||
result: {
|
||||
status: "success" | "error";
|
||||
data: unknown;
|
||||
};
|
||||
duration_ms: number;
|
||||
}
|
||||
|
||||
interface TerminalCommandMetadata {
|
||||
command: string;
|
||||
workingDirectory?: string;
|
||||
exitCode: number;
|
||||
output: string;
|
||||
duration_ms: number;
|
||||
permissions: string[];
|
||||
}
|
||||
|
||||
interface PlanMetadata {
|
||||
title: string;
|
||||
overview: string;
|
||||
todos: PlanTodo[];
|
||||
format: "markdown";
|
||||
}
|
||||
|
||||
interface PlanTodo {
|
||||
id: string;
|
||||
content: string;
|
||||
status: "pending" | "in_progress" | "completed" | "cancelled";
|
||||
dependencies: string[];
|
||||
}
|
||||
|
||||
interface ClarificationMetadata {
|
||||
questions: ClarificationQuestion[];
|
||||
}
|
||||
|
||||
interface ClarificationQuestion {
|
||||
id: string;
|
||||
text: string;
|
||||
options: { id: string; label: string }[];
|
||||
}
|
||||
|
||||
interface ErrorMetadata {
|
||||
errorCode: string;
|
||||
source: string;
|
||||
details: string;
|
||||
}
|
||||
|
||||
interface TodoUpdateMetadata {
|
||||
todos: { id: string; status: string }[];
|
||||
}
|
||||
|
||||
interface UsageInfo {
|
||||
promptTokens: number;
|
||||
completionTokens: number;
|
||||
totalTokens: number;
|
||||
}
|
||||
|
||||
interface ResponseMetadata {
|
||||
duration_ms: number;
|
||||
toolCallCount: number;
|
||||
turnCount: number;
|
||||
}
|
||||
|
||||
// Streaming Events
|
||||
type StreamEventType =
|
||||
| "document_start"
|
||||
| "content_delta"
|
||||
| "tool_call_start"
|
||||
| "tool_call_arguments"
|
||||
| "tool_result"
|
||||
| "document_end"
|
||||
| "done";
|
||||
|
||||
interface StreamEvent {
|
||||
type: StreamEventType;
|
||||
documentId?: string;
|
||||
data: unknown;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Mapping: Cursor LLM Response → Document Types
|
||||
|
||||
Wie Cursor's roher LLM-Stream in Dokument-Typen umgewandelt wird:
|
||||
|
||||
```
|
||||
LLM Output Stream
|
||||
↓
|
||||
Parser State Machine
|
||||
│
|
||||
├─> Erkennt freien Text → type: "text"
|
||||
│ Akkumuliert bis naechster Segment-Beginn
|
||||
│
|
||||
├─> Erkennt ``` mit startLine:endLine:filepath → type: "code_reference"
|
||||
│ Extrahiert: filePath, startLine, endLine, content
|
||||
│
|
||||
├─> Erkennt ``` mit Language-Tag → type: "code_block"
|
||||
│ Extrahiert: language, content
|
||||
│
|
||||
├─> Erkennt Tool Call im Response
|
||||
│ ├─> tool_name == "edit_file" → type: "file_edit"
|
||||
│ ├─> tool_name == "run_terminal_cmd" → type: "terminal_command"
|
||||
│ ├─> tool_name == "create_plan" → type: "plan"
|
||||
│ ├─> tool_name == "todo_write" → type: "todo_update"
|
||||
│ └─> alle anderen → type: "tool_call"
|
||||
│
|
||||
└─> Erkennt Fehler → type: "error"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Referenzen
|
||||
|
||||
- Cursor Architektur: `doc_cursor_ai_agent_architecture.md`
|
||||
- API Integration Konzept: `doc_cursor_chat_api_integration_concept.md`
|
||||
- OpenAI Chat Completions Streaming: https://platform.openai.com/docs/api-reference/chat/streaming
|
||||
- Server-Sent Events Spec: https://html.spec.whatwg.org/multipage/server-sent-events.html
|
||||
Loading…
Reference in a new issue