wiki/z-archive/implementation/implementation_user_prompt_analysis.md at c2ab5d976c8fc26e0fca55769e8d5e95195759d9

ValueOn AG 8bd4e67be6 docs: complete wiki restructuring - new folder hierarchy, canonical reference pages, archive old docs

Made-with: Cursor

2026-04-05 23:28:14 +02:00

5.8 KiB

Raw Blame History

User Prompt Analysis: Intent Extraction and Context Documentization

Objective

Extract a clean, concise user intent from the first user message of each workflow round.
Move large or detailed inline supportive content into ChatDocument entries attached to the same first user message.
Persist the cleaned intent in services.currentUserPrompt and keep the original message in services.rawUserPrompt.
Normalize the intent to the detected language.

Integration Point

Layer: Workflow level, same module where task planning is initiated.
Timing: Immediately when a new round starts and the first user message is being created (before task planning and any action planning).
Side effects:
- Create/attach ChatDocument items to the first user message with documentsLabel = "user_context".
- Ensure these documents are discoverable via existing AVAILABLE_DOCUMENTS* placeholders.

Data Flow

Receive raw user message for the round → store services.rawUserPrompt.
Run AI-based analyzer to produce { detectedLanguage, intent, contextItems[] }.
Set services.user.language = detectedLanguage (if present).
Set services.currentUserPrompt = intent.
For each contextItems[i], create a ChatDocument (fileName: user_context_{i}.txt or derived) and attach to the first user message. Group via docList:messageId:user_context.

Minimal User Input Object (in-memory)

detectedLanguage: string (ISO, e.g., "en")
intent: string (concise, normalized)
contextItems: array of items to be persisted as ChatDocuments only (not retained as a list beyond creation)

AI Analyzer Prompt (JSON braces escaped for docs)

Use this prompt for the analyzer call. Output must be JSON-only and use the following structure. Note: to display JSON in docs, we show braces as doubled {{ }}.

You are an input analyzer. Split the user's message into:
1) intent: the user's core request in one concise paragraph, normalized to the user's language.
2) contextItems: supportive data to attach as separate documents if significantly larger than the intent. Include large literal data blocks, long lists/tables, code/JSON blocks, quoted transcripts, CSV fragments, or detailed specs. Keep URLs in the intent unless they include large pasted content.

Rules:
- If total content length (intent + data) is less than 10% of the model's max tokens, do not extract; return an empty contextItems and keep a compact, self-contained intent.
- If content exceeds that, move bulky parts into contextItems, keeping the intent short and clear.
- Preserve critical references (URLs, filenames) in the intent.
- Normalize the intent to the detected language. If mixed-language, use the primary detected language and normalize.

Output JSON only (no markdown):
{{
  "detectedLanguage": "en",
  "intent": "Concise normalized request...",
  "contextItems": [
    {{
      "title": "User context 1",
      "mimeType": "text/plain",
      "content": "Full extracted content block here"
    }}
  ]
}}

Algorithm (concise)

On new round user message creation:
- Set services.rawUserPrompt = rawMessage.
- Determine model maxTokens (from current model selection).
- Call AI analyzer with prompt above and the raw message.
Parse analyzer result:
- Fallback: if invalid, set services.currentUserPrompt = rawMessage, contextItems = [].
- Else set services.currentUserPrompt = intent, update services.user.language when provided.
Create context documents:
- For each contextItem, create a ChatDocument using component/file interfaces.
- Attach to the first user message; label group as user_context so it appears in docList:messageId:user_context.
Downstream prompt extractors:
- extractUserPrompt returns services.currentUserPrompt if available, otherwise fallback.
- AVAILABLE_DOCUMENTS* functions continue to index attached documents.

Pseudocode (high-level)

raw = userMessage.text
services.rawUserPrompt = raw

modelMax = ai.getModelMaxTokens()
analysis = ai.callAnalyzer(raw, modelMax)

if !analysis.valid:
  services.currentUserPrompt = raw
  items = []
else:
  services.user.language = analysis.detectedLanguage or services.user.language
  services.currentUserPrompt = analysis.intent
  items = analysis.contextItems or []

for i, item in enumerate(items):
  fileName = inferFileName(item.title, i)  // default: user_context_{i}.txt
  doc = createChatDocument(fileName, item.mimeType, item.content, messageId=firstMessage.id)
  attachDocumentToMessage(doc, label="user_context")

Edge Cases

Analyzer returns empty/invalid → keep raw prompt as current.
Extremely large context blocks → rely on file storage and existing compression paths.
Mixed-language messages → normalize intent to detected primary language.
Token threshold (~10% of model max) → skip extraction when very small.

Telemetry & Logging

Log analyzer input size, output size, number of context items, and time.
Trace the final intent and number of documents created (not content).

Rollout

Implement analyzer call and storage.
Attach documents and verify they appear in AVAILABLE_DOCUMENTS index.
Update extractUserPrompt to prefer services.currentUserPrompt.
Add metrics and guardrails; enable behind a feature flag if needed.

Testing

Unit: parsing analyzer response; document creation; extractUserPrompt fallback.
Integration: start workflow round → verify services.currentUserPrompt set and user_context docs indexed.
Regression: prompts render correctly; parameters generation can reference new docs.

Acceptance Criteria

Clean intent set on services.currentUserPrompt consistently.
Context extracted into documents when above threshold; otherwise kept inline.
AVAILABLE_DOCUMENTS* includes new context docs; extractUserPrompt returns cleaned intent.

5.8 KiB Raw Blame History

User Prompt Analysis: Intent Extraction and Context Documentization

Objective

Integration Point

Data Flow

Minimal User Input Object (in-memory)

AI Analyzer Prompt (JSON braces escaped for docs)

Algorithm (concise)

Pseudocode (high-level)

Edge Cases

Telemetry & Logging

Rollout

Testing

Acceptance Criteria

5.8 KiB

Raw Blame History