diff --git a/poweron/implementation/implementation_user_prompt_analysis.md b/poweron/implementation/implementation_user_prompt_analysis.md new file mode 100644 index 0000000..25b6803 --- /dev/null +++ b/poweron/implementation/implementation_user_prompt_analysis.md @@ -0,0 +1,119 @@ +## User Prompt Analysis: Intent Extraction and Context Documentization + +### Objective +- Extract a clean, concise user intent from the first user message of each workflow round. +- Move large or detailed inline supportive content into `ChatDocument` entries attached to the same first user message. +- Persist the cleaned intent in `services.currentUserPrompt` and keep the original message in `services.rawUserPrompt`. +- Normalize the intent to the detected language. + +### Integration Point +- Layer: Workflow level, same module where task planning is initiated. +- Timing: Immediately when a new round starts and the first user message is being created (before task planning and any action planning). +- Side effects: + - Create/attach `ChatDocument` items to the first user message with `documentsLabel = "user_context"`. + - Ensure these documents are discoverable via existing `AVAILABLE_DOCUMENTS*` placeholders. + +### Data Flow +1) Receive raw user message for the round → store `services.rawUserPrompt`. +2) Run AI-based analyzer to produce `{ detectedLanguage, intent, contextItems[] }`. +3) Set `services.user.language = detectedLanguage` (if present). +4) Set `services.currentUserPrompt = intent`. +5) For each `contextItems[i]`, create a `ChatDocument` (fileName: `user_context_{i}.txt` or derived) and attach to the first user message. Group via `docList:messageId:user_context`. + +### Minimal User Input Object (in-memory) +- detectedLanguage: string (ISO, e.g., "en") +- intent: string (concise, normalized) +- contextItems: array of items to be persisted as ChatDocuments only (not retained as a list beyond creation) + +### AI Analyzer Prompt (JSON braces escaped for docs) +Use this prompt for the analyzer call. Output must be JSON-only and use the following structure. Note: to display JSON in docs, we show braces as doubled `{{` `}}`. + +``` +You are an input analyzer. Split the user's message into: +1) intent: the user's core request in one concise paragraph, normalized to the user's language. +2) contextItems: supportive data to attach as separate documents if significantly larger than the intent. Include large literal data blocks, long lists/tables, code/JSON blocks, quoted transcripts, CSV fragments, or detailed specs. Keep URLs in the intent unless they include large pasted content. + +Rules: +- If total content length (intent + data) is less than 10% of the model's max tokens, do not extract; return an empty contextItems and keep a compact, self-contained intent. +- If content exceeds that, move bulky parts into contextItems, keeping the intent short and clear. +- Preserve critical references (URLs, filenames) in the intent. +- Normalize the intent to the detected language. If mixed-language, use the primary detected language and normalize. + +Output JSON only (no markdown): +{{ + "detectedLanguage": "en", + "intent": "Concise normalized request...", + "contextItems": [ + {{ + "title": "User context 1", + "mimeType": "text/plain", + "content": "Full extracted content block here" + }} + ] +}} +``` + +### Algorithm (concise) +1) On new round user message creation: + - Set `services.rawUserPrompt = rawMessage`. + - Determine model `maxTokens` (from current model selection). + - Call AI analyzer with prompt above and the raw message. +2) Parse analyzer result: + - Fallback: if invalid, set `services.currentUserPrompt = rawMessage`, `contextItems = []`. + - Else set `services.currentUserPrompt = intent`, update `services.user.language` when provided. +3) Create context documents: + - For each `contextItem`, create a `ChatDocument` using component/file interfaces. + - Attach to the first user message; label group as `user_context` so it appears in `docList:messageId:user_context`. +4) Downstream prompt extractors: + - `extractUserPrompt` returns `services.currentUserPrompt` if available, otherwise fallback. + - `AVAILABLE_DOCUMENTS*` functions continue to index attached documents. + +### Pseudocode (high-level) +``` +raw = userMessage.text +services.rawUserPrompt = raw + +modelMax = ai.getModelMaxTokens() +analysis = ai.callAnalyzer(raw, modelMax) + +if !analysis.valid: + services.currentUserPrompt = raw + items = [] +else: + services.user.language = analysis.detectedLanguage or services.user.language + services.currentUserPrompt = analysis.intent + items = analysis.contextItems or [] + +for i, item in enumerate(items): + fileName = inferFileName(item.title, i) // default: user_context_{i}.txt + doc = createChatDocument(fileName, item.mimeType, item.content, messageId=firstMessage.id) + attachDocumentToMessage(doc, label="user_context") +``` + +### Edge Cases +- Analyzer returns empty/invalid → keep raw prompt as current. +- Extremely large context blocks → rely on file storage and existing compression paths. +- Mixed-language messages → normalize intent to detected primary language. +- Token threshold (~10% of model max) → skip extraction when very small. + +### Telemetry & Logging +- Log analyzer input size, output size, number of context items, and time. +- Trace the final intent and number of documents created (not content). + +### Rollout +1) Implement analyzer call and storage. +2) Attach documents and verify they appear in AVAILABLE_DOCUMENTS index. +3) Update `extractUserPrompt` to prefer `services.currentUserPrompt`. +4) Add metrics and guardrails; enable behind a feature flag if needed. + +### Testing +- Unit: parsing analyzer response; document creation; `extractUserPrompt` fallback. +- Integration: start workflow round → verify `services.currentUserPrompt` set and `user_context` docs indexed. +- Regression: prompts render correctly; parameters generation can reference new docs. + +### Acceptance Criteria +- Clean intent set on `services.currentUserPrompt` consistently. +- Context extracted into documents when above threshold; otherwise kept inline. +- `AVAILABLE_DOCUMENTS*` includes new context docs; `extractUserPrompt` returns cleaned intent. + +