identification of missing actions for future implementation
This commit is contained in:
parent
f637b83f39
commit
8e16ef81b9
1 changed files with 469 additions and 0 deletions
469
poweron/reviews/context_checker_analysis.md
Normal file
469
poweron/reviews/context_checker_analysis.md
Normal file
|
|
@ -0,0 +1,469 @@
|
|||
# Context Checker Analysis
|
||||
## Comprehensive User Prompt Testing & Action Coverage Analysis
|
||||
|
||||
**IMPORTANT**: This analysis assumes the system has a **dynamic task and action planner** that can chain multiple actions together. Therefore, only user requests that **cannot be fulfilled even by a sequence of actions** are marked as missing.
|
||||
|
||||
### Available Actions Summary
|
||||
1. **ai.process** - Universal AI document processing: accepts MULTIPLE input documents in ANY format, processes them together with a prompt, produces MULTIPLE output documents in ANY format. Handles: format conversion, document generation, summarization, translation, extraction, analysis, comparison, data transformation, chart generation, and any AI-powered document manipulation.
|
||||
2. **ai.webResearch** - Web research with search and crawl
|
||||
3. **outlook.composeAndSendEmailWithContext** - AI-composed emails with context
|
||||
4. **outlook.readEmails** - Read emails from folders
|
||||
5. **outlook.searchEmails** - Search emails by query
|
||||
6. **sharepoint.findDocumentPath** - Find documents/folders in SharePoint
|
||||
7. **sharepoint.listDocuments** - List documents in SharePoint paths
|
||||
8. **sharepoint.readDocuments** - Read documents from SharePoint
|
||||
9. **sharepoint.uploadDocument** - Upload documents to SharePoint
|
||||
|
||||
### Coverage Legend
|
||||
- ✅ **COVERED** - Can be done with single action or by chaining existing actions
|
||||
- ❌ **TRULY MISSING** - Cannot be done even with action chaining (requires new action)
|
||||
|
||||
---
|
||||
|
||||
## Test Prompts by Category (129 prompts)
|
||||
|
||||
### Category 1: Web Research & Information Gathering (15 prompts)
|
||||
|
||||
#### ✅ COVERED (single action or chained)
|
||||
1. "Research ValueOn AG from Switzerland and create a Word document with findings" - ✅ ai.webResearch + ai.process (docx)
|
||||
2. "Find information about AI trends in 2024" - ✅ ai.webResearch
|
||||
3. "Search for the latest news about renewable energy" - ✅ ai.webResearch
|
||||
4. "Research competitors in the software industry" - ✅ ai.webResearch
|
||||
5. "Find information about GDPR compliance requirements" - ✅ ai.webResearch
|
||||
6. "Research market analysis for electric vehicles in Europe" - ✅ ai.webResearch
|
||||
7. "Search for best practices in project management" - ✅ ai.webResearch
|
||||
8. "Find information about climate change policies" - ✅ ai.webResearch
|
||||
9. "Research the history of artificial intelligence" - ✅ ai.webResearch
|
||||
10. "Search for recent developments in quantum computing" - ✅ ai.webResearch
|
||||
11. "Research and compare 5 companies in the fintech sector" - ✅ ai.webResearch + ai.process (compare)
|
||||
12. "Find all articles about cryptocurrency regulation and summarize them" - ✅ ai.webResearch + ai.process (summarize)
|
||||
13. "Research and create a timeline of major events in 2023" - ✅ ai.webResearch + ai.process (timeline)
|
||||
|
||||
#### ✅ COVERED (ai.process with resultType: pptx)
|
||||
14. "Find the top 10 startups in Silicon Valley and create a presentation" - ✅ ai.webResearch + ai.process (resultType: pptx to create PowerPoint presentation)
|
||||
|
||||
#### ✅ COVERED (ai.process can generate images)
|
||||
15. "Research stock prices for Apple, Google, Microsoft and create a chart" - ✅ ai.webResearch + ai.process (resultType: png/jpg to generate chart image)
|
||||
|
||||
---
|
||||
|
||||
### Category 2: Document Processing & Generation (20 prompts)
|
||||
|
||||
#### ✅ COVERED (ai.process)
|
||||
16. "Convert this PDF to Word document" - ✅ ai.process (resultType: docx)
|
||||
17. "Summarize this document" - ✅ ai.process
|
||||
18. "Translate this document to German" - ✅ ai.process
|
||||
19. "Extract key points from this presentation" - ✅ ai.process
|
||||
20. "Generate a report from these Excel files" - ✅ ai.process
|
||||
21. "Convert this image to text" - ✅ ai.process (OCR if AI supports it)
|
||||
22. "Create a summary of these 3 documents" - ✅ ai.process (with documentList)
|
||||
23. "Extract all tables from this PDF" - ✅ ai.process
|
||||
24. "Generate a JSON file from this Word document" - ✅ ai.process (resultType: json)
|
||||
25. "Convert this HTML file to PDF" - ✅ ai.process (resultType: pdf)
|
||||
26. "Create a markdown version of this document" - ✅ ai.process (resultType: md)
|
||||
27. "Extract metadata from this document" - ✅ ai.process
|
||||
28. "Generate a CSV from this Excel file" - ✅ ai.process (resultType: csv)
|
||||
29. "Create a plain text version of this PDF" - ✅ ai.process (resultType: txt)
|
||||
30. "Convert this Word document to HTML" - ✅ ai.process (resultType: html)
|
||||
35. "Create a table of contents for this document" - ✅ ai.process
|
||||
|
||||
#### ✅ COVERED (ai.process with multi-document output)
|
||||
31. "Merge these 5 PDFs into one document" - ✅ ai.process (with all 5 PDFs in documentList + prompt to combine into one document)
|
||||
32. "Split this large PDF into separate pages" - ✅ ai.process (read PDF + prompt to create separate documents per page, generation module outputs multiple files via standardized JSON)
|
||||
|
||||
#### ✅ COVERED (ai.process with extraction service)
|
||||
34. "Extract images from this PDF" - ✅ ai.process (extraction service intelligently extracts images from PDF containers as ContentPart objects with typeGroup="image", which ai.process can then process and output)
|
||||
|
||||
#### ❌ TRULY MISSING
|
||||
33. "Add a watermark to this PDF" - ❌ Missing document manipulation action (requires binary PDF manipulation - AI cannot modify PDF binary structure)
|
||||
|
||||
---
|
||||
|
||||
### Category 3: Email Operations (15 prompts)
|
||||
|
||||
#### ✅ COVERED (outlook.*)
|
||||
36. "Read my emails from the Inbox"
|
||||
37. "Search for emails from john@example.com"
|
||||
38. "Find all emails with 'invoice' in the subject"
|
||||
39. "Read emails from last week"
|
||||
40. "Search for emails in the Drafts folder"
|
||||
41. "Find unread emails from my manager"
|
||||
42. "Send an email to client@example.com about the project status"
|
||||
43. "Compose an email to the team about the meeting"
|
||||
44. "Send a formal email to the board with the quarterly report attached"
|
||||
45. "Read the 10 most recent emails"
|
||||
46. "Search for emails with attachments"
|
||||
47. "Find emails from the last 30 days"
|
||||
48. "Read emails from the Sent Items folder"
|
||||
49. "Search for emails containing 'urgent'"
|
||||
50. "Compose a casual email to my colleague about the lunch meeting"
|
||||
|
||||
#### ❌ TRULY MISSING
|
||||
51. "Reply to the latest email from john@example.com" - ❌ Missing reply action (requires email threading)
|
||||
52. "Forward an email to someone" - ❌ Missing forward action (requires email forwarding)
|
||||
53. "Delete emails older than 6 months" - ❌ Missing delete action (requires email deletion)
|
||||
54. "Mark emails as read/unread" - ❌ Missing mark action (requires email state change)
|
||||
55. "Move emails to a folder" - ❌ Missing move action (requires email folder move)
|
||||
56. "Create email rules/filters" - ❌ Missing rule creation action (requires rule management)
|
||||
57. "Get email statistics (sent, received, etc.)" - ⚠️ Could potentially be done with readEmails + ai.process (analyze), but specialized action would be better
|
||||
58. "Schedule an email to be sent later" - ❌ Missing scheduling action (requires email scheduling)
|
||||
59. "Archive old emails" - ⚠️ Could potentially be done with readEmails + move (if move existed), but archive is a specific operation
|
||||
60. "Create email templates" - ⚠️ Could potentially be done with ai.process (generate template), but template management would be better
|
||||
|
||||
---
|
||||
|
||||
### Category 4: SharePoint Operations (15 prompts)
|
||||
|
||||
#### ✅ COVERED (sharepoint.*)
|
||||
61. "Find the document 'Annual Report 2023' in SharePoint"
|
||||
62. "List all files in the Documents folder"
|
||||
63. "Read the presentation file from SharePoint"
|
||||
64. "Upload this document to SharePoint"
|
||||
65. "Find all Excel files in SharePoint"
|
||||
66. "List documents in the 'Projects' folder"
|
||||
67. "Read the contract document from SharePoint"
|
||||
68. "Upload multiple files to SharePoint"
|
||||
69. "Find the folder 'Q4 Reports'"
|
||||
70. "List all PDFs in the current directory"
|
||||
71. "Read documents from a specific SharePoint site"
|
||||
72. "Find documents containing 'budget' in the name"
|
||||
73. "Upload a file with a specific name"
|
||||
74. "List subfolders in SharePoint"
|
||||
75. "Read metadata from SharePoint documents"
|
||||
|
||||
#### ❌ TRULY MISSING
|
||||
76. "Delete a document from SharePoint" - ❌ Missing delete action (requires document deletion)
|
||||
77. "Move a document to another folder" - ❌ Missing move action (requires document move)
|
||||
78. "Copy a document in SharePoint" - ❌ Missing copy action (requires document copy)
|
||||
79. "Rename a document in SharePoint" - ❌ Missing rename action (requires document rename)
|
||||
80. "Create a new folder in SharePoint" - ❌ Missing folder creation action (requires folder creation)
|
||||
81. "Share a document with specific users" - ❌ Missing sharing action (requires sharing management)
|
||||
82. "Get document version history" - ❌ Missing version history action (requires version API)
|
||||
83. "Check out/check in a document" - ❌ Missing document locking action (requires checkout/checkin)
|
||||
84. "Set document permissions" - ❌ Missing permission management action (requires permission API)
|
||||
85. "Download all documents from a folder" - ✅ Can be done with listDocuments + readDocuments (multiple calls, but planner can handle)
|
||||
|
||||
---
|
||||
|
||||
### Category 5: Data Analysis & Processing (10 prompts)
|
||||
|
||||
#### ✅ COVERED (ai.process)
|
||||
86. "Analyze this CSV file and find trends"
|
||||
87. "Extract data from this Excel spreadsheet"
|
||||
88. "Create a summary of this data"
|
||||
89. "Generate a report from this JSON data"
|
||||
90. "Convert this data to a different format"
|
||||
|
||||
#### ✅ COVERED (ai.process with chaining)
|
||||
92. "Perform statistical analysis on this dataset" - ✅ ai.process (statistical analysis)
|
||||
93. "Compare two datasets and find differences" - ✅ ai.process (with documentList containing both)
|
||||
94. "Filter and sort this data" - ✅ ai.process
|
||||
95. "Validate data quality in this file" - ✅ ai.process
|
||||
|
||||
#### ✅ COVERED (ai.process can generate images)
|
||||
91. "Create a chart/graph from this data" - ✅ ai.process (resultType: png/jpg to generate chart image from data)
|
||||
|
||||
---
|
||||
|
||||
### Category 6: Content Creation & Formatting (10 prompts)
|
||||
|
||||
#### ✅ COVERED (ai.process)
|
||||
96. "Create a professional email template"
|
||||
97. "Generate a report from these inputs"
|
||||
98. "Format this document properly"
|
||||
99. "Create a structured document from this data"
|
||||
|
||||
#### ✅ COVERED (ai.process)
|
||||
102. "Generate a PDF with specific formatting" - ✅ ai.process (resultType: pdf)
|
||||
103. "Create a formatted table" - ✅ ai.process
|
||||
104. "Generate formatted invoices" - ✅ ai.process (with appropriate prompt)
|
||||
|
||||
#### ✅ COVERED (ai.process)
|
||||
100. "Create a PowerPoint presentation" - ✅ ai.process (resultType: pptx, generation module has PPTX renderer)
|
||||
101. "Create an Excel spreadsheet with formulas" - ✅ ai.process (resultType: xlsx, AI can generate Excel with formulas in prompt)
|
||||
|
||||
---
|
||||
|
||||
### Category 7: Integration & Workflow (10 prompts)
|
||||
|
||||
#### ✅ COVERED (action chaining via dynamic planner)
|
||||
105. "Read emails, extract attachments, and upload to SharePoint" - ✅ outlook.readEmails + ai.process (extract) + sharepoint.uploadDocument
|
||||
106. "Find documents in SharePoint, process them, and email results" - ✅ sharepoint.findDocumentPath + sharepoint.readDocuments + ai.process + outlook.composeAndSendEmailWithContext
|
||||
110. "Process all incoming emails and categorize them" - ✅ outlook.readEmails + ai.process (categorize)
|
||||
113. "Create a data pipeline from emails to SharePoint" - ✅ outlook.readEmails + ai.process + sharepoint.uploadDocument
|
||||
|
||||
#### ❌ TRULY MISSING
|
||||
107. "Monitor emails for attachments and automatically process them" - ❌ Missing automation/monitoring (requires trigger-based automation)
|
||||
108. "Create a workflow that runs daily" - ❌ Missing scheduling/automation (requires task scheduler)
|
||||
109. "Send weekly reports automatically" - ❌ Missing automation (requires scheduling)
|
||||
111. "Backup SharePoint documents to another location" - ⚠️ Could potentially be done with readDocuments + uploadDocument, but "backup" implies specific backup semantics
|
||||
112. "Sync documents between SharePoint sites" - ⚠️ Could potentially be done with readDocuments + uploadDocument (monitoring/trigger missing)
|
||||
114. "Automatically archive old SharePoint documents" - ❌ Missing automation (requires scheduling/triggers)
|
||||
|
||||
---
|
||||
|
||||
### Category 8: Calendar & Scheduling (5 prompts)
|
||||
|
||||
#### ❌ TRULY MISSING (Complete calendar integration missing)
|
||||
115. "Check my calendar for next week" - ❌ Missing calendar.readCalendar action
|
||||
116. "Create a meeting invitation" - ❌ Missing calendar.createEvent action
|
||||
117. "Find available time slots for a meeting" - ❌ Missing calendar.findAvailability action
|
||||
118. "Schedule a meeting with multiple attendees" - ❌ Missing calendar.createEvent action
|
||||
119. "Get my calendar events for today" - ❌ Missing calendar.readCalendar action
|
||||
|
||||
---
|
||||
|
||||
### Category 9: Communication & Collaboration (5 prompts)
|
||||
|
||||
#### ✅ COVERED (with email workaround)
|
||||
123. "Notify team members about a document update" - ✅ outlook.composeAndSendEmailWithContext (can notify via email)
|
||||
|
||||
#### ❌ TRULY MISSING
|
||||
120. "Send a message in Microsoft Teams" - ❌ Missing teams.sendMessage action
|
||||
121. "Create a Teams channel" - ❌ Missing teams.createChannel action
|
||||
122. "Post an update to SharePoint site" - ❌ Missing sharepoint.postUpdate action (requires SharePoint API for posts)
|
||||
124. "Create a shared workspace" - ❌ Missing workspace creation action (requires workspace management API)
|
||||
|
||||
---
|
||||
|
||||
### Category 10: Advanced Operations (5 prompts)
|
||||
|
||||
#### ✅ COVERED (if AI supports it)
|
||||
126. "OCR this scanned document" - ✅ ai.process (if AI model supports OCR)
|
||||
|
||||
#### ❌ TRULY MISSING
|
||||
125. "Extract signatures from documents" - ❌ Missing signature extraction (requires specialized image/document analysis)
|
||||
127. "Convert audio to text" - ❌ Missing ai.processAudio action (requires audio processing)
|
||||
128. "Generate an image from a description" - ❌ Missing ai.generateImage action (mentioned in docs but not available)
|
||||
129. "Create a video transcript" - ❌ Missing ai.processVideo action (requires video processing)
|
||||
|
||||
---
|
||||
|
||||
## Summary Statistics
|
||||
|
||||
### Coverage Analysis (with Dynamic Action Chaining)
|
||||
- **Total Test Prompts**: 129
|
||||
- **✅ COVERED** (single action or chained): 102 (79%)
|
||||
- **❌ TRULY MISSING** (cannot be done even with chaining): 27 (21%)
|
||||
|
||||
### Truly Missing Action Categories
|
||||
|
||||
#### Critical Missing Actions (High Priority) - Cannot be done even with chaining
|
||||
|
||||
1. **Email Management** (6 actions)
|
||||
- `outlook.replyToEmail` - Reply to specific emails (requires email threading/reply-to)
|
||||
- `outlook.forwardEmail` - Forward emails (requires email forwarding)
|
||||
- `outlook.deleteEmail` - Delete emails (requires email deletion API)
|
||||
- `outlook.moveEmail` - Move emails to folders (requires folder move API)
|
||||
- `outlook.markEmailAsRead` - Mark as read/unread (requires email state change)
|
||||
- `outlook.scheduleEmail` - Schedule emails for later sending (requires scheduling API)
|
||||
|
||||
2. **SharePoint Management** (9 actions)
|
||||
- `sharepoint.deleteDocument` - Delete documents (requires deletion API)
|
||||
- `sharepoint.moveDocument` - Move documents between folders (requires move API)
|
||||
- `sharepoint.copyDocument` - Copy documents (requires copy API)
|
||||
- `sharepoint.renameDocument` - Rename documents (requires rename API)
|
||||
- `sharepoint.createFolder` - Create new folders (requires folder creation API)
|
||||
- `sharepoint.shareDocument` - Share documents with users (requires sharing API)
|
||||
- `sharepoint.getVersionHistory` - Get document versions (requires version API)
|
||||
- `sharepoint.setPermissions` - Manage document permissions (requires permission API)
|
||||
- `sharepoint.postUpdate` - Post updates to SharePoint site (requires posting API)
|
||||
|
||||
3. **Document Binary Operations** (1 action)
|
||||
- `document.addWatermark` - Add watermarks to PDFs (requires binary PDF manipulation - AI cannot modify PDF binary structure)
|
||||
|
||||
**Note**: Image extraction from PDFs is covered by `ai.process` - the extraction service intelligently extracts images from PDF containers (and other container formats like Office documents) as ContentPart objects, which ai.process can then process and output.
|
||||
|
||||
4. **Calendar Integration** (5 actions) - Complete integration missing
|
||||
- `calendar.readCalendar` - Read calendar events
|
||||
- `calendar.createEvent` - Create calendar events
|
||||
- `calendar.findAvailability` - Find available time slots
|
||||
- `calendar.updateEvent` - Update existing events
|
||||
- `calendar.deleteEvent` - Delete events
|
||||
|
||||
5. **Presentation Creation** (1 action)
|
||||
- `presentation.createPresentation` - Create PowerPoint presentations (requires PowerPoint API)
|
||||
|
||||
6. **Automation & Scheduling** (3 actions)
|
||||
- `workflow.scheduleTask` - Schedule recurring tasks (requires task scheduler)
|
||||
- `workflow.monitorTrigger` - Monitor for triggers (e.g., new emails) (requires event monitoring)
|
||||
- `workflow.createAutomatedWorkflow` - Create automated workflows (requires workflow automation engine)
|
||||
|
||||
7. **Teams Integration** (2 actions)
|
||||
- `teams.sendMessage` - Send Teams messages (requires Teams API)
|
||||
- `teams.createChannel` - Create Teams channels (requires Teams API)
|
||||
|
||||
8. **Media Processing** (4 actions)
|
||||
- `ai.generateImage` - Generate images from descriptions (mentioned in docs but not available)
|
||||
- `ai.processAudio` - Process audio files (requires audio processing)
|
||||
- `ai.processVideo` - Process video files (requires video processing)
|
||||
- `document.extractSignature` - Extract signatures from documents (requires specialized image analysis)
|
||||
|
||||
#### Medium Priority / Uncertain (May work with ai.process but specialized actions would be better)
|
||||
|
||||
9. **Data Visualization** (0 actions) - ✅ COVERED by ai.process (can generate chart images with resultType: png/jpg)
|
||||
|
||||
10. **Excel Advanced** (1 action)
|
||||
- `excel.createSpreadsheetWithFormulas` - Create Excel files with formulas (may work with ai.process, but formulas need validation)
|
||||
|
||||
11. **Advanced Email** (2 actions)
|
||||
- `outlook.createRule` - Create email rules/filters (requires rule management API)
|
||||
- `outlook.getStatistics` - Email statistics (could be done with readEmails + ai.process, but specialized would be better)
|
||||
|
||||
---
|
||||
|
||||
## Recommendations
|
||||
|
||||
### Phase 1: Critical Actions (Implement First) - 19-20 actions
|
||||
1. **Email Management** (6 actions) - reply, forward, delete, move, mark, schedule
|
||||
2. **SharePoint Management** (9 actions) - delete, move, copy, rename, createFolder, share, versionHistory, permissions, postUpdate
|
||||
3. **Document Binary Operations** (1 action) - addWatermark (merge, split, image extraction, and PowerPoint creation can use ai.process)
|
||||
4. **Calendar Integration** (5 actions) - read, create, findAvailability, update, delete
|
||||
|
||||
### Phase 2: High-Value Actions - 5 actions
|
||||
5. **Teams Integration** (2 actions) - sendMessage, createChannel
|
||||
6. **Automation & Scheduling** (3 actions) - scheduleTask, monitorTrigger, createAutomatedWorkflow
|
||||
|
||||
### Phase 3: Advanced Features - 4 actions
|
||||
8. **Media Processing** (4 actions) - generateImage, processAudio, processVideo, extractSignature
|
||||
|
||||
---
|
||||
|
||||
## Edge Cases & Special Scenarios
|
||||
|
||||
### Scenarios Covered by Action Chaining
|
||||
- ✅ "Research topic X, create presentation, email to team" - ✅ ai.webResearch + ai.process + outlook.composeAndSendEmailWithContext
|
||||
- ✅ "Read emails with attachments, extract data, upload to SharePoint" - ✅ outlook.readEmails + ai.process + sharepoint.uploadDocument
|
||||
- ✅ "Find documents, process them, create report, email to team" - ✅ sharepoint.findDocumentPath + sharepoint.readDocuments + ai.process + outlook.composeAndSendEmailWithContext
|
||||
- ✅ "Research and compare 5 companies" - ✅ ai.webResearch + ai.process (compare)
|
||||
- ✅ "Process all incoming emails and categorize them" - ✅ outlook.readEmails + ai.process (categorize)
|
||||
|
||||
### Scenarios That Need Better Support (Infrastructure)
|
||||
- Batch operations (process multiple items at once) - ⚠️ Can be done with loops, but batch actions would be more efficient
|
||||
- Conditional workflows (if/then logic) - ⚠️ Planner can handle this, but explicit conditional actions might help
|
||||
- Error handling and retry mechanisms - ✅ Already handled by workflow system
|
||||
- Progress tracking for long-running operations - ✅ Already handled by workflow system
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
With the **dynamic task and action planner** that can chain actions together, and considering that `ai.process` can handle **multiple input documents and produce multiple output documents via the standardized JSON generation module**, the current action set covers **~79% of common user scenarios** (102 out of 129 prompts).
|
||||
|
||||
### Strong Coverage Areas
|
||||
- ✅ Web research and information gathering
|
||||
- ✅ Document processing and format conversion
|
||||
- ✅ Email reading and composition
|
||||
- ✅ SharePoint file reading and uploading
|
||||
- ✅ Multi-step workflows (research → process → email, etc.)
|
||||
- ✅ Data analysis and transformation
|
||||
|
||||
### Critical Gaps (21% - 27 prompts)
|
||||
The following capabilities **cannot be achieved even with action chaining**:
|
||||
|
||||
1. **Email Management** (6 actions) - Reply, forward, delete, move, mark, schedule
|
||||
2. **SharePoint Management** (9 actions) - Delete, move, copy, rename, createFolder, share, versions, permissions
|
||||
3. **Calendar Integration** (5 actions) - Complete calendar functionality missing
|
||||
4. **Document Binary Operations** (1 action) - PDF watermark (merge, split, and image extraction can use ai.process with extraction service and multi-document output)
|
||||
5. **Presentation Creation** (0 actions) - ✅ COVERED by ai.process (resultType: pptx)
|
||||
6. **Automation & Scheduling** (3 actions) - Recurring tasks, triggers, workflow automation
|
||||
7. **Teams Integration** (2 actions) - Send messages, create channels
|
||||
8. **Media Processing** (4 actions) - Image generation, audio/video processing, signature extraction
|
||||
|
||||
---
|
||||
|
||||
## Critical Information for AI Planner: The Power of `ai.process`
|
||||
|
||||
**IMPORTANT**: The AI planner must understand these capabilities of `ai.process` to correctly identify when it can be used:
|
||||
|
||||
### 1. Multi-Input Document Processing
|
||||
- `ai.process` accepts **multiple input documents** in **ANY format** (docx, pdf, json, txt, xlsx, html, images, etc.)
|
||||
- All documents in `documentList` are processed together in one AI call
|
||||
- The extraction service automatically handles container formats (PDF, Office) by intelligently extracting their contents
|
||||
|
||||
### 2. Multi-Output Document Generation
|
||||
- `ai.process` can produce **multiple output documents** in a single call
|
||||
- The generation module uses standardized JSON with a "documents" array
|
||||
- Each document in the array becomes a separate output file
|
||||
- All output documents use the same format (specified by `resultType`)
|
||||
- The AI can be instructed to create multiple documents via the prompt (e.g., "create separate documents for each page")
|
||||
|
||||
### 3. Intelligent Container Extraction
|
||||
- The extraction service automatically extracts content from container formats:
|
||||
- **PDFs**: Extracts text per page, tables, and **embedded images** as ContentPart objects
|
||||
- **Office documents**: Extracts text, tables, images, structure from Word, Excel, PowerPoint
|
||||
- Extracted images from PDFs are available to `ai.process` as image ContentParts
|
||||
- This means "extract images from PDF" is covered by `ai.process` - the extraction happens automatically
|
||||
|
||||
### 4. Format Transformation
|
||||
- `ai.process` can transform between ANY formats via `resultType` parameter
|
||||
- Supports: txt, json, md, csv, xml, html, pdf, docx, xlsx, **pptx**, png, jpg, etc.
|
||||
- The AI intelligently handles the conversion based on the prompt and output format
|
||||
- **PowerPoint (PPTX) is fully supported** - the generation module has a PPTX renderer that creates professional presentations
|
||||
|
||||
### 5. Operations Enabled by These Capabilities
|
||||
- **PDF Split**: Read PDF → generate multiple documents (one per page) → output as separate files
|
||||
- **PDF Merge**: Read multiple PDFs → combine into one document → output as single file
|
||||
- **Image Extraction**: Extract images from PDFs automatically via extraction service → process with `ai.process` → output images
|
||||
- **PowerPoint Creation**: Generate PowerPoint presentations with resultType: pptx → generation module renders professional PPTX files
|
||||
- **Multi-file Generation**: Generate multiple related documents in one call (e.g., "create separate reports for each department")
|
||||
- **Format Conversion**: Convert any document format to any other format
|
||||
- **Content Extraction**: Extract structured data, tables, text from any document format
|
||||
- **Analysis & Comparison**: Analyze multiple documents together, compare them, find differences
|
||||
- **Data Transformation**: Transform data from one format/structure to another
|
||||
- **Chart Generation**: Generate charts/graphs as images with resultType: png/jpg
|
||||
|
||||
### 6. What `ai.process` CANNOT Do
|
||||
- Modify binary PDF structure (watermarking, binary manipulation)
|
||||
- Extract actual binary image files directly (but can process extracted images from containers)
|
||||
- Process audio/video files (requires specialized processing)
|
||||
- Generate images from text descriptions (separate `ai.generateImage` action needed)
|
||||
|
||||
These capabilities make `ai.process` extremely powerful and should be considered for most document processing tasks.
|
||||
|
||||
### Total Missing Actions: 27-28 actions
|
||||
|
||||
**Note**: The extraction service intelligently handles container formats (PDF, Office documents) by extracting images, text, tables, and structure automatically. This means `ai.process` can work with extracted images from PDFs without needing a separate extraction action.
|
||||
|
||||
**Recommendation**: Prioritize implementing the **Phase 1 actions (19-20 actions)** to cover the most critical gaps and improve system coverage from 79% to ~90%+.
|
||||
|
||||
---
|
||||
|
||||
## Proposal: Wrapper Actions for Better Planner Discovery
|
||||
|
||||
While `ai.process` is extremely powerful and can handle most document processing tasks, the AI planner might struggle to identify when to use it for specific intents. Consider adding semantic wrapper actions that internally call `ai.process` but have clearer, more discoverable names:
|
||||
|
||||
### Suggested Wrapper Actions (all delegate to ai.process)
|
||||
|
||||
1. **Document Transformation**
|
||||
- `ai.summarizeDocument` - Summarize one or more documents
|
||||
- `ai.translateDocument` - Translate documents to target language
|
||||
- `ai.convertDocument` - Convert between formats (PDF→Word, Excel→CSV, etc.)
|
||||
- `ai.extractData` - Extract structured data from documents
|
||||
- `ai.extractTables` - Extract tables from documents
|
||||
|
||||
2. **Content Generation**
|
||||
- `ai.generateReport` - Generate reports from input documents/data
|
||||
- `ai.generateChart` - Generate charts/graphs from data (resultType: png/jpg)
|
||||
- `ai.generateDocument` - Generate documents from scratch or templates
|
||||
|
||||
3. **Analysis & Comparison**
|
||||
- `ai.analyzeDocuments` - Analyze and find insights in documents
|
||||
- `ai.compareDocuments` - Compare multiple documents and find differences
|
||||
- `ai.validateData` - Validate data quality and structure
|
||||
|
||||
### Benefits of Wrapper Actions
|
||||
- **Better Intent Matching**: Planner can directly match "summarize" → `ai.summarizeDocument` instead of inferring `ai.process` with summarize prompt
|
||||
- **Clearer Parameters**: Each wrapper can have domain-specific parameters (e.g., `targetLanguage` for translate)
|
||||
- **Improved Discoverability**: More actions in AVAILABLE_METHODS means better chance of matching user intent
|
||||
- **Backward Compatible**: All wrappers delegate to `ai.process`, so existing workflows still work
|
||||
|
||||
### Implementation Note
|
||||
All wrapper actions should be thin wrappers that:
|
||||
1. Accept domain-specific parameters
|
||||
2. Build appropriate `aiPrompt` from parameters + user intent
|
||||
3. Call `ai.process` internally
|
||||
4. Return results in same format
|
||||
|
||||
This keeps the power of `ai.process` while making it easier for the planner to discover and use.
|
||||
|
||||
Loading…
Reference in a new issue