wiki/c-work/1-plan/2026-03-web-image-search.md

361 lines
12 KiB
Markdown

# Refactoring Concept: Add WEB_SEARCH_MEDIA Operation Type
## Executive Summary
**Refactoring Goal**: Add image search capability by introducing `WEB_SEARCH_MEDIA` operation type and Google Custom Search connector.
**Current State**:
-`WEB_SEARCH_DATA` already exists (replaces former `WEB_SEARCH`)
- ✅ Tavily connector handles `WEB_SEARCH_DATA` for web page/content search
-`AiCallPromptWebSearchData` model exists
**Target State**:
- Add `WEB_SEARCH_MEDIA` operation type for image/media search
- Create Google Custom Search connector for `WEB_SEARCH_MEDIA`
- Create `AiCallPromptWebSearchMedia` model
- Add `ai.searchImages` action using `WEB_SEARCH_MEDIA`
**Estimated Complexity**: Medium (1-2 days development)
**Integration Impact**: Low - additive changes, no breaking changes
---
## 1. Current Architecture
### 1.1 Operation Types
**Current Operation Types** (`gateway/modules/datamodels/datamodelAi.py`):
```python
class OperationTypeEnum(str, Enum):
# ... existing operations ...
# Web Operations
WEB_SEARCH_DATA = "webSearchData" # Web page search (Tavily) ✅ EXISTS
WEB_CRAWL = "webCrawl" # Web crawl for a given URL
```
**Target Operation Types**:
```python
class OperationTypeEnum(str, Enum):
# ... existing operations ...
# Web Operations
WEB_SEARCH_DATA = "webSearchData" # Web page search (Tavily) ✅ EXISTS
WEB_SEARCH_MEDIA = "webSearchMedia" # Image/media search (Google) ⬅️ NEW
WEB_CRAWL = "webCrawl" # Web crawl for a given URL
```
### 1.2 Model Capabilities
**Tavily Connector** (`gateway/modules/aicore/aicorePluginTavily.py`):
- ✅ Registered for `WEB_SEARCH_DATA` (rating: 9)
- ✅ Registered for `WEB_CRAWL` (rating: 10)
- ❌ Does NOT support image search - designed for web/text content only
**Model Selection**:
- Dynamic model selection routes based on `OperationTypeEnum` in `AiCallOptions`
- Models register capabilities via `operationTypes` with ratings (1-10)
- System automatically selects best model for each operation type
---
## 2. Refactoring Plan
### 2.1 Add WEB_SEARCH_MEDIA Operation Type
**File**: `gateway/modules/datamodels/datamodelAi.py`
**Changes**:
```python
class OperationTypeEnum(str, Enum):
# ... existing operations ...
# Web Operations
WEB_SEARCH_DATA = "webSearchData" # Web page search (Tavily)
WEB_SEARCH_MEDIA = "webSearchMedia" # Image/media search (Google) ⬅️ ADD
WEB_CRAWL = "webCrawl" # Web crawl for a given URL
```
### 2.2 Create AiCallPromptWebSearchMedia Model
**File**: `gateway/modules/datamodels/datamodelAi.py`
**Add after `AiCallPromptWebSearchData`**:
```python
class AiCallPromptWebSearchMedia(BaseModel):
"""Structured prompt format for WEB_SEARCH_MEDIA operation - returns list of image URLs."""
instruction: str = Field(description="Search instruction/query for finding relevant images")
maxResults: Optional[int] = Field(default=10, description="Maximum number of images to return (default: 10)")
imageType: Optional[str] = Field(default=None, description="Image type filter: 'photo', 'clipart', 'lineart', 'animated'")
size: Optional[str] = Field(default=None, description="Image size filter: 'small', 'medium', 'large', 'xlarge'")
color: Optional[str] = Field(default=None, description="Color filter: 'color', 'grayscale', 'transparent'")
country: Optional[str] = Field(default=None, description="Two-digit country code (lowercase, e.g., ch, us, de, fr)")
language: Optional[str] = Field(default=None, description="Language code (lowercase, e.g., de, en, fr)")
```
### 2.3 Create Google Custom Search Connector
**New File**: `gateway/modules/aicore/aicorePluginGoogle.py`
**Structure** (similar to `aicorePluginTavily.py`):
```python
class AiGoogle(BaseConnectorAi):
"""Google Custom Search connector for image search."""
def getModels(self) -> List[AiModel]:
"""Get Google Custom Search model."""
return [
AiModel(
name="google-custom-search",
displayName="Google Custom Search",
connectorType="google",
apiUrl="https://www.googleapis.com/customsearch/v1",
# ... model configuration ...
operationTypes=createOperationTypeRatings(
(OperationTypeEnum.WEB_SEARCH_MEDIA, 9)
),
functionCall=self._routeWebOperation,
# ...
)
]
async def _routeWebOperation(self, modelCall: AiModelCall) -> "AiModelResponse":
"""Route web operation based on operation type."""
operationType = modelCall.options.operationType
if operationType == OperationTypeEnum.WEB_SEARCH_MEDIA:
return await self.webSearchMedia(modelCall)
else:
return AiModelResponse(
content="",
success=False,
error=f"Unsupported operation type: {operationType}"
)
async def webSearchMedia(self, modelCall: AiModelCall) -> "AiModelResponse":
"""WEB_SEARCH_MEDIA operation - returns list of image URLs using Google Custom Search."""
# Parse AiCallPromptWebSearchMedia from messages
# Call Google Custom Search API with searchType=image
# Return JSON array of image URLs (same format as Tavily for consistency)
```
**Key Implementation Points**:
- Use Google Custom Search API with `searchType=image`
- Return image URLs in JSON array format (consistent with Tavily)
- Support filters: imageType, size, color, country, language
- Handle API errors and rate limiting
### 2.4 Update Service AI Handler
**File**: `gateway/modules/services/serviceAi/mainServiceAi.py`
**Update `_handleWebOperations()` method**:
```python
async def _handleWebOperations(
self,
prompt: str,
options: AiCallOptions,
opType: OperationTypeEnum,
aiOperationId: str
) -> AiResponse:
"""Handle WEB_SEARCH_DATA, WEB_SEARCH_MEDIA, and WEB_CRAWL operation types."""
# Existing logic handles WEB_SEARCH_DATA and WEB_CRAWL
# Add support for WEB_SEARCH_MEDIA
if opType == OperationTypeEnum.WEB_SEARCH_DATA or opType == OperationTypeEnum.WEB_SEARCH_MEDIA or opType == OperationTypeEnum.WEB_CRAWL:
# ... existing implementation ...
```
### 2.5 Add searchImages Action
**New File**: `gateway/modules/workflows/methods/methodAi/actions/searchImages.py`
**Implementation**:
```python
async def searchImages(self, parameters: Dict[str, Any]) -> ActionResult:
"""Search for images on the web using a prompt and return them as documents."""
prompt = parameters.get("prompt")
if not prompt:
return ActionResult.isFailure(error="Search prompt is required")
maxResults = parameters.get("maxResults", 5)
imageType = parameters.get("imageType")
size = parameters.get("size")
color = parameters.get("color")
# Build AiCallPromptWebSearchMedia
searchPromptModel = AiCallPromptWebSearchMedia(
instruction=prompt,
maxResults=maxResults,
imageType=imageType,
size=size,
color=color
)
# Call AI with WEB_SEARCH_MEDIA operation
searchOptions = AiCallOptions(
operationType=OperationTypeEnum.WEB_SEARCH_MEDIA,
resultFormat="json"
)
# System will automatically route to Google connector
searchResponse = await self.services.ai.callAiContent(
prompt=searchPromptModel.model_dump_json(exclude_none=True, indent=2),
options=searchOptions,
outputFormat="json"
)
# Parse response to extract image URLs
# Download images in parallel
# Create ActionDocument for each image
# Return ActionResult with list of documents
```
**Update**: `gateway/modules/workflows/methods/methodAi/methodAi.py`
**Add action definition**:
```python
"searchImages": WorkflowActionDefinition(
actionId="ai.searchImages",
description="Search for images on the web using a prompt and return them as documents",
dynamicMode=True,
parameters={
"prompt": WorkflowActionParameter(...),
"maxResults": WorkflowActionParameter(...),
"imageType": WorkflowActionParameter(...),
"size": WorkflowActionParameter(...)
},
execute=searchImages.__get__(self, self.__class__)
)
```
---
## 3. Implementation Steps
### Phase 1: Add WEB_SEARCH_MEDIA Operation Type (30 min)
1. ✅ Add `WEB_SEARCH_MEDIA = "webSearchMedia"` to `OperationTypeEnum`
2. ✅ Create `AiCallPromptWebSearchMedia` model
3. ✅ Update imports/exports
### Phase 2: Create Google Connector (4-6 hours)
1. ✅ Create `aicorePluginGoogle.py` file
2. ✅ Implement `AiGoogle` connector class
3. ✅ Implement `webSearchMedia()` method
4. ✅ Register connector in connector discovery
5. ✅ Test Google API integration
6. ✅ Handle errors and rate limiting
### Phase 3: Update Service Handlers (1-2 hours)
1. ✅ Update `serviceAi/mainServiceAi.py` to handle `WEB_SEARCH_MEDIA`
2. ✅ Ensure routing works correctly
3. ✅ Test model selection routes to Google connector
### Phase 4: Add searchImages Action (3-4 hours)
1. ✅ Create `actions/searchImages.py`
2. ✅ Implement image search logic
3. ✅ Implement parallel image download
4. ✅ Add action definition to `methodAi.py`
5. ✅ Add to `actions/__init__.py`
6. ✅ Write unit tests
### Phase 5: Testing & Integration (2-3 hours)
1. ✅ Test Google connector with real API
2. ✅ Test searchImages action end-to-end
3. ✅ Verify dynamic model selection
4. ✅ Test error handling
5. ✅ Integration testing
**Total Estimated Time**: 1-2 days
---
## 4. Configuration Requirements
### 4.1 Environment Variables
**Required**:
- `GOOGLE_SEARCH_API_KEY` - Google Custom Search API key
- `GOOGLE_SEARCH_ENGINE_ID` - Custom Search Engine ID
- Note: Must enable "Image Search" in Google Custom Search Engine settings
### 4.2 Google Custom Search Setup
1. Create Google Custom Search Engine at https://programmablesearchengine.google.com/
2. Enable "Image Search" in settings
3. Get API key from Google Cloud Console
4. Configure environment variables
---
## 5. Files to Create/Modify
### New Files
- `gateway/modules/aicore/aicorePluginGoogle.py` - Google connector
- `gateway/modules/workflows/methods/methodAi/actions/searchImages.py` - Image search action
### Modified Files
- `gateway/modules/datamodels/datamodelAi.py` - Add `WEB_SEARCH_MEDIA` and `AiCallPromptWebSearchMedia`
- `gateway/modules/services/serviceAi/mainServiceAi.py` - Handle `WEB_SEARCH_MEDIA`
- `gateway/modules/workflows/methods/methodAi/methodAi.py` - Add `searchImages` action
- `gateway/modules/workflows/methods/methodAi/actions/__init__.py` - Export `searchImages`
- Connector discovery module (if exists) - Register Google connector
---
## 6. Testing Requirements
### Unit Tests
- Google connector `webSearchMedia()` method
- `searchImages` action with various parameters
- Error handling (API errors, rate limits, invalid responses)
- Image download and validation
### Integration Tests
- End-to-end image search workflow
- Dynamic model selection routes to Google connector
- Multiple image downloads in parallel
- Verify ActionDocuments are created correctly
---
## 7. Risks & Mitigation
| Risk | Impact | Probability | Mitigation |
|------|--------|-------------|------------|
| Google API setup complexity | Medium | Medium | Provide clear setup instructions, validate API keys at startup |
| Dynamic model selection routing | Medium | Low | Thoroughly test operation type routing |
| API rate limiting (Google) | Low | Medium | Implement retry logic with exponential backoff |
| Missing connector registration | Medium | Low | Ensure connector is registered in discovery system |
---
## 8. Dependencies
- **Google Custom Search API** - REQUIRED
- **Google API Client Library** - May need `google-api-python-client` package
- **HTTP Client** - For image downloads (existing)
- **Base64 Encoding** - Python standard library (no dependency)
---
## 9. Success Criteria
`WEB_SEARCH_MEDIA` operation type exists
✅ Google connector is registered and functional
`searchImages` action works end-to-end
✅ Dynamic model selection routes `WEB_SEARCH_MEDIA` to Google connector
✅ Images are downloaded and returned as ActionDocuments
✅ All tests pass
---
**Document Version**: 2.0
**Last Updated**: 2026-01-01
**Status**: Refactoring Concept - Ready for Implementation