361 lines
12 KiB
Markdown
361 lines
12 KiB
Markdown
# Refactoring Concept: Add WEB_SEARCH_MEDIA Operation Type
|
|
|
|
## Executive Summary
|
|
|
|
**Refactoring Goal**: Add image search capability by introducing `WEB_SEARCH_MEDIA` operation type and Google Custom Search connector.
|
|
|
|
**Current State**:
|
|
- ✅ `WEB_SEARCH_DATA` already exists (replaces former `WEB_SEARCH`)
|
|
- ✅ Tavily connector handles `WEB_SEARCH_DATA` for web page/content search
|
|
- ✅ `AiCallPromptWebSearchData` model exists
|
|
|
|
**Target State**:
|
|
- Add `WEB_SEARCH_MEDIA` operation type for image/media search
|
|
- Create Google Custom Search connector for `WEB_SEARCH_MEDIA`
|
|
- Create `AiCallPromptWebSearchMedia` model
|
|
- Add `ai.searchImages` action using `WEB_SEARCH_MEDIA`
|
|
|
|
**Estimated Complexity**: Medium (1-2 days development)
|
|
|
|
**Integration Impact**: Low - additive changes, no breaking changes
|
|
|
|
---
|
|
|
|
## 1. Current Architecture
|
|
|
|
### 1.1 Operation Types
|
|
|
|
**Current Operation Types** (`gateway/modules/datamodels/datamodelAi.py`):
|
|
```python
|
|
class OperationTypeEnum(str, Enum):
|
|
# ... existing operations ...
|
|
|
|
# Web Operations
|
|
WEB_SEARCH_DATA = "webSearchData" # Web page search (Tavily) ✅ EXISTS
|
|
WEB_CRAWL = "webCrawl" # Web crawl for a given URL
|
|
```
|
|
|
|
**Target Operation Types**:
|
|
```python
|
|
class OperationTypeEnum(str, Enum):
|
|
# ... existing operations ...
|
|
|
|
# Web Operations
|
|
WEB_SEARCH_DATA = "webSearchData" # Web page search (Tavily) ✅ EXISTS
|
|
WEB_SEARCH_MEDIA = "webSearchMedia" # Image/media search (Google) ⬅️ NEW
|
|
WEB_CRAWL = "webCrawl" # Web crawl for a given URL
|
|
```
|
|
|
|
### 1.2 Model Capabilities
|
|
|
|
**Tavily Connector** (`gateway/modules/aicore/aicorePluginTavily.py`):
|
|
- ✅ Registered for `WEB_SEARCH_DATA` (rating: 9)
|
|
- ✅ Registered for `WEB_CRAWL` (rating: 10)
|
|
- ❌ Does NOT support image search - designed for web/text content only
|
|
|
|
**Model Selection**:
|
|
- Dynamic model selection routes based on `OperationTypeEnum` in `AiCallOptions`
|
|
- Models register capabilities via `operationTypes` with ratings (1-10)
|
|
- System automatically selects best model for each operation type
|
|
|
|
---
|
|
|
|
## 2. Refactoring Plan
|
|
|
|
### 2.1 Add WEB_SEARCH_MEDIA Operation Type
|
|
|
|
**File**: `gateway/modules/datamodels/datamodelAi.py`
|
|
|
|
**Changes**:
|
|
```python
|
|
class OperationTypeEnum(str, Enum):
|
|
# ... existing operations ...
|
|
|
|
# Web Operations
|
|
WEB_SEARCH_DATA = "webSearchData" # Web page search (Tavily)
|
|
WEB_SEARCH_MEDIA = "webSearchMedia" # Image/media search (Google) ⬅️ ADD
|
|
WEB_CRAWL = "webCrawl" # Web crawl for a given URL
|
|
```
|
|
|
|
### 2.2 Create AiCallPromptWebSearchMedia Model
|
|
|
|
**File**: `gateway/modules/datamodels/datamodelAi.py`
|
|
|
|
**Add after `AiCallPromptWebSearchData`**:
|
|
```python
|
|
class AiCallPromptWebSearchMedia(BaseModel):
|
|
"""Structured prompt format for WEB_SEARCH_MEDIA operation - returns list of image URLs."""
|
|
|
|
instruction: str = Field(description="Search instruction/query for finding relevant images")
|
|
maxResults: Optional[int] = Field(default=10, description="Maximum number of images to return (default: 10)")
|
|
imageType: Optional[str] = Field(default=None, description="Image type filter: 'photo', 'clipart', 'lineart', 'animated'")
|
|
size: Optional[str] = Field(default=None, description="Image size filter: 'small', 'medium', 'large', 'xlarge'")
|
|
color: Optional[str] = Field(default=None, description="Color filter: 'color', 'grayscale', 'transparent'")
|
|
country: Optional[str] = Field(default=None, description="Two-digit country code (lowercase, e.g., ch, us, de, fr)")
|
|
language: Optional[str] = Field(default=None, description="Language code (lowercase, e.g., de, en, fr)")
|
|
```
|
|
|
|
### 2.3 Create Google Custom Search Connector
|
|
|
|
**New File**: `gateway/modules/aicore/aicorePluginGoogle.py`
|
|
|
|
**Structure** (similar to `aicorePluginTavily.py`):
|
|
```python
|
|
class AiGoogle(BaseConnectorAi):
|
|
"""Google Custom Search connector for image search."""
|
|
|
|
def getModels(self) -> List[AiModel]:
|
|
"""Get Google Custom Search model."""
|
|
return [
|
|
AiModel(
|
|
name="google-custom-search",
|
|
displayName="Google Custom Search",
|
|
connectorType="google",
|
|
apiUrl="https://www.googleapis.com/customsearch/v1",
|
|
# ... model configuration ...
|
|
operationTypes=createOperationTypeRatings(
|
|
(OperationTypeEnum.WEB_SEARCH_MEDIA, 9)
|
|
),
|
|
functionCall=self._routeWebOperation,
|
|
# ...
|
|
)
|
|
]
|
|
|
|
async def _routeWebOperation(self, modelCall: AiModelCall) -> "AiModelResponse":
|
|
"""Route web operation based on operation type."""
|
|
operationType = modelCall.options.operationType
|
|
|
|
if operationType == OperationTypeEnum.WEB_SEARCH_MEDIA:
|
|
return await self.webSearchMedia(modelCall)
|
|
else:
|
|
return AiModelResponse(
|
|
content="",
|
|
success=False,
|
|
error=f"Unsupported operation type: {operationType}"
|
|
)
|
|
|
|
async def webSearchMedia(self, modelCall: AiModelCall) -> "AiModelResponse":
|
|
"""WEB_SEARCH_MEDIA operation - returns list of image URLs using Google Custom Search."""
|
|
# Parse AiCallPromptWebSearchMedia from messages
|
|
# Call Google Custom Search API with searchType=image
|
|
# Return JSON array of image URLs (same format as Tavily for consistency)
|
|
```
|
|
|
|
**Key Implementation Points**:
|
|
- Use Google Custom Search API with `searchType=image`
|
|
- Return image URLs in JSON array format (consistent with Tavily)
|
|
- Support filters: imageType, size, color, country, language
|
|
- Handle API errors and rate limiting
|
|
|
|
### 2.4 Update Service AI Handler
|
|
|
|
**File**: `gateway/modules/services/serviceAi/mainServiceAi.py`
|
|
|
|
**Update `_handleWebOperations()` method**:
|
|
```python
|
|
async def _handleWebOperations(
|
|
self,
|
|
prompt: str,
|
|
options: AiCallOptions,
|
|
opType: OperationTypeEnum,
|
|
aiOperationId: str
|
|
) -> AiResponse:
|
|
"""Handle WEB_SEARCH_DATA, WEB_SEARCH_MEDIA, and WEB_CRAWL operation types."""
|
|
# Existing logic handles WEB_SEARCH_DATA and WEB_CRAWL
|
|
# Add support for WEB_SEARCH_MEDIA
|
|
if opType == OperationTypeEnum.WEB_SEARCH_DATA or opType == OperationTypeEnum.WEB_SEARCH_MEDIA or opType == OperationTypeEnum.WEB_CRAWL:
|
|
# ... existing implementation ...
|
|
```
|
|
|
|
### 2.5 Add searchImages Action
|
|
|
|
**New File**: `gateway/modules/workflows/methods/methodAi/actions/searchImages.py`
|
|
|
|
**Implementation**:
|
|
```python
|
|
async def searchImages(self, parameters: Dict[str, Any]) -> ActionResult:
|
|
"""Search for images on the web using a prompt and return them as documents."""
|
|
prompt = parameters.get("prompt")
|
|
if not prompt:
|
|
return ActionResult.isFailure(error="Search prompt is required")
|
|
|
|
maxResults = parameters.get("maxResults", 5)
|
|
imageType = parameters.get("imageType")
|
|
size = parameters.get("size")
|
|
color = parameters.get("color")
|
|
|
|
# Build AiCallPromptWebSearchMedia
|
|
searchPromptModel = AiCallPromptWebSearchMedia(
|
|
instruction=prompt,
|
|
maxResults=maxResults,
|
|
imageType=imageType,
|
|
size=size,
|
|
color=color
|
|
)
|
|
|
|
# Call AI with WEB_SEARCH_MEDIA operation
|
|
searchOptions = AiCallOptions(
|
|
operationType=OperationTypeEnum.WEB_SEARCH_MEDIA,
|
|
resultFormat="json"
|
|
)
|
|
|
|
# System will automatically route to Google connector
|
|
searchResponse = await self.services.ai.callAiContent(
|
|
prompt=searchPromptModel.model_dump_json(exclude_none=True, indent=2),
|
|
options=searchOptions,
|
|
outputFormat="json"
|
|
)
|
|
|
|
# Parse response to extract image URLs
|
|
# Download images in parallel
|
|
# Create ActionDocument for each image
|
|
# Return ActionResult with list of documents
|
|
```
|
|
|
|
**Update**: `gateway/modules/workflows/methods/methodAi/methodAi.py`
|
|
|
|
**Add action definition**:
|
|
```python
|
|
"searchImages": WorkflowActionDefinition(
|
|
actionId="ai.searchImages",
|
|
description="Search for images on the web using a prompt and return them as documents",
|
|
dynamicMode=True,
|
|
parameters={
|
|
"prompt": WorkflowActionParameter(...),
|
|
"maxResults": WorkflowActionParameter(...),
|
|
"imageType": WorkflowActionParameter(...),
|
|
"size": WorkflowActionParameter(...)
|
|
},
|
|
execute=searchImages.__get__(self, self.__class__)
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
## 3. Implementation Steps
|
|
|
|
### Phase 1: Add WEB_SEARCH_MEDIA Operation Type (30 min)
|
|
|
|
1. ✅ Add `WEB_SEARCH_MEDIA = "webSearchMedia"` to `OperationTypeEnum`
|
|
2. ✅ Create `AiCallPromptWebSearchMedia` model
|
|
3. ✅ Update imports/exports
|
|
|
|
### Phase 2: Create Google Connector (4-6 hours)
|
|
|
|
1. ✅ Create `aicorePluginGoogle.py` file
|
|
2. ✅ Implement `AiGoogle` connector class
|
|
3. ✅ Implement `webSearchMedia()` method
|
|
4. ✅ Register connector in connector discovery
|
|
5. ✅ Test Google API integration
|
|
6. ✅ Handle errors and rate limiting
|
|
|
|
### Phase 3: Update Service Handlers (1-2 hours)
|
|
|
|
1. ✅ Update `serviceAi/mainServiceAi.py` to handle `WEB_SEARCH_MEDIA`
|
|
2. ✅ Ensure routing works correctly
|
|
3. ✅ Test model selection routes to Google connector
|
|
|
|
### Phase 4: Add searchImages Action (3-4 hours)
|
|
|
|
1. ✅ Create `actions/searchImages.py`
|
|
2. ✅ Implement image search logic
|
|
3. ✅ Implement parallel image download
|
|
4. ✅ Add action definition to `methodAi.py`
|
|
5. ✅ Add to `actions/__init__.py`
|
|
6. ✅ Write unit tests
|
|
|
|
### Phase 5: Testing & Integration (2-3 hours)
|
|
|
|
1. ✅ Test Google connector with real API
|
|
2. ✅ Test searchImages action end-to-end
|
|
3. ✅ Verify dynamic model selection
|
|
4. ✅ Test error handling
|
|
5. ✅ Integration testing
|
|
|
|
**Total Estimated Time**: 1-2 days
|
|
|
|
---
|
|
|
|
## 4. Configuration Requirements
|
|
|
|
### 4.1 Environment Variables
|
|
|
|
**Required**:
|
|
- `GOOGLE_SEARCH_API_KEY` - Google Custom Search API key
|
|
- `GOOGLE_SEARCH_ENGINE_ID` - Custom Search Engine ID
|
|
- Note: Must enable "Image Search" in Google Custom Search Engine settings
|
|
|
|
### 4.2 Google Custom Search Setup
|
|
|
|
1. Create Google Custom Search Engine at https://programmablesearchengine.google.com/
|
|
2. Enable "Image Search" in settings
|
|
3. Get API key from Google Cloud Console
|
|
4. Configure environment variables
|
|
|
|
---
|
|
|
|
## 5. Files to Create/Modify
|
|
|
|
### New Files
|
|
- `gateway/modules/aicore/aicorePluginGoogle.py` - Google connector
|
|
- `gateway/modules/workflows/methods/methodAi/actions/searchImages.py` - Image search action
|
|
|
|
### Modified Files
|
|
- `gateway/modules/datamodels/datamodelAi.py` - Add `WEB_SEARCH_MEDIA` and `AiCallPromptWebSearchMedia`
|
|
- `gateway/modules/services/serviceAi/mainServiceAi.py` - Handle `WEB_SEARCH_MEDIA`
|
|
- `gateway/modules/workflows/methods/methodAi/methodAi.py` - Add `searchImages` action
|
|
- `gateway/modules/workflows/methods/methodAi/actions/__init__.py` - Export `searchImages`
|
|
- Connector discovery module (if exists) - Register Google connector
|
|
|
|
---
|
|
|
|
## 6. Testing Requirements
|
|
|
|
### Unit Tests
|
|
- Google connector `webSearchMedia()` method
|
|
- `searchImages` action with various parameters
|
|
- Error handling (API errors, rate limits, invalid responses)
|
|
- Image download and validation
|
|
|
|
### Integration Tests
|
|
- End-to-end image search workflow
|
|
- Dynamic model selection routes to Google connector
|
|
- Multiple image downloads in parallel
|
|
- Verify ActionDocuments are created correctly
|
|
|
|
---
|
|
|
|
## 7. Risks & Mitigation
|
|
|
|
| Risk | Impact | Probability | Mitigation |
|
|
|------|--------|-------------|------------|
|
|
| Google API setup complexity | Medium | Medium | Provide clear setup instructions, validate API keys at startup |
|
|
| Dynamic model selection routing | Medium | Low | Thoroughly test operation type routing |
|
|
| API rate limiting (Google) | Low | Medium | Implement retry logic with exponential backoff |
|
|
| Missing connector registration | Medium | Low | Ensure connector is registered in discovery system |
|
|
|
|
---
|
|
|
|
## 8. Dependencies
|
|
|
|
- **Google Custom Search API** - REQUIRED
|
|
- **Google API Client Library** - May need `google-api-python-client` package
|
|
- **HTTP Client** - For image downloads (existing)
|
|
- **Base64 Encoding** - Python standard library (no dependency)
|
|
|
|
---
|
|
|
|
## 9. Success Criteria
|
|
|
|
✅ `WEB_SEARCH_MEDIA` operation type exists
|
|
✅ Google connector is registered and functional
|
|
✅ `searchImages` action works end-to-end
|
|
✅ Dynamic model selection routes `WEB_SEARCH_MEDIA` to Google connector
|
|
✅ Images are downloaded and returned as ActionDocuments
|
|
✅ All tests pass
|
|
|
|
---
|
|
|
|
**Document Version**: 2.0
|
|
**Last Updated**: 2026-01-01
|
|
**Status**: Refactoring Concept - Ready for Implementation
|