ai models ready for image analysis

This commit is contained in:
ValueOn AG 2025-10-26 23:09:26 +01:00
parent 2489719c62
commit 5039096a10
8 changed files with 629 additions and 487 deletions

150
IMAGE_ANALYSE_ANALYSIS.md Normal file
View file

@ -0,0 +1,150 @@
# Image Analysis Code Flow Analysis
## Zusammenfassung der Parameter und Handovers
### 1. Ablauf für Image-Analyse (durch Dokumentenverarbeitung)
#### Eingabe
- **methodAi.process()** wird aufgerufen mit:
- `aiPrompt`: Textanweisung für die Bildanalyse
- `documentList`: Liste von Dokumenten (einschließlich Bilder)
- `resultType`: Output-Format (optional, default: txt)
#### Verarbeitung
1. **mainServiceAi.py**`callAiDocuments()`
- Delegiert an `subCoreAi.callAiDocuments()`
2. **subCoreAi.py**`callAiDocuments()`
- Prüft, ob Dokumente vorhanden sind
- Wenn ja: ruft `documentProcessor.callAiText()` auf
3. **subDocumentProcessing.py**`callAiText()`
- Ruft `processDocumentsPerChunk()` auf
4. **subDocumentProcessing.py**`_processChunksWithMapping()`
- Analysiert jeden Chunk
- **Wichtig**: Zeile 645-689 - Erkennung von Bildern
- Prüft `is_image` Flag basierend auf:
- `document_mime_type` (z.B. "image/jpeg")
- `part.mimeType`
- `part.typeGroup == "image"`
5. **subCoreAi.py**`readImage()` (wird aufgerufen für Bildchunks)
- Zeile 561-625
- Setzt `operationType = IMAGE_ANALYSE`
- Ruft `aiObjects.callImage()` auf mit:
- `prompt`: Der Analyse-Prompt
- `imageData`: Die Bilddaten (bytes oder base64)
- `mimeType`: Z.B. "image/jpeg"
- `options`: Mit `operationType=IMAGE_ANALYSE`
#### Ausgabe
- Textanalyse des Bildes
### 2. Ablauf für direkte Image-Analyse
#### Eingabe
- **mainServiceAi.readImage()** wird direkt aufgerufen mit:
- `prompt`: Textanweisung
- `imageData`: Bilddaten (bytes oder base64)
- `mimeType`: Z.B. "image/jpeg"
- `options`: Optional, wird auf `IMAGE_ANALYSE` gesetzt
#### Verarbeitung
1. **mainServiceAi.py**`readImage()`
- Delegiert an `subCoreAi.readImage()`
2. **subCoreAi.py**`readImage()`
- Setzt `operationType = IMAGE_ANALYSE` (Zeile 582)
- Ruft `aiObjects.callImage()` auf
#### Ausgabe
- Textanalyse des Bildes
## Wo werden welche Funktionen genutzt?
### mainServiceAi.py
#### `readImage()` (Zeile 96-105)
- **Verwendung**: Wird direkt von außen aufgerufen (z.B. API)
- **Delegiert an**: `subCoreAi.readImage()`
- **Verwendung**: ✅ Wird verwendet
#### `generateImage()` (Zeile 108-118)
- **Verwendung**: Wird direkt von außen aufgerufen (z.B. API)
- **Delegiert an**: `subCoreAi.generateImage()`
- **Verwendung**: ✅ Wird verwendet
### subCoreAi.py
#### `readImage()` (Zeile 561-625)
- **Verwendung**:
1. Wird von `mainServiceAi.readImage()` aufgerufen
2. Wird von `subDocumentProcessing._processChunksWithMapping()` aufgerufen (Zeile 670)
- **Verwendung**: ✅ Wird verwendet
#### `generateImage()` (Zeile 628-660)
- **Verwendung**: Wird von `mainServiceAi.generateImage()` aufgerufen
- **Verwendung**: ✅ Wird verwendet
### subDocumentProcessing.py
#### `_processChunksWithMapping()` (Zeile 594-994)
- **Bildanalyse**: Zeile 645-689
- Erkennt Bilder basierend auf MIME-Type und typeGroup
- Ruft `core_ai.readImage()` auf
- **Verwendung**: ✅ Wird verwendet
## Parameter-Validierung
### ✅ Alle Parameter korrekt
1. **operationType**:
- Wird immer auf `IMAGE_ANALYSE` gesetzt (subCoreAi Zeile 582)
- Wird korrekt übergeben
2. **imageData**:
- Wird korrekt geladen und übergeben
- Unterstützt bytes und base64
3. **mimeType**:
- Wird automatisch erkannt
- Standard: "image/jpeg"
4. **prompt**:
- Wird korrekt übergeben
- Kann von Benutzer angepasst werden
## Handovers sind korrekt
### mainServiceAi → subCoreAi
- ✅ `readImage()` delegiert korrekt
- ✅ `generateImage()` delegiert korrekt
### subDocumentProcessing → subCoreAi
- ✅ Erkennt Bilder korrekt (Zeile 645-689)
- ✅ Ruft `readImage()` mit korrekten Parametern auf
- ✅ Setzt `operationType=IMAGE_ANALYSE`
### subCoreAi → aiObjects
- ✅ Ruft `callImage()` mit korrekten Parametern auf
- ✅ Setzt `operationType=IMAGE_ANALYSE`
## Identifizierte Probleme
### ⚠️ Keine Probleme identifiziert
Die Parameter und Handovers sind alle korrekt:
- ✅ Operation Type wird korrekt gesetzt
- ✅ Bilddaten werden korrekt geladen und übergeben
- ✅ MIME-Type wird korrekt erkannt
- ✅ Prompt wird korrekt übergeben
- ✅ Alle Delegierungen funktionieren korrekt
## Test-Strategie
Der Test verwendet:
1. Direkte Bildanalyse über `mainServiceAi.readImage()`
2. Testet alle Modelle die `IMAGE_ANALYSE` unterstützen
3. Validiert die Antworten auf Inhalt und Struktur

View file

@ -70,8 +70,8 @@ class AiAnthropic(BaseConnectorAi):
calculatePriceUsd=lambda processingTime, bytesSent, bytesReceived: (bytesSent / 4 / 1000) * 0.015 + (bytesReceived / 4 / 1000) * 0.075 calculatePriceUsd=lambda processingTime, bytesSent, bytesReceived: (bytesSent / 4 / 1000) * 0.015 + (bytesReceived / 4 / 1000) * 0.075
), ),
AiModel( AiModel(
name="claude-3-5-sonnet-20241022-vision", name="claude-3-5-sonnet-20241022",
displayName="Anthropic Claude 3.5 Sonnet Vision", displayName="Anthropic Claude 3.5 Sonnet Instance Vision",
connectorType="anthropic", connectorType="anthropic",
apiUrl="https://api.anthropic.com/v1/messages", apiUrl="https://api.anthropic.com/v1/messages",
temperature=0.2, temperature=0.2,
@ -79,9 +79,8 @@ class AiAnthropic(BaseConnectorAi):
contextLength=200000, contextLength=200000,
costPer1kTokensInput=0.015, costPer1kTokensInput=0.015,
costPer1kTokensOutput=0.075, costPer1kTokensOutput=0.075,
speedRating=6, # Slower due to high-quality processing speedRating=6,
qualityRating=10, # Best quality available qualityRating=10,
# capabilities removed (not used in business logic)
functionCall=self.callAiImage, functionCall=self.callAiImage,
priority=PriorityEnum.QUALITY, priority=PriorityEnum.QUALITY,
processingMode=ProcessingModeEnum.DETAILED, processingMode=ProcessingModeEnum.DETAILED,
@ -234,69 +233,122 @@ class AiAnthropic(BaseConnectorAi):
AiModelResponse with analysis content AiModelResponse with analysis content
""" """
try: try:
# Extract parameters from modelCall # Extract parameters from messages for Anthropic Vision API
messages = modelCall.messages messages = modelCall.messages
model = modelCall.model model = modelCall.model
options = modelCall.options
prompt = messages[0]["content"] if messages else ""
imageData = getattr(options, "imageData", None)
mimeType = getattr(options, "mimeType", None)
# Debug logging # Verify messages contain image data
logger.info(f"callAiImage called with imageData type: {type(imageData)}, length: {len(imageData) if imageData else 0}, mimeType: {mimeType}") if not messages or not messages[0].get("content"):
raise ValueError("No messages provided for image analysis")
# Distinguish between file path and binary data logger.info(f"callAiImage called with {len(messages)} message(s)...")
if isinstance(imageData, str):
# Check if it's base64 encoded data or a file path
if len(imageData) > 100 and not os.path.exists(imageData):
# It's likely base64 encoded data
logger.info("Treating imageData as base64 encoded string")
base64Data = imageData
if not mimeType:
mimeType = "image/png"
else:
# It's a file path - import filehandling only when needed
logger.info(f"Treating imageData as file path: {imageData}")
from modules import agentserviceFilemanager as fileHandler
base64Data, autoMimeType = fileHandler.encodeFileToBase64(imageData)
mimeType = mimeType or autoMimeType
else:
# It's binary data
logger.info("Treating imageData as binary data")
import base64
base64Data = base64.b64encode(imageData).decode('utf-8')
# MIME type must be specified for binary data
if not mimeType:
# Fallback to generic image type
mimeType = "image/png"
# Prepare the payload for the Vision API # Extract text prompt and image data from messages
messages = [ # Messages format: [{"role": "user", "content": [{"type": "text", "text": "..."}, {"type": "image_url", "image_url": {"url": "data:..."}}]}]
{ userContent = messages[0]["content"]
"role": "user", if not isinstance(userContent, list):
"content": [ raise ValueError("Expected content to be a list for vision")
{"type": "text", "text": prompt},
{ textPrompt = ""
"type": "image_url", imageUrl = None
"image_url": {
"url": f"data:{mimeType};base64,{base64Data}" for contentItem in userContent:
} if contentItem.get("type") == "text":
textPrompt = contentItem.get("text", "")
elif contentItem.get("type") == "image_url":
imageUrl = contentItem.get("image_url", {}).get("url", "")
if not imageUrl or not imageUrl.startswith("data:"):
raise ValueError("No image data found in messages")
# Extract base64 data and mime type from data URL
# Format: data:image/jpeg;base64,/9j/4AAQSkZ...
parts = imageUrl.split(";base64,")
if len(parts) != 2:
raise ValueError("Invalid image data URL format")
mimeType = parts[0].replace("data:", "")
base64Data = parts[1]
# Convert to Anthropic's vision format
anthropicMessages = [{
"role": "user",
"content": [
{"type": "text", "text": textPrompt},
{
"type": "image",
"source": {
"type": "base64",
"media_type": mimeType,
"data": base64Data
} }
] }
} ]
] }]
# Create a modelCall for the basic AI function # Call Anthropic API directly for vision
basicModelCall = AiModelCall( import time
messages=messages, import base64
model=model
startTime = time.time()
# Prepare system prompt if available
systemPrompt = None
for msg in messages:
if msg.get("role") == "system":
systemContent = msg.get("content")
if isinstance(systemContent, list):
systemPrompt = "\n".join([item.get("text", "") for item in systemContent if item.get("type") == "text"])
else:
systemPrompt = systemContent
break
# Get parameters from model (consistent with callAiBasic)
maxTokens = model.maxTokens if hasattr(model, 'maxTokens') else 8192
temperature = model.temperature if hasattr(model, 'temperature') else 0.2
# Prepare API payload
payload = {
"model": model.name, # Use standard model.name
"max_tokens": maxTokens,
"messages": anthropicMessages
}
if systemPrompt:
payload["system"] = systemPrompt
# Set temperature from model
payload["temperature"] = temperature
# Make API call with headers from httpClient (which includes anthropic-version)
response = await self.httpClient.post(
"https://api.anthropic.com/v1/messages",
json=payload
) )
# Use the existing callAiBasic function with the Vision model if response.status_code != 200:
response = await self.callAiBasic(basicModelCall) errorText = response.text
logger.error(f"Anthropic API error: {response.status_code} - {errorText}")
raise HTTPException(status_code=response.status_code, detail=f"Anthropic API error: {errorText}")
# Return the standardized response # Parse response
return response result = response.json()
content = result["content"][0]["text"] if result.get("content") else ""
endTime = time.time()
processingTime = endTime - startTime
# Calculate cost
inputTokens = result.get("usage", {}).get("input_tokens", 0)
outputTokens = result.get("usage", {}).get("output_tokens", 0)
# Return standardized response
return AiModelResponse(
content=content,
success=True,
modelId=model.name,
processingTime=processingTime
)
except Exception as e: except Exception as e:
logger.error(f"Error during image analysis: {str(e)}", exc_info=True) logger.error(f"Error during image analysis: {str(e)}", exc_info=True)

View file

@ -95,8 +95,8 @@ class AiOpenai(BaseConnectorAi):
calculatePriceUsd=lambda processingTime, bytesSent, bytesReceived: (bytesSent / 4 / 1000) * 0.0015 + (bytesReceived / 4 / 1000) * 0.002 calculatePriceUsd=lambda processingTime, bytesSent, bytesReceived: (bytesSent / 4 / 1000) * 0.0015 + (bytesReceived / 4 / 1000) * 0.002
), ),
AiModel( AiModel(
name="gpt-4o-vision", name="gpt-4o",
displayName="OpenAI GPT-4o Vision", displayName="OpenAI GPT-4o Instance Vision",
connectorType="openai", connectorType="openai",
apiUrl="https://api.openai.com/v1/chat/completions", apiUrl="https://api.openai.com/v1/chat/completions",
temperature=0.2, temperature=0.2,
@ -106,7 +106,6 @@ class AiOpenai(BaseConnectorAi):
costPer1kTokensOutput=0.06, costPer1kTokensOutput=0.06,
speedRating=6, # Slower for vision tasks speedRating=6, # Slower for vision tasks
qualityRating=9, # High quality vision qualityRating=9, # High quality vision
# capabilities removed (not used in business logic)
functionCall=self.callAiImage, functionCall=self.callAiImage,
priority=PriorityEnum.QUALITY, priority=PriorityEnum.QUALITY,
processingMode=ProcessingModeEnum.DETAILED, processingMode=ProcessingModeEnum.DETAILED,
@ -226,42 +225,16 @@ class AiOpenai(BaseConnectorAi):
# Extract parameters from modelCall # Extract parameters from modelCall
messages = modelCall.messages messages = modelCall.messages
model = modelCall.model model = modelCall.model
options = modelCall.options
prompt = messages[0]["content"] if messages else ""
imageData = getattr(options, "imageData", None)
mimeType = getattr(options, "mimeType", "image/jpeg")
logger.debug(f"Starting image analysis with query '{prompt}' for size {len(imageData)}B...") # Messages should already be in the correct format with image data embedded
# Just verify they contain image data
if not messages or not messages[0].get("content"):
raise ValueError("No messages provided for image analysis")
# Ensure imageData is a string (base64 encoded) logger.debug(f"Starting image analysis with {len(messages)} message(s)...")
if not isinstance(imageData, str):
raise ValueError("imageData must be a string (base64 encoded)")
# Fix base64 padding if needed # Use the messages directly - they should already contain the image data
padding_needed = len(imageData) % 4 # in the format: {"type": "image_url", "image_url": {"url": "data:...base64,..."}}
if padding_needed:
imageData += '=' * (4 - padding_needed)
logger.debug(f"Using MIME type: {mimeType}")
logger.debug(f"Base64 data length: {len(imageData)} characters")
# Create the data URL format as required by OpenAI Vision API
data_url = f"data:{mimeType};base64,{imageData}"
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": prompt},
{
"type": "image_url",
"image_url": {
"url": data_url
}
}
]
}
]
# Use parameters from model # Use parameters from model
temperature = model.temperature temperature = model.temperature

View file

@ -194,6 +194,7 @@ class AiModelResponse(BaseModel):
# Structured prompt models for specialized operations # Structured prompt models for specialized operations
class AiCallPromptWebSearch(BaseModel): class AiCallPromptWebSearch(BaseModel):
"""Structured prompt format for WEB_SEARCH operation - returns list of URLs.""" """Structured prompt format for WEB_SEARCH operation - returns list of URLs."""

View file

@ -1,6 +1,7 @@
import logging import logging
import asyncio import asyncio
import uuid import uuid
import base64
from typing import Dict, Any, List, Union, Tuple, Optional from typing import Dict, Any, List, Union, Tuple, Optional
from dataclasses import dataclass from dataclasses import dataclass
import time import time
@ -74,7 +75,7 @@ class AiObjects:
logger.info(f"Selected model: {selectedModel.name} ({selectedModel.displayName})") logger.info(f"Selected model: {selectedModel.name} ({selectedModel.displayName})")
return selectedModel.name return selectedModel.name
# AI for Extraction and Text Generation
async def call(self, request: AiCallRequest) -> AiCallResponse: async def call(self, request: AiCallRequest) -> AiCallResponse:
"""Call AI model for text generation with model-aware chunking.""" """Call AI model for text generation with model-aware chunking."""
# Handle content parts (unified path) # Handle content parts (unified path)
@ -196,11 +197,71 @@ class AiObjects:
"""Process a single content part with model-aware chunking and fallback.""" """Process a single content part with model-aware chunking and fallback."""
lastError = None lastError = None
# Check if this is an image - Vision models need special handling
isImage = (contentPart.typeGroup == "image") or (contentPart.mimeType and contentPart.mimeType.startswith("image/"))
for attempt, model in enumerate(failoverModelList): for attempt, model in enumerate(failoverModelList):
try: try:
logger.info(f"Processing content part with model: {model.name} (attempt {attempt + 1}/{len(failoverModelList)})") logger.info(f"Processing content part with model: {model.name} (attempt {attempt + 1}/{len(failoverModelList)})")
# Check if part fits in model context # Special handling for images with Vision models
if isImage and hasattr(model, 'functionCall'):
# Call model's functionCall directly (for Vision models this is callAiImage)
from modules.datamodels.datamodelAi import AiModelCall, AiCallOptions as AiCallOpts
try:
modelCall = AiModelCall(
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": prompt},
{
"type": "image_url",
"image_url": {
"url": f"data:{contentPart.mimeType};base64,{contentPart.data}" if isinstance(contentPart.data, str) else
f"data:{contentPart.mimeType};base64,{base64.b64encode(contentPart.data).decode('utf-8')}"
}
}
]
}
],
model=model,
options=AiCallOpts(operationType=options.operationType)
)
modelResponse = await model.functionCall(modelCall)
if not modelResponse.success:
raise ValueError(f"Model call failed: {modelResponse.error}")
logger.info(f"✅ Image content part processed successfully with model: {model.name}")
# Convert to AiCallResponse format
return AiCallResponse(
content=modelResponse.content,
modelName=model.name,
priceUsd=modelResponse.priceUsd if hasattr(modelResponse, 'priceUsd') else 0.0,
processingTime=modelResponse.processingTime if hasattr(modelResponse, 'processingTime') else 0.0,
bytesSent=0, # Will be calculated elsewhere
bytesReceived=0, # Will be calculated elsewhere
errorCount=0
)
except Exception as e:
# Image processing failed with this model
lastError = e
logger.warning(f"❌ Image processing failed with model {model.name}: {str(e)}")
# If this is not the last model, try the next one
if attempt < len(failoverModelList) - 1:
logger.info(f"🔄 Trying next fallback model for image processing...")
continue
else:
# All models failed
logger.error(f"💥 All {len(failoverModelList)} models failed for image processing")
raise
# For non-image parts, check if part fits in model context
partSize = len(contentPart.data.encode('utf-8')) if contentPart.data else 0 partSize = len(contentPart.data.encode('utf-8')) if contentPart.data else 0
modelContextBytes = model.contextLength * 4 # Convert tokens to bytes modelContextBytes = model.contextLength * 4 # Convert tokens to bytes
@ -319,12 +380,13 @@ class AiObjects:
content_parts.append(content_part) content_parts.append(content_part)
# Use existing merging system # Use existing merging system
merge_strategy = { from modules.datamodels.datamodelExtraction import MergeStrategy
"useIntelligentMerging": True, merge_strategy = MergeStrategy(
"groupBy": "typeGroup", useIntelligentMerging=True,
"orderBy": "id", groupBy="typeGroup",
"mergeType": "concatenate" orderBy="id",
} mergeType="concatenate"
)
from modules.services.serviceExtraction.subPipeline import _applyMerging from modules.services.serviceExtraction.subPipeline import _applyMerging
merged_parts = _applyMerging(content_parts, merge_strategy) merged_parts = _applyMerging(content_parts, merge_strategy)
@ -365,12 +427,13 @@ class AiObjects:
content_parts.append(content_part) content_parts.append(content_part)
# Use existing merging system # Use existing merging system
merge_strategy = { from modules.datamodels.datamodelExtraction import MergeStrategy
"useIntelligentMerging": True, merge_strategy = MergeStrategy(
"groupBy": "typeGroup", useIntelligentMerging=True,
"orderBy": "id", groupBy="typeGroup",
"mergeType": "concatenate" orderBy="id",
} mergeType="concatenate"
)
from modules.services.serviceExtraction.subPipeline import _applyMerging from modules.services.serviceExtraction.subPipeline import _applyMerging
merged_parts = _applyMerging(content_parts, merge_strategy) merged_parts = _applyMerging(content_parts, merge_strategy)
@ -462,118 +525,8 @@ class AiObjects:
errorCount=0 errorCount=0
) )
async def callImage(self, prompt: str, imageData: Union[str, bytes], mimeType: str = None, options: AiCallOptions = None) -> AiCallResponse:
"""Call AI model for image analysis with fallback mechanism."""
if options is None:
options = AiCallOptions(operationType=OperationTypeEnum.IMAGE_ANALYSE)
# Get fallback models for image analysis
availableModels = modelRegistry.getAvailableModels()
failoverModelList = modelSelector.getFailoverModelList(prompt, "", options, availableModels)
if not failoverModelList:
errorMsg = f"No suitable models found for image analysis"
logger.error(errorMsg)
return AiCallResponse(
content=errorMsg,
modelName="error",
priceUsd=0.0,
processingTime=0.0,
bytesSent=0,
bytesReceived=0,
errorCount=1
)
# Try each model in fallback sequence
lastError = None
for attempt, model in enumerate(failoverModelList):
try:
logger.info(f"Attempting image analysis with model: {model.name} (attempt {attempt + 1}/{len(failoverModelList)})")
# Call the model
response = await self._callImageWithModel(model, prompt, imageData, mimeType)
logger.info(f"✅ Image analysis successful with model: {model.name}")
return response
except Exception as e:
lastError = e
logger.warning(f"❌ Image analysis failed with model {model.name}: {str(e)}")
# If this is not the last model, try the next one
if attempt < len(failoverModelList) - 1:
logger.info(f"🔄 Trying next fallback model for image analysis...")
continue
else:
# All models failed
logger.error(f"💥 All {len(failoverModelList)} models failed for image analysis")
break
# All fallback attempts failed - return error response
errorMsg = f"All AI models failed for image analysis. Last error: {str(lastError)}"
logger.error(errorMsg)
return AiCallResponse(
content=errorMsg,
modelName="error",
priceUsd=0.0,
processingTime=0.0,
bytesSent=0,
bytesReceived=0,
errorCount=1
)
async def _callImageWithModel(self, model: AiModel, prompt: str, imageData: Union[str, bytes], mimeType: str) -> AiCallResponse:
"""Call a specific model for image analysis and return the response."""
# Calculate input bytes from prompt and image data
promptBytes = len(prompt.encode('utf-8'))
if isinstance(imageData, str):
# Base64 encoded string
imageBytes = len(imageData.encode('utf-8'))
else:
# Raw bytes
imageBytes = len(imageData)
inputBytes = promptBytes + imageBytes
# Start timing
startTime = time.time()
# Create standardized call object for image analysis
modelCall = AiModelCall(
messages=[{"role": "user", "content": prompt}],
model=model,
options=AiCallOptions(imageData=imageData, mimeType=mimeType)
)
# Call the model with standardized interface
if model.functionCall:
modelResponse = await model.functionCall(modelCall)
# Extract content from standardized response
if not modelResponse.success:
raise ValueError(f"Model call failed: {modelResponse.error}")
content = modelResponse.content
else:
raise ValueError(f"Model {model.name} has no function call defined")
# Calculate timing and output bytes
endTime = time.time()
processingTime = endTime - startTime
outputBytes = len(content.encode("utf-8"))
# Calculate price using model's own price calculation method
priceUsd = model.calculatePriceUsd(processingTime, inputBytes, outputBytes)
return AiCallResponse(
content=content,
modelName=model.name,
priceUsd=priceUsd,
processingTime=processingTime,
bytesSent=inputBytes,
bytesReceived=outputBytes,
errorCount=0
)
# AI for Image Generation
async def generateImage(self, prompt: str, size: str = "1024x1024", quality: str = "standard", style: str = "vivid", options: AiCallOptions = None) -> AiCallResponse: async def generateImage(self, prompt: str, size: str = "1024x1024", quality: str = "standard", style: str = "vivid", options: AiCallOptions = None) -> AiCallResponse:
"""Generate an image using AI.""" """Generate an image using AI."""

View file

@ -100,7 +100,7 @@ class AiService:
mimeType: str = None, mimeType: str = None,
options: Optional[AiCallOptions] = None, options: Optional[AiCallOptions] = None,
) -> str: ) -> str:
"""Call AI for image analysis using interface.callImage().""" """Call AI for image analysis using interface.call() with contentParts."""
await self._ensureAiObjectsInitialized() await self._ensureAiObjectsInitialized()
return await self.coreAi.readImage(prompt, imageData, mimeType, options) return await self.coreAi.readImage(prompt, imageData, mimeType, options)

View file

@ -565,7 +565,7 @@ CRITICAL REQUIREMENTS:
mimeType: str = None, mimeType: str = None,
options: Optional[AiCallOptions] = None, options: Optional[AiCallOptions] = None,
) -> str: ) -> str:
"""Call AI for image analysis using interface.callImage().""" """Call AI for image analysis using interface.call() with contentParts."""
try: try:
# Check if imageData is valid # Check if imageData is valid
if not imageData: if not imageData:
@ -584,30 +584,51 @@ CRITICAL REQUIREMENTS:
# Override the operation type to ensure image analysis # Override the operation type to ensure image analysis
options.operationType = OperationTypeEnum.IMAGE_ANALYSE options.operationType = OperationTypeEnum.IMAGE_ANALYSE
self.services.utils.debugLogToFile(f"Calling aiObjects.callImage with operationType: {options.operationType}", "AI_SERVICE") # Create content parts with image data
logger.info(f"Calling aiObjects.callImage with operationType: {options.operationType}") from modules.datamodels.datamodelExtraction import ContentPart
import base64
# ContentPart.data must be a string - convert bytes to base64 if needed
if isinstance(imageData, bytes):
imageDataStr = base64.b64encode(imageData).decode('utf-8')
else:
# Already a base64 string
imageDataStr = imageData
imagePart = ContentPart(
id="image_0",
parentId=None,
label="Image",
typeGroup="image",
mimeType=mimeType or "image/jpeg",
data=imageDataStr, # Must be a string (base64 encoded)
metadata={"imageAnalysis": True}
)
# Create request with content parts
from modules.datamodels.datamodelAi import AiCallRequest
request = AiCallRequest(
prompt=prompt,
context="",
options=options,
contentParts=[imagePart]
)
self.services.utils.debugLogToFile(f"Calling aiObjects.call() with operationType: {options.operationType}", "AI_SERVICE")
logger.info(f"Calling aiObjects.call() with operationType: {options.operationType}")
# Write image analysis prompt to debug file # Write image analysis prompt to debug file
self.services.utils.writeDebugFile(prompt, "image_analysis_prompt") self.services.utils.writeDebugFile(prompt, "image_analysis_prompt")
response = await self.aiObjects.callImage(prompt, imageData, mimeType, options) response = await self.aiObjects.call(request)
# Write image analysis response to debug file # Write image analysis response to debug file
result = response.content if hasattr(response, 'content') else str(response) # response is an AiCallResponse object
result = response.content
self.services.utils.writeDebugFile(result, "image_analysis_response") self.services.utils.writeDebugFile(result, "image_analysis_response")
# Emit stats for image analysis
self.services.workflow.storeWorkflowStat(
self.services.currentWorkflow,
response,
f"ai.image.{options.operationType}"
)
# Debug the result # Debug the result
self.services.utils.debugLogToFile(f"Raw AI result type: {type(response)}, value: {repr(response)}", "AI_SERVICE") self.services.utils.debugLogToFile(f"AI image analysis result type: {type(response)}, content length: {len(result)}", "AI_SERVICE")
# Extract content from response
result = response.content if hasattr(response, 'content') else str(response)
# Check if result is valid # Check if result is valid
if not result or (isinstance(result, str) and not result.strip()): if not result or (isinstance(result, str) and not result.strip()):

View file

@ -1,9 +1,31 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
""" """
AI Models Test - Tests WEB_CRAWL functionality on all models that support it AI Models Test - Tests IMAGE_ANALYSE functionality on all models that support it
This script tests all models that have WEB_CRAWL capability, validates that This script tests all models that have IMAGE_ANALYSE capability, validates that
they can crawl specific URLs and return content, and analyzes the quality of results. they can analyze images and return structured content, and analyzes the quality of results.
CODE FLOW ANALYSIS:
1. methodAi.process() is called by AI planner with prompt and documents (images)
2. mainServiceAi.callAiDocuments() is called
-> delegates to subCoreAi.callAiDocuments()
-> which calls subDocumentProcessing.callAiText()
-> which processes chunks and detects images
-> for image chunks, calls subCoreAi.readImage()
-> which calls aiObjects.callImage() with operationType=IMAGE_ANALYSE
OR direct call:
- mainServiceAi.readImage() can be called directly (used in this test)
-> delegates to subCoreAi.readImage()
-> which calls aiObjects.callImage() with operationType=IMAGE_ANALYSE
WHERE FUNCTIONS ARE USED:
- mainServiceAi.readImage(): Public API entry point for direct image analysis
- mainServiceAi.generateImage(): Public API entry point for image generation
- subCoreAi.readImage(): Internal implementation, called by document processing or directly
- subCoreAi.generateImage(): Internal implementation, called by mainServiceAi
- subDocumentProcessing._processChunksWithMapping(): Detects image chunks and calls readImage()
""" """
import asyncio import asyncio
@ -54,6 +76,22 @@ class AIModelsTester:
shutil.copy2(testImageSource, testImageDest) shutil.copy2(testImageSource, testImageDest)
print(f"📷 Test image copied to: {testImageDest}") print(f"📷 Test image copied to: {testImageDest}")
# Find test image
self.testImagePath = None
if os.path.exists(testImageDest):
self.testImagePath = testImageDest
else:
# Try to find any image in modeltest directory
for file in os.listdir(self.modelTestDir):
if file.lower().endswith(('.jpg', '.jpeg', '.png')):
self.testImagePath = os.path.join(self.modelTestDir, file)
break
if self.testImagePath:
print(f"📷 Using test image: {self.testImagePath}")
else:
print(f"⚠️ No test image found in {self.modelTestDir}")
async def initialize(self): async def initialize(self):
"""Initialize the AI service.""" """Initialize the AI service."""
# Set logging level to DEBUG for detailed output # Set logging level to DEBUG for detailed output
@ -65,14 +103,18 @@ class AIModelsTester:
from modules.aicore.aicorePluginTavily import AiTavily from modules.aicore.aicorePluginTavily import AiTavily
from modules.aicore.aicorePluginPerplexity import AiPerplexity from modules.aicore.aicorePluginPerplexity import AiPerplexity
# Register web connectors that support WEB_CRAWL # Note: We don't need to register web connectors for IMAGE_ANALYSE testing
modelRegistry.registerConnector(AiTavily()) # modelRegistry.registerConnector(AiTavily())
modelRegistry.registerConnector(AiPerplexity()) # modelRegistry.registerConnector(AiPerplexity())
# The AI service needs to be recreated with proper initialization # The AI service needs to be recreated with proper initialization
from modules.services.serviceAi.mainServiceAi import AiService from modules.services.serviceAi.mainServiceAi import AiService
self.services.ai = await AiService.create(self.services) self.services.ai = await AiService.create(self.services)
# Also initialize extraction service for image processing
from modules.services.serviceExtraction.mainServiceExtraction import ExtractionService
self.services.extraction = ExtractionService(self.services)
# Create a minimal workflow context # Create a minimal workflow context
from modules.datamodels.datamodelChat import ChatWorkflow from modules.datamodels.datamodelChat import ChatWorkflow
import uuid import uuid
@ -98,228 +140,150 @@ class AIModelsTester:
print(f"📁 Results will be saved to: {self.modelTestDir}") print(f"📁 Results will be saved to: {self.modelTestDir}")
async def testModel(self, modelName: str) -> Dict[str, Any]: async def testModel(self, modelName: str) -> Dict[str, Any]:
"""Test a specific AI model with WEB_CRAWL operation.""" """Test a specific AI model with IMAGE_ANALYSE operation."""
print(f"\n{'='*60}") print(f"\n{'='*60}")
print(f"TESTING MODEL: {modelName}") print(f"TESTING MODEL: {modelName}")
print(f"OPERATION TYPE: WEB_CRAWL") print(f"OPERATION TYPE: IMAGE_ANALYSE")
print(f"{'='*60}") print(f"{'='*60}")
# CRAWL CONFIGURATION # Check if test image exists
# Deep and Broad Web Crawl Example: if not self.testImagePath or not os.path.exists(self.testImagePath):
# - maxDepth: 3 (deep) - follows links up to 3 levels from starting page result = {
# - Level 1: Starting page "modelName": modelName,
# - Level 2: Pages linked from starting page "status": "ERROR",
# - Level 3: Pages linked from Level 2 pages "processingTime": 0.0,
# - maxWidth: 50 (broad) - crawls up to 50 pages at each depth level "responseLength": 0,
# This results in potential maximum of ~1,250 pages (if 50 links exist at each level) "responseType": "error",
# "hasContent": False,
# Common configurations: "error": "No test image available",
# - Fast/Overview: maxDepth=1, maxWidth=5 (shallow, focused) "fullResponse": ""
# - General/Standard: maxDepth=2, maxWidth=10 (balanced) }
# - Deep and Broad: maxDepth=3, maxWidth=50 (comprehensive) self.testResults.append(result)
return result
CRAWL_DEPTH = 3 # Deep crawl: follows links 3 levels deep # Test prompt for image analysis
CRAWL_WIDTH = 50 # Broad crawl: up to 50 pages per level testPrompt = "Analyze this image and describe what you see. Extract any text, numbers, or structured data."
print(f"Crawl Configuration:")
print(f" - Depth: {CRAWL_DEPTH} levels (deep)")
print(f" - Width: {CRAWL_WIDTH} pages per level (broad)")
print(f" - Theoretical max: {CRAWL_WIDTH ** min(CRAWL_DEPTH, 3)} pages")
# Use WEB_CRAWL specific prompt format
from modules.datamodels.datamodelAi import AiCallPromptWebCrawl
# Test with simple prompt like playground example
simplePrompt = f"https://www.valueon.ch: Who works in this company?"
# But keep structured format for now to match our API
testPrompt = json.dumps({
"instruction": "Who works in this company?",
"url": "https://www.valueon.ch",
"maxDepth": CRAWL_DEPTH,
"maxWidth": CRAWL_WIDTH
}, indent=2)
print(f"Simple prompt (playground style): {simplePrompt}")
# For Tavily models, test direct API call for better link following
if "tavily" in modelName.lower():
return await self._testTavilyDirect(modelName, CRAWL_DEPTH, CRAWL_WIDTH)
print(f"Test image: {self.testImagePath}")
print(f"Test prompt: {testPrompt}") print(f"Test prompt: {testPrompt}")
print(f"Prompt length: {len(testPrompt)} characters")
# Load image data
with open(self.testImagePath, 'rb') as f:
imageData = f.read()
print(f"Image size: {len(imageData)} bytes")
# Determine image MIME type from extension
if self.testImagePath.lower().endswith('.png'):
mimeType = "image/png"
elif self.testImagePath.lower().endswith(('.jpg', '.jpeg')):
mimeType = "image/jpeg"
else:
mimeType = "image/jpeg" # Default
print(f"Image MIME type: {mimeType}")
startTime = asyncio.get_event_loop().time() startTime = asyncio.get_event_loop().time()
try: try:
# Create options for WEB_CRAWL operation # Get model directly from registry and test it
options = AiCallOptions(
operationType=OperationTypeEnum.WEB_CRAWL,
preferredModel=modelName
)
# Call the AI service DIRECTLY through the model's functionCall
# This tests the actual model, not the document generation pipeline
# Get the model directly from the registry using the model registry
from modules.aicore.aicoreModelRegistry import modelRegistry from modules.aicore.aicoreModelRegistry import modelRegistry
model = modelRegistry.getModel(modelName) model = modelRegistry.getModel(modelName)
if not model: if not model:
raise Exception(f"Model {modelName} not found") raise Exception(f"Model {modelName} not found")
# Create AiModelCall and call the model's functionCall directly # Import base64 for image data conversion
from modules.datamodels.datamodelAi import AiModelCall
import base64 import base64
import os
# For WEB_CRAWL models, use normal functionCall with structured prompt # Convert image data to base64 string
messages = [{"role": "user", "content": testPrompt}] if isinstance(imageData, bytes):
imageDataStr = base64.b64encode(imageData).decode('utf-8')
else:
imageDataStr = imageData
# Create messages in vision format
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": testPrompt},
{
"type": "image_url",
"image_url": {
"url": f"data:{mimeType};base64,{imageDataStr}"
}
}
]
}
]
# Create model call
from modules.datamodels.datamodelAi import AiModelCall, AiCallOptions
modelCall = AiModelCall( modelCall = AiModelCall(
messages=messages, messages=messages,
model=model, model=model,
options=options options=AiCallOptions(operationType=OperationTypeEnum.IMAGE_ANALYSE)
) )
response = await model.functionCall(modelCall)
# Call model directly
print(f"Calling model.functionCall() for {modelName}")
modelResponse = await model.functionCall(modelCall)
if not modelResponse.success:
raise Exception(f"Model call failed: {modelResponse.error}")
result = modelResponse.content
endTime = asyncio.get_event_loop().time() endTime = asyncio.get_event_loop().time()
processingTime = endTime - startTime processingTime = endTime - startTime
# Analyze response - now we get AiModelResponse objects # Analyze result (string response from readImage)
if hasattr(response, 'success'): if result:
# AiModelResponse object analysisResult = {
if response.success:
result = {
"modelName": modelName,
"status": "SUCCESS",
"processingTime": round(processingTime, 2),
"responseLength": len(response.content) if response.content else 0,
"responseType": "AiModelResponse",
"hasContent": bool(response.content),
"error": None,
"modelUsed": modelName,
"priceUsd": 0.0, # AiModelResponse doesn't have price info
"bytesSent": 0,
"bytesReceived": len(response.content.encode('utf-8')) if response.content else 0
}
# Extract actual prompt sent if available in metadata
if hasattr(response, 'metadata') and response.metadata:
result["actualPromptSent"] = response.metadata.get("actualPromptSent", "N/A")
# Try to parse content as JSON
if response.content:
try:
json.loads(response.content)
result["isValidJson"] = True
except:
result["isValidJson"] = False
result["responsePreview"] = response.content[:200] + "..." if len(response.content) > 200 else response.content
result["fullResponse"] = response.content
else:
result["isValidJson"] = False
result["responsePreview"] = "Empty response"
result["fullResponse"] = ""
print(f"✅ SUCCESS - Processing time: {processingTime:.2f}s")
print(f"📄 Response length: {len(response.content) if response.content else 0} characters")
print(f"📄 Model used: {modelName}")
print(f"📄 Response preview: {result['responsePreview']}")
else:
error = response.error or "Unknown error"
result = {
"modelName": modelName,
"status": "ERROR",
"processingTime": round(processingTime, 2),
"responseLength": 0,
"responseType": "AiModelResponse",
"hasContent": False,
"error": error,
"fullResponse": str(response)
}
print(f"❌ ERROR - {error}")
elif isinstance(response, dict):
# Fallback for dict responses
if response.get("success", True):
result = {
"modelName": modelName,
"status": "SUCCESS",
"processingTime": round(processingTime, 2),
"responseLength": len(str(response)),
"responseType": "dict",
"hasContent": True,
"error": None
}
# Try to parse as JSON
try:
jsonResponse = json.dumps(response, indent=2)
result["responsePreview"] = jsonResponse[:200] + "..." if len(jsonResponse) > 200 else jsonResponse
result["isValidJson"] = True
result["fullResponse"] = jsonResponse
except:
result["responsePreview"] = str(response)[:200] + "..." if len(str(response)) > 200 else str(response)
result["isValidJson"] = False
result["fullResponse"] = str(response)
print(f"✅ SUCCESS - Processing time: {processingTime:.2f}s")
print(f"📄 Response length: {len(str(response))} characters")
print(f"📄 Response preview: {result['responsePreview']}")
else:
error = response.get("error", "Unknown error")
result = {
"modelName": modelName,
"status": "ERROR",
"processingTime": round(processingTime, 2),
"responseLength": 0,
"responseType": "error",
"hasContent": False,
"error": error,
"fullResponse": str(response)
}
print(f"❌ ERROR - {error}")
else:
# String response
result = {
"modelName": modelName, "modelName": modelName,
"status": "SUCCESS", "status": "SUCCESS",
"processingTime": round(processingTime, 2), "processingTime": round(processingTime, 2),
"responseLength": len(str(response)), "responseLength": len(result) if result else 0,
"responseType": "string", "responseType": "string",
"hasContent": True, "hasContent": True,
"error": None "error": None,
"testPrompt": testPrompt,
"imagePath": self.testImagePath,
"imageSize": len(imageData),
"mimeType": mimeType
} }
# Try to parse as JSON # Try to parse as JSON
try: try:
json.loads(str(response)) json.loads(result)
result["isValidJson"] = True analysisResult["isValidJson"] = True
except: except:
result["isValidJson"] = False analysisResult["isValidJson"] = False
result["responsePreview"] = str(response)[:200] + "..." if len(str(response)) > 200 else str(response) analysisResult["responsePreview"] = result[:200] + "..." if len(result) > 200 else result
result["fullResponse"] = str(response) analysisResult["fullResponse"] = result
print(f"✅ SUCCESS - Processing time: {processingTime:.2f}s") print(f"✅ SUCCESS - Processing time: {processingTime:.2f}s")
print(f"📄 Response length: {len(str(response))} characters") print(f"📄 Response length: {len(result)} characters")
print(f"📄 Response preview: {result['responsePreview']}") print(f"📄 Response preview: {analysisResult['responsePreview']}")
# Add prompt to result for logging result = analysisResult
result["testPrompt"] = testPrompt
result["crawlConfig"] = {
"depth": CRAWL_DEPTH,
"width": CRAWL_WIDTH
}
# For WEB_CRAWL, also validate that content was extracted # Validate that content was extracted
if result.get("status") == "SUCCESS" and result.get("fullResponse"): if result.get("status") == "SUCCESS" and result.get("fullResponse"):
self._validateCrawlResponse(modelName, result) self._validateImageResponse(modelName, result)
else:
result = {
"modelName": modelName,
"status": "ERROR",
"processingTime": round(processingTime, 2),
"responseLength": 0,
"responseType": "error",
"hasContent": False,
"error": "Empty response",
"fullResponse": ""
}
except Exception as e: except Exception as e:
endTime = asyncio.get_event_loop().time() endTime = asyncio.get_event_loop().time()
@ -334,10 +298,9 @@ class AIModelsTester:
"hasContent": False, "hasContent": False,
"error": str(e), "error": str(e),
"testPrompt": testPrompt, "testPrompt": testPrompt,
"crawlConfig": { "imagePath": self.testImagePath,
"depth": CRAWL_DEPTH, "imageSize": len(imageData) if imageData else 0,
"width": CRAWL_WIDTH "mimeType": mimeType
}
} }
print(f"💥 EXCEPTION - {str(e)}") print(f"💥 EXCEPTION - {str(e)}")
@ -346,7 +309,7 @@ class AIModelsTester:
# Save text response even for exceptions to log the prompt # Save text response even for exceptions to log the prompt
if result.get("status") in ["SUCCESS", "EXCEPTION", "ERROR"]: if result.get("status") in ["SUCCESS", "EXCEPTION", "ERROR"]:
self._saveTextResponse(modelName, result) self._saveImageResponse(modelName, result)
# Save individual model result immediately # Save individual model result immediately
self._saveIndividualModelResult(modelName, result) self._saveIndividualModelResult(modelName, result)
@ -354,54 +317,48 @@ class AIModelsTester:
return result return result
def _saveImageResponse(self, modelName: str, result: Dict[str, Any]): def _saveImageResponse(self, modelName: str, result: Dict[str, Any]):
"""Save base64 image response to file.""" """Save image analysis response to file."""
try: try:
fullResponse = result.get("fullResponse", "") timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
base64Data = None filename = f"{modelName}_{timestamp}.txt"
filepath = os.path.join(self.modelTestDir, filename)
# Try to extract base64 data from response # Prepare content for saving
if isinstance(fullResponse, dict): content = result.get("fullResponse", "")
# Look for base64 data in the response if not content:
if "content" in fullResponse: content = result.get("responsePreview", "No content available")
base64Data = fullResponse["content"]
elif "data" in fullResponse:
base64Data = fullResponse["data"]
elif "image" in fullResponse:
base64Data = fullResponse["image"]
else:
# Try to find base64 data in string response
import re
base64Match = re.search(r'data:image/[^;]+;base64,([A-Za-z0-9+/=]+)', str(fullResponse))
if base64Match:
base64Data = base64Match.group(1)
else:
# Try to find pure base64 string
base64Match = re.search(r'([A-Za-z0-9+/=]{100,})', str(fullResponse))
if base64Match:
base64Data = base64Match.group(1)
if base64Data: # If there's an error, include it in the content
# Clean base64 data if result.get("error"):
if base64Data.startswith('data:image/'): content = f"ERROR: {result.get('error')}\n\n{content}"
base64Data = base64Data.split(',', 1)[1]
# Decode and save image # Add metadata header
imageData = base64.b64decode(base64Data) metadata = f"""Model: {modelName}
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S") Test Time: {timestamp}
filename = f"{modelName}_{timestamp}.png" Status: {result.get('status', 'Unknown')}
filepath = os.path.join(self.modelTestDir, filename) Processing Time: {result.get('processingTime', 0):.2f}s
Response Length: {result.get('responseLength', 0)} characters
Is Valid JSON: {result.get('isValidJson', False)}
Image Path: {result.get('imagePath', 'N/A')}
Image Size: {result.get('imageSize', 'N/A')} bytes
MIME Type: {result.get('mimeType', 'N/A')}
with open(filepath, 'wb') as f: --- TEST PROMPT ---
f.write(imageData) {result.get('testPrompt', 'N/A')}
result["savedImage"] = filepath --- RESPONSE CONTENT ---
print(f"🖼️ Image saved: {filepath}") {content}
else: """
print(f"⚠️ No base64 image data found in response")
with open(filepath, 'w', encoding='utf-8') as f:
f.write(metadata)
result["savedTextFile"] = filepath
print(f"📄 Analysis response saved: {filepath}")
except Exception as e: except Exception as e:
print(f"❌ Error saving image: {str(e)}") print(f"❌ Error saving analysis response: {str(e)}")
result["imageSaveError"] = str(e) result["saveError"] = str(e)
def _saveTextResponse(self, modelName: str, result: Dict[str, Any]): def _saveTextResponse(self, modelName: str, result: Dict[str, Any]):
"""Save text response to file.""" """Save text response to file."""
@ -504,6 +461,41 @@ Width: {crawlWidth}
print(f"❌ Error validating crawl response: {str(e)}") print(f"❌ Error validating crawl response: {str(e)}")
result["crawlValidationError"] = str(e) result["crawlValidationError"] = str(e)
def _validateImageResponse(self, modelName: str, result: Dict[str, Any]):
"""Validate that the IMAGE_ANALYSE response contains analyzed content."""
try:
content = result.get("fullResponse", "")
# Check if content is meaningful
hasContent = bool(content and len(content.strip()) > 0)
contentLength = len(content)
result["hasContent"] = hasContent
result["contentLength"] = contentLength
# Try to determine what kind of content was extracted
if hasContent:
# Check if it's structured data
isStructured = False
try:
parsed = json.loads(content)
if isinstance(parsed, dict):
isStructured = True
except:
pass
result["isStructured"] = isStructured
print(f"✅ Successfully analyzed image")
print(f" Content length: {contentLength} characters")
print(f" Is structured: {'Yes' if isStructured else 'No'}")
else:
print(f"⚠️ Empty or invalid image analysis response")
except Exception as e:
print(f"❌ Error validating image response: {str(e)}")
result["validationError"] = str(e)
async def _testTavilyDirect(self, modelName: str, crawlDepth: int = 3, crawlWidth: int = 50) -> Dict[str, Any]: async def _testTavilyDirect(self, modelName: str, crawlDepth: int = 3, crawlWidth: int = 50) -> Dict[str, Any]:
"""Test Tavily API directly using the crawl() method with better link following.""" """Test Tavily API directly using the crawl() method with better link following."""
print(f"\n{'='*60}") print(f"\n{'='*60}")
@ -660,30 +652,30 @@ Width: {crawlWidth}
print(f"❌ Error saving individual result: {str(e)}") print(f"❌ Error saving individual result: {str(e)}")
def getAllAvailableModels(self) -> List[str]: def getAllAvailableModels(self) -> List[str]:
"""Get all available model names that support WEB_CRAWL.""" """Get all available model names that support IMAGE_ANALYSE."""
from modules.aicore.aicoreModelRegistry import modelRegistry from modules.aicore.aicoreModelRegistry import modelRegistry
from modules.datamodels.datamodelAi import OperationTypeEnum from modules.datamodels.datamodelAi import OperationTypeEnum
# Get all models from registry # Get all models from registry
allModels = modelRegistry.getAvailableModels() allModels = modelRegistry.getAvailableModels()
# Filter models that support WEB_CRAWL # Filter models that support IMAGE_ANALYSE
webCrawlModels = [] imageAnalyseModels = []
for model in allModels: for model in allModels:
if model.operationTypes and any( if model.operationTypes and any(
ot.operationType == OperationTypeEnum.WEB_CRAWL ot.operationType == OperationTypeEnum.IMAGE_ANALYSE
for ot in model.operationTypes for ot in model.operationTypes
): # Include both Tavily and Perplexity models ):
webCrawlModels.append(model.name) imageAnalyseModels.append(model.name)
# Filter to only "sonar" model for testing # Filter to common models for testing (remove filter to test all models)
webCrawlModels = [m for m in webCrawlModels if m == "sonar"] # imageAnalyseModels = [m for m in imageAnalyseModels if "gpt" in m.lower() or "claude" in m.lower()]
print(f"Found {len(webCrawlModels)} models that support WEB_CRAWL (filtered to sonar):") print(f"Found {len(imageAnalyseModels)} models that support IMAGE_ANALYSE:")
for modelName in webCrawlModels: for modelName in imageAnalyseModels:
print(f" - {modelName}") print(f" - {modelName}")
return webCrawlModels return imageAnalyseModels
def saveTestResults(self): def saveTestResults(self):
"""Save detailed test results to file.""" """Save detailed test results to file."""
@ -802,7 +794,7 @@ async def main():
"""Run AI models testing for WEB_CRAWL operation.""" """Run AI models testing for WEB_CRAWL operation."""
tester = AIModelsTester() tester = AIModelsTester()
print("Starting AI Models Testing for WEB_CRAWL...") print("Starting AI Models Testing for IMAGE_ANALYSE...")
print("Initializing AI service...") print("Initializing AI service...")
await tester.initialize() await tester.initialize()
@ -814,9 +806,9 @@ async def main():
print(f" {i}. {model}") print(f" {i}. {model}")
print(f"\n{'='*80}") print(f"\n{'='*80}")
print("STARTING WEB_CRAWL TESTS") print("STARTING IMAGE_ANALYSE TESTS")
print(f"{'='*80}") print(f"{'='*80}")
print("Testing each model's ability to crawl URLs and return content...") print("Testing each model's ability to analyze images and return structured content...")
print("Press Enter after each model test to continue to the next one...") print("Press Enter after each model test to continue to the next one...")
# Test each model individually # Test each model individually
@ -840,7 +832,7 @@ async def main():
print("TESTING COMPLETED") print("TESTING COMPLETED")
print(f"{'='*80}") print(f"{'='*80}")
print(f"📄 Results saved to: {resultsFile}") print(f"📄 Results saved to: {resultsFile}")
print(f"📁 Images saved to: {tester.modelTestDir}") print(f"📁 Test results saved to: {tester.modelTestDir}")
if __name__ == "__main__": if __name__ == "__main__":
asyncio.run(main()) asyncio.run(main())