wiki/archiv/implementation_content_handling_with_dynamic_ai.md
2026-02-04 10:13:46 +01:00

117 lines
7.4 KiB
Markdown

# Current process
A: GENERIC START
mainServiceAi.py
├── callAiDocuments(prompt, documents, options, outputFormat, title)
│ └── SubCoreAi.callAiDocuments()
│ ├── documents provided?
│ │ └── SubDocumentProcessing.callAiText(prompt, documents, options)
│ │ └── processDocumentsPerChunk(documents, prompt, options)
B: MODEL DATA gathering
│ │ ├── _getModelCapabilitiesForContent(prompt, documents, options) [MODEL SELECTION FOR CHUNKING]
│ │ │ ├── modelRegistry.getAvailableModels()
│ │ │ ├── model_selector.selectModel(prompt, "", options, availableModels)
│ │ │ └── Returns: {maxContextBytes, textChunkSize, imageChunkSize}
│ │ │
C: GENERIC EXTRTACTION without AI, without need for model data
│ │ └── extractionService.extractContent(documents, extractionOptions)
│ │ └── For each document in documents:
│ │ └── runExtraction(extractorRegistry, chunkerRegistry, documentBytes, fileName, mimeType, options)
│ │ ├── extractorRegistry.resolve(mimeType, fileName) [FORMAT-SPECIFIC EXTRACTOR]
│ │ │ └── extractor.extract(documentBytes, options) [PDF, HTML, JSON, etc.]
│ │ │ └── Returns: List[ContentPart] (text, table, image, structure, container, binary)
│ │ │
│ │ └── poolAndLimit(parts, chunkerRegistry, options) [CHUNKING CORE - USES MODEL DATA]
│ │ ├── Uses: maxSize from options (derived from model context length)
│ │ ├── For each part that exceeds maxSize:
D: CHUNKING, requires model data
│ │ │ └── chunkerRegistry.resolve(part.typeGroup).chunk(part, options)
│ │ │ ├── TextChunker.chunk() → Uses textChunkSize from options
│ │ │ ├── ImageChunker.chunk() → Uses imageChunkSize from options
│ │ │ ├── TableChunker.chunk() → Uses textChunkSize from options
│ │ │ └── StructureChunker.chunk() → Uses textChunkSize from options
E: Generic part
│ │ └── Returns: List[ContentPart] with chunks marked as metadata["chunk"] = True
│ │
│ │ └── _processChunksWithMapping(extractionResult, prompt, options)
F: HERE CHUNKING NEEDED FOR AI CALLS
│ │ └── For each chunk in parallel (with concurrency control):
G: Idea
│ │ ├── Image chunks:
│ │ │ └── SubCoreAi.readImage(prompt, imageData, mimeType, options) [AI CALL]
│ │ │ └── interfaceAiObjects.callImage() [FALLBACK MODEL SELECTION]
│ │ │
│ │ ├── Container/Binary chunks:
│ │ │ └── aiObjects.call(AiCallRequest) [AI CALL]
│ │ │ └── interfaceAiObjects.call() [FALLBACK MODEL SELECTION]
│ │ │
│ │ └── Text/Table/Structure chunks:
│ │ └── aiObjects.call(AiCallRequest) [AI CALL]
│ │ └── interfaceAiObjects.call() [FALLBACK MODEL SELECTION]
│ │ ├── model_selector.getFallbackModels() [GET PRIORITIZED MODEL LIST]
│ │ ├── Try each model in sequence until success:
│ │ │ ├── _callWithModel(model, prompt, context, temperature, maxTokens, inputBytes)
│ │ │ ├── If fails → try next model in fallback list
│ │ │ └── If all fail → return error
│ │ └── Returns: AiCallResponse with content, modelName, priceCHF, etc.
H: GENERIC MERGING WITHOUT AI
│ │
│ │ └── _mergeChunkResults(chunkResults, options)
│ │ └── Returns: Merged text result
│ │
# Idea to review:
- not to mix content extraction to parts and chunking. to separate those two steps into pure extraction --> parts AND chunking parts -> chunks
- in step C only to do extraction without chunking, but for each part to have the meta information ready for potential chunking (should already be the case). So poolAndLimit not to run here. We just extract the content parts.
- in step F to loop over the parts, not the chunks
- now to handle the chunking inside the ai calls (SubCoreAi.readImage -> interfaceAiObjects.callImage() and aiObjects.call(AiCallRequest) --> interfaceAiObjects.call()) by giving one more attribute to the options "contentTypeGroup" which can be resolved to a chunker by chunkerRegistry.resolve(part.contentTypeGroup).chunk(part, options) --> like this we produce chunks only if needed, when model is known. So for each fallback round, new chunking can be done according to the models size. before the fallback loop to store the original content part to be able to make fresh chunking for each calling round.
# Process concept to handle the chunking inside the ai calls:
The architecture for interfaceAiObjects.call + interfaceAiObjects.callImage changes as follows.
## interfaceAiObjects.call (with one contentPart)
1. models to get with model_selector.getFallbackModels()
2. processedContentPartChunks[] to initialize empty
3. pipelineContentPartChunks: to add one chunk, the contentPart
4. set model fail counter for each model to 0. Set model index to 1 (first model in the list). maxModelErrors to set to 10
5. LOOP over models:
5.1. if a model has > maxModelErrors then to remove the model from the list. if no model on the list anymore to break up. if model index counter > max model index to set it back to model index 1
5.2. models selection: to select currentModel based on model index --> now we have model data
5.3. pooling: pool chunkItems in pipelineContentPartChunks[] to use models size; the pooling logic from poolAndLimit() function to reduce ai calls. Pooled items replace the single items in pipelineContentPartChunks[]
5.4. chunking: if chunkItem size bigger than models maxSize: to produce chunks for the chunkItem, then replace the chunkItem in pipelineContentPartChunks[] with the new chunks
5.5. processing: process chunkItem's in pipelineContentPartChunks and write each processed extraction into processedContentPartChunks and remove chunkItem in pipelineContentPartChunks. This to loop until model fails (to increment model fail counter for the model, to increment model index number, then next loop) or no items anymore (to exit LOOP).
6. Returns: AiCallResponse with content, modelName, priceCHF, etc. based on the delivered data in processedContentPartChunks
# Task
Can you review this new process concept critically. Is the logic correct. Some logic mistakes? - Something missing? Dataflow is clear?