Commit graph

14 commits

Author SHA1 Message Date
Ida
a7f4055130 fix(rag): preserve per-page granularity + remove on-demand extraction fallbacks
The default MergeStrategy concatenates every extracted text part into a
single ContentPart, collapsing a 500-page PDF into one chunk with a
blurred average embedding — RAG retrieval was effectively broken.

- ExtractionOptions.mergeStrategy is now Optional[MergeStrategy]; passing
  None preserves per-part granularity. Default factory kept for
  backward compatibility.
- routeDataFiles._autoIndexFile, _workspaceTools.readFile, and
  _documentTools.describeImage explicitly pass mergeStrategy=None.
- Agent tools no longer carry redundant extraction + requestIngestion
  fallback paths: the unified ingestion lane owns all corpus writes,
  and readFile/describeImage are pure consumers of the knowledge store.
- Unit test asserts runExtraction(mergeStrategy=None) keeps every part.
2026-04-29 14:39:40 +02:00
ValueOn AG
e942770ffc feat db-clean-ui and unified content udm 2026-04-16 23:13:05 +02:00
ValueOn AG
7fe6f9bc97 new ai agent 2026-03-15 23:38:21 +01:00
ValueOn AG
e1b3cd36f0 enhanced core ai call document handling with document intent 2025-12-25 00:09:27 +01:00
ValueOn AG
4b00e741b3 refactored service center 2025-12-15 21:55:26 +01:00
ValueOn AG
9bd7821cf5 feat: refactored ai calls and pydantic models 2025-11-17 23:12:18 +01:00
ValueOn AG
259ccabbe3 Prompt tuning for generation and validation step 2025-10-31 14:28:14 +01:00
ValueOn AG
8523da7fe2 cleanup pydantic v2, unnecessary pdantic to dict convesrions, unnecessary unions removed with clean classes 2025-10-24 22:46:05 +02:00
ValueOn AG
36947b6d7e ready for test revised dynamic ai aware chunking system 2025-10-23 00:35:44 +02:00
ValueOn AG
be5f6773b6 ai core with unlimited document size 2025-10-11 18:30:26 +02:00
ValueOn AG
d2b6820812 cleaned pydantic model classes 2025-10-05 16:28:44 +02:00
ValueOn AG
0e14607f83 ai services extract and web revised 2025-10-02 21:29:21 +02:00
ValueOn AG
501cebe342 start testing with backend running 2025-09-30 18:30:33 +02:00
ValueOn AG
cbea086f91 Ready for test revised generic dynamic ai call center with dynamic generic content extraction engine 2025-09-30 00:02:51 +02:00