The default MergeStrategy concatenates every extracted text part into a
single ContentPart, collapsing a 500-page PDF into one chunk with a
blurred average embedding — RAG retrieval was effectively broken.
- ExtractionOptions.mergeStrategy is now Optional[MergeStrategy]; passing
None preserves per-part granularity. Default factory kept for
backward compatibility.
- routeDataFiles._autoIndexFile, _workspaceTools.readFile, and
_documentTools.describeImage explicitly pass mergeStrategy=None.
- Agent tools no longer carry redundant extraction + requestIngestion
fallback paths: the unified ingestion lane owns all corpus writes,
and readFile/describeImage are pure consumers of the knowledge store.
- Unit test asserts runExtraction(mergeStrategy=None) keeps every part.