gateway

History

Ida dff3d41845 fix(rag): stable ingestion idempotency across re-extractions (AC4) Re-indexing the same file always triggered a full embedding run — ingestion.skipped.duplicate never fired. Two independent causes: 1. _computeIngestionHash included contentObjectId in its payload, but extractors generate fresh uuid4() per run, making the hash a per-run nonce. Now hashed over (contentType, data) in extractor order — stable across re-extractions, sensitive to content, ordering, and type changes. 2. _autoIndexFile upserted the fresh pre-scan FileContentIndex before requestIngestion's duplicate check, wiping structure._ingestion and status=indexed from the prior run. The pre-upsert now merges the existing _ingestion metadata and preserves the indexed status. Verified end-to-end: second PATCH /scope on an already-indexed file logs and returns in ~2s with zero embedding API calls. Adds test_ingestion_hash_stability.py (5 cases).		2026-04-29 14:39:40 +02:00
..
__init__.py	new ai agent	2026-03-15 23:38:21 +01:00
mainServiceKnowledge.py	fix(rag): stable ingestion idempotency across re-extractions (AC4)	2026-04-29 14:39:40 +02:00
subPreScan.py	unified failsafe neutralization architecture	2026-03-29 21:55:09 +02:00