169 lines
6.7 KiB
Markdown
169 lines
6.7 KiB
Markdown
# Budget Rendering Requirement for JSON Continuation Context
|
|
|
|
## Problem Statement
|
|
When rendering the `hierarchyContextForPrompt` with budget constraints, we need to prioritize data near the cut point (where JSON was truncated) over data near the root. Currently, the implementation renders top-down (root → children), causing root-level data to consume budget before cut-level data gets rendered.
|
|
|
|
## Requirement
|
|
|
|
### Core Logic
|
|
1. **Find the cut element**: Identify the element in the JSON structure where truncation occurred (the cut point).
|
|
|
|
2. **Build path from cut to root**: Create an ordered list of nodes from the cut element back to the root element. The path should be: `[cut_element, parent_of_cut, grandparent_of_cut, ..., root]`.
|
|
|
|
3. **Walk backwards from cut to root, consuming budget**:
|
|
- Start at the cut element
|
|
- Walk backwards along the path to root
|
|
- For each value node encountered on the path:
|
|
- If value size ≤ remaining budget: render full value and deduct budget
|
|
- If value size > remaining budget: render type hint (`<str>`, `<number>`, etc.) and **DO NOT** deduct budget (never let budget go below 0)
|
|
- If remaining budget < 50: set budget to 0 and enable "summary mode"
|
|
|
|
4. **Summary mode (budget = 0)**:
|
|
- Continue walking towards root
|
|
- For elements on the **same level** as current position: render with type hints
|
|
- For elements on **higher levels** (closer to root): render only structure (keys/attributes) without values - just show the level hierarchy
|
|
|
|
### Critical Constraints
|
|
- **Never let budget go below 0**: If a value is bigger than remaining budget, use type hint instead of rendering data
|
|
- **Budget allocation order**: Cut element → parent → grandparent → ... → root (bottom-up)
|
|
- **Rendering order**: Can be top-down (root → children) for structure, but budget must be allocated bottom-up
|
|
- **When budget < 50**: Set to 0 and enable summary mode immediately
|
|
|
|
### Example Flow
|
|
|
|
Given JSON structure (truncated at cut point):
|
|
```json
|
|
{
|
|
"document": {
|
|
"metadata": {
|
|
"title": "My Document",
|
|
"author": "John Doe",
|
|
"version": 1
|
|
},
|
|
"sections": [
|
|
{
|
|
"id": "section1",
|
|
"title": "Introduction",
|
|
"content": "This is the introduction content..."
|
|
},
|
|
{
|
|
"id": "section2",
|
|
"title": "Main Content",
|
|
"content": "This is a very long content that gets cut right here in the middle of this sentence and the JSON is truncated..."
|
|
```
|
|
|
|
**Cut point**: The JSON is truncated in the middle of `sections[1].content` value.
|
|
|
|
**Path from cut to root**:
|
|
```
|
|
[
|
|
sections[1].content (value - CUT ELEMENT),
|
|
sections[1].content (key-value pair),
|
|
sections[1] (object),
|
|
sections (array),
|
|
document (object),
|
|
root (object)
|
|
]
|
|
```
|
|
|
|
**Budget allocation order (walking backwards from cut → root)**:
|
|
|
|
1. **`sections[1].content` (value)** - CUT ELEMENT
|
|
- Check: value size = 120 chars, budget = 500
|
|
- Action: Render full value "This is a very long content that gets cut right here in the middle of this sentence and the JSON is truncated..."
|
|
- Deduct: 120 chars from budget → remaining = 380
|
|
|
|
2. **`sections[1]` (object)** - Parent of cut
|
|
- Action: Render structure only (no budget needed for `{`, `}`, keys)
|
|
- Render: `"id": "section2", "title": "Main Content", "content": <already rendered>`
|
|
|
|
3. **`sections` (array)** - Grandparent
|
|
- Action: Render structure only
|
|
- Render: `[<section1>, <section2>]` where section2 already has content rendered
|
|
|
|
4. **`document` (object)** - Great-grandparent
|
|
- Action: Render structure only
|
|
- Render: `"metadata": {...}, "sections": <already rendered>`
|
|
|
|
5. **`root` (object)** - Root
|
|
- Action: Render structure only
|
|
- Render: `{"document": <already rendered>}`
|
|
|
|
**If budget becomes 0 during step 1** (e.g., value size = 600, budget = 500):
|
|
- `sections[1].content` gets type hint `<str>` (value too big, don't deduct budget)
|
|
- Budget remains 500, but if < 50, set to 0 and enable summary mode
|
|
- Continue to root with summary mode:
|
|
- **Same level elements** (e.g., `sections[1].id`, `sections[1].title`): type hints (`<str>`)
|
|
- **Higher level elements** (e.g., `sections[0]`, `metadata`): structure only (keys, braces, no values)
|
|
|
|
**Expected output with budget = 500** (sufficient budget):
|
|
```json
|
|
{
|
|
"document": {
|
|
"metadata": {
|
|
"title": "My Document",
|
|
"author": "John Doe",
|
|
"version": 1
|
|
},
|
|
"sections": [
|
|
{
|
|
"id": "section1",
|
|
"title": "Introduction",
|
|
"content": "This is the introduction content..."
|
|
},
|
|
{
|
|
"id": "section2",
|
|
"title": "Main Content",
|
|
"content": "This is a very long content that gets cut right here in the middle of this sentence and the JSON is truncated..."
|
|
```
|
|
*All values rendered because budget is sufficient.*
|
|
|
|
**Expected output if ONE value is too big and budget running out** (budget = 100, cut value = 120):
|
|
```json
|
|
{
|
|
"document": {
|
|
"metadata": <object>>,
|
|
"sections": [
|
|
{
|
|
"id": "section1",
|
|
"title": "Introduction",
|
|
"content": "This is the introduction content..."
|
|
},
|
|
{
|
|
"id": "section2",
|
|
"title": <str>,
|
|
"content": "This is a very long content that gets cut right here in the middle of this sentence and the JSON is truncated..."
|
|
```
|
|
|
|
|
|
|
|
**Key Points:**
|
|
- **Single value too big**: Only that value gets type hint, continue rendering other data
|
|
- **Budget > 0**: Render side paths (siblings, other branches) as long as budget allows
|
|
- **Budget = 0**: Stop rendering side paths, only render path from cut element to root (structure only for higher levels, type hints for same level)
|
|
|
|
## Current Implementation Issues
|
|
|
|
The current implementation in `jsonContinuation.py`:
|
|
- Pre-allocates budget to path elements before rendering
|
|
- But rendering still happens top-down, so root elements consume budget first
|
|
- Path elements check for pre-allocated budget, but non-path elements also consume budget during top-down rendering
|
|
|
|
## Expected Behavior
|
|
|
|
**Before budget runs out:**
|
|
- Cut element and path to root: full values rendered
|
|
- Other elements: full values if budget allows
|
|
|
|
**After budget < 50 (summary mode):**
|
|
- Cut element and path to root: full values (if budget was allocated)
|
|
- Same level elements: type hints (`<str>`, `<object>`, etc.)
|
|
- Higher level elements: structure only (keys, braces, brackets - no values)
|
|
|
|
## Implementation Notes
|
|
|
|
- The path should be built using `_findPathToRoot()` which walks from root to find cut element
|
|
- Budget should be consumed during a separate pass that walks the path (cut → root)
|
|
- During rendering, path elements should check if they have pre-allocated budget
|
|
- Non-path elements should only consume leftover budget after path elements are processed
|
|
- Structure elements (objects, arrays) don't consume budget - only values do
|