13 KiB
Functional Differences Analysis: Legacy vs Current Chatbot
Executive Summary
The legacy implementation works correctly because the LLM actually uses the send_streaming_message tool as instructed. The current implementation fails because the LLM generates status messages as regular text instead of using the tool, causing an infinite loop when the system tries to handle these text messages.
Core Functional Difference
Legacy: Tool-Based Status Updates (WORKS)
How it works:
- System prompt instructs: "Use
send_streaming_messagetool for status updates" - LLM (ChatAnthropic) follows instructions and calls the tool
- Event handler listens ONLY for
on_tool_startevents withsend_streaming_message - When tool is called → routes to tools node → tool executes → routes back to agent
- Agent then calls SQL tools → processes results → generates final answer
Code Evidence:
# legacy/chatbot.py line 267
if etype == "on_tool_start" and ename == "send_streaming_message":
tool_in = edata.get("input") or {}
msg = tool_in.get("message")
if isinstance(msg, str) and msg.strip():
yield {"type": "status", "label": msg.strip()}
continue
Key Point: Legacy ONLY handles tool calls. It doesn't try to detect status messages in regular text.
Current: Text-Based Status Updates (BROKEN)
How it fails:
- System prompt instructs: "MUST use
send_streaming_messagetool, VERBOTEN to write text messages" - LLM (AICenterChatModel) ignores instructions and generates text messages like "Ich werde die Datenbank nach Artikeln durchsuchen..."
- Event handler tries to handle these text messages by detecting them as "status messages"
- When status message detected → routes back to agent (to "fix" it)
- Agent generates another text status message (still not using tool)
- Infinite loop until max iterations (15) reached
Code Evidence:
# gateway/modules/features/chatbot/chatbotStreaming.py line 198-227
if etype == "on_chain_stream" and ename == "agent":
# Tries to detect status messages in regular text
if content and is_status_message(content):
await _emit_status_event(...) # Convert to status event
continue # Don't store as message
# gateway/modules/features/chatbot/chatbotLangGraph.py line 292-296
if is_status:
# Status message without tool calls - route back to agent
# The agent should then call actual tools (like sqlite_query)
logger.info(f"Status message detected without tool calls, routing back to agent...")
return "agent" # THIS CAUSES THE LOOP
Key Point: Current tries to compensate for LLM not following instructions, but this creates a loop.
Why They Differ: Root Causes
1. Model Behavior Difference
| Aspect | Legacy (ChatAnthropic) | Current (AICenterChatModel) |
|---|---|---|
| Tool Calling | Follows prompt, uses send_streaming_message tool |
Ignores prompt, generates text instead |
| Instruction Following | Strong adherence to system prompt | Weak adherence to system prompt |
| Model Type | Direct LangChain integration | Bridge to AI center (may use different models) |
Impact: The current model doesn't follow the instruction to use the tool, so it generates text messages that break the workflow.
2. Event Handling Strategy
Legacy Event Handling
# Simple: Only listen for tool calls
if etype == "on_tool_start" and ename == "send_streaming_message":
# Handle tool call
yield {"type": "status", "label": msg}
continue # Done, move on
Strategy: Trust the LLM to use the tool. Only handle tool calls.
Current Event Handling
# Complex: Try to handle both tool calls AND text messages
if etype == "on_tool_start" and ename == "send_streaming_message":
# Handle tool call (same as legacy)
await _emit_status_event(...)
if etype == "on_chain_stream" and ename == "agent":
# ALSO try to detect status messages in text
if is_status_message(content):
await _emit_status_event(...) # Convert text to status
Strategy: Don't trust the LLM. Try to compensate by detecting status messages in text.
Problem: This creates a feedback loop where status messages trigger re-routing, causing infinite loops.
3. Workflow Routing Logic
Legacy Routing (should_continue)
# Simple logic
def should_continue(state: ChatState) -> str:
last_message = state.messages[-1]
tool_calls = getattr(last_message, "tool_calls", None)
if tool_calls:
return "tools" # Has tool calls → execute tools
else:
return END # No tool calls → done
Key Point: No special handling for status messages. If there are tool calls, execute them. Otherwise, end.
Current Routing (should_continue)
# Complex logic with status detection
def should_continue(state: ChatState) -> str:
last_message = state.messages[-1]
tool_calls = getattr(last_message, "tool_calls", None)
if tool_calls:
return "tools"
# NEW: Check if it's a status message
if isinstance(last_message, AIMessage):
content = last_message.content
if is_status_message(content):
return "agent" # Route back to agent (CAUSES LOOP!)
return END
Key Point: Tries to "fix" status messages by routing back to agent, but agent just generates another status message.
4. Message Filtering
Legacy: No Filtering
- All messages are stored in memory
- Status messages from tool calls are handled, but messages themselves are stored
- No filtering of "status-like" text messages
Current: Aggressive Filtering
# chatbotMemory.py - Filters out status messages
if content:
content_lower = content.lower().strip()
status_patterns = ["ich werde", "ich suche", ...]
if len(content) < 150 and any(pattern in content_lower for pattern in status_patterns):
logger.debug(f"Skipping status update message...")
continue # Don't store
# chatbotLangGraph.py - Filters from conversation window
if content and is_status_message(content):
logger.debug(f"Filtering out status message from conversation window...")
# Skip this message
Problem: Status messages are filtered out, so they don't accumulate in memory, but the agent keeps generating them, creating a loop.
The Infinite Loop Explained
What Happens in Current Implementation
- User asks: "wie viele leds haben wir auf lager"
- Agent generates: "Ich werde die Datenbank nach Artikeln durchsuchen..." (text message, NO tool call)
- Status detection:
is_status_message()returnsTrue - Routing:
should_continue()returns"agent"(route back) - Memory filtering: Message is filtered out (not stored)
- Agent called again: Generates another status message (still no tool call)
- Repeat steps 3-6 until max iterations (15) reached
- Workflow ends: No final answer, only status messages
What Should Happen (Like Legacy)
- User asks: "wie viele leds haben wir auf lager"
- Agent calls tool:
send_streaming_message("Durchsuche Datenbank nach LEDs...")(tool call) - Tool execution: Tool node executes, emits status event
- Routing:
should_continue()returns"tools"→ tools execute → back to agent - Agent calls SQL tool:
sqlite_query("SELECT ...")(tool call) - SQL execution: Tool node executes query, returns results
- Agent processes: Generates final answer with results
- Workflow ends: Final answer returned
Why Legacy Works: Model Compliance
ChatAnthropic Behavior
- Strong tool calling: When instructed to use a tool, it actually uses it
- Prompt following: Adheres to system prompt instructions
- Tool-first approach: Prefers tool calls over text for structured operations
Evidence from Legacy Logs
Denke nach.. ← Tool call
Durchsuche Datenbank nach LEDs... ← Tool call
Berechne Gesamtlagerbestand... ← Tool call
Formuliere finale Antwort... ← Tool call
Aus der Datenbank habe ich 801... ← Final text answer
Each status update is a tool call, not a text message.
Why Current Fails: Model Non-Compliance
AICenterChatModel Behavior
- Weak tool calling: Doesn't reliably use tools when instructed
- Text-first approach: Generates text messages instead of tool calls
- Prompt ignoring: Doesn't follow "VERBOTEN" instructions
Evidence from Current Logs
Ich werde die Datenbank nach Artikeln durchsuchen... ← Text message (WRONG!)
Skipping status update message... ← Filtered out
Status message detected without tool calls... ← Detected as status
Routing back to agent... ← Causes loop
[Repeats 15 times]
Each status update is a text message, not a tool call, causing the loop.
System Prompt Comparison
Legacy Prompt (Works)
STREAMING-UPDATES: Du hast Zugriff auf das Tool "send_streaming_message",
mit dem du dem Nutzer kurze Status-Updates senden kannst.
Nutze dieses Tool, um den Nutzer über deine aktuellen Aktivitäten zu informieren.
Tone: Informative, suggests using the tool.
Current Prompt (Doesn't Work)
STREAMING-UPDATES - ABSOLUT KRITISCH:
⚠️⚠️⚠️ WICHTIG: Du MUSST das Tool "send_streaming_message" verwenden,
um Status-Updates zu senden. VERBOTEN ist es, normale Text-Nachrichten
für Status-Updates zu schreiben!
VERBOTEN: Text-Nachrichten wie "Ich werde die Datenbank durchsuchen..."
ERLAUBT: Nur das Tool "send_streaming_message" für Status-Updates verwenden!
Tone: Aggressive, forbids text messages, but model ignores it anyway.
Irony: The more explicit the prompt, the more the model ignores it.
Functional Differences Summary
| Aspect | Legacy | Current | Impact |
|---|---|---|---|
| Model Tool Calling | ✅ Uses tool | ❌ Generates text | CRITICAL |
| Event Handling | Tool calls only | Tool calls + text detection | Creates complexity |
| Routing Logic | Simple (tool calls → tools, else → end) | Complex (status detection → route back) | Creates loop |
| Message Filtering | None | Aggressive filtering | Hides the problem |
| Prompt Style | Informative | Aggressive/forbidding | Model ignores anyway |
Why This Matters
Legacy Success Factors
- Model compliance: ChatAnthropic follows instructions
- Simple event handling: Only handles what's expected (tool calls)
- No compensation logic: Doesn't try to "fix" model behavior
- Trust-based: Assumes model will use tools correctly
Current Failure Factors
- Model non-compliance: AICenterChatModel doesn't follow instructions
- Complex event handling: Tries to handle both tool calls and text
- Compensation logic: Tries to "fix" model behavior, creates loops
- Distrust-based: Assumes model won't use tools, tries to compensate
The Real Problem
The current implementation is trying to compensate for model non-compliance by:
- Detecting status messages in text
- Converting them to status events
- Routing back to agent to "fix" it
But this creates a feedback loop because:
- Agent generates text status message
- System detects it and routes back
- Agent generates another text status message
- Loop continues
The solution is NOT to add more compensation logic. The solution is to fix the root cause: Make the model actually use the tool.
Recommendations
Short-Term Fix
- Remove status message detection from
should_continue()- don't route back - Remove text-to-status conversion - only handle tool calls
- Let status messages be stored - don't filter them aggressively
- Simplify routing - if no tool calls, end (like legacy)
Long-Term Fix
- Fix model behavior - Ensure AICenterChatModel actually uses tools
- Improve prompt - Test different prompt styles to get tool usage
- Model selection - Use a model that reliably follows tool-calling instructions
- Tool binding - Verify tools are properly bound and available to model
Conclusion
The functional difference is not in the architecture but in model behavior:
- Legacy: Model uses tools → Simple event handling → Works
- Current: Model doesn't use tools → Complex compensation → Breaks
The current implementation is over-engineered to compensate for model non-compliance, but this compensation creates more problems than it solves.
The fix is simple: Make the model use the tool (like legacy), then simplify the event handling to match legacy's simplicity.