Skip to content

Conversation

@Pouyanpi
Copy link
Collaborator

PR Description

Removes LangChain callback dependencies from StreamingHandler. (one callback down)

Key Changes:

  • StreamingHandler no longer inherits from AsyncCallbackHandler
  • Streaming now uses llm.astream() with direct push_chunk() calls
  • Removed LangChain-specific type handling (GenerationChunk, AIMessageChunk, etc.)
  • Added explicit streaming_handler parameter to llm_call()
  • Simplified streaming interface (string-only chunks)

@Pouyanpi Pouyanpi force-pushed the refactor/drop-streaming-callback branch from 040ba3c to a11a88f Compare December 16, 2025 14:49
…amingHandler

Refactored StreamingHandle by removing dependencies on LangChain
callback interfaces (AsyncCallbackHandler, LLMResult, etc.).

- Remove AsyncCallbackHandler inheritance from StreamingHandler
- Replace callback-based streaming with direct push_chunk() interface
- Add streaming_handler parameter to llm_call() for explicit streaming
- Update llm_call to use llm.astream() instead of callbacks
- Simplify push_chunk() to accept only strings (remove LangChain type
conversions)
- Remove on_chat_model_start, on_llm_new_token, on_llm_end callback
methods
- Update tests to use push_chunk() directly instead of mocking callbacks
@Pouyanpi Pouyanpi force-pushed the refactor/drop-streaming-callback branch from a11a88f to bb7f0a3 Compare December 16, 2025 14:52
@Pouyanpi Pouyanpi changed the title refactor(streaming): remove LangChain callback dependencies from Stre… refactor(streaming): remove LangChain callback dependencies from StreamingHandler Dec 16, 2025
@Pouyanpi Pouyanpi marked this pull request as draft December 16, 2025 14:53
@codecov
Copy link

codecov bot commented Dec 16, 2025

Codecov Report

❌ Patch coverage is 94.28571% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
nemoguardrails/actions/llm/utils.py 93.10% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 16, 2025

Greptile Overview

Greptile Summary

This PR refactors the streaming functionality in NeMo Guardrails by removing LangChain callback dependencies from the StreamingHandler class. The key architectural change moves away from LangChain's AsyncCallbackHandler pattern toward a more direct streaming approach. The StreamingHandler no longer inherits from AsyncCallbackHandler and instead accepts only string chunks through direct push_chunk() calls. Streaming now uses llm.astream() calls with explicit streaming_handler parameters passed to llm_call() functions throughout the codebase, replacing the previous custom_callback_handlers pattern. This refactoring simplifies the streaming interface, reduces external dependencies, and makes the system more provider-agnostic while maintaining all existing streaming functionality.

Important Files Changed

Filename Score Overview
nemoguardrails/streaming.py 4/5 Removed AsyncCallbackHandler inheritance and LangChain-specific chunk types, simplified to string-only chunks
nemoguardrails/actions/llm/utils.py 4/5 Added streaming_handler parameter to llm_call() and implemented new _stream_llm_call() function
nemoguardrails/actions/llm/generation.py 4/5 Updated multiple llm_call() invocations to use streaming_handler parameter instead of custom_callback_handlers
nemoguardrails/actions/v2_x/generation.py 4/5 Replaced custom_callback_handlers with streaming_handler parameter in passthrough LLM action
tests/test_streaming_handler.py 4/5 Refactored tests to remove LangChain callback testing and use direct push_chunk() calls
tests/utils.py 5/5 Added _astream method to FakeLLM class to support new streaming architecture in tests
tests/runnable_rails/test_streaming.py 4/5 Modified StreamingFakeLLM to yield GenerationChunk objects with proper error handling

Confidence score: 4/5

  • This PR requires careful review due to architectural changes in critical streaming functionality
  • Score reflects significant refactoring of streaming implementation across multiple core files, though changes appear well-structured and consistent
  • Pay close attention to the streaming handler implementation and LLM call patterns across action files

Sequence Diagram

sequenceDiagram
  participant User
  participant RunnableRails
  participant StreamingHandler as "StreamingHandler"
  participant LLM
  participant llm_call as "llm_call"

  User->>RunnableRails: stream(input)
  RunnableRails->>RunnableRails: _prepare_streaming()
  RunnableRails->>StreamingHandler: new StreamingHandler()
  RunnableRails->>RunnableRails: streaming_handler_var.set()
  RunnableRails->>RunnableRails: generate_async()
  RunnableRails->>llm_call: llm_call(streaming_handler=handler)
  llm_call->>LLM: astream(prompt)
  
  loop For each chunk
    LLM-->>llm_call: yield chunk
    llm_call->>StreamingHandler: push_chunk(chunk.content)
    StreamingHandler->>StreamingHandler: _process(chunk)
    StreamingHandler->>StreamingHandler: queue.put(chunk)
  end
  
  llm_call->>StreamingHandler: push_chunk(END_OF_STREAM)
  StreamingHandler->>StreamingHandler: streaming_finished_event.set()
  llm_call-->>RunnableRails: return completion_text
  
  loop Stream to user
    RunnableRails->>StreamingHandler: __anext__()
    StreamingHandler->>StreamingHandler: queue.get()
    StreamingHandler-->>RunnableRails: chunk
    RunnableRails-->>User: yield chunk
  end
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants