Version: Latest

Tracing

Distributed tracing tracks requests as they flow through a distributed system (in this case: a Rasa assistant), sending data about the requests to a tracing backend which collects all trace data and enables inspecting it. Trace data helps you understand the flow of requests through both the components of a single service (Rasa itself), and across different distributed services, for example, your action server.

Supported Tracing Backends/Collectors

To trace requests in Rasa Pro, you can either use Jaeger as a backend, or use the OTEL Collector (OpenTelemetry Collector). to collect traces and then send them to the backend of your choice. See Configuring a Tracing Backend or Collector for instructions.

Rasa Channels

Trace context sent along with requests using the W3C Trace Context Specification via the REST channel is used to continue tracing in Rasa Pro.

Rasa Inspector

If you have enabled tracing in Rasa Pro and are using the Rasa Inspector debugging tool to try your assistant, note that in addition to the expected tracing span for the Agent.handle_message method call, the tracing backend will collect independent tracing spans for the MessageProcessor.get_tracker method calls. This is expected behaviour because the Rasa Inspector tool uses the Rasa HTTP API endpoints to retrieve the conversation tracker which is required by the Inspector interface.

Action Server

The trace context from Rasa Pro is sent along with requests to the custom action server using the W3C Trace Context Specification and then used to continue tracing the request through the custom action server.

Tracing is continued in the action server by instrumenting the webhook that receives custom actions. See Action server attributes for the attributes captured as part of the trace context.

See traced events for details on what attributes are made available as part of the trace context in Rasa Pro.

Questions Tracing Can Help Answer

Tracing can help troubleshoot issues in development and production, by answering questions such as:

How does a user message request get processed across different components i.e. dialogue understanding components (NLU, CommandGenerator, CommandProcessorComponent), policies, and action server?
Why has my Rasa assistant decided to execute a certain action?
Why has my Rasa assistant been slow to respond?
Why have my custom actions been slow to execute?
What is my OpenAI prompt token usage?
What is the performance of my Rasa assistant across different flows?
What is the performance of my Rasa assistant across different LLM models?
What is the performance of my Rasa assistant across different vector stores?

Configuring a Tracing Backend or Collector

To configure a tracing backend or collector, add a tracing entry to your endpoints i.e. in your endpoints.yml file, or in the relevant section of your Helm values in a deployment.

Jaeger

To configure a Jaeger tracing backend, specify the type as jaeger.

tracing:
  type: jaeger
  host: localhost
  port: 6831
  service_name: rasa
  sync_export: ~

tip

If you come across the error "OSError: [Errno 40] Message too long", read the instructions here to resolve it

OTEL Collector

Collectors are components that collect traces in a vendor-agnostic way and then forward them to various backends. For example, the OpenTelemetry Collector (OTEL) can collect traces from multiple different components and instrumentation libraries, and then export them to multiple different backends e.g. jaeger.

To configure an OTEL Collector, specify the type as otlp.

tracing:
  type: otlp
  endpoint: my-otlp-host:4318
  insecure: false
  service_name: rasa
  root_certificates: ./tests/unit/tracing/fixtures/ca.pem

Traced Events

The Rasa service areas that are traceable cover the actions required to:

train a model (i.e., the training of each graph component)
handle a message

Model Training

Tracing is enabled for model training by instrumenting Rasa GraphTrainer and GraphNode classes.

`GraphTrainer` Attributes

The following attributes can be inspected during training of GraphTrainer:

training_type of model configuration:
- "NLU"
- "CORE"
- "BOTH"
- "END-TO-END"
language of model configuration
recipe_name used in the config.yml file
output_filename: the location where the packaged model is saved
is_finetuning: boolean argument, if True enables incremental training

`GraphNode` Attributes

The following attributes are captured during the training (as well as prediction during message handling) of every graph node:

node_name
component_class
fn_name: method of component class that gets called

Message Handling

The following Rasa classes are instrumented to enable tracing during message handling:

Agent
MessageProcessor
TrackerStore
LockStore
LLMCommandGenerator
NLUCommandAdapter
FlowPolicy
IntentlessPolicy
EnterpriseSearchPolicy
InformationRetrieval
EndpointConfig

In addition, the following Python modules were instrumented to enable tracing during message handling:

command processor module, i.e. utility functions leveraged by the CommandProcessorComponent to pre-process predicted commands
flow executor module, i.e. utility functions leveraged by FlowPolicy to advance flows

Namely, these operations are now traceable:

receiving a message
parsing the message
predicting commands
pre-processing commands
predicting the next action
running the action
advancing flows
searching documents in vector stores for enterprise search
generating LLM answers by policies e.g. IntentlessPolicy and EnterpriseSearchPolicy
tracing prompt token usage
retrieving and saving the tracker
locking the conversation
publishing to the event broker
making requests to the action server or nlg server
passing the trace context to the action server

Tracing prompt token usage

New in 3.8

Tracing prompt token usage for OpenAI models is available starting with version 3.8.0.

Tracing prompt token usage is available for the following classes if you're using OpenAI models:

LLMCommandGenerator class
IntentlessPolicy class
EnterpriseSearchPolicy class
ContextualResponseRephraser class

The prompt token usage is captured as part of the trace context and can be used to monitor the usage of prompt tokens in the LLM answer generation process. This is only captured if one of instrumented classes mentioned above is configured to enable capturing the length of the prompt tokens. For example, the LLMCommandGenerator can be configured to trace the length of the prompt tokens by setting the trace_prompt_tokens attribute to true in the config.yml file:

pipeline:
  - name: LLMCommandGenerator
    trace_prompt_tokens: true

It is highly recommended to enable tracing of prompt tokens only in development and not in production, because it could increase assistant response latency.

`Agent` Attributes

Tracing the Agent instance handling a message captures the following attributes:

input_channel: the name of the channel connector
sender_id: the conversation id
model_id: a unique identifier for the model
model_name: the model name

`MessageProcessor` Attributes

The following MessageProcessor attributes are extracted during the tracing:

number_of_events: number of events in tracker
action_name: the name of the predicted and executed action
sender_id: the conversation id of the DialogueStateTracker object
message_id: the unique message id

The latter three attributes are also injected in the trace context that gets passed to the requests made to the custom action server.

`TrackerStore` & `LockStore` Attributes

Observable TrackerStore and LockStore attributes include:

number_of_streamed_events: number of new events to stream
broker_class: the EventBroker on which the new events are published
lock_store_class: Name of lock store used to lock conversations while messages are actively processed

`LLMCommandGenerator` Attributes

New in 3.8

Tracing the described LLMCommandGenerator attributes is available starting with version 3.8.0.

The following attributes are captured as part of the trace context of the LLMCommandGenerator:

class_name: the name of the instrumented component class
llm_model: the name of the LLM used
llm_type: the type of LLM used
embeddings: the embeddings used
llm_temperature: the temperature used for LLM answer generation
request_timeout: the timeout for the LLM request
llm_engine: the engine used for LLM answer generation
len_prompt_tokens: the token length of the prompt (optional, only supported for OpenAI models). To enable this attribute, see instructions in the Tracing prompt token usage section.

`NLUCommandAdapter` Attributes

New in 3.8

Tracing the described NLUCommandAdapter attributes is available starting with version 3.8.0.

The following attributes are captured as part of the trace context of the NLUCommandAdapter:

commands: the predicted commands
intent: the predicted intent of the user message that the NLUCommandAdapter receives as input

Command Processor Module Attributes

New in 3.8

Tracing the described command processor module attributes is available starting with version 3.8.0.

The following attributes are captured as part of the trace context of the command processor module functions:

execute_commands function:
- number_of_events: the number of events in the tracker
- sender_id: the conversation id of the DialogueStateTracker object
validate_state_of_commands function:
- cleaned_up_commands: list of cleaned up commands
clean_up_commands function:
- commands: list of originally parsed commands from the LLM answer
- current_context: the current context of the dialogue stack
remove_duplicated_set_slots function:
- resulting_events: list of events prior to removing duplicated set slot events; note that slot values are removed to prevent PII leakage

Flow Executor Module Attributes

New in 3.8

Tracing the described flow executor module attributes is available starting with version 3.8.0.

The following attributes are captured as part of the trace context of the flow executor module functions:

advance_flow function:
- available_actions: list of available actions
- current_context: the current context of the dialogue stack
advance_flows_until_next_action function:
- action_name: the name of the action to be executed
- score: the score of the executed action
- metadata: the prediction metadata
- events: list of event names if available
run_step function:
- step_custom_id: the custom id of the step if available
- step_description: the description of the step if available
- current_flow_id: the id of the current flow
- current_context: the current context of the dialogue stack

`Policy` subclasses attributes

New in 3.8

Tracing the described Policy subclasses' attributes is available starting with version 3.8.0.

The following attributes are captured as part of the trace context of subclasses of the Policy interface, e.g. FlowPolicy, IntentlessPolicy, EnterpriseSearchPolicy:

priority: the priority of the policy which made the prediction
events: a list of event names which are applied independent of whether the policy wins against other policies or not
optional_events: a list of optional event names if available else None - these events are applied if the policy wins against other policies
is_end_to_end_prediction: a boolean indicating if the prediction used the text of the user message instead of the intent
is_no_user_prediction: a boolean indicating if the prediction uses neither the text of the user message nor the intent
diagnostic_data: intermediate results or other information that is not necessary for Rasa to function, but intended for debugging and fine-tuning purposes
action_metadata: additional metadata that can be passed by policies

`IntentlessPolicy` Attributes

New in 3.8

Tracing the described IntentlessPolicy attributes is available starting with version 3.8.0.

Depending on the instrumented policy method, the following attributes are captured as part of the trace context of the IntentlessPolicy:

current_context: the context of the top dialogue stack frame, received as input by the IntentlessPolicy.find_closest_response method
ai_response_examples: the sample responses that fit the current conversation, returned by the IntentlessPolicy.select_response_examples method
conversation_samples: the conversation samples returned by the IntentlessPolicy.select_few_shot_conversations method
ai_responses: the AI responses extracted from the conversation samples by the IntentlessPolicy.extract_ai_responses method
llm_response: the response generated by the LLM model call, returned by the IntentlessPolicy.generate_answer method
action_name: the name of the action to be executed, received as input by the IntentlessPolicy._prediction_result method
score: the score of the executed action, received as input by the IntentlessPolicy._prediction_result method

In addition, the IntentlessPolicy._generate_llm_answer captures the same attributes as the LLMCommandGenerator class.

`EnterpriseSearchPolicy` Attributes

New in 3.8

Tracing the described EnterpriseSearchPolicy attributes is available starting with version 3.8.0.

The EnterpriseSearchPolicy._generate_llm_answer method captures the same attributes as the LLMCommandGenerator class.

`InformationRetrieval` Attributes

New in 3.8

Tracing the described InformationRetrieval subclasses' attributes is available starting with version 3.8.0.

The following attributes are captured as part of the trace context of the InformationRetrieval subclasses, e.g. Milvus_Store, Qdrant_Store:

query: the query used to search the vector store
document_metadata: the metadata of the documents retrieved from the vector store

`EndpointConfig` Attributes

New in 3.8

Tracing the described EndpointConfig attributes is available starting with version 3.8.0.

The following attributes are captured as part of the trace context of the EndpointConfig:

url: the url of the endpoint
request_body_size_in_bytes: the size of the request body in bytes

Tracing in the Action Server

API Requests are traced as they flow through the action server by instrumenting the webhook that receives custom actions and other classes involved in the execution of custom actions.

New in 3.8

Additional classes are now instrumented to improve tracing in the action server.

The following classes are instrumented;

ValidationAction: the base class for custom actions extracting and validating slots that can be set or updated outside a form context.
FormValidationAction: the base class for custom actions extracting and validating slots that are set only within the context of a form.
ActionExecutor - the class that executes the custom actions.

Webhook Attributes

The following attributes are captured as part of the trace context of the webhook that receives custom actions;

http.method: the http method used to make the request
http.route: the endpoint of the request
next_action: the name of the next action to be executed
version: the rasa version used
sender_id: the id of the conversation
message_id: the unique message id

Action Executor Attributes

The following attributes are captured as part of the trace context of the action executor;

action_name: the name of the action to be executed
sender_id: the id of the conversation
events: a list of returned events
slots: a list of filled slots by the executed custom action
utters: a list of executed utterances

Slot Validation Action Attributes

The following attributes are captured as part of the trace context of Slot Validation Actions;

class_name: the name of the instrumented component class
action_name: the name of the action to be executed
sender_id: the id of the conversation
events: a list of returned events
slots: a list of filled slots by the executed custom action
utters: a list of executed utterances
message_count: the number of messages
slots_to_validate: a list of recently filled slots to validate

Debugging custom actions performance

New in 3.8

You can now continue tracing the request further along your custom actions code.

It is now possible to debug the performance of your custom actions by tracing specific parts of your custom actions code. This can be achieved by creating spans to trace the execution of these parts.

In order to create more spans, you can retrieve the tracer object from the ActionExecutorTracerRegister component.

from rasa_sdk.tracing.tracer_register import ActionExecutorTracerRegister

tracer = ActionExecutorTracerRegister().get_tracer()

To create a span as documented in the OTEL documentation, corresponding to traces from a specific part of your custom actions code, you can embed the following code snippet:

with tracer.start_as_current_span("span_name") as span:
  # your code here
  span.set_attribute("attribute_name", "attribute_value")

For example, a complete custom action that implements a custom span is shown below:

import requests
import json
from rasa_sdk import Action
from rasa_sdk.tracing.tracer_register import ActionExecutorTracerRegister


tracer = ActionExecutorTracerRegister().get_tracer()


class ActionCheckSufficientFunds(Action):
  def name(self):
    return "action_check_sufficient_funds"

  def run(
    self,
    dispatcher: CollectingDispatcher,
    tracker: Tracker,
    domain: Dict[Text, Any]
  ) -> List[Dict[Text, Any]]:
    with tracer.start_as_current_span("span_name"):
      balance = 1000 # hardcoded balance from tutorial purposes
      transfer_amount = tracker.get_slot("amount")
      has_sufficient_funds = transfer_amount <= balance

      # set trace attributes
      span.set_attribute("has_sufficient_funds", has_sufficient_funds)

      return [SlotSet("has_sufficient_funds", has_sufficient_funds)]

Enabling and disabling tracing in the action server is also done in the same way as described below. The same Tracing Backends/Collectors listed above are also supported for the action server. See Configuring a Tracing Backend or Collector for further instructions.

Enabling / Disabling

Tracing is automatically enabled in Rasa Pro by configuring a supported tracing backend. No further action is required to enable tracing.

You can disable tracing by leaving the tracing: configuration key empty in your endpoints file.

Tracing#

Supported Tracing Backends/Collectors#

Rasa Channels#

Rasa Inspector#

Action Server#

Questions Tracing Can Help Answer#

Configuring a Tracing Backend or Collector#

Jaeger#

tip

OTEL Collector#

Traced Events#

Model Training#

GraphTrainer Attributes#

GraphNode Attributes#

Message Handling#

Tracing prompt token usage#

New in 3.8

Agent Attributes#

MessageProcessor Attributes#

TrackerStore & LockStore Attributes#

LLMCommandGenerator Attributes#

New in 3.8

NLUCommandAdapter Attributes#

New in 3.8

Command Processor Module Attributes#

New in 3.8

Flow Executor Module Attributes#

New in 3.8

Policy subclasses attributes#

New in 3.8

IntentlessPolicy Attributes#

New in 3.8

EnterpriseSearchPolicy Attributes#

New in 3.8

InformationRetrieval Attributes#

New in 3.8

EndpointConfig Attributes#

New in 3.8

Tracing in the Action Server#

New in 3.8

Webhook Attributes#

Action Executor Attributes#

Slot Validation Action Attributes#

Debugging custom actions performance#

New in 3.8

Enabling / Disabling#

Tracing

Supported Tracing Backends/Collectors

Rasa Channels

Rasa Inspector

Action Server

Questions Tracing Can Help Answer

Configuring a Tracing Backend or Collector

Jaeger

OTEL Collector

Traced Events

Model Training

`GraphTrainer` Attributes

`GraphNode` Attributes

Message Handling

Tracing prompt token usage

`Agent` Attributes

`MessageProcessor` Attributes

`TrackerStore` & `LockStore` Attributes

`LLMCommandGenerator` Attributes

`NLUCommandAdapter` Attributes

Command Processor Module Attributes

Flow Executor Module Attributes

`Policy` subclasses attributes

`IntentlessPolicy` Attributes

`EnterpriseSearchPolicy` Attributes

`InformationRetrieval` Attributes

`EndpointConfig` Attributes

Tracing in the Action Server

Webhook Attributes

Action Executor Attributes

Slot Validation Action Attributes

Debugging custom actions performance

Enabling / Disabling