Share via


Test conversational agents using Direct Line

The Direct Line API acts as a communication interface for client applications to interact with conversational agents built with Copilot Studio. The Direct Line API facilitates the transmission of messages between the client application and the agent through either WebSocket streams or HTTP requests. For performance testing, Direct Line enables load testing tools to replicate actual user behavior, generate load, and measure response times.

Communicate with Direct Line using WebSockets

Conversational agents built with Copilot Studio are deployed to web applications either as embedded iframes or by using a custom canvas. Both deployment options use WebSocket communication with Direct Line. If your conversational agent is deployed to an app using one of these methods, your performance test script should use WebSocket communication to generate load that resembles real user behavior, and measure performance with a high degree of confidence.

Client applications that use Direct Line and WebSocket communication should follow this flow:

  1. To initiate a conversation, a client application must first obtain a conversation token. If your agent is configured with a Direct Line secret, obtain a token by calling the Direct Line regional endpoint. Tokens for agents not using secrets can be obtained from the token endpoint.
  2. The client application starts a conversation by using the token, and receives a Conversation ID and a WebSocket stream URL.
  3. User messages are sent by sending an HTTP POST request with the Conversation ID.
  4. Messages from the conversational agent are received over the WebSocket stream.

Screenshot showing the flow of Direct Line communication using WebSockets.

Communicate with Direct Line using HTTP GET

If your load testing tool can't use WebSocket communication, or if your client-facing application doesn't use WebSocket communication, you can receive activities by sending HTTP GET instead. As shown in the following diagram, the conversation initiation flow doesn't change.

Screenshot showing the flow of Direct Line communication using HTTP GET.

Measure response times

To assess how load affects user experience, make sure performance test scripts track and report response time for the following steps:

Step Impact on user experience
Generate Token The time it takes to initiate a new conversation
Start Conversation The time it takes to initiate a new conversation
Send Activity The time it takes to send a new user message (doesn't include the agent's response)
Receive Activities/Get Activities The time it takes for an agent to respond

Tracking response times for Generate Token, Start Conversation, and Send Activity is straightforward for load testing tools, as these steps use standard HTTP requests. However, measuring the time it takes an agent to respond to user messages is more complex, due to the following reasons:

  • Sending and receiving activities over Direct Line follows an asynchronous pattern. When a user message is sent using a Send Activity request, the response isn't a message from the agent. Instead, it merely confirms that the user message is successfully posted.

  • Based on its design, a conversational agent might send any number of messages back in response to a user message. Therefore, in most cases, you should measure the time it takes an agent to respond as the time that passes between a user message and the last agent message. In the following example, a single user message triggers three agent messages, with API calls running in between. Each message takes about two seconds to come back; however, from a user's perspective, it takes the agent six seconds to respond to the user's request.

    Screenshot showing the response time between messages.

Identify the agent's last response

To measure the time it takes an agent to complete its responses, your performance testing script needs to:

  • Identify the last agent message that follows a user message
  • Calculate the time difference between the two

The underlying protocol that Copilot Studio uses doesn't have a concept of a 'last response,' as both agents and users can send messages at any given time. Therefore, your performance testing script needs to assume that if the agent doesn't send a message within a given timeframe, no further messages will be sent until the next user message is sent. The implementation of this logic varies based on how your script communicates with Direct Line.

Use WebSockets

When communicating with Direct Line over WebSockets, assume that the agent sends no more messages when no more frames can be read from the WebSocket. You may see this indicated by a timeout when attempting to read the next frame, though the exact behavior depends on your implementation. For a reference implementation that uses WebSockets, consider using HTTP GET.

Use HTTP GET

Performance testing scripts that use HTTP GET instead of WebSockets should poll the Activities endpoint to get the entire set of user and agent messages. When polling, make sure to provide sufficient time for your agent to respond. For example, if your agent needs to call a backend API to respond to a user query, and the API takes up to 5 seconds to respond, your script shouldn't poll the Activities endpoint until 5 seconds have elapsed.

The following simplified payload represents the response coming back from the Activities endpoint:

[
  {
    "type": "message",
    "id": "98SryQaHr2rGthOGpChPK2-us|0000012",
    "timestamp": "2025-01-07T09:12:22.0329242Z",
    "from": {
      "id": "a688eb7d-092a-42a8-8ef5-73123b9c2aaa",
      "name": ""
    },
    "conversation": {
      "id": "98SryQaHr2rGthOGpChPK2-us"
    },
    "text": "I also want to set up a new account",
  },
  {
    "type": "message",
    "id": "98SryQaHr2rGthOGpChPK2-us|0000017",
    "timestamp": "2025-01-07T09:12:24.5478686Z",
    "from": {
      "id": "4b56bfa5-5574-5bb3-7aa3-99b8798b9d90",
      "name": "Load Testing",
      "role": "bot"
    },
    "conversation": {
      "id": "98SryQaHr2rGthOGpChPK2-us"
    },
    "text": "Sure, please bear with me as I set up your new account",
    "replyToId": "98SryQaHr2rGthOGpChPK2-us|0000012",
  },
  {
    "type": "message",
    "id": "98SryQaHr2rGthOGpChPK2-us|0000018",
    "timestamp": "2025-01-07T09:12:33.1960413Z",
    "from": {
      "id": "4b56bfa5-5574-5bb3-7aa3-99b8798b9d90",
      "name": "Load Testing",
      "role": "bot"
    },
    "conversation": {
      "id": "98SryQaHr2rGthOGpChPK2-us"
    },
    "text": "Almost done! Thank you for your patience",
    "replyToId": "98SryQaHr2rGthOGpChPK2-us|0000012",
  },
  {
    "type": "message",
    "id": "98SryQaHr2rGthOGpChPK2-us|0000019",
    "timestamp": "2025-01-07T09:12:41.9166159Z",
    "from": {
      "id": "4b56bfa5-5574-5bb3-7aa3-99b8798b9d90",
      "name": "Load Testing",
      "role": "bot"
    },
    "conversation": {
      "id": "98SryQaHr2rGthOGpChPK2-us"
    },
    "text": "All done! Your new account is now active.",
    "inputHint": "acceptingInput",
    "replyToId": "98SryQaHr2rGthOGpChPK2-us|0000012"
  }
]

When you parse the payload and calculate response times, consider the following guidelines:

  • Messages from the agent have the property role: bot, while messages from the user have no role property.
  • Agent messages sent in response to user messages have the property replyToId, which has a value of the id property of the user message.
  • You can calculate the agent response times as the time difference between the user message and the last agent message that replies to the user message.