CometD 2 Load Testing
CometD 2 Load Testing
The CometD project comes with a load test tool that can be used to get the gist of how CometD scales.
Note however that real deployments often have a totally different behavior that depends on OS settings, TCP stack settings, other network settings, JVM settings and application settings, so the basic load test tool gives you information only up to a point.
Load testing can be very stressful to the OS, TCP stack and network, so you need to tune a few values to avoid that the OS, TCP stack or network become a bottleneck, making you think the CometD does not scale. It does scale.
The setup must be done on both client(s) and server.
A suggested setup is the following (for Linux; it may vary for other operative systems):
# ulimit -n 65536 # ifconfig eth0 txqueuelen 8192 # replace eth0 with the ethernet interface you are using # /sbin/sysctl -w net.core.somaxconn=4096 # /sbin/sysctl -w net.core.netdev_max_backlog=16384 # /sbin/sysctl -w net.core.rmem_max=16777216 # /sbin/sysctl -w net.core.wmem_max=16777216 # /sbin/sysctl -w net.ipv4.tcp_max_syn_backlog=8192 # /sbin/sysctl -w net.ipv4.tcp_syncookies=1
Read also here for more information.
Running the server
If you have already deployed your Bayeux server, then you can use that as a load test server. The load test tool only requires to handshake, subscribe and publish to normal channels, which is what any Bayeux server can do out of the box.
If you have installed a security policy (see here), you may need to tweak the load test tool to satisfy the policy conditions (for example by adding security tokens and the like).
If you use your server, however, you will loose some important report that is available in the CometD load server (but that you can add to your server by looking at how it is done in the CometD load server).
If you don't have a server already running, then you need to build and run the CometD load server.
Follow the instructions in the build section to build the whole project (if not already done), then issue these commands in a terminal window:
$ cd cometd-java/cometd-java-examples // Use "cd cometd-java/cometd-java-client" for CometD versions prior 2.4.0 $ mvn -Pserver install exec:exec
This command launches the CometD load server after a number of questions prompted in the terminal window; you can additionally tweak the command line by tweaking the
pom.xml section related to the "server" profile.
Running the load test client
The load test client can be run on a different host than the server, and you need to build and run the tool from the sources.
Follow the instructions in the build section to build the whole project (if not already done), then issue these commands:
$ cd cometd-java/cometd-java-examples // Use "cd cometd-java/cometd-java-client" for CometD versions prior 2.4.0 $ mvn -Pclient install exec:exec
This command launches the CometD load client after a number of questions prompted in the terminal window; you can additionally tweak the command line by tweaking the
pom.xml section related to the "client" profile.
The load test tool simulates a chat client and prompts you for configuration options:
server [localhost]: port : transports: 0 - long-polling 1 - websocket transport : 1 use ssl [false]: max threads : context [/cometd]: channel [/chat/demo]: rooms : rooms per client : record latency details [true]: ----- clients : 1000 Waiting for clients to be ready... Waiting for clients 999/1000 Clients ready batch count : batch size : 1 batch pause (µs) : message size : randomize sends [false]:
The default configuration connects 100 Bayeux clients to the server at
http://localhost:8080/cometd, then sends 1000 batches of 10 messages (of size 50 bytes) with 10 ms pauses between each batch, choosing a random client for each batch as the sender. The client sends the messages to the chat room(s) it is subscribed to, the messages arrive to the server and the server broadcasted them back to the subscribers of the chat room(s), that receive the messages.
The latency between the sends and the receives is then measured and displayed, for example:
Outgoing: Elapsed | Rate = 10273 ms | 973 messages/s - 97 requests/s Waiting for messages to arrive 15815/15868 All messages arrived 15868/15868 Messages - Success/Expected = 15868/15868 Incoming - Elapsed | Rate = 10667 ms | 1487 messages/s Messages - Latency Distribution Curve (X axis: Frequency, Y axis: Latency): @ _ 13 ms (2436) @ _ 26 ms (581) @ _ 39 ms (3805) @ _ 52 ms (3301) @ _ 65 ms (1802) @ _ 78 ms (1501) @ _ 91 ms (1029) @ _ 104 ms (537) @ _ 117 ms (280) @ _ 130 ms (163) @ _ 142 ms (158) @ _ 155 ms (107) @ _ 168 ms (76) @ _ 181 ms (30) @ _ 194 ms (17) @ _ 207 ms (33) @ _ 220 ms (9) @ _ 233 ms (1) @ _ 246 ms (0) @ _ 259 ms (2) Messages - Latency Min/Ave/Max = 0/48/259 ms
Note how we chose a delay of 10 ms between batches, yielding a potential of 1000 messages sent/s (10 sends every 10 ms), and actually achieved 973 sends/s.
The number of messages depends on how many clients are subscribed to each chat room: sending one message to a chat room with 5 subscribers results in one messages being broadcasted back to each of the 5 subscribers.
In the example above we sent 10000 messages and received 15868, which makes sense since we have 100 clients subscribed randomly to 100 rooms so in average we have one client per room (with some room empty and some room with more than one subscriber). Had we chosen 1 chat room only, we would have had 10000 * 100 messages back (i.e. each message sent would have been broadcasted back to the 100 clients).
The rate of messages back is 1487 messages/s, and again makes sense that it is greater than the send rate if we consider that due to the random subscription, each message sent could have generated more than one message sent back (when more than one client is subscribed to the same chat room).
For the latency distribution curve, imagine to rotate it 90 degrees counter-clockwise. It shows a bell-shaped curve with the peak at around 33 ms, and reports the number of messages (in parenthesis) received in each interval of time. In the curve above, 2436 messages were received with a latency between 0 and 13 ms, 581 messages with a latency between 13 and 26 ms, and so on.
Finally, the minimum, average and maximum latency are reported.
Exiting the load test tool
To exit the load test tool by disconnecting orderly the clients, just enter
0 as the number of clients:
----- clients : 0 Waiting for clients to be ready... Waiting for clients 68/0 [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESSFUL [INFO] ------------------------------------------------------------------------ [INFO] Total time: 45 seconds [INFO] Finished at: Wed Mar 31 18:20:15 CEST 2010 [INFO] Final Memory: 33M/216M [INFO] ------------------------------------------------------------------------
There are many things that can influence the results of the load testing.
Among many the most important two are:
- OS / network stack setup
See the section on setting up things above for details.
- Resolution of the sleep timer
Pauses in the load test tool are implemented via
Thread.sleep(). If you choose a 10 ms pause (equivalent to 10000 µs), it may happen that the actual time elapsed in sleep mode is way longer that what you specified.
This is because when
Thread.sleep()communicates to the JVM and then to the OS to sleep 10 ms, the OS may have a coarser sleep resolution, and actually sleep for 25 ms or more.
To avoid this problem, the load test tool auto-tunes itself by detecting the timer resolution on startup.
It is not uncommon, especially for virtualized servers, to have timer resolutions as high as 64 ms.