Sizing Guide - Benchmarks

Presentation of benchmarks

This page gathers the results of a panel of benchmarks run on  the integration node pattern to evaluate the data exchange capabilities between an integration node and an application node, using Decision Insight Messaging System (DIMS) to exchange messages.

For more information about the integration node pattern, see the Integration node pattern page in the Axway Decision Insight product documentation in https://docs.axway.com/category/analytics.

The goal of the benchmarks is to evaluate, for a range of data to exchange per day, the minimum configuration of DIMS nodes.

Several parameters are considered when running these benchmarks:

  • Number of messages per seconds/day
  • Size of messages
  • Number of topics/producers

Depending on these values, hardware settings must be adapted:

  • CPU
  • RAM
  • Network bandwidth
  • Hard disk

This page only deals with sizing the DIMS nodes. To size the integration and application nodes, please refer to the sizing guide of the Axway Desicion Insight product documentation in https://docs.axway.com/category/analytics .

Scenario description

The integration pattern involves three types of servers:

  • The integration node handles data extraction. The performance of the node depends on the data sources.
  • DIMS nodes facilitate transferring messages from integration nodes to application nodes and ensure the resiliency of the whole deployment. In our use case, each DIMS server hosts an orchestrator and a messaging server.
    Note: You can isolate orchestrators from messaging servers by installing them on different servers, but this case is outside the scope of this scenario.
  • The application node collects data from DIMS and enriches it through Decision Insight functionalities. 

All servers are in the same subnetwork.

Payload

For the described benchmarks, we sent messages of three different sizes from integration nodes to DIMS. The main difference in size is given by is the number of fields per message.

A small message simulates a payment content (here, a map of four fields):

Small message content
"id": "10000000000", "amount": 10000000, "account": "123456798", "step":"Acknowledge"

A medium message consists of five small messages (20 fields).

Medium message content
"id": "10000000000", "amount": 10000000, "account": "123456798", "step":"Acknowledge",
"id1": "10000000000", "amount1": 10000000, "account1": "123456798", "step1":"Acknowledge",             
"id2": "10000000000", "amount2": 10000000, "account2": "123456798", "step2":"Acknowledge",             
"id3": "10000000000", "amount3": 10000000, "account3": "123456798", "step3":"Acknowledge",             
"id4": "10000000000", "amount4": 10000000, "account4": "123456798", "step4":"Acknowledge"             

A large message consists of ten small messages (40 fields).

Large message content
"id": "10000000000", "amount": 10000000, "account": "123456798", "step":"Acknowledge",
"id1": "10000000000", "amount1": 10000000, "account1": "123456798", "step1":"Acknowledge",             
"id2": "10000000000", "amount2": 10000000, "account2": "123456798", "step2":"Acknowledge",             
"id3": "10000000000", "amount3": 10000000, "account3": "123456798", "step3":"Acknowledge",             
"id4": "10000000000", "amount4": 10000000, "account4": "123456798", "step4":"Acknowledge",             
"id5": "10000000000", "amount5": 10000000, "account5": "123456798", "step5":"Acknowledge",             
"id6": "10000000000", "amount6": 10000000, "account6": "123456798", "step6":"Acknowledge",             
"id7": "10000000000", "amount7": 10000000, "account7": "123456798", "step7":"Acknowledge",             
"id8": "10000000000", "amount8": 10000000, "account8": "123456798", "step8":"Acknowledge",             
"id9": "10000000000", "amount9": 10000000, "account9": "123456798", "step9":"Acknowledge"

Hardware

Most benchmarks are run on an Amazon EC2 t2.small instance:

The more demanding benchmarks are run on an Amazon EC2 i3.large instance:

  • CPU: 2 x Intel Xeon E5-2686 v4 (Broadwell) - 2.3 GHz

  • RAM: 16 GB

  • Network: 400 Mbits/sec
  • OS: Ubuntu TLS 16.04

Configuration

Each server hosts an orchestrator and a messaging server:

  • conf/messaging-server/jvm.conf: -Xmx1G
  • conf/orchestrator/jvm.conf: -Xmx512G

Memory consumption will vary between 0.5GB to 1.1G

Results

CPU usage

Standard throughputs

For a throughput of up to 1,000 messages per second, the benchmarks have been run on t2.small nodes. As the benchmarks show, this configuration can easily handle up to 1,000 messages per second.


Topic count 10 messages/sec 100 messages/sec 1,000 messages/sec
Small  Medium  Large  Small Medium Large Small Medium Large
1 1.5% 2.0% 2.0% 5.0% 6.0% 6.5% 7.0% 9.0% 9.0%
2 1.5% 2.0% 2.5% 9.0% 9.0% 9.5% 12.0% 15.0% 15.0%
3 2.2% 3.0% 3.0% 11.0% 12.0% 12.5% 15.0% 20.5% 20.5%
5 3.0% 3.0% 3.5% 17.0% 17.5% 18.0% 27.0% 30% 45.0%
10 5.0% 5.0% 5.5% 25.0% 27.0% 30.0% 35.0% 50.0% 55.0%
20 8.0% 8.0% 8.5% 32.5% 40.0% 42.5% 45.0% 55.0%

60.0%

Higher troughputs

To achieve a higher throughput (10,000 messages per second), a larger configuration than t2.small is required.

These benchmarks have been run on i3.large nodes.

Topic count 10,000 messages/sec
Small Medium Large
1 25% 25% 40%
2 60% 65% 40%
3 70% 70% 70%

Network

To size the bandwidth of a DIMS node, take into account these communication channels:

  • Producer: produces data to a specific partition, so, depending on the leader, to a specific DIMS server (incoming).
  • Consumer:  consumes data from a specific partition, so, depending on the leader, from a specific DIMS server (outgoing).
  • DIMS nodes: for resiliency purposes, you must be able to replicate data over other DIMS nodes (incoming/outgoing).
  • Messaging servers to orchestrators communication and orchestrators inter-communication; this won't have a significant impact on network sizing.

 

Minimal bandwidth needed (in Mbits/s) for a different size/messages per day, produced linearly:

Topic count 10 messages/sec 100 messages/sec 1,000 messages/sec 10,000 messages/sec
Small  Medium  Large  Small Medium Large Small Medium Large Small Medium Large
1 0.080 0.095 0.130 0.65 0.85 1.2 1.8 3.6 6.6 22.5 55 100
2 0.120 0.175 0.240 1.2 1.7 2.4 3.5 8 13 40.5 105 190
3 0.185 0.260 0.360 1.8 2.6 3.6 5.5 12 20 54.5 150 270

Bandwidth limit is reached as the size of messages or their frequency increase (or if the topic count grows).
Of course, the integration node or application node bandwidth can restrain message production or consumption.

However, a bottleneck can also occur on the messaging server side: their bandwidth must be sized for the incoming and outgoing traffic, plus the traffic for data replication between nodes.
If so, the messaging server would delay the acknowledgment to integration nodes, up to the point when a timeout on the integration node would cancel the transfer.

For example, given three messaging servers with a replication factor of 3, when an integration node reaches its bandwidth limit to 100Mbit/s and an application node reaches this limit too, a messaging server requires at least: 

  • 300Mbit/s of bandwidth for outgoing messages (100Mbits per messaging server for data replication).
  • 300Mbit/s of bandwidth for incoming messages (100Mbits for the integration node, 200Mbits for the two other messaging servers).

Storage


Minimum hard drive size in GB, for a different size/messages per day, produced linearly:

Topic count 1,000,000 messages 10,000,000 messages 100,000,000 messages 1,000,000,000 messages
Small  Medium  Large  Small Medium Large Small Medium Large Small Medium Large
1 0.110 0.4 0.75 1.1 4 7.5 8.5 30 60 125 425 950
2 0.210 0.8 1.5 2.1 7.7 15 16.5 65 120 250 930 1900
3 0.325 1.2 2.25 3.1 11.75 22.5 25 95 180 375 1400 2850

The performance tests show a DIMS node could run out of hard disk space if you do not adjust the retention period based on the messages size or amount.
For instance, given a DIMS cluster with a default retention period of 7 days, processing 1 000 000 000 large messages per day on 3 topics requires about 20 TB of hard disk space (2850GB * 7)

How to reproduce?

Configuration of the Messaging Servers

  1. Foolowing the decision Insight integration node pattern, deploy 3 DIMS nodes, each on a different server. For the purpose of these benchmarks, each server will have an orchestrator and a messaging server. For further details on deploying a cluster, see Installation.
  2. Upon installing each messaging server, provide:
    • The TLS configuration
    • Replication and partitioning parameters:

      Default Replication Factor

      3

      Minimal Synchronized Messaging Servers 2

      Number of partitions

      3
  3. Start the orchestrators.
  4. Start the messaging servers.

For more information about the integration node pattern, see the Integration node pattern page in the Axway Decision Insight product documentation in https://docs.axway.com/category/analytics.

Configuration of the application node

  1. Install and start a Decision Insight node.
  2. Import the application: DIMS.appx.
  3. change the properties:
    • DIMS_SERVERS :  Comma-separated list of DIMS URIs
    • keyStorePassword : keystore password
    • keyStorePath : keystore path
    • trustStorePassword : truststore password
    • trustStorePath : truststore path
  4. Start the context named Consumer.

Configuration of the integration node

  • Install and start a Decision Insight node.
  • Import the DIMS.appx application.
  • Change the following properties:
    • DIMS_SERVERS :  Comma-separated list of DIMS URIs
    • keyStorePassword : keystore password
    • keyStorePath : keystore path
    • trustStorePassword : truststore password
    • trustStorePath : truststore path

Different routes are available to reproduce each use case.

Related Links