eProsima Fast DDS Performance

 Updated 9th of January 2023

This performance testing, carried out by eProsima, focuses on latency and throughput.

  1. Test Environment
  2. Latency
    1. Latency performance for different delivery mechanisms
      1. Intra-process Delivery
      2. Inter-process Shared Memory
      3. UDP Transport
      4. General Latency Comparison
  3. Throughput
    1. Throughput perfromance for different delivery mechanisms
      1. Intra-process Delivery
      2. Inter-process Shared Memory
      3. UDP Transport
      4. General Throughput Comparison
  4. Conclusion

Test environment

The following performance experiments were performed using one computer with the following characteristics:

  • PowerEdge R330 e34s (8 Intel(R) Xeon(R) CPU E3-1230 v6 @ 3.50GHz, 32 GB RAM)
  • Linux 5.4.0-53-generic
  • OS: Ubuntu 18.04.5 LTS
  • Docker environment:
    • Docker 20.10.7
    • Ubuntu 22.04 LTS

Latency

Latency is usually defined as the amount of time that it takes for a message to traverse a system. In packet-based networking, latency is usually measured either as one-way latency (the time from the source sending the packet to the destination receiving it), or as the round-trip delay time (the time from source to destination plus the time from the destination back to the source). The latter is more often used since it can be measured from a single point.

In the case of a DDS communication exchange, latency could be defined as the time it takes for a DataWriter to serialize and send a data message plus the time it takes for a matching DataReader to receive and deserialize it. Applying the same round-trip concept mentioned before, the round-trip latency could be defined as the time it takes for a message to be sent by a DataWriter, received by a DataReader and sent back to the same DataWriter. For example, in the figure below, the round-trip time would be T2-T1 making the latency estimation (T2-T1)/2.

Latency image

Back to Performance Testing Index

 

LATENCY PERFORMANCE FOR DIFFERENT DELIVERY MECHANISMS

Fast DDS has as a unique attribute; the duality characteristic of synchronous and asynchronous publishing modes. Meaning, Fast-DDS offers two possibilities of publication modes for the middleware layer, for more information about both options, please have a look at the following link.

For these latency performance tests, eProsima focused on the synchronous publication mode, comparing the different delivery mechanisms available in Fast DDS v2.8.0.

 

Intra-process Delivery

Intra-process delivery is a Fast DDS feature that accelerates the communications between entities within the same process, averting any of the overhead involved in the transport layer. Intra-process guarantees that the DataReader receives the message by making the DataWriter directly call the reception routines of the DataReader.

In figure 1.1a and 1.1b, the latency performance of the intra-process feature with and without zero-copy delivery mechanism activated can be seen. The graphic clearly shows the performance enhancement for the majority of the payloads tested when using zero-copy. This indicates that the vast majority of the latency overhead is caused by the data copies from buffer to buffer (which are not present in the zero-copy case). Because of this, the larger the data samples, the more the latency improvements.

Latency same processFig 1.1a Fast DDS 2.8.0 Same Process Reliable

Latency same process zoom upFig 1.1b Fast DDS v2.8.0 Same Process Reliable - zoom - up to 1 kB payload

Back to Performance Testing Index

 

Inter-process Shared Memory

In graphics 1.2a and 1.2b, the Fast DDS v2.8.0 - Shared Memory Transport (SHM) latency performance can be observed for both Fast DDS with and without zero-copy delivery mechanism. The Fast DDS v2.8.0 running with zero-copy is using both Loans and Data Sharing mechanisms, using this transport mechanism the data can be transmitted between applications in the same host without copying data in memory. Due to its architecture, Fast DDS zero-copy is showing far better latency results than without for the majority of the payloads tested. As it was previously stated, generating less copies of large data samples, results in remarkable latency improvements.

 Fast DDS SHM Latency

Fig 1.2a Fast DDS v2.8.0 SHM

Fast DDS SHM Latency zoom upFig 1.2b Fast DDS v2.8.0 SHM - zoom - up to 1 kB payload

Back to Performance Testing Index

Shared Memory Transport (SHM) is a Fast DDS feature that facilitates communications between entities running in the same processing unit/machine. It provides a better performance than the standard UDP transport due to  the following factors: 

  • Its Large message support, where the only message size limit is the machine's memory capacity.
  • The number of memory copies reduction: SHM can directly share the same memory buffer with all the destination endpoints.
  • Less operating system overhead: the shared memory transport transfer requires a smidgen system call amount when compared to other protocols such as UDP.

 For further information about the three mechanisms please refer to eProsima Shared Memory

 

UDP Transport

UDP is a connectionless transport where the receiving DomainParticipant must open a UDP port listening for incoming messages, whereas the sending DomainParticipant sends messages to this port. Fast DDS enables a UDPv4 transport by default. Nevertheless, the application can enable other UDP transports if needed.

In the following figure 1.3a and 1.3b, the UDP transport performance with and without activated loans can be appreciated. It is important to note that the latency with activated loans is performing considerably better for the majority of the payloads. Eliminating the overhead caused by data copies notably impacts the latency performance.

Latency UDP

Fig 1.3a  Fast DDS v2.8.0 UDP

 

Latency UDP Zoom up

Fig 1.3b Fast DDS v2.8.0 UDP - zoom - up to 1 kB payload

Back to Performance Testing Index

 

General Latency Comparison

The graphics 1.4.a and 1.4.b provide a latency performance comparison of the Fast DDS implementation with different transports and mechanisms to provide a comprehensive vision of its performance. As it can be observed, in all the cases the implementation with zero-copy presents a better latency performance than its peer. Fast DDS intra-process presents the best latency, followed by Fast DDS Shared Memory, and Fast DDS with UDP transport presents the highest latency numbers.

 Latency general

Fig 1.4a  Fast DDS v2.8.0

 

Latency general zoom up

Fig 1.4b Fast DDS v2.8.0 - zoom - up to 1 kB payload

Back to Performance Testing Index

 

Throughput

In network computing, throughput is defined as a measurement of the amount of information that can be sent/received through the system per unit time, i.e. it is a measurement of how many bits traverse the system every second. The normal measuring operation is for a DataWriter to send a large group of messages (known as batch, burst or demand) within a certain time (burst interval). After finishing the sending, if the operation has taken less time than the burst interval, the DataWriter is put to rest until the interval has completely elapsed (else, the DataWriter is not allowed to rest). This operation is performed until the test time is reached. On the receiving side, a DataReader is receiving the information, taking note of the time when the first message was received, and counting every message that arrives. When the test is complete, the DataReader can compute a receiving sample rate. Knowing the size of every message (in bits), the throughput is simply the product of the sample rate times the message size. The following diagram illustrates this process.

Latency definition image

Back to Performance Testing Index

 

THROUGHPUT PERFORMANCE FOR DIFFERENT DELIVERY MECHANISMS

Fast DDS has as a unique attribute; the duality characteristic of synchronous and asynchronous publishing modes. Meaning, Fast-DDS offers two possible publication modes for the middleware layer.For more information about both options, please have a look at the following link.

For these throughput performance tests, eProsima focused on the synchronous publication mode, comparing the different delivery mechanisms available in Fast DDS v2.8.0.

 

Intra-process Delivery

Intra-process delivery is a Fast DDS feature that accelerates the communications between entities inside the same process, averting any of the overhead involved in the transport layer. Intra-process guarantees that the DataReader receives the message by making the DataWriter directly call the reception routines of the DataReader

In the graphs from figures 2.1a and 2.1b, the throughput performance of the intra-process communication using the zero-copy delivery mechanism available in Fast DDS v2.8.0 can be seen. The graphics clearly show the performance enhancement for every payload size tested when using zero-copy. This indicates that the vast majority of the throughput overhead is caused by the data copies from buffer to buffer (which are not present in the case of zero-copy). Because of this, the larger the data samples, the more the throughput improvements.

Throughput same process

Fig 2.1a Fast DDS v2.8.0 Intra-process Reliable

 

Throughput same process zoom up

Fig 2.1b Fast DDS v2.8.0 Intra-process Reliable - zoom - up to 65 kB data payload

Back to Performance Testing Index

  

Inter-process Shared Memory

In figures 2.2a and 2.2b the Fast DDS v2.8.0 - Shared Memory Transport (SHM) performance is compared with the SHM with loans (zero-copy) delivery mechanism. Comparing the different options and different data payloads it can be noticed that, due to its architecture, SHM with loans (zero-copy) is showing far better throughput results for all payload sizes. As it was previously stated, generating less copies of large data samples, results in remarkable throughput improvements.

Throughput SHM

Fig 2.2a Fast DDS v2.8.0 SHM 

 

Throughput SHM zoom up

Fig 2.2b Fast DDS v2.8.0 SHM - zoom - up to 65 kB data payload

 

For further information about the three mechanisms please refer to eProsima Shared Memory.

Back to Performance Testing Index

 

UDP Transport

UDP is a connectionless transport where the receiving DomainParticipant must open a UDP port listening for incoming messages, whereas the sending DomainParticipant sends messages to this port. Fast DDS enables a UDPv4 transport by default. Nevertheless, the application can enable other UDP transports if needed.

In figures 2.3a and 2.3b, the UDP transport performance with and without loans can be appreciated. It is important to note that the throughput with activated loans is performing considerably better at all times and stabilizes with large data over 1MB. Eliminating the overhead caused by data copies notably impacts the throughput performance seen in all tests.

Throughput UDP

Fig 2.3a  Fast DDS v2.8.0 UDP

 

Throughput UDP zoom up

Fig 2.3b  Fast DDS v2.8.0 UDP - zoom - up to 65 kB data payload

Back to Performance Testing Index

 

General Throughput Comparison

The graphics 2.4.a and 2.4.b provide a throughput performance comparison of the Fast DDS v2.8.0 implementation with different transports and mechanisms to provide a comprehensive vision of its performance. As it can be observed, in all the cases the implementation with zero-copy presents a better latency performance than its peer. Fast DDS intra-process presents the best latency, followed by zero copy Shared Memory Transport and UDP with loans, and plain Fast DDS SHM and Fas DDS UDP respectively show the lowest throughput in the performance testing.

Latency general

Fig 2.4a  Fast DDS v2.8.0 

 

Latency general zoom up

Fig 2.4b  Fast DDS v2.8.0 - zoom - up to 65 kB data payload

Back to Performance Testing Index

 

Conclusion

In summary, it can be observed that the Fast DDS v2.8.0 performance shows great results in terms of latency and throughput, and a stunning improvement in latency when using the zero-copy mechanism. Thanks to a proper data model design, Fast DDS has improved, being capable to keep latency and throughput stable, no matter the data sample size. The results on this benchmark testing show that Fast DDS is able to provide optimal performance in every scenario.

 

MORE INFORMATION ABOUT EPROSIMA FAST DDS AND ITS PERFORMANCE:

For any questions please contact This email address is being protected from spambots. You need JavaScript enabled to view it.

Back to Performance Testing Index

  

Contact Us

General Information:
Phone: +34 91 804 34 48
Email: [email protected]

Tech Support:
Phone: +34 91 804 34 48
Email: [email protected]

© 2013-2025 eProsima.
All Rights Reserved.

Office address

Plaza de la Encina 10-11,
Nucleo 4, 2ª Planta
28760 Tres Cantos – Madrid (Spain)