ROS 2 Communication and Networking: A Brief Introduction

03 Mar, 2025

How the ROS2 Comm works ?

ROS2 uses an abstraction of ROS Middleware (RMW) to help it manage networking and communication.
Now there are different RMW implementations, but most of them use DDS (Data Distribution Service).
DDS (Data Distribution Service) is built on the RTPS (Real-Time Publish Subscribe) protocol.
In the world of DDS, ROS2 nodes can publish and subscribe to data, and to provide this communication, DDS relies on UDP messages for underlying communication over the network.
```
ROS -> RMW -> DDS -> UDP
```
RMW has different configurations to achieve simplicity, robustness, or redundancy depending on what you need!
Now the good thing about DDS is that all the participants (which are our nodes in this case) do not rely on a single entity to ensure communication. Rather the entire system is decentralized, which means that the 'DDS System' gives power to each participant and it allows peer-to-peer communication. So that the participants (nodes) can find each other. And all this, my friends, is called the 'Discovery process'.
When developing a networking system for your robot, you may need to choose which RMW implementation that you want to use. Let's say you decide to use the eProsima Fast DDS. Now, you also have to decide what should be the discovery method. So, in Fast DDS that we chose in our example has:
1. Simple Discovery option
2. Discovery Server option
Now, you may have seen the 'ROS_DOMAIN_ID' variable. Basically, this allows the DDS Layer to determine which ports each of your nodes communicates on. Note: I am saying node, but this is the same as 'participant'. More precisely, the DDS layer uses a combination of the domain ID, participant ID, and other factors to determine the ports for discovery and communication according to a defined algorithm specified in the DDS-RTPS specification. The domain ID acts as a partitioning mechanism to isolate different ROS 2 systems on the same network.
```
+---------------------+     +---------------------+     +---------------------+     +---------------------+
|      ROS 2 Node     |---->|         RMW         |---->|         DDS         |---->|         UDP         |
| (Publisher/Subscriber)|     | (Abstraction Layer) |     | (Data Distribution) |     | (Network Transport)|
+---------------------+     +---------------------+     +---------------------+     +---------------------+
```

Lets talk about ROS2 Daemon

A daemon basically is any process that runs in the background, kinda demonic not gonna lie.
In ROS2, the ROS2 Daemon is a 'service' that runs in the background and it looks at all the ROS2 nodes currently running and makes that info available to all the ROS2 introspection tools over the CLI
Like ros2 topic list, ros2 node list, ros2 topic echo.
It takes some time for the daemon to gather info. However, the primary reason you might not see all topics immediately after starting a node is due to the discovery process itself, not just the daemon's information gathering. DDS participants (nodes) need time to discover each other, and this can vary depending on the RMW implementation and network conditions.
The ROS2 Daemon inherits the environment variables that are defined in the terminal when you ran it. So once you run a ROS2 command and the daemon starts running, if you change the ROS_DOMAIN_ID now, it will not reflect in your ROS2 CLI commands. You have to stop and restart the Daemon to see the changes.
ros2 daemon stop, ros2 daemon start
Note: If you have two terminals with different ROS2 settings like different ROS_DOMAIN_ID, the ROS2 CLI can only work with one terminal at a time and so the daemon must be stopped and restarted in the appropriate terminal before being used.
Each computer has 1 daemon with one set of ROS2 communication settings.
The daemon starts automatically when you run a ROS2 CLI command.
The daemon adopts the settings of the terminal. To change the settings, you have to stop and restart the daemon.

When the daemon starts, it takes some time to complete the discovery process.

                              _.-^^---....,,--
                           _--                  --
                           <                        >)
                           |       ROS 2 Daemon      |
                           \._                   _./
                              ```--. . , ; .--'''
                                    | |   |
                                 .-=||  | |=-.
                                 `-=#$%&%$#=-'
                                    | ;  :|
                  _____.,-#%&$@%#&#~,._____
                  /                          \
                  |    Looking at all Nodes   |
                  |    Making info available |
                  \                          /
                  --------------------------

Network settings

Some ROS2 node settings that you should be aware of: 'Domain id', 'Automatic Discovery Range', 'Discovery servers'.
You cannot change ROS2 node settings on the fly.
Fact: When a node starts, a 'locator list' is created. This list contains the IP address and the port where the node is listening for messages. This list is created when the node is initialized first. This is sorta the address for that node that the discovery process uses to establish communication. So, let's say a computer is assigned a new IP by the DHCP server, now the node would not be available at the new IP address since the address of the node does not change dynamically. Note: Some DDS implementations can support dynamic locator updates, but this is not universally guaranteed. To prevent this from happening, it is recommended to use static IP for all devices connected in the ROS2 network.
You should takeaway the following:
1. Nodes must be restarted to inherit the updated network settings including new or updated IP addresses.
2. Use static IP for all devices and reserve them on the network.
3. Ensure that all networks are active and connected prior to launching the ROS nodes.

Optimize your data before transfer over the network

Image data must be compressed using the standard image transport methods. "Image transport" is a general concept for optimizing image data transfer, including color and depth images.
Depth data should be either stored as lidar scan packets or depth images. Depth images can also be compressed using image transport mechanisms such as image_transport_plugins, but they may use different compression algorithms optimized for depth data.
Point cloud data format consumes a lot more bandwidth.

ROS2 Quality of Service

QoS settings are like 'policies' that dictate how a message is transmitted. Now both the publisher and subscriber can have different QoS settings that they declare.
The subscriber and publisher policies must be compatible. DDS has rules for QoS compatibility. For example, a subscriber requesting RELIABLE QoS can connect to a publisher offering RELIABLE or BEST_EFFORT, but a subscriber requesting BEST_EFFORT can connect to a publisher offering only BEST_EFFORT.
Some examples, reliability policy include Best Effort or Reliable:
- Reliable: The publisher will try to send the message repeatedly if not successfully delivered.
- Best Effort: The publisher will publish the message once and the computer will try to deliver it over the network, but it does not check if the data is properly delivered or not.
For sensor data, where we care about the latest data and don't care if the history is missing or not, we use 'Best effort' typically.
If, for instance, you have set a Reliable policy to send a very large message, then it puts a lot of burden on the network to continue retrying sending a message that is very large.
Also a fact: the ROS2 CLI when you say ros2 topic echo uses the Reliable QoS setting.
Other important policies are History and depth,
They control how many historic messages are stored and would be resent.

Lets talk about ROS2 Discovery Configuration

So there are a lot of options to choose from the RMW (ROS2 middleware) each has its own quirks, we will be discussing the eProsima Fast DDS RMW implementation.
This middleware has two options to discover ROS2 nodes.
1. Simple discovery
2. Discovery Server

Simple Discovery

Default, no manual setup.
ROS2 devices should be discoverable and connect automatically.
Simple systems, small number of nodes within a local network.

Requires 'multicasting', which is a network addressing method where a single message is sent to a group of recipients simultaneously.

        Node 1                             Node 2
     +-----------+                      +-----------+
     |  "I'm here!|                     | "I'm here! |
     | 192.168.1.2|                     | 192.168.1.3|
     +-----|-----+                      +-----|-----+
           |      \                    /      |
           |       \                  /       |
           |        \                /        |
           |             Multicast            |
           |           Announcement           |
           |                                  |
           +-----v---------------------v------+
           |          Network                 |
           +----------------------------------+

Discovery Server

Manual configuration is needed which means it provides more flexibility.
Based on the principle that ROS2 networks should be controlled and only discover other nodes as needed.

Does not require multicasting.

                                 +---------------------+
                                 |  Discovery Server   |
                                 | (Lookup Table/DB)   |
                                 +---------|---------+
                                          |
         +------------------------------+-------------------------------+
         |                              |                               |
         |                              |                               |
   +---v---+                  +-------v-------+                 +-----v-----+
   | Node 1|                  |    Node 2     |                 |   Node 3  |
   +-------+                  +---------------+                 +-----------+
         |                          |                               |
         | "I'm Node 1, need info"  | "I'm Node 2, here's my info"  | "I'm Node 3"
         |                          |                               |
         +--------------------------+-------------------------------+

Lets dive into Simple Discovery and Discovery Server and see how they work

Simple Discovery: All participants (nodes) in the network send a multicast announcement to all the other participants (nodes) on the network telling them its unicast address (IP and port). This is like you go in a room and shout to everyone that "Come talk to me, my home address is ...", and similarly, all the nodes do it, this way each node has information about each other node. But only the nodes that are interested in a particular node will initiate a unicast communication with you using the address that you shared.
DOMAIN_ID could be thought of as a floor in the building. When multicasting, all the nodes on the same floor will be able to contact you using the unicast address that you shouted but the other floor members will not be able to hear your message and so not communicate with you.
Discovery server: Now in this situation, instead of all nodes just sending their data to everyone in the network. The entire system is now more organized. We have a discovery server, or you can call it a 'look-up table' or a 'database' that contains information about all the nodes (participants) in the network. Each node shares its information with the server and if needed requests some information about another participant in the network. This way each node only gets the relevant information and not the information about all the nodes in the network. I like this approach because, this also allows you to segregate your network, with multiple servers that document or store information about a particular section of the network.
You can set the 'ROS_DISCOVERY_SERVER' environment variable to choose which server to choose.
So each node only has information about the nodes on the same server.

ROS is built on DDS

DDS provides discovery, message definition, message serialization, and publish-subscriber transport. DDS (Data Distribution Service) is a specification that defines an API and communication model.
ROS is like a wrapper over the DDS API, the users in ROS2 are allowed to access the specific DDS API.
So there can be multiple different DDS vendors, ROS acts as a generic wrapper over DDS, and so it is not reliant on a specific vendor rather it is modular in the way that it allows to change the underlying vendor, future-proofing ROS.
DDS also provides DDSI-RTPS (DDS-interoperability Real-Time Publish Subscribe) protocol for publish and subscribe. RTPS (Real-Time Publish-Subscribe) is a protocol often used to implement DDS.
ROS2 has written the 'ros client library API' to utilize the DDS API.
ROS2 currently supports the following DDS or RTPS vendors: 'eProsima's Fast DDS', 'RTI's Connect DDS', 'Eclipse Cyclone DDS' and 'GurumNetworks GurumDDS'
But the default vendor for DDS for ROS2 is 'eProsima's Fast DDS'.
You might be wondering what is Eprosima Fast DDS: It is basically a complete open-source DDS implementation for real-time embedded architecture and operating system.
ROS2 also allows you to switch between vendors, basically, it allows you to specify a different vendor other than the default one.
```
User code -> ros client Api -> abstract DDS Api -> inner details of DDS specific API
```

ROS has written a middleware, which is a collection of packages that provide a full or partial implementation of the DDS API.

+-------------------------------------------------+
|                  ROS 2 Application              |
+-------------------------------------------------+
|             ROS Client Library (rcl)            |
+-------------------------|-----------------------+
|        RMW Interface    |                       |
+------------|------------+                       |
|  rmw_fastrtps_cpp       |      Other RMWs...    |
+------------|------------+                       |
|    Fast DDS (eProsima)  |                       |
+-------------------------+-----------------------+
|            DDS / RTPS Protocol                  |
+-------------------------------------------------+

Lets dive a bit more deep into the ROS middleware

There are three main parts to the ROS middleware:

CMake Module Packages:
RMW (ROS middleware) implementation package:
ROSIDL type support: Remember how we have to include rosidl generator commands in the package.xml when writing a ROS2 interface?

CMake modules help you link to the specific DDS vendor that you are trying to use, to include the DDS libraries.

CMAKE module -> link to Dependencies ( DDS libs )

RMW is the package where the actual communication API for the middleware is written, it is the part that allows ROS to interact with the middleware like Fast DDS, cyclone DDS etc.

RMW package -> implements ROS middleware api (C++) -> provides the communication functions for ROS

ROSIDL type support package allows to convert the data types messages between ROS and DDS. More specifically, they generate the necessary code to serialize (convert data to a byte stream for transmission) and deserialize (convert the byte stream back to data) messages for a specific DDS implementation. It makes sure that your sensor data can be properly interpreted by the ROS and DDS.

In short:

Your node -- (communicates with) --> rclcpp - ros client library -> RMW implementation (e.g rmw_fastrtps) -> DDS or Middleware

+---------------------+         +-----------------------+         +-----------------------+
|    Your ROS Node    |-------> |   rclcpp/rcl (API)   | ------> |    RMW Implementation |
|                     |         |                       |         | (e.g., rmw_fastrtps)  |
+---------------------+         +-----------------------+         +-----------|-----------+
                                                                              |
                                                                   +----------v----------+
                                                                   |  DDS / Middleware  |
                                                                   | (e.g., Fast DDS)   |
                                                                   +--------------------+
                                                                   |  Serialization     |
+---------------------+         +-----------------------+          | (ROSIDL generated) |
|    package.xml      |-------> |   ROSIDL generator   | ------>   |   .dds (IDL file)  |
| (Dependencies, etc.)|         |   (e.g., rosidl_     |           +--------------------+
+---------------------+         |  generator_dds_idl)  |
                                +-----------------------+