Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Denying/allowing topics on the subscriber side does not work as expected #241

Closed
lukaszanger opened this issue Sep 3, 2024 · 10 comments
Labels
bug Something isn't working

Comments

@lukaszanger
Copy link

lukaszanger commented Sep 3, 2024

Describe the bug

Denying and allowing topics via the configuration file does not work as expected for zenoh-bridge-ros2dds. In the minimal example setup below, even if explicitely denying a specific topic, it is nevertheless subscribed by the Zenoh bridge (and published as a ROS2 topic).

To reproduce

To reproduce the bug, start the following minimal example via docker compose up.

In the setup, we have a publishing ROS2 node (with a publishing Zenoh bridge) with ROS_DOMAIN_ID=1 in a specific Docker network rosnet-pub that publishes a topic with 1 Hz. On the other hand, we have two subscribing Zenoh bridges both with ROS_DOMAIN_ID=2 and both in Docker network rosnet-sub.

One of the subscribing Zenoh bridges allows the topic /pub. The other subscribing Zenoh bridge denies this topic. Nevertheless, it seems that both of these subscribing Zenoh bridges are publishing the ROS2 topic /pub which is not expected. What is observed: ros2 topic info /pub shows that it is published by two ROS2 nodes. ros2 topic hz /pub gives a frequency in the range of 2000 - 4000 Hz (and not 1 Hz).

Expected outcome: Zenoh subscriber sub1 publishes the ROS2 topic /pub. Zenoh subscriber sub2 does not publish anything.
Actual outcome: Both Zenoh subscribers publish the ROS2 topic /pub.

Docker Compose File

docker-compose.yml:

services:

  zenoh-router:
    image: eclipse/zenoh:0.11.0
    restart: unless-stopped
    ports:
      - 7447:7447
    networks:
      - rosnet-pub
      - rosnet-sub

  pub:
    image: rwthika/ros2:latest
    environment:
      - ROS_DOMAIN_ID=1
      - RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
    command:
      - /bin/bash
      - -c
      - |
        apt-get update
        apt-get install -y ros-humble-rmw-cyclonedds-cpp
        ros2 topic pub /pub std_msgs/msg/String "data: 'Topic from pub'" --node-name pub
    networks: 
      - rosnet-pub

  zenoh-bridge-dds-pub:
    image: eclipse/zenoh-bridge-ros2dds:0.11.0
    restart: unless-stopped
    command: -c /config.json5 -d 1
    environment:
      - ROS_DISTRO=humble
    networks: 
      - rosnet-pub
    volumes:
      - ./zenoh_pub.json5:/config.json5

  sub1:
    image: rwthika/ros2:latest
    environment:
      - ROS_DOMAIN_ID=2
      - RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
    command:
      - /bin/bash
      - -c
      - |
        apt-get update
        apt-get install -y ros-humble-rmw-cyclonedds-cpp
        sleep infinity
    networks: 
      - rosnet-sub

  zenoh-bridge-dds-sub1:
    image: eclipse/zenoh-bridge-ros2dds:0.11.0
    restart: unless-stopped
    command: -c /config.json5 -d 2
    environment:
      - ROS_DISTRO=humble
    networks: 
      - rosnet-sub
    volumes:
      - ./zenoh_sub1.json5:/config.json5

  sub2:
    image: rwthika/ros2:latest
    environment:
      - ROS_DOMAIN_ID=2
      - RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
    command:
      - /bin/bash
      - -c
      - |
        apt-get update
        apt-get install -y ros-humble-rmw-cyclonedds-cpp
        sleep infinity
    networks:
      - rosnet-sub

  zenoh-bridge-dds-sub2:
    image: eclipse/zenoh-bridge-ros2dds:0.11.0
    restart: unless-stopped
    command: -c /config.json5 -d 2
    environment:
      - ROS_DISTRO=humble
    networks: 
      - rosnet-sub
    volumes:
      - ./zenoh_sub2.json5:/config.json5

networks:
  rosnet-pub:
  rosnet-sub:

Zenoh Config of publisher

zenoh_pub.json5:

{
  plugins: {
    ros2dds: {
      id: "zenoh_pub",
      nodename: "zenoh_bridge_ros2dds_pub",
      allow: {
        publishers: ["/pub"],
      },
    },
  },
  mode: "client",
  connect: {
    endpoints: [
      "tcp/zenoh-router:7447"
    ]
  },
}

Zenoh Config of subscriber no. 1:

zenoh_sub1.json5:

{
  plugins: {
    ros2dds: {
      id: "zenoh_sub1",
      nodename: "zenoh_bridge_ros2dds_sub1",
      allow: {
        subscribers: ["/pub"],
      },
    },
  },
  mode: "client",
  connect: {
    endpoints: [
      "tcp/zenoh-router:7447"
    ]
  },
}

Zenoh Config of subscriber no. 2

zenoh_sub2.json5:

{
  plugins: {
    ros2dds: {
      id: "zenoh_sub2",
      nodename: "zenoh_bridge_ros2dds_sub2",
      deny: {
        subscribers: ["/pub"],
      },
    },
  },
  mode: "client",
  connect: {
    endpoints: [
      "tcp/zenoh-router:7447"
    ]
  },
}

System info

@lukaszanger lukaszanger added the bug Something isn't working label Sep 3, 2024
@lreiher
Copy link

lreiher commented Sep 4, 2024

Something to add to my colleague's issue description: you may wonder why we are not simply using a single Zenoh bridge on the sub side. The reason is that we are employing Zenoh in a system where we want to dynamically launch and shutdown certain communication channels, so we're effectively launching one bridge per topic.

@TomGrimwood
Copy link

TomGrimwood commented Sep 10, 2024

Initially I thought that because you do not deny sub2 to publish "/pub", it will be republishing sub1_bridge "/pub", but this did not help, and pub_bridge shouldn't subscribe to it anyway creating a loop.

However, wouldn't having two Zenoh bridges on the same ROS_DOMAIN_ID be incorrect usage of the bridge?

The readme states:

It's important to make sure that NO DDS communication can occur between 2 hosts that are bridged by zenoh-bridge-ros2dds. Otherwise, some duplicate or looping traffic can occur.

Running zenoh-bridge-dds-sub1 and zenoh-bridge-dds-sub2 on ROS_DOMAIN_ID=2 seems to break the system, as expected?

@lreiher
Copy link

lreiher commented Sep 10, 2024

Thank you for pointing out that sentence from the README. This might be an unfortunate limitation of the zenoh-bridge-ros2dds for us. Using separate ROS_DOMAIN_IDs for both subscribers does not make sense in our use case, as we want them to both operate on the same ROS domain.

Effectively, what we are trying to do is build a dynamic system. Let's say we have two hosts and are already bridging /topic/a between them. At some point in time we want to automatically launch another communication channel for bridging /topic/b and /topic/c. After some more time, we might want to close down the communication channels for /topic/b and /topic/c then.

We come from using mqtt_client for this. We would simply launch another two mqtt_clients (equivalent to zenoh-bridge-ros2dds) on either host. We now wanted to explore the system's performance using Zenoh. However, it looks like we cannot have two zenoh-bridge-ros2dds operating on the same host on the same ROS_DOMAIN_ID.

Another option could be to reconfigure an existing bridge, e.g., via ROS 2 services. I have found #45, but this doesn't really help us directly, I guess. Shutting down the existing bridge and launching a new one is not really an option, because we would lose traffic in the transition.

Any help is appreciated. :)

@TomGrimwood
Copy link

TomGrimwood commented Sep 10, 2024

From what I understand, if you were able to whitelist topics at runtime, this would be a satisfactory solution?

A very weird idea to try this would be if you are able to have strange topic names.

pub1 publisher whitelist: [".*/pub1/.*"]
sub1 subscriber whitelist [".*/sub1/.*"]
sub2 subscriber whitelist [".*/sub2/.*"]

Then if you create a topic called "/pub1/sub1/battery_state" it would be sent from p1 to s1.
And later a topic called "/pub1/sub2/battery_state" it would be sent from p1 to s2.
or even "/pub1/sub1/sub2/battery_state" to be sent to both.

Maybe this is not a solution for you but an interesting thought.

This would still require all bridges running on different DOMAIN_ID's.

@lreiher
Copy link

lreiher commented Sep 11, 2024

Not really sure I understand your idea. However, we definitely need to be on the same ROS_DOMAIN_ID everywhere, as Sub1 and Sub2 (and others) also need to directly talk to each other, without Zenoh.

As I mentioned, we could achieve the same if we had the option to dynamically reconfigure allow/deny of each Zenoh bridge.

@lreiher
Copy link

lreiher commented Sep 24, 2024

@JEnoch
Copy link
Member

JEnoch commented Nov 29, 2024

Hi @lreiher,
I finally found time to investigate further...

What is observed: ros2 topic info /pub shows that it is published by two ROS2 nodes

This was a bug that is now fixed by #344.

ros2 topic hz /pub gives a frequency in the range of 2000 - 4000 Hz (and not 1 Hz).

The cause is a loop between the 2 bridges on domain 1:

  • the 1st bridge allows subscriptions on /pub and thus creates a route from Zenoh to DDS with an DDS Writer on /pub to republish the data coming from the remote bridge on domain 0.
  • the 2nd bridge discovers this DDS Writer on pub. As it's not denied, it creates a route for it (DDS to Zenoh) and the data flows back to the 1st bridge via Zenoh...

A simple fix is to also deny the publishers on /pub for the 2nd bridge.

@lukaszanger
Copy link
Author

lukaszanger commented Dec 6, 2024

Hi @JEnoch,
thanks for your answer and thanks for fixing the bug!
To sum up, the problem that we had, is solved now. Two aspects were relevant to achieve our expected behavior:

First: Your bug fix in #344. Denying/allowing topics now works as expected.

Second: We have to explicitely allow only those topics that should actually be published/subscribed by the corresponding bridge. Otherwise, our subscribing bridges also act as publishers which leads to (in our case) undesired behavior.

As a result of the second aspect, we need to explicitely add empty lists to the allow entry. E.g., in case of sub1:

  ...
  allow: {
    publishers: [],
    subscribers: ["/pub"],
  },
  ...

In summary, both aspects were important to achieve our desired functionality!

@JEnoch
Copy link
Member

JEnoch commented Dec 6, 2024

As a result of the second aspect, we need to explicitely add empty lists to the allow entry

No: not specifying any publishers field in the allow struct is exactly the same than specifying an empty list with publishers: [].

The problem is rather the configuration of the bridge "Subscriber" 2 and its interaction with the bridge "Subscriber" 1.
Let me try to clarify with a diagram:
Capture d’écran 2024-12-06 à 11 29 02

The bridge "Subscriber" 2 is denied to route DDS Subscribers on "/pub". But is by default allowed to route anything else, including the DDS Publishers on "/pub". Therefore, when discovering the DDS Publisher created by the bridge "Subscriber" 1, it routes it back to Zenoh, and thus to bridge "Subscriber" 1.
This creates a loop of traffic between the 2 bridges, using DDS in one way and Zenoh in the other way.

If you configure the bridge "Subscriber" 2 to deny also the DDS Publishers on "/pub", it will ignore the bridge "Subscriber" 1, and there will be no loop:

...
deny: {
  subscribers: ["/pub"],
  publishers: ["/pub"],
},
...

@lukaszanger
Copy link
Author

No: not specifying any publishers field in the allow struct is exactly the same than specifying an empty list with publishers: [].

Yes, you are right. This is also explained here in the DEFAULT_CONFIG.json5. Thanks for clarification!

To sum it up: This issue is solved with the following two modifications:

  1. Your bug fix in On remote announcement, check local allow/deny config #344.
  2. In addition to my original setup, also deny the publisher /pub as you explained in your previous comment.

So, we can close this issue.

@JEnoch JEnoch closed this as completed Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants