Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mctpd: AllocateEndpoints implementation #46

Open
PeterHo-wiwynn opened this issue Jul 8, 2024 · 18 comments
Open

mctpd: AllocateEndpoints implementation #46

PeterHo-wiwynn opened this issue Jul 8, 2024 · 18 comments
Assignees

Comments

@PeterHo-wiwynn
Copy link

We have a system which has devices at lower bus hierarchy.

As descibed in DSP0236, Bus owners are responsible for allocating pools of EIDs to MCTP bridges that are lower in the bus hierarchy. This is done using the Allocate Endpoint IDs command..

The sequence will be like:

sequenceDiagram
    BMC->>MCTP_bridge: SetEndpoint
    MCTP_bridge->>BMC: Endpoint requires EID pool allocation.
    BMC->>MCTP_bridge: AllocateEndpoints
    MCTP_bridge ->>MCTP_devices: SetEndpoint
Loading

The BMC need to send Starting EID and Number of EIDs. I think we can send next available EID as the Starting EID when we call AssignStaticEndpoint. (e.g. AssignStaticEndpoint with 10, and we give 11 as Starting EID) However, the Number of EIDs is hard to define a specific number. Moreover, for the dynamic EID, it's difficult to choose which number we should send as the Starting EID. If EID 8 and 10 is occupied, should we send Starting EID 9 and 1 as the Number of EIDs?

We might need to clarify how we implement AllocateEndpoints in mctpd.

@jk-ozlabs jk-ozlabs self-assigned this Jul 9, 2024
@jk-ozlabs
Copy link
Member

jk-ozlabs commented Jul 9, 2024

All sounds good. We'll likely need some configuration to assign these pools of EIDs to the downstream bridges.

First impressions for this may involve @amboar 's work on the OpenBMC mctpreactor to control the configuration of these, which then sends the pool data to mctpd via dbus, on discovery of these bridge devices.

So, we'll need to define the appropriate place in the dbus hierarchy to represent the EID pool data.

@jk-ozlabs
Copy link
Member

Although, given that we have the (requested) pool size as part of the Set Endpoint ID response, we can just allocate that directly from the existing dynamic EID pool, and not need any extra configuration. Would that work for you?

We could later add limits and pre-defined allocations from the dynamic pool if necessary, but that doesn't seem to be required at the moment.

@PeterHo-wiwynn
Copy link
Author

Although, given that we have the (requested) pool size as part of the Set Endpoint ID response, we can just allocate that directly from the existing dynamic EID pool, and not need any extra configuration. Would that work for you?

Oh, I missed that part. This makes sense now. I think I can start implementing it, and will make a PR when it's done.

@santoshpuranik
Copy link

@jk-ozlabs , @PeterHo-wiwynn : At NVIDIA, we are also looking at enabling bridge support with mctpd. Wondering if there is work started on this already? I wanted to get some understanding on how that design/D-Bus API would look like.

As previous comments on this issue already point out, an endpoint can tell the TBO that it is a bridge and requires pool EID allocation as a part of the response to SetEndpointID.

In case the EID assignments on the platform are dynamic, mctpd can simply call AllocateEndpointIDs with the next available EID (and reserve that EID + pool size).

However, there are platforms that desire setting fixed EIDs as the TBO. For such a use case, do you think it is reasonable to modify the AssignEndpointStatic D-Bus API to take an additional parameter (pool EID start) in addition to the EID parameter it takes today? Alternatively, we could define a new D-Bus API for bridges?

Can this EID and pool start parameter be supplied as an entity-manager configuration to the MCTP reactor?

Further, after the EID pool is assigned to the bridge, mctpd should query for the routing table from the bridge (GetRoutingTableEntries) and use the responses to setup routes and host the downstream EIDs as D-Bus objects. Does this approach seem right?

We would be happy to make PRs to add this function if you agree with this design/once we decide on the right design.

@amboar
Copy link
Contributor

amboar commented Nov 13, 2024

mctpd should query for the routing table from the bridge (GetRoutingTableEntries) and use the responses to setup routes and host the downstream EIDs as D-Bus objects. Does this approach seem right?

My understanding is Get Routing Table Entries should not be required for network maintenance operations. Rather, it's a debugging tool provided by the spec (DSP0236). If a design requires its use for non-debugging purposes, my hunch is that the design needs some more consideration.

@santoshpuranik
Copy link

mctpd should query for the routing table from the bridge (GetRoutingTableEntries) and use the responses to setup routes and host the downstream EIDs as D-Bus objects. Does this approach seem right?

My understanding is Get Routing Table Entries should not be required for network maintenance operations. Rather, it's a debugging tool provided by the spec (DSP0236). If a design requires its use for non-debugging purposes, my hunch is that the design needs some more consideration.

Almost none of the MCTP control commands are marked "mandatory to generate" in Table 12 of the spec. :) So in that sense, yes, Get Routing Table Entries should not be required.

However, the spec. (https://www.dmtf.org/sites/default/files/standards/documents/DSP0236_1.3.1.pdf) does say this in section 12.12:

"This command can be used to request an MCTP bridge or bus owner to return data corresponding to its
present routing table entries. This data is used to enable troubleshooting the configuration of routing
tables and to enable software to draw a logical picture of the MCTP network"

Drawing a logical picture of the MCTP network is a function of the control daemon, I think.

As a more concrete use case, one of the fields in a routing table entry is the physical transport binding identifier (byte 4). For scenarios where an endpoint has multiple (possibly redundant) physical paths from a bridge (ex. Serial/I2C/I3C), this field can be used by applications to choose the "fastest" path.

@amboar
Copy link
Contributor

amboar commented Nov 13, 2024

Almost none of the MCTP control commands are marked "mandatory to generate" in Table 12 of the spec. :) So in that sense, yes, Get Routing Table Entries should not be required.

Mandatory / Optional wasn't what I was referring to, rather the description of the command itself (as you quoted).

Drawing a logical picture of the MCTP network is a function of the control daemon, I think.

Sure, but who's looking at the picture, and why? My understanding of the intent of the description is "not software, but humans".

As a more concrete use case, one of the fields in a routing table entry is the physical transport binding identifier (byte 4). For scenarios where an endpoint has multiple (possibly redundant) physical paths from a bridge (ex. Serial/I2C/I3C), this field can be used by applications to choose the "fastest" path.

This can be made with entirely local decisions at each node ("Which of the interfaces is the fastest interface for my local route entries through which the destination is reachable?"). I don't see that there's a reason to fetch another node's route table? However, the other way this is learned is via Query Hop as part of path resolution, and again, the response is generated with purely local reasoning based on the route table state at each Query Hop destination.

@santoshpuranik
Copy link

Almost none of the MCTP control commands are marked "mandatory to generate" in Table 12 of the spec. :) So in that sense, yes, Get Routing Table Entries should not be required.

Mandatory / Optional wasn't what I was referring to, rather the description of the command itself (as you quoted).

Ack.

Drawing a logical picture of the MCTP network is a function of the control daemon, I think.

Sure, but who's looking at the picture, and why? My understanding of the intent of the description is "not software, but humans".

The specification does say software specifically. I understand that was the intent. As for why, please see the response below.

As a more concrete use case, one of the fields in a routing table entry is the physical transport binding identifier (byte 4). For scenarios where an endpoint has multiple (possibly redundant) physical paths from a bridge (ex. Serial/I2C/I3C), this field can be used by applications to choose the "fastest" path.

This can be made with entirely local decisions at each node ("Which of the interfaces is the fastest interface for my local route entries through which the destination is reachable?"). I don't see that there's a reason to fetch another node's route table? However, the other way this is learned is via Query Hop as part of path resolution, and again, the response is generated with purely local reasoning based on the route table state at each Query Hop destination.

Not really? Consider a case where we have an MCTP bridge that connects to the same downstream endpoint via two different PHY buses of different speeds. The bridge's routing table would expose the medium type in the routing table. A TBO above the bridge would then query the bridge for its routing table and determine which of the two EIDs (belonging to the same device) to use. A QueryHop would not work here because both EIDs are at the same distance from the bridge.

Another scenario is that of two bus owners for a bridge (Fig. 22, section 12.10 of DSP0236 v1.3.1). How would a bus owner query for EIDs downstream to the bridge without issuing a GetRoutingTableEntries?

In the absence of GetRoutingTableEntries, how would the control daemon know if each EID in the pool it allocated to the bridge was actually assigned to a device that exists (the pool requested could be a max config case by the bridge, for ex.). How wouldl the TBO learn about EIDs in the bridge's routing table that may have been statically allocated by the bridge itself?

@amboar
Copy link
Contributor

amboar commented Nov 27, 2024

The specification does say software specifically. I understand that was the intent. As for why, please see the response below.

The spec does literally contain the word "software", but my interpretation of what's described is that it's a mechanism to query the network topology for the purpose of drawing a picture for people (e.g. via graphviz).

This data is used to enable troubleshooting the configuration of routing tables

That's "figure out what's wrong, after-the-fact", not "ongoing automated network maintenance", IMO.

Consider a case where we have an MCTP bridge that connects to the same downstream endpoint via two different PHY buses of different speeds. The bridge's routing table would expose the medium type in the routing table.

In it's own routing table, yes. But the routing table isn't global; each node has its own understanding of the routing table's composition, and the medium type of the interface associated with an EID is only something that represented for local peers. See the route table entry for EIDs 13 and 14 in Table 5 – Example 2 Routing table for D1 for Figure 13 – Example 2 Routing topology in DSP0236 v1.3.3. The decision of how to route a message from e.g. D2 to D5 is a local decision at bridges D1 and D3. In this case the route is distinguished by the different EIDs used on each port and is discoverable via Query Hop.

A TBO above the bridge would then query the bridge for its routing table and determine which of the two EIDs (belonging to the same device) to use.

I think this is a responsibility that sits at the application layer, above the MCTP transport. Selection of which EID to use as the destination is something the MCTP transport layer needs to be told by the application, not chosen on behalf of the application. We do have to expose enough information for the application to make practical judgements here, but that can only really be done with local information, or (perhaps) information gathered using Query Hop.

I think 9.3.2 Resolving multiple paths is instructive here:

If device D wishes to send a message to EID 100, bridge X can choose to route that message either
through bus 1 or bus 2
. MCTP does not have a requirement on how this is accomplished. The general
recommendation is that the bridge preferentially selects the faster available medium. In this example, that
would be PCIe.

(emphasis added)

Another scenario is that of two bus owners for a bridge (Fig. 22, section 12.10 of DSP0236 v1.3.1). How would a bus owner query for EIDs downstream to the bridge without issuing a GetRoutingTableEntries?

It (Device X) doesn't need to, because it allocated the EIDs to the bridge (Device Y). It already knows what EIDs Device Y is capable of handing out by definition.

In the absence of GetRoutingTableEntries, how would the control daemon know if each EID in the pool it allocated to the bridge was actually assigned to a device that exists

It doesn't need to know, because any hop in a route can choose to drop packets if the destination is not routable. The only thing that needs to be known at the BO of a bridge is the range of EIDs the BO allocated to the bridge. The bridge then drops any unroutable packets.

How wouldl the TBO learn about EIDs in the bridge's routing table that may have been statically allocated by the bridge itself?

The spec makes it clear that any static EID allocation in the network must be global knowledge. You cannot statically allocate an EID under a bridge without the entire network being aware of it. The spec also makes it clear that communicating static allocations of EIDs is outside the scope of the spec.

@santoshpuranik
Copy link

Thanks for the responses, Andrew.

A TBO above the bridge would then query the bridge for its routing table and determine which of the two EIDs (belonging to the same device) to use.

I think this is a responsibility that sits at the application layer, above the MCTP transport. Selection of which EID to use as the destination is something the MCTP transport layer needs to be told by the application, not chosen on behalf of the application. We do have to expose enough information for the application to make practical judgements here, but that can only really be done with local information, or (perhaps) information gathered using Query Hop.

Yes, it is the responsibility of the application that is above the MCTP layer. As for exposing this information, a Query Hop does not really help in cases where the absolute number of hops taken is the same. Could you elaborate on what "local information" is, please?

I think 9.3.2 Resolving multiple paths is instructive here:

If device D wishes to send a message to EID 100, bridge X can choose to route that message either
through bus 1 or bus 2
. MCTP does not have a requirement on how this is accomplished. The general
recommendation is that the bridge preferentially selects the faster available medium. In this example, that
would be PCIe.

(emphasis added)

Yes, this works if the EID is the same and can be reached from two buses. We have cases where the same downstream device is accessible to the bridge via two different buses and two different EIDs. The Medium Type property of the device's routing table entry in the bridge can help the application running on the TBO in making the choice about the EID to use.

Another scenario is that of two bus owners for a bridge (Fig. 22, section 12.10 of DSP0236 v1.3.1). How would a bus owner query for EIDs downstream to the bridge without issuing a GetRoutingTableEntries?

It (Device X) doesn't need to, because it allocated the EIDs to the bridge (Device Y). It already knows what EIDs Device Y is capable of handing out by definition.

Referring to the same fig. in the spec. How does Device Z know about the EIDs downstream to Y?

In the absence of GetRoutingTableEntries, how would the control daemon know if each EID in the pool it allocated to the bridge was actually assigned to a device that exists

It doesn't need to know, because any hop in a route can choose to drop packets if the destination is not routable. The only thing that needs to be known at the BO of a bridge is the range of EIDs the BO allocated to the bridge. The bridge then drops any unroutable packets.

How wouldl the TBO learn about EIDs in the bridge's routing table that may have been statically allocated by the bridge itself?

The spec makes it clear that any static EID allocation in the network must be global knowledge. You cannot statically allocate an EID under a bridge without the entire network being aware of it. The spec also makes it clear that communicating static allocations of EIDs is outside the scope of the spec.

Ack. Not sure I entirely agree when there is facility in the spec. to query the bridge for EIDs and use the information returned to build that knowledge programmatically, but sure, it is outside the scope of the spec.

@amboar
Copy link
Contributor

amboar commented Dec 2, 2024

Could you elaborate on what "local information" is, please?

"Local information" was referring to the bus type of an interface over which the destination EID is reachable, as will be stored in the current node's route table.

As for exposing this information, a Query Hop does not really help in cases where the absolute number of hops taken is the same.

That's not entirely true. The Query Hop response contains the upstream and downstream transmission unit size for the target link, which may provide an indication of the potential throughput. That will likely be higher for PCIe than SMBus. However, it's only a heuristic.

Yes, this works if the EID is the same and can be reached from two buses. We have cases where the same downstream device is accessible to the bridge via two different buses and two different EIDs. The Medium Type property of the device's routing table entry in the bridge can help the application running on the TBO in making the choice about the EID to use.

This is not, in my reading, how the spec intended the route tables to work, as it requires global knowledge of the network topology at the resolution of the physical transports for each link. Certainly IP networks are not managed that way; similarly, IP nodes make routing decisions for packets based on destination constraints and then their local interface priorities.

My hunch is your design might have entered "don't do that" territory, much like putting MCTP endpoints behind I2C muxes. The alternative is that you don't assign an EID per interface on your destination device. Rather, use a single EID for all interfaces, and then the upstream bridge is able to use its local knowledge of the relevant link speeds to direct traffic appropriately. Set Endpoint ID provides for responding to the bus owner to express that it already has a dynamic EID assigned, so it shouldn't matter which interface first accepts an EID.

Referring to the same fig. in the spec. How does Device Z know about the EIDs downstream to Y?

In accordance with the statements around EID allocation races across the buses from X, there are two scenarios:

  1. Z wins the Set Endpoint ID race for Y, and subsequently allocates an EID pool to Y (which is a subset of the EID pool allocated to Z by X). Z knows the EIDs downstream to Y by definition of having allocated the pool to Y. Z updates its route table accordingly
  2. X wins the Set Endpoint ID race for Y. After allocating a pool of EIDs to Y, X updates its own route table, and then issues a Routing Information Update to Z to inform it of the EIDs associated with Y (Y's own EID and the EIDs in the pool it allocated to Y).

In the second scenario, Z will receive a response to its own Set Endpoint ID that provides the EID that Y has already been allocated, in which case Z can update its own route table to avoid the long path to Y via X (and the devices underneath Y).

@santoshpuranik
Copy link

"Local information" was referring to the bus type of an interface over which the destination EID is reachable, as will be stored in the current node's route table.

OK, so the local bus to the endpoint.

As for exposing this information, a Query Hop does not really help in cases where the absolute number of hops taken is the same.

That's not entirely true. The Query Hop response contains the upstream and downstream transmission unit size for the target link, which may provide an indication of the potential throughput. That will likely be higher for PCIe than SMBus. However, it's only a heuristic.

Agree. On most implementations, though, the MTU size is the same irrespective of the transport due to the underlying packet buffers in small endpoint devices. However, certain transports can be much faster that others even with the same MTU. Like you said, the MTU size is but a heuristic.

This is not, in my reading, how the spec intended the route tables to work, as it requires global knowledge of the network topology at the resolution of the physical transports for each link. Certainly IP networks are not managed that way; similarly, IP nodes make routing decisions for packets based on destination constraints and then their local interface priorities.

My hunch is your design might have entered "don't do that" territory, much like putting MCTP endpoints behind I2C muxes. The alternative is that you don't assign an EID per interface on your destination device. Rather, use a single EID for all interfaces, and then the upstream bridge is able to use its local knowledge of the relevant link speeds to direct traffic appropriately. Set Endpoint ID provides for responding to the bus owner to express that it already has a dynamic EID assigned, so it shouldn't matter which interface first accepts an EID.

Not sure I understand the parallels with the I2C mux here :) The spec does not explicitly prohibit devices from having a distinct EID per physical bus.

Referring to the same fig. in the spec. How does Device Z know about the EIDs downstream to Y?

In accordance with the statements around EID allocation races across the buses from X, there are two scenarios:

1. `Z` wins the `Set Endpoint ID` race for `Y`, and subsequently allocates an EID pool to `Y` (which is a subset of the EID pool allocated to `Z` by `X`). `Z` knows the EIDs downstream to `Y` by definition of having allocated the pool to `Y`. `Z` updates its route table accordingly

2. `X` wins the `Set Endpoint ID` race for `Y`. After allocating a pool of EIDs to `Y`, `X` updates its own route table, and then issues a `Routing Information Update` to `Z` to inform it of the EIDs associated with `Y` (`Y`'s own EID and the EIDs in the pool it allocated to `Y`).

In the second scenario, Z will receive a response to its own Set Endpoint ID that provides the EID that Y has already been allocated, in which case Z can update its own route table to avoid the long path to Y via X (and the devices underneath Y).

Perhaps I should have made this clearer :) On the topologies we deal with, X and Z not connected over an MCTP bus directly, so X has no way to send a Routing Information Update to Z. Z has to learn about the EID pool and downstream endpoints by querying Y's routing table.

Leaving Get Routing Table Entries aside for a bit, there are also requirements to keep the bridge's EID and those of the endpoints downstream of the bridge static. With that in mind, do you see any issues if we make a PR to do the following:

  • Extend the D-Bus API AssignEndpointStatic to take both an EID and an (optional) pool EID start in case the device is a bridge. Alternatively, we can add a new D-Bus method specifically to allocate endpoint ID + pool start EID to bridges.
  • As an implementation of the above, mctpd will add routes to the bridge itself as well as all of the downstream endpoints, query their properties (supported message types, uuid) and publish them as D-Bus objects.

@amboar
Copy link
Contributor

amboar commented Dec 4, 2024

Not sure I understand the parallels with the I2C mux here :)

You shouldn't put MCTP-capable devices underneath an I2C mux because the mux state might prevent propagation of messages from the device. So despite the fact that you can design a system with MCTP devices under an I2C mux, my recommendation is that you don't. In much the same way, I'm suggesting that perhaps ...

The spec does not explicitly prohibit devices from having a distinct EID per physical bus.

... you shouldn't assign distinct endpoint IDs to each interface on the device where each interface has vastly different throughputs, as while no-one is stopping you from doing so, it seems you're creating trouble for yourself with the EID selection and routing.

On the topologies we deal with, X and Z not connected over an MCTP bus directly, so X has no way to send a Routing Information Update to Z.

So Y is the bus owner for Z?

If not, you have multiple roots in your MCTP network, which (as far as I'm aware) goes outside the spec and thus is in "don't do that" territory.

Leaving Get Routing Table Entries aside for a bit, there are also requirements to keep the bridge's EID and those of the endpoints downstream of the bridge static.

I think I'm interpreting "there are also requirements..." correctly as "our platform design desires...", is that reasonable?

Regardless, "static" has a specific meaning in the context of the spec, which is "pre-configured default assigned non-zero value" (line 1071, Section 8.17.2, DSP0236 v1.3.1). Static endpoint IDs are not EIDs that are handed out by a bus owner.

I pushed back a bit against the use of the word "static" in AssignEndpointStatic because I thought it was confusing: If we're assigning an endpoint ID, then it is by definition not static (it may, however, be deterministic, depending on the implementation of the bus owner logic for all bus owners in the device's hierarchy).

It feels like that confusion might be bearing out here.

as well as all of the downstream endpoints

mctpd only has immediate visibility of local endpoints, not endpoints underneath a (local) bridge. If there's out-of-band knowledge of the presence of a device below a bridge, something will need to trigger mctpd to query it, if mctpd is to expose an object for it.

@santoshpuranik
Copy link

santoshpuranik commented Dec 4, 2024

You shouldn't put MCTP-capable devices underneath an I2C mux because the mux state might prevent propagation of messages from the device. So despite the fact that you can design a system with MCTP devices under an I2C mux, my recommendation is that you don't. In much the same way, I'm suggesting that perhaps ...

Ah! yeah, the whole I2C mux and PLDM type 5/events not supported. Don't get me started on that :)

The spec does not explicitly prohibit devices from having a distinct EID per physical bus.

... you shouldn't assign distinct endpoint IDs to each interface on the device where each interface has vastly different throughputs, as while no-one is stopping you from doing so, it seems you're creating trouble for yourself with the EID selection and routing.

Noted, but this is something we do have to deal with right now because we cannot have all devices change their firmware to deal with this change

On the topologies we deal with, X and Z not connected over an MCTP bus directly, so X has no way to send a Routing Information Update to Z.

So Y is the bus owner for Z?

If not, you have multiple roots in your MCTP network, which (as far as I'm aware) goes outside the spec and thus is in "don't do that" territory.

No, Z is another bus owner for Y over a different physical bus than X. That is kind of a grey area in the spec., I think. It does not explicitly call out anti-patterns. Nonetheless, we have no choice but to support a topology like that since it already exists, though we could certainly influence future topologies to "don't do that".

Leaving Get Routing Table Entries aside for a bit, there are also requirements to keep the bridge's EID and those of the endpoints downstream of the bridge static.

I think I'm interpreting "there are also requirements..." correctly as "our platform design desires...", is that reasonable?

Yes, "it is desired that" ...

Regardless, "static" has a specific meaning in the context of the spec, which is "pre-configured default assigned non-zero value" (line 1071, Section 8.17.2, DSP0236 v1.3.1). Static endpoint IDs are not EIDs that are handed out by a bus owner.

I pushed back a bit against the use of the word "static" in AssignEndpointStatic because I thought it was confusing: If we're assigning an endpoint ID, then it is by definition not static (it may, however, be deterministic, depending on the implementation of the bus owner logic for all bus owners in the device's hierarchy).

It feels like that confusion might be bearing out here.

Absolutely! I meant to say deterministic EID assignments across devices that the BMC manages (including those underneath any bridges)

as well as all of the downstream endpoints

mctpd only has immediate visibility of local endpoints, not endpoints underneath a (local) bridge. If there's out-of-band knowledge of the presence of a device below a bridge, something will need to trigger mctpd to query it, if mctpd is to expose an object for it.

Okay, this confuses me -- When mctpd sends AllocateEndpoints to the bridge (as a follow up to calling SetEndpointID to the bridge), it already knows the pool EID start and the pool size, so it does have the knowledge of downstream endpoint IDs that the bridge could have assigned.

So, we should just use that information to start querying for those EID's properties and publish them on D-Bus (for those that do respond)? Isn't that what @jk-ozlabs suggested earlier in this issue here: #46 (comment)?

If I read that wrong, how do you think mctpd would know about the endpoints downstream of the bridge, do you have a flow in mind? It isn't always possible to have out-of-band knowledge of what is dowstream.

@amboar
Copy link
Contributor

amboar commented Dec 5, 2024

as well as all of the downstream endpoints

mctpd only has immediate visibility of local endpoints, not endpoints underneath a (local) bridge. If there's out-of-band knowledge of the presence of a device below a bridge, something will need to trigger mctpd to query it, if mctpd is to expose an object for it.

Okay, this confuses me -- When mctpd sends AllocateEndpoints to the bridge (as a follow up to calling SetEndpointID to the bridge), it already knows the pool EID start and the pool size, so it does have the knowledge of downstream endpoint IDs that the bridge could have assigned.

Sorry, on reflection what I wrote was a bit subtle. You're right that mctpd knows the endpoint ID pool it allocated to the bridge. It's certainly possible that mctpd could regularly poll the EIDs it allocated to the bridge's pool.

But to clarify, I wasn't talking about endpoint IDs, rather the endpoint devices themselves. Unless there are exceptional circumstances, only the bridge is aware of the coming and going of devices on buses for which it is the owner. Thus mctpd is reduced to polling the EIDs it allocated to the bridge's pool, unless some other hardware signal provides presence information for these devices to mctpd. mctpd is also not in charge of managing those (non-local) devices (the bridge is), and so I think would expose a limited set of device properties (e.g. network, UUID) and none of the usual management capabilities (endpoint ID management, recovery).

It isn't always possible to have out-of-band knowledge of what is dowstream.

Right, short of mctpd polling non-local EIDs for responses, I think it would require explicit hardware support.

Is it reasonable for mctpd to be regularly polling every EID that it has allocated out? I'm a bit wary of the idea, but I understand why it might feel desirable.

Regarding:

Absolutely! I meant to say deterministic EID assignments across devices that the BMC manages (including those underneath any bridges)

and

The spec does not explicitly prohibit devices from having a distinct EID per physical bus.

... you shouldn't assign distinct endpoint IDs to each interface on the device where each interface has vastly different throughputs, as while no-one is stopping you from doing so, it seems you're creating trouble for yourself with the EID selection and routing.

Noted, but this is something we do have to deal with right now because we cannot have all devices change their firmware to deal with this change

and

Nonetheless, we have no choice but to support a topology like that since it already exists, though we could certainly influence future topologies to "don't do that".

I have a few thoughts:

  • If you're relying on EID allocations that are deterministic, then is it the case that your platform has a-priori knowledge of which EID is associated with the faster bus interface? One strategy you could use is to do the polling you're proposing for mctpd in the affected applications: Probe the responsiveness of the EID on the fast bus, use it if it's responsive, and if not, fall back to using the EID of the slow bus (... if that's functional). In this case there are no mctpd changes required, just some logic in your affected applications that are already dealing with your deterministic EID assignment, which is where the choice about the destination EID needs to be made anyway.
  • I would like to make a distinction about your use of "we" in "we cannot have all devices change their firmware to deal with this change" - I don't see it as referring to the community of the the mctp project here, rather the organisation behind your current platform of interest. While the community here (what I would pick as the meaning for "we" here in the issue tracker) will endeavour to help, IMO it shouldn't feel obliged to support platform designs that appear to be lurking in grey areas of the spec.
  • Together it feels like for future designs you might make different design choices that better align with the intent of the spec. If so, I think it's reasonable that we don't add (what I think are) work-arounds for your current platform to mctpd. If you think changes to mctpd are necessary it might be best if they prove their need in a fork for your current platform. However, I'll defer to @jk-ozlabs and @mkj for their thoughts here too.

@santoshpuranik
Copy link

Sorry, on reflection what I wrote was a bit subtle. You're right that mctpd knows the endpoint ID pool it allocated to the bridge. It's certainly possible that mctpd could regularly poll the EIDs it allocated to the bridge's pool.

But to clarify, I wasn't talking about endpoint IDs, rather the endpoint devices themselves. Unless there are exceptional circumstances, only the bridge is aware of the coming and going of devices on buses for which it is the owner. Thus mctpd is reduced to polling the EIDs it allocated to the bridge's pool, unless some other hardware signal provides presence information for these devices to mctpd. mctpd is also not in charge of managing those (non-local) devices (the bridge is), and so I think would expose a limited set of device properties (e.g. network, UUID) and none of the usual management capabilities (endpoint ID management, recovery).

I see. For starters, I think it would suffice that mctpd follow up the AllocateEndpoints with querying downstream endpoints for their MCTP message type and UUIDs. Like you describe above, we should not allow any "normal" management operations on these downstream endpoints.

It isn't always possible to have out-of-band knowledge of what is dowstream.

Right, short of mctpd polling non-local EIDs for responses, I think it would require explicit hardware support.

Is it reasonable for mctpd to be regularly polling every EID that it has allocated out? I'm a bit wary of the idea, but I understand why it might feel desirable.

To solve this very issue, our downstream MCTP control daemon implementation relies on a DiscoveryNotify from the bridge and uses the bridge's routing table to determine the current status of the endpoints downstream (which is what I proposed mctpd do in #53).

I have a few thoughts:

* If you're relying on EID allocations that are deterministic, then is it the case that your platform has a-priori knowledge of which EID is associated with the faster bus interface? One strategy you could use is to do the polling you're proposing for `mctpd` in the affected applications: Probe the responsiveness of the EID on the fast bus, use it if it's responsive, and if not, fall back to using the EID of the slow bus (... if that's functional). In this case there are no `mctpd` changes required, just some logic in your affected applications that are already dealing with your deterministic EID assignment, which is where the choice about the destination EID needs to be made anyway.

Sure, that is reasonable. As long as we have a solid mechanism to control static EID assignments across MCTP interfaces :)

* I would like to make a distinction about your use of "we" in "we cannot have all devices change their firmware to deal with this change" - I don't see it as referring to the community of the the mctp project here, rather the organisation behind your current platform of interest. While the community here (what I would pick as the meaning for "we" here in the issue tracker) will endeavour to help, IMO it shouldn't feel obliged to support platform designs that appear to be lurking in grey areas of the spec.

Understood. It wasn't my intention to impose any of our platform's design choices/quirks on mctpd. I was merely pointing out that the end devices we deal with expect different EIDs different PHY buses.

* Together it feels like for future designs you might make different design choices that better align with the intent of the spec. If so, I think it's reasonable that we don't add (what I think are) work-arounds for your current platform to `mctpd`. If you think changes to `mctpd` are necessary it might be best if they prove their need in a fork for your current platform. However, I'll defer to @jk-ozlabs and @mkj for their thoughts here too.

OK. I do appreciate the discussion on this issue here! I would like to focus on the AllocateEndpoints impl. for a bit and we can discuss the GetRoutingTableEntries impl. (if at all needed) in another issue.

Do you think the following is a reasonable implementation of AllocateEndpoints:

  • mctpd calls SetEndpointID to an MCTP device based on an external trigger like the AssignEndpoint D-Bus API.
  • If the device on the other side is an MCTP bridge, it might indicate that it needs pool EID allocation in the response to SetEndpointID.
  • mctpd will then call AllocateEndpointId with the start EID of the pool and by reserving a pool of EIDs : pool EID start + pool size (assuming we have EIDs available in the pool).
  • mctpd will add routes to all the EIDs in the range [pool EID start, pool EID start + pool size].
  • mctpd will query properties from the EIDs in the range above (MCTP message types and UUID).
  • mctpd will publish the bridge as well as the range of EIDs allocated to the bridge as D-Bus objects along with the properties queried above.

In addition to the above (which handles dynamic EID allocations), I propose that we:

  • Extend the AssignEndpointStatic method to take an additional argument to get a static EID for the pool start.
  • The rest of the implementation follows the steps outlined above, but uses the static pool start EID supplied by the caller.

@amboar
Copy link
Contributor

amboar commented Dec 11, 2024

I see. For starters, I think it would suffice that mctpd follow up the AllocateEndpoints with querying downstream endpoints for their MCTP message type and UUIDs.

So my concern is it's not clear to me that immediately querying the EIDs allocated to the bridge's pool should succeed; there's going to be some level of concurrency there that may not guarantee they've been assigned prior to the query from the bridge's bus owner, which I think gets us back to some level of polling.

To solve this very issue, our downstream MCTP control daemon implementation relies on a DiscoveryNotify from the bridge and uses the bridge's routing table to determine the current status of the endpoints downstream (which is what I proposed mctpd do in #53).

From 12.15 Discovery Notify, DSP0236 v1.3.1:

This message should only be sent from endpoints to the bus owner for the bus that the endpoint is on so
it can notify the bus owner that the endpoint has come online and may require an EID assignment or
update.

In my reading that description excludes a bridge from notifying its own bus owner about EID assignment events to devices downstream of the bridge. The intent is that it's purely a signal for the device's bus owner to assign an EID to the device.

  • mctpd will query properties from the EIDs in the range above (MCTP message types and UUID).
  • mctpd will publish the bridge as well as the range of EIDs allocated to the bridge as D-Bus objects along with the properties queried above.

The only concern I have here is that polling seems necessary, as discussed previously.

  • Extend the AssignEndpointStatic method to take an additional argument to get a static EID for the pool start.

I'm going to defer to @jk-ozlabs and @mkj 's thoughts there. I don't have an immediate opposition beyond vague concerns about spreading complexity of the network configuration (but that's already the case for AssignEndpointStatic).

@santoshpuranik
Copy link

So my concern is it's not clear to me that immediately querying the EIDs allocated to the bridge's pool should succeed; there's going to be some level of concurrency there that may not guarantee they've been assigned prior to the query from the bridge's bus owner, which I think gets us back to some level of polling.

Yes, that is a fair point. Could that just be done via giving the bridge a "reasonable" amount of time to have completed all the downstream EID pool assignments? We can follow this up later with a polling implementation.

To solve this very issue, our downstream MCTP control daemon implementation relies on a DiscoveryNotify from the bridge and uses the bridge's routing table to determine the current status of the endpoints downstream (which is what I proposed mctpd do in #53).

From 12.15 Discovery Notify, DSP0236 v1.3.1:

This message should only be sent from endpoints to the bus owner for the bus that the endpoint is on so
it can notify the bus owner that the endpoint has come online and may require an EID assignment or
update.

In my reading that description excludes a bridge from notifying its own bus owner about EID assignment events to devices downstream of the bridge. The intent is that it's purely a signal for the device's bus owner to assign an EID to the device.

Agreed. Let us discuss DiscoveryNotify handling in a separate issue. You are right that the spec. does not explicitly call out the above as a use-case. I can talk to the author of the spec. to get this added since theya re the ones who suggested we use this mechanism in our current downstream implementation, but that is not going to happen immediately :).

  • mctpd will query properties from the EIDs in the range above (MCTP message types and UUID).
  • mctpd will publish the bridge as well as the range of EIDs allocated to the bridge as D-Bus objects along with the properties queried above.

The only concern I have here is that polling seems necessary, as discussed previously.

  • Extend the AssignEndpointStatic method to take an additional argument to get a static EID for the pool start.

I'm going to defer to @jk-ozlabs and @mkj 's thoughts there. I don't have an immediate opposition beyond vague concerns about spreading complexity of the network configuration (but that's already the case for AssignEndpointStatic).

Ack. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants