Multiple parallel fi_send/fi_recv through a single FI_EP_MSG endpoint using verbs provider #10327
Unanswered
dariuszsciebura
asked this question in
Q&A
Replies: 1 comment 2 replies
-
if you need to place data into known destination buffer, use fi_write* APIs. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
Let's imagine a following setup: we have 2 machines equipped with rnics supported by libfabric VERBS provider. We create a single FI_EP_MSG endpoint per machine, setup the connection - we are ready to transfer data.
Scenarios to consider:
A) On a 'sender' machine we create a single thread that posts a series of fi_send/fi_senddata to transfer context of our independent N buffers. Similarly on the 'receiver' machine we also create a single thread that receives the data (fi_recv), filling its own buffers one after another. I guess, due to message ordering nature in FI_EP_MSG, we receive the data in order they were transferred (known to both sides of the transmission), so if the sender sent buffers: A, B, C, the receiver posting 3 fi_recv(s) will get A, B, C.
B) On the sender machine we create a N threads, each of which does post fi_send. On the receiver side we still have only a single thread which posts N fi_recv(s). Now, the order of transfers is unknown (undeterministic). To identify the transfer we can switch fi_send to fi_senddata to 'tag' the messages, which will help determine which transfer had been finished (when polling the CQ). But it doesn't seem to help with matching sends with corresponding receives. So, as the result, we end up with sender data being put to random receive buffers. Am I right about it?
A remedy for this issue could be tagged messages, but they are not supported in FI_EP_MSG. Another potential solution for this issue could be scalable endpoints, but again - they are not supported in FI_EP_MSG. Is it then possible to handle such scenarios when using FI_EP_MSG with verbs?
Beta Was this translation helpful? Give feedback.
All reactions