-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prov/rxm: add FI_PEER support to rxm #10510
base: main
Are you sure you want to change the base?
Conversation
(will move first 3 commits into separate PR, leaving here for testing) |
c860a1e
to
79f6e8b
Compare
prov/rxm/src/rxm_cq.c
Outdated
ret = ofi_cq_write_error_trunc(rx_buf->ep->util_ep.rx_cq, | ||
ret = ofi_peer_cq_write_error_trunc(rx_buf->ep->util_ep.rx_cq, | ||
rx_buf->recv_entry->context, | ||
rx_buf->recv_entry->comp_flags | | ||
rx_buf->pkt.hdr.flags, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alignment is off after the change.
rxm_cq_write(rxm_ep->util_ep.tx_cq, app_context, | ||
comp_flags, 0, NULL, 0, 0); | ||
(void) ofi_peer_cq_write(rxm_ep->util_ep.tx_cq, app_context, | ||
comp_flags, 0, NULL, 0, 0, FI_ADDR_NOTAVAIL); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alignment.
prov/rxm/src/rxm_cq.c
Outdated
|
||
if (ofi_peer_cq_write_error(rxm_ep->util_ep.rx_cq, &err_entry)) | ||
FI_WARN(&rxm_prov, FI_LOG_CQ, | ||
"Unable to ofi_peer_cq_Wwrite_error\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo "Wwrite"
prov/rxm/src/rxm_ep.c
Outdated
rxm_cq_write_rx_error(def_tx_entry->rxm_ep, | ||
ofi_op_msg, | ||
def_tx_entry->rndv_ack.rx_buf-> | ||
recv_entry->context, (int) ret); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try break the line after (
so that all the arguments are aligned. That's the preferred way if lines become too long with the regular style. Same for a few more places following this.
prov/rxm/src/rxm_tagged.c
Outdated
ret = ofi_peer_cq_write_error_peek(rxm_ep->util_ep.rx_cq, tag, | ||
context); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check the alignment.
prov/rxm/src/rxm_msg.c
Outdated
return util_srx_generic_recv(&rxm_ep->srx->ep_fid, msg->msg_iov, | ||
msg->desc, msg->iov_count, msg->addr, msg->context, | ||
flags | rxm_ep->util_ep.rx_msg_flags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alignment.
prov/rxm/src/rxm_msg.c
Outdated
return util_srx_generic_recv(&rxm_ep->srx->ep_fid, &iov, &desc, 1, | ||
src_addr, context, rxm_ep->util_ep.rx_op_flags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alignment.
prov/rxm/src/rxm_msg.c
Outdated
return util_srx_generic_recv(&rxm_ep->srx->ep_fid, iov, desc, count, | ||
src_addr, context, rxm_ep->util_ep.rx_op_flags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alignment.
prov/rxm/src/rxm_tagged.c
Outdated
return util_srx_generic_trecv(&rxm_ep->srx->ep_fid, &iov, &desc, 1, | ||
src_addr, context, tag, ignore, | ||
rxm_ep->util_ep.rx_op_flags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alignment off by 1 space.
prov/rxm/src/rxm_ep.c
Outdated
|
||
if (attr->op_flags & FI_PEER) { | ||
rxm_domain->srx = ((struct fi_peer_srx_context *) | ||
(context))->srx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check if srx
has already been set?
@j-xiong Almost all of the alignment you pointed out are places where I did it on purpose to maintain the 80 character limit. Do you want me to change all of those to adding a new line after the open bracket? It looks weird to me and I don't think we do it most places in libfabric but I can change them all if you like. |
Yes, I prefer making the changes if possible.
…________________________________
From: Alexia Ingerson ***@***.***>
Sent: Wednesday, November 20, 2024 3:19:29 PM
To: ofiwg/libfabric ***@***.***>
Cc: Xiong, Jianxin ***@***.***>; Mention ***@***.***>
Subject: Re: [ofiwg/libfabric] prov/rxm: add FI_PEER support to rxm (PR #10510)
@j-xiong<https://github.com/j-xiong> Almost all of the alignment you pointed out are places where I did it on purpose to maintain the 80 character limit. Do you want me to change all of those to adding a new line after the open bracket? It looks weird to me and I don't think we do it most places in libfabric but I can change them all if you like.
—
Reply to this email directly, view it on GitHub<#10510 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACA7MLOVE5NFQA3W6WEXOID2BUKHLAVCNFSM6AAAAABQ7JINH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOBZG4ZTEMRVGM>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Update - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
@shijin-aws Is the AWS CI failure related? |
bot:aws:retest |
2 mpichtestsuite test cases failed with Intel CI over verbs_rxm:
|
The rxm SAR segment type enum was defined inside another struct. While techincally ok, this made it difficult for editors to find the type and reported compiler errors. This cleans it up to make it more readible and easier for editors to find the type Signed-off-by: Alexia Ingerson <[email protected]>
Add application side support for FI_AV_USER_ID which requires saving the fi_addr input as the internal fi_addr (for both the peer API srx use case and for reporting unique source address information). When supporting the capability for the application, remove it form the core provider information as it is only required on the top layer Signed-off-by: Alexia Ingerson <[email protected]>
Support using the peer APIs by default using the util peer helper functions. Instead of going through the rxm-specific functions to write to CQs and counters, use the ofi_peer_cq/cntr APIs which use the owner ops. In the default case where rxm is not being used as a peer these will go to the regular ofi_cq_write functions. Signed-off-by: Alexia Ingerson <[email protected]>
Remove rxm implementation of receive queues and leverage the util srx implementation which supports the peer srx API. This allows rxm to use the peer API calls to match receives. To do this, move the rxm protocol information from the receive entry into the rx_buf and allocate it dynamically as needed to track protocol information. This allows rxm to use the default peer_rx_entry instead of its own custom receive entry. With this last piece of the peer API implemented, rxm can also now advertise full support of the FI_PEER capability. Just like the FI_AV_USER_ID capability, rxm removes the bit from the core provider info as it is only a requirement from the application side and not from the message provider Signed-off-by: Alexia Ingerson <[email protected]>
Add rxm support for FI_PEER (srx, cq, and cntr) and FI_AV_USER_ID to be used with the link provider