Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/logging: Fix race in writing to vrb_ep_ops #10588

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dsciebu
Copy link

@dsciebu dsciebu commented Nov 28, 2024

'vrb_ep_ops' is a static object which field 'ops_open' gets overriden in 'vrb_open_ep'. If the operation is performed by multiple threads at once, which is totally possible, we end up with a data race. Switching it to thread local fixes it.

Signed-off-by: Dariusz Sciebura [email protected]

'vrb_ep_ops' is a static object which field 'ops_open' gets
overriden in 'vrb_open_ep'. If the operation is performed by
multiple threads at once, which is totally possible, we end
up with a data race. Switching it to thread local fixes it.

Signed-off-by: Dariusz Sciebura <[email protected]>
@dsciebu
Copy link
Author

dsciebu commented Nov 28, 2024

There is a bunch of static variables spread all over the place. IMHO they all should be marked as _Thread_local, as, even if they are not overridden somewhere now, they can be in future leading to another race. If you agree, just let know - I can modify my PR. Otherwise, let's stick to this ver.

@piotrchmiel
Copy link

Isn't it duplicate of #10571 and #10569 ?

@dsciebu
Copy link
Author

dsciebu commented Nov 28, 2024

Isn't it duplicate of #10571 and #10569 ?

It is in a sense that it generally solves the same issue. However what I am promoting here is a general rule of marking static globals with _Thread_local as probably there are still used concurrently in several places and will probably be used that way in the future. IMO the best solution would be to rethink the architecture and refactor the existing code, but as a quick fix adding thread locality to variables like the one mentioned may help fix a lot of UBs for pretty much 'free'.

@dsciebu
Copy link
Author

dsciebu commented Nov 28, 2024

I accidentally found another spot with an identical bug, this time in xnet_ep.c. The line:
(*ep_fid)->fid.ops->ops_open = xnet_ep_ops_open;
overwrites the global assigned to (*ep_fid)->fid.ops assigned in the following line:
(*ep_fid)->fid.ops = &xnet_ep_fi_ops;
I will push a fix for it in a separate PR, but please treat it as an argument for my idea of marking the globals with thread_local.
Edit: The fix mentioned is #10590.

@dsciebu
Copy link
Author

dsciebu commented Nov 29, 2024

I accidentally found another spot with an identical bug, this time in xnet_ep.c. The line: (*ep_fid)->fid.ops->ops_open = xnet_ep_ops_open; overwrites the global assigned to (*ep_fid)->fid.ops assigned in the following line: (*ep_fid)->fid.ops = &xnet_ep_fi_ops; I will push a fix for it in a separate PR, but please treat it as an argument for my idea of marking the globals with thread_local. Edit: The fix mentioned is #10590.

@shefty - counting on your voice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants