-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: unprotected read data race #998
base: master
Are you sure you want to change the base?
Conversation
WalkthroughThe changes involve modifications to the Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
We switched from using the cometBFT ABCI RPC to using CosmosSDK gRPC to query application state in this commit. Our CI then started failing with the following data race:
|
immutable_tree.go
Outdated
@@ -190,10 +190,15 @@ func (t *ImmutableTree) Get(key []byte) ([]byte, error) { | |||
} | |||
|
|||
if fastNode == nil { | |||
latestVersion, err := t.ndb.getLatestVersion() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that getLatestVersion()
will not return t.ndb.latestVersion
if it is 0. I'm not sure if this is a problem or not. This change might therefore result in different logic. We could also add a simple getter that only returns t.ndb.latestVersion
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is a solution for the above issue. t.ndb.latestVersion
should be set within LoadVersion
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to resolve the data race issue, seems like we need to add the API safeGetLatestVersion
locked by ndb.mtx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see there is already an exiting method ImmutableTree::isLatestVersion
which does what my suggestion does, maybe we should just call that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cool-develope I did the change as you suggested using a new safeGetLatestVersion
method instead.
@tac0turtle @cool-develope just bumping this as it is blocking us at omni.network to switch to using Cosmos GRPC directly for state queries as it breaks our CI currently. Just an indication of if the fix is good is sufficient, we can then update our fork and move on. Thanks 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (2)
nodedb.go (2)
800-802
: Add documentation about thread-safety guarantees.Consider adding a comment explaining that this method provides thread-safe access to
latestVersion
and how it differs fromgetLatestVersion()
.// safeGetLatestVersion returns the value of ndb.latestVersion directly // without additional logic performed by getLatestVersion. +// This method is thread-safe and should be used when only the current value +// of latestVersion is needed without the initialization logic present in getLatestVersion. func (ndb *nodeDB) safeGetLatestVersion() int64 {
800-808
: Consider using sync.RWMutex for better read concurrency.Since
latestVersion
is frequently read but less frequently modified, usingsync.RWMutex
could improve performance by allowing multiple concurrent readers.This would allow multiple goroutines to read
latestVersion
simultaneously while still maintaining exclusive access for writers. Consider:- mtx sync.Mutex // Read/write lock. + mtx sync.RWMutex // Read/write lock allowing multiple readers. func (ndb *nodeDB) safeGetLatestVersion() int64 { - ndb.mtx.Lock() - defer ndb.mtx.Unlock() + ndb.mtx.RLock() + defer ndb.mtx.RUnlock()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
immutable_tree.go
(1 hunks)nodedb.go
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- immutable_tree.go
// safeGetLatestVersion returns the value of ndb.latestVersion directly | ||
// without additional logic performed by getLatestVersion. | ||
func (ndb *nodeDB) safeGetLatestVersion() int64 { | ||
ndb.mtx.Lock() | ||
defer ndb.mtx.Unlock() | ||
|
||
return ndb.latestVersion | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Direct accesses to latestVersion need to be protected
Several unprotected accesses to latestVersion
were found that could lead to data races:
- In
nodedb_test.go
: Multiple direct writes and reads tondb.latestVersion
without mutex protection - In
import.go
: Direct read oftree.ndb.latestVersion
without mutex protection - In
mutable_tree.go
: Direct read in conditionversion == tree.ndb.latestVersion
These locations should be updated to use the new safeGetLatestVersion()
method or ensure proper mutex synchronization.
🔗 Analysis chain
LGTM! Thread-safe implementation for accessing latestVersion.
The new method correctly implements thread-safe access to latestVersion
using mutex synchronization.
Let's verify that all direct accesses to latestVersion
have been replaced with this new safe method:
Let me analyze the direct accesses to verify if they are properly protected. I'll check the mutex usage patterns in these locations.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for any remaining direct access to latestVersion that might not be protected
# We should only see accesses within methods that already hold the mutex
rg -A 3 "ndb\.latestVersion\s*[^=]*" --type go
Length of output: 4880
Script:
#!/bin/bash
# Check mutex lock/unlock patterns around these accesses
# Looking for mtx.Lock() calls before these accesses
# First, let's check nodedb.go for mutex patterns
rg -B 5 "mtx\.(Lock|Unlock)" nodedb.go
# Then, check if the test file has any mutex protection
rg -B 5 "mtx\.(Lock|Unlock)" nodedb_test.go
# Finally, check mutable_tree.go for the context of that access
rg -B 10 "version == tree\.ndb\.latestVersion" mutable_tree.go
Length of output: 6644
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmm, interesting, @cool-develope should I do as suggested here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, will check quickly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Importer
is OK since it is a fresh tree, I think mutable_tree.GetVersioned
should be updated with getLatestVersion
.
@@ -193,7 +193,7 @@ func (t *ImmutableTree) Get(key []byte) ([]byte, error) { | |||
// If the tree is of the latest version and fast node is not in the tree | |||
// then the regular node is not in the tree either because fast node | |||
// represents live state. | |||
if t.version == t.ndb.latestVersion { | |||
if t.version == t.ndb.safeGetLatestVersion() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use t.ndb.getLatestVersion()
(thread safe)
// safeGetLatestVersion returns the value of ndb.latestVersion directly | ||
// without additional logic performed by getLatestVersion. | ||
func (ndb *nodeDB) safeGetLatestVersion() int64 { | ||
ndb.mtx.Lock() | ||
defer ndb.mtx.Unlock() | ||
|
||
return ndb.latestVersion | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Importer
is OK since it is a fresh tree, I think mutable_tree.GetVersioned
should be updated with getLatestVersion
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a mutex lock/unlock on every ImmutableTree.Get
seems expensive, have you benchmarked the impact at all?
This fixes a data race in immutable tree that accesses
nodeDB.latestVersion
without acquiring the lock.Summary by CodeRabbit
New Features
Bug Fixes