Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Parallelize saveNewNodes' DB writes with figuring out "what to write" #889

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 32 additions & 28 deletions mutable_tree.go
Original file line number Diff line number Diff line change
Expand Up @@ -1028,53 +1028,57 @@ func (tree *MutableTree) balance(node *Node) (newSelf *Node, err error) {
// saveNewNodes save new created nodes by the changes of the working tree.
// NOTE: This function clears leftNode/rigthNode recursively and
// calls _hash() on the given node.
// TODO: Come back and figure out how to better parallelize this code.
func (tree *MutableTree) saveNewNodes(version int64) error {
nonce := uint32(0)
newNodes := make([]*Node, 0)
var recursiveAssignKey func(*Node) ([]byte, error)
recursiveAssignKey = func(node *Node) ([]byte, error) {

nodeChan := make(chan *Node, 64) // Buffered channel to avoid blocking.
doneChan := make(chan error, 1) // Channel to signal completion and return any errors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some overhead with creating channels and goroutine, although likely not so big, it might be better to couple channel and goroutine creation with the lifecycle of MutableTree instead of saveNewNodes.

Since I don't know off hand how much overhead that actually is maybe it's fine as-is too.


// Start a goroutine to save nodes.
go func() {
var err error
for node := range nodeChan {
if saveErr := tree.ndb.SaveNode(node); saveErr != nil {
err = saveErr
break
}
node.leftNode, node.rightNode = nil, nil // Detach children after saving.
}
doneChan <- err // Send any error encountered or nil if none.
}()

var recursiveAssignKey func(*Node) []byte
recursiveAssignKey = func(node *Node) []byte {
if node.nodeKey != nil {
if node.nodeKey.nonce != 0 {
return node.nodeKey.GetKey(), nil
return node.nodeKey.GetKey()
}
return node.hash, nil
return node.hash
}
nonce++
node.nodeKey = &NodeKey{
version: version,
nonce: nonce,
nonce: nonce, // Example nonce calculation; adjust as needed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just realized the original codebase is wrong, not a bug but just a performance issue.
newNodes = append(newNodes, node) should come after this line, the main idea of newNodes slice is to save the node in a sorted manner by the nodekey (here the nonce) to reduce the compaction.

Just worried we can't assume this sorted manner after refactoring ...

}

var err error
// the inner nodes should have two children.
// Assign keys recursively to child nodes. (Two children are guaranteed)
if node.subtreeHeight > 0 {
node.leftNodeKey, err = recursiveAssignKey(node.leftNode)
if err != nil {
return nil, err
}
node.rightNodeKey, err = recursiveAssignKey(node.rightNode)
if err != nil {
return nil, err
}
node.leftNodeKey = recursiveAssignKey(node.leftNode)
node.rightNodeKey = recursiveAssignKey(node.rightNode)
}

node._hash(version)
newNodes = append(newNodes, node)
node._hash(version) // Assuming this hashes the node.
nodeChan <- node // Send node to be saved.

return node.nodeKey.GetKey(), nil
return node.nodeKey.GetKey()
}

if _, err := recursiveAssignKey(tree.root); err != nil {
recursiveAssignKey(tree.root)
close(nodeChan) // Close the channel on completion.
if err := <-doneChan; err != nil { // Wait for the saving goroutine to finish.
Comment on lines +1031 to +1078
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parallelization of the saveNewNodes function introduces concurrency into the node saving process, which is a significant change aimed at improving performance. However, there are several areas that need attention:

  1. Error Handling: The current implementation stops processing new nodes as soon as an error occurs (lines 1042-1045). This approach might leave some nodes unsaved without any attempt to save them. Consider whether it's acceptable to stop immediately or if it would be better to attempt saving all nodes and collect all errors.

  2. Detaching Children: Detaching children after saving (line 1046) is done within the goroutine. Ensure that this operation does not introduce any race conditions or inconsistencies, especially since the parent node's references to these children are cleared.

  3. Nonce Calculation: The nonce calculation (line 1062) seems to be a placeholder (nonce++;). Ensure that the nonce calculation logic is correctly implemented and serves its intended purpose. If the nonce is meant to ensure uniqueness or order, verify that this incrementation strategy is sufficient and safe under concurrent conditions.

  4. Recursive Key Assignment: The recursive assignment of keys to child nodes (lines 1051-1074) is a critical operation. It's important to ensure that this recursion does not introduce any performance bottlenecks or stack overflow issues for trees with significant depth. Additionally, verify that the operation of assigning keys and subsequently hashing nodes (line 1071) is correct and does not lead to any unintended side effects.

  5. Channel and Goroutine Management: The use of a buffered channel (line 1035) and a single goroutine (lines 1038-1049) for saving nodes is a straightforward approach to parallelization. However, consider the buffer size of the channel and whether it's adequately sized for the expected workload. Also, ensure that the goroutine's error handling and channel closing logic (lines 1077-1078) are robust and won't lead to goroutine leaks or panic due to double closing channels.

  6. Concurrency and Data Races: Given the introduction of concurrency, it's crucial to ensure that there are no data races, especially concerning the access and modification of node properties. Use tools like the Go race detector to verify that the implementation is safe.

Overall, while the parallelization effort is commendable for its potential to improve performance, careful consideration must be given to error handling, concurrency issues, and the correctness of the implementation.

Consider reviewing the error handling strategy, ensuring the safety of detaching children, verifying the nonce calculation logic, assessing the performance and correctness of recursive key assignment, and ensuring robust channel and goroutine management to prevent leaks or panics.

return err
}

for _, node := range newNodes {
if err := tree.ndb.SaveNode(node); err != nil {
return err
}
node.leftNode, node.rightNode = nil, nil
}

return nil
}

Expand Down
Loading