Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trie_prefetcher: alternate structure #666

Draft
wants to merge 17 commits into
base: master
Choose a base branch
from
Draft

Conversation

darioush
Copy link
Collaborator

@darioush darioush commented Oct 1, 2024

Why this should be merged

The current trie_prefetcher is modified significantly compared to upstream, which makes it difficult to merge changes from upstream.
This PR aims to preserve the existing behavior but only with minor inline modifications to upstream code.

I would be open to other structures with the same idea.

How this works

Uses a wrapper for DB to track the central worker pool.
Adds a couple hooks for stopping the background processes.

How this was tested

CI

@@ -59,7 +59,8 @@ func filledStateDB() *StateDB {

func TestCopyAndClose(t *testing.T) {
db := filledStateDB()
prefetcher := newTriePrefetcher(db.db, db.originalRoot, "", maxConcurrency)
prefetchDb := newPrefetcherDatabase(db.db, maxConcurrency)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to preserve existing UT behavior for now

@darioush darioush force-pushed the trie-prefetcher-alt branch from 3d6f372 to f3612cd Compare October 1, 2024 22:20
@darioush darioush closed this Nov 18, 2024
@darioush darioush reopened this Nov 18, 2024

// triePrefetcher is an active prefetcher, which receives accounts or storage
// items and does trie-loading of them. The goal is to get as much useful content
// into the caches as possible.
//
// Note, the prefetcher's API is not thread safe.
type triePrefetcher struct {
db Database // Database to fetch trie nodes through
db PrefetcherDB // Database to fetch trie nodes through
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed compared to upstream

Comment on lines +65 to +69
var pdb PrefetcherDB
pdb = withPrefetcherDefaults{db}
if db, ok := db.(withPrefetcherDB); ok {
pdb = db.PrefetcherDB()
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed compared to upstream

storageFetchers int64
largestLoad int64
)
defer p.db.Close()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added compared to upstream

p *triePrefetcher

db Database // Database to load trie nodes through
db PrefetcherDB // Database to load trie nodes through
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified compared to upstream

return 0
}
return sf.to.copies
sf.db.WaitTrie(sf.trie)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added compared to upstream

Comment on lines 365 to 367
sf.db.PrefetchAccount(sf.trie, common.BytesToAddress(task))
} else {
sf.db.PrefetchStorage(sf.trie, sf.addr, task)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified. compared to upstream (made this more explicit to avoid overriding GetAccount and GetStorage)

@darioush
Copy link
Collaborator Author

Adding do not merge label prior to verifying performance

@darioush darioush marked this pull request as ready for review November 19, 2024 01:35
@darioush darioush requested review from ceyonur and a team as code owners November 19, 2024 01:35
}

func (*prefetcherDatabase) PrefetchAccount(t Trie, address common.Address) {
t.(*prefetcherTrie).PrefetchAccount(address)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can you be certain that t will be a *prefetcherTrie?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PrefetcherDB type is designed to minimize changes in trie_prefetcher so these methods are only called from the trie_prefetcher.
In this case, sf.trie is initialized with the return values of OpenTrie or OpenStorageTrie, which is the intended use of the PrefetcherDB type.

We could try to return another type from OpenTrie, however this lead to more changes in trie_prefetcher than I would consider "minimal", open to your suggestions.

Copy link
Collaborator Author

@darioush darioush Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also existing code https://github.com/ava-labs/coreth/blob/master/core/state/database.go#L218-L225 couples the DB type to the trie type (CopyTrie). This code seems "gone" https://github.com/ethereum/go-ethereum/blob/master/core/state/database.go#L48 from the current versions of upstream which is very promising for the future

p.workers.Wait()
}

type PrefetcherDB interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Please move this to the top of the file as it's pertinent from the very beginning so helps in understanding this file with context.

(question) Why are you defining both the interface and the implementation here instead of just using the implementation directly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving it to the top.

The implementation will later stay in coreth but the interfaces (and trie_prefetcher modifications) will be part of libevm.

core/state/prefetcher_database.go Show resolved Hide resolved
}

func (withPrefetcherDefaults) PrefetchAccount(t Trie, address common.Address) {
_, _ = t.GetAccount(address)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're defining the method signature, why not return an error if it occurs and allow the call-site to decide to drop it? Same applies to storage.

Copy link
Collaborator Author

@darioush darioush Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this 66e4674 however I don't really think it is helping with clarity. I'd slightly prefer to revert it and have the empty return signal that the callsite would be ignoring any errors so the callee should eg, log it if they care.
Let me know which you prefer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm happy if you revert and log instead.

core/state/prefetcher_database.go Show resolved Hide resolved
core/state/trie_prefetcher_extra_test.go Outdated Show resolved Hide resolved
core/state/trie_prefetcher_extra_test.go Outdated Show resolved Hide resolved
core/state/trie_prefetcher_extra_test.go Outdated Show resolved Hide resolved
core/state/trie_prefetcher_extra_test.go Outdated Show resolved Hide resolved
core/state/trie_prefetcher_extra_test.go Show resolved Hide resolved
core/state/trie_prefetcher_extra_test.go Outdated Show resolved Hide resolved
@@ -332,6 +330,29 @@ func (sf *subfetcher) loop() {
}
sf.trie = trie
}
handleTask := func(task []byte) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just from upstream?

Copy link
Collaborator Author

@darioush darioush Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this from https://github.com/ava-labs/coreth/pull/666/files/7b9959050f89377e30b15957fd5081e012cd720c..96575b96ada7cc778f0a53c5045a9539f5ec530b#diff-58140c10803d5c33def41482e87d096548e88382aacc21aa7dbee1e918915870L361-L370 so I can call it in defer.

Using defer seems cleaner than to modify the channel logic in the loop or repeating the logic in defer, which could diverge from upstream without us noticing.
As we would encounter a merge conflict for the code that handles the task since we modified GetAccount to PrefetchAccount (to avoid wrapping GetAccount), this seems fine to me.

Here's another approach for example ava-labs/subnet-evm@5a48a7d#diff-58140c10803d5c33def41482e87d096548e88382aacc21aa7dbee1e918915870R361 (but this does not support the original pattern).

Overall I am open to changing it to a different approach if you have one in mind.

Copy link
Collaborator

@ceyonur ceyonur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few nits, otherwise LGTM.

// withPrefetcherDefaults extends Database and implements PrefetcherDB by adding
// default implementations for PrefetchAccount and PrefetchStorage that read the
// account and storage slot from the trie.
type withPrefetcherDefaults struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this no-op prefetcher?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we add a type assertion and maybe move this to no_op_prefetcher file?

@darioush darioush marked this pull request as draft November 25, 2024 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants