Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Actions become substantially slower over time #936

Open
ianb-pomelo opened this issue Aug 22, 2024 · 17 comments
Open

Actions become substantially slower over time #936

ianb-pomelo opened this issue Aug 22, 2024 · 17 comments
Assignees

Comments

@ianb-pomelo
Copy link

I have a checkout of a git repo in sapling that I have been using for a while and as I've used it more and more, the commands (such as sl status and sl) have been substantially slower. I checked out a new version of the repo with Sapling and sl is almost 2x slower in my older checkout and sl status is ~8x slower. I suspect that it has something to do with all the history that I have locally that is hidden. Is there a way to clean up the checkout a bit to remove these stale references that I don't want to bring back? Or do I just have to routinely checkout new copies?

@bolinfest
Copy link
Contributor

I'm not sure about sl status, but in terms of interacting with Git, I just ran:

git --git-dir .sl/store/git fsck

which appeared to clean some things up?

@nicktogo
Copy link

Had same issue when upgrading to the latest version a while ago. I re-cloned with SL and it's working fine until recently, it's slower again.

@quark-zju
Copy link
Contributor

quark-zju commented Sep 18, 2024

Is your repo a .git repo or .sl repo? Which directory does the repo root has?

If it's .git/ repo, does git status make subsequential git/sl status faster?
If it's .sl/ repo, what's the output of sl debugstatus and sl debugtree? Do you have watchman installed? Does --config workingcopy.use-rust=true --config status.use-rust=true (or false) make a difference?

status should not be impacted by commit history.

@ianb-pomelo
Copy link
Author

Mine is a .git repo. I tried running git status with git --git-dir .sl/store/git status but got an error fatal: this operation must be run in a work tree.

I've attached the debug logs for sl status that has the couple second delay in checking the status

2024-09-18T14:30:50.932032Z DEBUG run: status_info: status_mode="rust"
2024-09-18T14:30:50.932036Z  INFO run:status: workingcopy::workingcopy: enter
2024-09-18T14:30:50.933227Z  INFO run:status:pending_changes: workingcopy::filesystem::watchmanfs::watchmanfs: enter
2024-09-18T14:30:50.933743Z DEBUG progress_model::registry: registering ProgressBar connecting watchman
2024-09-18T14:30:50.946475Z  INFO run:status:pending_changes: workingcopy::filesystem::watchmanfs::watchmanfs: exit
2024-09-18T14:30:50.946499Z DEBUG run:status: watchman_info: watchmanfallback=1
2024-09-18T14:30:50.946506Z  WARN run:status: workingcopy::filesystem::watchmanfs::watchmanfs: watchman error - falling back to slow crawl err=While invoking the watchman CLI to discover the server connection details: No such file or directory (os error 2), stderr=``
2024-09-18T14:30:50.946571Z  INFO run:status:pending_changes: workingcopy::filesystem::physicalfs: enter
2024-09-18T14:30:50.946586Z DEBUG run:status:pending_changes: workingcopy::filesystem::physicalfs: working copy parents: [Ok(HgId("a1b73645a2d513b0b07fbbb8105a0537a04e3f8a"))]
2024-09-18T14:30:50.946737Z DEBUG run:status:pending_changes: progress_model::registry: registering ProgressBar comparing
2024-09-18T14:30:50.946744Z  INFO run:status:pending_changes: workingcopy::filesystem::physicalfs: exit
2024-09-18T14:30:50.946749Z  INFO run:status:compute_status: workingcopy::status: enter
2024-09-18T14:30:50.946758Z  INFO run:status:compute_status:compute_status: workingcopy::status: enter
2024-09-18T14:30:50.955040Z DEBUG progress_model::registry: removing 1 orphan ProgressBar
2024-09-18T14:30:52.747772Z  INFO run:status:compute_status:compute_status:into_iter: workingcopy::filechangedetector: enter
2024-09-18T14:30:52.747795Z  INFO run:status:compute_status:compute_status:into_iter:check manifest{lookups=0}: workingcopy::filechangedetector: enter
2024-09-18T14:30:52.747802Z  INFO run:status:compute_status:compute_status:into_iter:check manifest{lookups=0}:files: manifest_tree: enter
2024-09-18T14:30:52.748272Z DEBUG tree::store::prefetch{ids="9a7be5236f176977ec136e3847dea0a38140c2ee"}: manifest_tree::store: enter
2024-09-18T14:30:52.748327Z DEBUG tree::store::prefetch{ids="9a7be5236f176977ec136e3847dea0a38140c2ee"}: manifest_tree::store: exit
2024-09-18T14:30:52.748335Z DEBUG tree::store::get{id="9a7be5236f176977ec136e3847dea0a38140c2ee"}: manifest_tree::store: enter
2024-09-18T14:30:52.748374Z  INFO run:status:compute_status:compute_status:into_iter:check manifest{lookups=0}:files: manifest_tree: exit
2024-09-18T14:30:52.748540Z DEBUG tree::store::get{id="9a7be5236f176977ec136e3847dea0a38140c2ee"}: manifest_tree::store: exit
2024-09-18T14:30:52.748652Z  INFO run:status:compute_status:compute_status:into_iter:check manifest{lookups=0}: workingcopy::filechangedetector: exit
2024-09-18T14:30:52.748848Z  INFO run:status:compute_status:compute_status:into_iter:compare contents{keys=0}: workingcopy::filechangedetector: enter
2024-09-18T14:30:52.748949Z  INFO run:status:compute_status:compute_status:into_iter:compare contents{keys=0}:get_content_stream: workingcopy::filechangedetector: enter
2024-09-18T14:30:52.749031Z  INFO run:status:compute_status:compute_status:into_iter:compare contents{keys=0}:get_content_stream: workingcopy::filechangedetector: exit
2024-09-18T14:30:52.749039Z  INFO run:status:compute_status:compute_status:into_iter:compare contents{keys=0}: workingcopy::filechangedetector: exit
2024-09-18T14:30:52.749045Z  INFO run:status:compute_status:compute_status:into_iter: workingcopy::filechangedetector: exit
2024-09-18T14:30:52.749054Z  INFO run:status:compute_status:compute_status:maybe_flush_treestate: workingcopy::util: enter
2024-09-18T14:30:52.749060Z DEBUG run:status:compute_status:compute_status:maybe_flush_treestate: workingcopy::util: pending_change_count=0 timeout_secs=Some(0)
2024-09-18T14:30:52.749065Z DEBUG run:status:compute_status:compute_status:maybe_flush_treestate: treestate::dirstate: skipping treestate flush - it is not dirty
2024-09-18T14:30:52.749068Z  INFO run:status:compute_status:compute_status:maybe_flush_treestate: workingcopy::util: exit
2024-09-18T14:30:52.750157Z  INFO run:status:compute_status:compute_status{pending_changes_count=0 manifest_files_count=0}: workingcopy::status: exit
2024-09-18T14:30:52.750163Z  INFO run:status:compute_status: workingcopy::status: exit
2024-09-18T14:30:52.750167Z  INFO run:status{status_len=0}: workingcopy::workingcopy: exit
2024-09-18T14:30:52.751429Z DEBUG run: clidispatch::context: maybe starting pager enable_pager=true reason="auto"
2024-09-18T14:30:52.755092Z DEBUG run: progress_model::registry: removing 1 orphan ProgressBar
2024-09-18T14:30:52.755551Z  INFO run: commands::run: exit

@quark-zju
Copy link
Contributor

I tried running git status with git --git-dir .sl/store/git status

That would be a .sl repo. A .git repo is when you run git init or git clone followed by running sl commands in it and is a separate issue (duplicated with #929).

Could you try the .sl instructions above?

@ianb-pomelo
Copy link
Author

Ah sorry, the output is

sl debugstatus
len(dirstate) = 7915
len(nonnormal) = 4
len(filtered nonnormal) = 4
clock = None
sl debugtree
dirstate v2 (using treestate/60f847df-7593-4f1f-b826-66c1f92973c4, offset 2811706, 7915 files tracked)

I do not have watchman installed

@quark-zju
Copy link
Contributor

quark-zju commented Sep 18, 2024

sl debugstatus
len(dirstate) = 7915
len(nonnormal) = 4
len(filtered nonnormal) = 4
clock = None

Thanks. It looks normal. Something might have regressed without our notice (non-watchman, non-edenfs use-cases are rare in production). I can have a look when I get time.

@quark-zju
Copy link
Contributor

@ianb-pomelo What is the version of Sapling you're using? Are you building sl from source? Have you tried

sl status --config workingcopy.use-rust=true --config status.use-rust=true

?

@ianb-pomelo
Copy link
Author

I installed Sapling via Homebrew and am using version Sapling 0.2.20240718-145624-f4e9df48

I tried that command and it was about the same performance as without that config.

I have also tried installing watchman and that substantially sped things up so for me I can just use that setup

@nicktogo
Copy link

nicktogo commented Sep 18, 2024

I'm using .sl now and I have watchman running. Here are my debug output:

➜  sl debugstatus                                                              
len(dirstate) = 519805
len(nonnormal) = 1
len(filtered nonnormal) = 1
clock = c:1726596135:5724:1:46787
➜  sl debugtree                                                                
dirstate v2 (using treestate/37b9b29e-9233-4a38-b688-96be70bd6b3c, offset 43057603, 519805 files tracked)

setting the flag to true is faster than false. If I just use sl status, that seems to be same as setting to true

sl status --config workingcopy.use-rust=true --config status.use-rust=true

@nicktogo
Copy link

@quark-zju do you have some suggestions how to work around this, other than re-cloning?

@destroycomputers
Copy link

I have encountered this, too.

A quick run under Instruments showed that in my case the issue was due to multi-pack index validation. The issue only appears after some time because originally there is no multi-pack index, as the count of pack files is low. This also means I could fix it temporarily by running git's garbage collection on the .sl/store/git directory.

The reason it's slow seems to be due to excessive validations on multi-pack index loading as described in libgit2/libgit2#6166 which is fixed in the upstream. Indeed, simply recompiling sapling with the git2 rust dependency bumped from version 0.14 to the latest 0.19 fixed the problem for me.

@nicktogo
Copy link

nicktogo commented Oct 3, 2024

Thanks for sharing! @destroycomputers this works for me too!!

@quark-zju
Copy link
Contributor

@quark-zju do you have some suggestions how to work around this, other than re-cloning?

As a temporary solution, you can try disabling multi-pack index by appending to .sl/store/git/config after #### End maintained by Sapling ####:

[maintenance "gc"]
enabled = true
[maintenance "incremental-repack"]
enabled = false

Then delete the multi-pack index file:

rm .sl/store/git/objects/pack/multi-pack-index

The underlying complexity is with --config git.shallow=1 (sapling config, matches git's "filter" concept, not git's "shallow"), the default gc can error out saying some object is not found (reported internally). But multi-pack repack works. If you're not using git.shallow feature you don't need the multi-pack index and can use the regular gc.

It seems the multi-pack index is causing multiple issues - performance issues reported here, and "object not found" issues reported internally ("not found" by libgit2, but found by git). So we might want to abandon its usage all together. To do that, we might need to switch off libgit2, or the use of the git native store. Those are non-trivial and would take some time...

@destroycomputers
Copy link

It seems the multi-pack index is causing multiple issues - performance issues reported here, and "object not found" issues reported internally ("not found" by libgit2, but found by git). So we might want to abandon its usage all together. To do that, we might need to switch off libgit2, or the use of the git native store. Those are non-trivial and would take some time...

@quark-zju meanwhile, would it be possible to bump the version of the git2 dependency to latest? That would at least fix this specific issue and given the currently used version is over 2 years old, maybe will have some other positive effects.

@quark-zju
Copy link
Contributor

quark-zju commented Oct 7, 2024

would it be possible to bump the version of the git2 dependency to latest?

Good catch. I didn't realize we're using an old version. I tried upgrading it but it does not seem as trivial since other 3rd party dependencies in the monorepo (like cargo, sqlite, etc) wants the older version. Upgrading them could be a choice but if it's too hard we might try different approaches.

UPDATE: It is more complex than I thought. Let's say it's not feasible to upgrade now. Our plan might be migrating off libgit2.

@yiding
Copy link
Contributor

yiding commented Dec 5, 2024

It seems the multi-pack index is causing multiple issues - performance issues reported here, and "object not found" issues reported internally ("not found" by libgit2, but found by git).

Just as a data point, I've ran into this pretty consistently with a large git repo that I've been working in where I get object not found errors after a sl pull, and disabling incremental repack and deleting the file fixes it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants