Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow "Set up job" in self hosted runners outside US due to latency when downloading actions #3594

Open
chlunde opened this issue Nov 26, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@chlunde
Copy link

chlunde commented Nov 26, 2024

Describe the bug
Network latency significantly slows down the "Set up job" step for self hosted runners located outside US. This could be improved from 4+ seconds to 1 second in many cases with concurrent downloads.

To Reproduce
Steps to reproduce the behavior:

  1. Set up self hosted runner on a VM or container in EU (and maybe one in US)
  2. Create a workflow which uses the runner and downloads actions from 5+ repositories
  3. Modify the runner code to download in parallel
  4. Compare performance

Expected behavior
As low latency as possible from a PR has been created until first action is started

Runner Version and Platform

Version of your runner? main branch / v2.321.0

OS of the machine running the runner? Linux

What's not working?

Performance could be better

Job Log Output

2024-11-26T14:51:25.6284489Z Getting action download info
2024-11-26T14:51:26.0748747Z Download action repository 'actions/checkout@v4' (SHA:11bd71901bbe5b1630ceea73d27597364c9af683)
2024-11-26T14:51:26.6441629Z Download action repository 'x/actions-aws-cognito-login@v1' (SHA:c4c685143f292222485f32581feb62d0bedfdb11)
2024-11-26T14:51:27.2038260Z Download action repository 'x/actions-cache-s3@main' (SHA:76659e53192849baddea38a67089861e58d716aa)
2024-11-26T14:51:28.5609038Z Download action repository 'actions/setup-go@v5' (SHA:41dfa10bad2bb2ae585af6ee5bb4d7d973ad74ed)
2024-11-26T14:51:29.0825995Z Download action repository 'x/y@main' (SHA:b5c50a71bcc8233d2cd669ccff2389bb3503cc0c)
2024-11-26T14:51:31.3479674Z Download action repository 'actions/github-script@v7' (SHA:60a0d83039c74a4aee543508d2ffcb1c3799cdea)
2024-11-26T14:51:31.8964153Z Complete job name: configrepo

6 seconds are lost just to download some fairly small resources. A major contributor to this is ~500 millisecond latency, per download, from EU self hosted runners. It seems "Getting action download info" is also getting the same latency, so in total N+1 calls are made, where N is the numberof calls.

Example of latency:

~$ time curl -q https://pipelinesghubeus12.actions.githubusercontent.com/
real	0m0.543s

This could be mitigated:

  • Using parallel downloads in the runner
  • Consider tuning the HTTP client: using http2, connection pooling
  • By github: have a CDNs around the world for action cache and OIDC tokens
  • By self hoster: Using a caching HTTP proxy - could cause issues with authentication and security. Doesn't help if cache hit ratio is low.

Example of output with patched runner doing downloads in aralllel, step is executed in about 1 second instead of the usual 6 (slightly different downloads compared to above, sorry about that):

2024-11-26T14:52:36.5830458Z Getting action download info
2024-11-26T14:52:36.9554298Z Download action repository 'actions/checkout@v4' (SHA:11bd71901bbe5b1630ceea73d27597364c9af683)
2024-11-26T14:52:36.9594200Z Download action repository 'aws-actions/configure-aws-credentials@v4' (SHA:e3dd6a429d7300a6a4c196c26e071d42e0343502)
2024-11-26T14:52:36.9601916Z Download action repository 'x/actions-cache-s3@main' (SHA:76659e53192849baddea38a67089861e58d716aa)
2024-11-26T14:52:36.9607794Z Download action repository 'actions/setup-go@v5' (SHA:41dfa10bad2bb2ae585af6ee5bb4d7d973ad74ed)
2024-11-26T14:52:36.9614638Z Download action repository 'x/y@main' (SHA:d58bf9d9ffeddedd952508c053d657fc3fd767f4)
2024-11-26T14:52:37.6148979Z Complete job name: configrepo

Experimental patch

I don't know C#/.NET, but here's a patch I made to do the experiment above. I could create a PR if you like the idea but it will probably be easier if you do it to get the right quality and coding standards.

Fork: https://github.com/actions/runner/compare/main...chlunde:runner:parallel-download?expand=1

diff --git a/src/Runner.Worker/ActionManager.cs b/src/Runner.Worker/ActionManager.cs
index f32cad2..e560045 100644
--- a/src/Runner.Worker/ActionManager.cs
+++ b/src/Runner.Worker/ActionManager.cs
@@ -198,22 +198,35 @@ namespace GitHub.Runner.Worker
                 // Get the download info
                 var downloadInfos = await GetDownloadInfoAsync(executionContext, repositoryActions);
 
+                // Limit concurrency to 10
+                var semaphore = new SemaphoreSlim(10);
+
                 // Download each action
-                foreach (var action in repositoryActions)
+                var downloadTasks = repositoryActions.Select(async action =>
                 {
-                    var lookupKey = GetDownloadInfoLookupKey(action);
-                    if (string.IsNullOrEmpty(lookupKey))
+                    await semaphore.WaitAsync();
+                    try
                     {
-                        continue;
-                    }
+                        var lookupKey = GetDownloadInfoLookupKey(action);
+                        if (string.IsNullOrEmpty(lookupKey))
+                        {
+                            return;
+                        }
 
-                    if (!downloadInfos.TryGetValue(lookupKey, out var downloadInfo))
+                        if (!downloadInfos.TryGetValue(lookupKey, out var downloadInfo))
+                        {
+                            throw new Exception($"Missing download info for {lookupKey}");
+                        }
+
+                        await DownloadRepositoryActionAsync(executionContext, downloadInfo);
+                    }
+                    finally
                     {
-                        throw new Exception($"Missing download info for {lookupKey}");
+                        semaphore.Release();
                     }
+                }).ToList();
 
-                    await DownloadRepositoryActionAsync(executionContext, downloadInfo);
-                }
+                await Task.WhenAll(downloadTasks);
 
                 // More preparation based on content in the repository (action.yml)
                 foreach (var action in repositoryActions)
@chlunde chlunde added the bug Something isn't working label Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant