You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Network latency significantly slows down the "Set up job" step for self hosted runners located outside US. This could be improved from 4+ seconds to 1 second in many cases with concurrent downloads.
To Reproduce
Steps to reproduce the behavior:
Set up self hosted runner on a VM or container in EU (and maybe one in US)
Create a workflow which uses the runner and downloads actions from 5+ repositories
Modify the runner code to download in parallel
Compare performance
Expected behavior
As low latency as possible from a PR has been created until first action is started
6 seconds are lost just to download some fairly small resources. A major contributor to this is ~500 millisecond latency, per download, from EU self hosted runners. It seems "Getting action download info" is also getting the same latency, so in total N+1 calls are made, where N is the numberof calls.
Example of latency:
~$ time curl -q https://pipelinesghubeus12.actions.githubusercontent.com/
real 0m0.543s
This could be mitigated:
Using parallel downloads in the runner
Consider tuning the HTTP client: using http2, connection pooling
By github: have a CDNs around the world for action cache and OIDC tokens
By self hoster: Using a caching HTTP proxy - could cause issues with authentication and security. Doesn't help if cache hit ratio is low.
Example of output with patched runner doing downloads in aralllel, step is executed in about 1 second instead of the usual 6 (slightly different downloads compared to above, sorry about that):
I don't know C#/.NET, but here's a patch I made to do the experiment above. I could create a PR if you like the idea but it will probably be easier if you do it to get the right quality and coding standards.
diff --git a/src/Runner.Worker/ActionManager.cs b/src/Runner.Worker/ActionManager.cs
index f32cad2..e560045 100644
--- a/src/Runner.Worker/ActionManager.cs+++ b/src/Runner.Worker/ActionManager.cs@@ -198,22 +198,35 @@ namespace GitHub.Runner.Worker
// Get the download info
var downloadInfos = await GetDownloadInfoAsync(executionContext, repositoryActions);
+ // Limit concurrency to 10+ var semaphore = new SemaphoreSlim(10);+
// Download each action
- foreach (var action in repositoryActions)+ var downloadTasks = repositoryActions.Select(async action =>
{
- var lookupKey = GetDownloadInfoLookupKey(action);- if (string.IsNullOrEmpty(lookupKey))+ await semaphore.WaitAsync();+ try
{
- continue;- }+ var lookupKey = GetDownloadInfoLookupKey(action);+ if (string.IsNullOrEmpty(lookupKey))+ {+ return;+ }- if (!downloadInfos.TryGetValue(lookupKey, out var downloadInfo))+ if (!downloadInfos.TryGetValue(lookupKey, out var downloadInfo))+ {+ throw new Exception($"Missing download info for {lookupKey}");+ }++ await DownloadRepositoryActionAsync(executionContext, downloadInfo);+ }+ finally
{
- throw new Exception($"Missing download info for {lookupKey}");+ semaphore.Release();
}
+ }).ToList();- await DownloadRepositoryActionAsync(executionContext, downloadInfo);- }+ await Task.WhenAll(downloadTasks);
// More preparation based on content in the repository (action.yml)
foreach (var action in repositoryActions)
The text was updated successfully, but these errors were encountered:
Describe the bug
Network latency significantly slows down the "Set up job" step for self hosted runners located outside US. This could be improved from 4+ seconds to 1 second in many cases with concurrent downloads.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
As low latency as possible from a PR has been created until first action is started
Runner Version and Platform
Version of your runner? main branch / v2.321.0
OS of the machine running the runner? Linux
What's not working?
Performance could be better
Job Log Output
6 seconds are lost just to download some fairly small resources. A major contributor to this is ~500 millisecond latency, per download, from EU self hosted runners. It seems "Getting action download info" is also getting the same latency, so in total N+1 calls are made, where N is the numberof calls.
Example of latency:
This could be mitigated:
Example of output with patched runner doing downloads in aralllel, step is executed in about 1 second instead of the usual 6 (slightly different downloads compared to above, sorry about that):
Experimental patch
I don't know C#/.NET, but here's a patch I made to do the experiment above. I could create a PR if you like the idea but it will probably be easier if you do it to get the right quality and coding standards.
Fork: https://github.com/actions/runner/compare/main...chlunde:runner:parallel-download?expand=1
The text was updated successfully, but these errors were encountered: