Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starting members not actually limited to batch size of init_count #41

Open
dcheckoway opened this issue Jan 6, 2015 · 2 comments
Open
Assignees

Comments

@dcheckoway
Copy link
Collaborator

This is not a bug per se, just something I wanted to bring to your attention -- on the off chance it's unintentional or has potential to cause anybody any grief.

Pooler grows the member pool by batches of init_count, but that is not strictly enforced. I don't necessarily think it should be enforced, but some interesting behavior might arise in certain cases because it's not.

What I mean by not strictly enforced is:
https://github.com/seth/pooler/blob/master/src/pooler.erl#L560

Here's a sample case to illustrate the point...all members are currently in use, but there is room for growth:

init_count = 10
max_count = 100
free_pids = []
in_use_count = 10
starting_members = []

Somebody tries to take a member. That results in add_members_async(10, ...) starting up another batch of 10, and it returns error_no_members to the caller. All good so far, and now:

starting_members = [...10 elements...]

Immediately thereafter (prior to any of the new batch being fully started/added/accepted), another caller tries to take a member. This gets evaluated:

NumToAdd = max(min(InitCount - NonStaleStartingMemberCount, NumCanAdd), 1)

...which equates to max(min(10 - 10, 80), 1), so NumCanAdd = 1. That results in add_members_async(1, ...) starting up another single member, and it returns error_no_members to the caller.

Now we have 11 members being started. This may not be a problem per se, but if (a) it takes long enough for new members to start, and (b) consumers are trying to take a member at a fast enough rate, it can result in a # of starting members that can grow quite large. Worst case scenario (I believe) is that you can have up to max_count - init_count members starting at once.

(a) and (b) is the perfect storm that can potentially overload things.

This may be a non-issue, but I wanted to bring it to your attention. If this ever bites anybody, a fix would be simple...just placing a configurable hard limit on it, i.e. max_starting_count. It could default to something like max(init_count,1) out of the box, and users could raise/lower it as they see fit.

What do you think?

@seth
Copy link
Collaborator

seth commented Jan 7, 2015

Hi there,

I've only had a chance to ready your report quickly, but looks to me like you've found a bug and something that could be a problem. Perhaps it would be worth adding the config you suggest.

@oferrigni thoughts?

@dcheckoway
Copy link
Collaborator Author

@seth @oferrigni FWIW, I've been seeing this "member leakage" behavior in our production environment. We a setup where init_count is 1000, and we can see our pool grow like:

  • 1000
  • 2001
  • 3003
  • 4008
  • 5017

The overflow is the number of consumers trying to get a member while pooler is off starting up a new batch of members.

If you're not opposed to my suggestion above, I'll submit a PR...

@seriyps seriyps self-assigned this Apr 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants