Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup nodes + httpchk #100

Closed
wants to merge 4 commits into from
Closed

Backup nodes + httpchk #100

wants to merge 4 commits into from

Conversation

Jaykah
Copy link
Contributor

@Jaykah Jaykah commented Nov 23, 2014

This patch allows to set stub backends as backup and adds the ability to use the Haproxy httpchk option with custom check ports for those nodes.

@Jaykah
Copy link
Contributor Author

Jaykah commented Jan 8, 2015

Can we please have this merged?

It has been quite a while, is the project still maintained?

Thanks

@clizzin
Copy link
Contributor

clizzin commented Feb 19, 2015

Sorry @Jaykah, the project maintainer @igor47 is really busy with other projects for Airbnb these days. We're doing our best to find time to respond to pull requests, and we're sorry contributors are waiting so long. Please bear with us in the meantime.

@Jaykah
Copy link
Contributor Author

Jaykah commented Apr 15, 2015

Hi,

Any updates on the potential merge?

Thanks!

@Jaykah
Copy link
Contributor Author

Jaykah commented May 28, 2015

@clizzin @igor47 Hi guys, when do you think it will be possible to have this reviewed?

@Jaykah
Copy link
Contributor Author

Jaykah commented Aug 27, 2015

@clizzin @igor47 Just wanted to bump this, since it is a one-line change that, if merged, would make our lives much easier :) There's not much to review either...

@igor47
Copy link
Collaborator

igor47 commented Aug 27, 2015

ah sorry about the delay; this could use a rebase.

also, we handle this problem for non-default servers with a server_options argument, which can cover all of the use cases you introduce plus any other arbitrary ones. maybe this PR should also just include a server_options argument for the backup servers section instead?

@Jaykah
Copy link
Contributor Author

Jaykah commented Aug 27, 2015

Rebase done.

The reason it's done without server_options is to assign "backup" labels to individual backup nodes instead of the whole group (if I understand your question correctly).

@jolynch
Copy link
Collaborator

jolynch commented Aug 27, 2015

We're also adding something like this for the weight change as well in #131. Perhaps we should add something like server_options to the backend registrations themselves to allow this kind of metadata to be propagated from nerve? If we're going to do that it might make senseto keep all this frontend specific metadata in one top level key?

@Jaykah
Copy link
Contributor Author

Jaykah commented Aug 27, 2015

@jolynch Sure, that's a good approach for ZK-listed servers. However, this particular scenario deals with static default servers, so I don't see an overlap there.

@jolynch
Copy link
Collaborator

jolynch commented Aug 27, 2015

@Jaykah The static registrations from the default servers should (afaik) be identical to what nerve is registering as far as Synapse can tell. In BaseWatcher's set_backends method they are just merged all together.

I guess what I'm saying is that we can solve both use cases with a generic "server_options" top level key in backend registration hashes.

@Jaykah
Copy link
Contributor Author

Jaykah commented Aug 27, 2015

@jolynch

What I am referring to is the "stub" watcher, rather than any backend discovered via Zookeeper/Docker/etc. My use case is a server in Region A that would use backends from Region B only if all local nodes fail. Those backends are marked as "backup" in the Synapse config. At the same time, region B's backends are primary for their own region (so they won't be marked as "backup" in Nerve), and are discovered through a different path in Zookeeper.

A better option would be to discover backup nodes from another Zookeeper path, but that's a different discussion (we would still need default servers if ZK fails).

@jolynch
Copy link
Collaborator

jolynch commented Aug 28, 2015

Ah, that's good context. We're experimenting with zone failover by using multiple backends and a frontend acl. We've been relying on the implicit backend name for a bit but I opened #135 to make it explicit so you can reference backends in acls. That being said the HAProxy guys recommend combining multiple backends + acl with backups so we should probably support it. Definitely interested in coming up with a good way to do failover.

I think I'm starting to see your use case, I'm just trying to find the way to address a few similar feature requests. I feel like adding a bunch of HAProxy specific keys (like httpchk) is the wrong way going forward, especially if we ever want to be able to add more outputs than HAProxy/file output. I'm not sure the right abstraction though, what do you think?

Also just FYI you can use the state cache as an additional guard (it's the only guard we use) against zookeeper flakiness.

@Jaykah
Copy link
Contributor Author

Jaykah commented Aug 28, 2015

We've been relying on the implicit backend name for a bit but I opened #135 to make it explicit so you can reference backends in acls.

So if I understand correctly, in that case we would define two (or more) backends (main and backup) in frontend's Synapse and failover to that backup using ACL? If so, that seems to solve a part of what I am trying to achieve, since your example shows ZK discovery for the backup node, which can be done from a different ZK path.

Are there any limitations to such approach?

That being said the HAProxy guys recommend combining multiple backends + acl with backups so we should probably support it.

That's what I meant by

A better option would be to discover backup nodes from another Zookeeper path, but that's a different discussion (we would still need default servers if ZK fails).

Having the backup option available on a per-node basis for stub watcher is complementary to #135

I feel like adding a bunch of HAProxy specific keys (like httpchk) is the wrong way going forward, especially if we ever want to be able to add more outputs than HAProxy/file output. I'm not sure the right abstraction though, what do you think?

Sure, having more flexibility is always great, but we should not forget that even per Synapse's readme, "The heart of synapse is actually HAProxy, a stable and proven routing component."

Also just FYI you can use the state cache as an additional guard (it's the only guard we use) against zookeeper flakiness.

What cache are you referring to? We are not using Curator, if you are talking about the path cache.

@jolynch
Copy link
Collaborator

jolynch commented Sep 7, 2015

@Jaykah
I think that we need a general way to inform config generators (HAProxy, fileoutput etc ...) of relevant metadata for backends. @bobtfish is working on adding weights in #131 and I think this change and that change need to be generalized.

Are there any limitations to such approach?

From my conversations with the HAProxy folks acls are preferred because they give you more control for when failover occurs, but the blog article I linked to seems to indicate they can be used together. In my opinion I think it strictly dominates backups because I can reason about failover really easily (does the backup backend have any traffic).

Having the backup option available on a per-node basis for stub watcher is complementary to #135

I agree, I'm just trying to figure out a way that we can unify #100, #131 and other similar features. I really do think that the right way to do this is to attach metadata to backends and we can have well specified behavior per output (HAProxy, files, etc ...)

Sure, having more flexibility is always great, but we should not forget that even per Synapse's readme, "The heart of synapse is actually HAProxy, a stable and proven routing component."

I can change the readme, but I believe it is in the best interest of the project to not make additional choices that tie us directly to HAProxy. We've already added file output which has been instrumental to getting things like Cassandra bootstrapping off Smartstack. We had to do some crazy stuff to get HAProxy to not drop traffic during reloads. HAProxy might not be the only solution we offer going forwards.

What cache are you referring to? We are not using Curator, if you are talking about the path cache.

I'm referring to the state_file_path and state_file_ttl options which allow you to cache backends between restarts. That and the restart_interval restart_jitter HAProxy options have helped out a ton for us.

@jolynch
Copy link
Collaborator

jolynch commented Sep 7, 2015

I'm also potentially ok with using a bunch of top level keys as long as we namespace them or somehow make it obvious that httpchk doesn't do anything unless you're using HAProxy (as opposed to file or nginx or w.e. we build in future)

@jolynch
Copy link
Collaborator

jolynch commented Sep 7, 2015

#83 also is relevant

@Jaykah
Copy link
Contributor Author

Jaykah commented Sep 10, 2015

@jolynch

I agree, I'm just trying to figure out a way that we can unify #100, #131 and other similar features. I really do think that the right way to do this is to attach metadata to backends and we can have well specified behavior per output (HAProxy, files, etc ...)

Sure, if we can select main and backup paths per Synapse (e.g. main path for the same region as Synapse, and then a separate "backup" block with a different ZK path for another region) that would solve the need for the backup option.

I'm also potentially ok with using a bunch of top level keys as long as we namespace them or somehow make it obvious that httpchk doesn't do anything unless you're using HAProxy (as opposed to file or nginx or w.e. we build in future)

Maybe prefix them with "haproxy" ?

haproxy-httpchk : boolean, use haproxy httpchk for this server, requires "option httpchk /" to be set under haproxy options
haproxy-chkport : custom port for httpchk

@jolynch
Copy link
Collaborator

jolynch commented Sep 14, 2015

@Jaykah I think keeping it flat with well named tags can work going forward. Trying to come up with some kind of well defined nesting structure in the service registrations seems like a losing battle to me, but maybe I'm not creative enough.

Let's do this, but let's add some documentation in both nerve and synapse for what backend registrations can actually contain (e.g. right now it's host, port, name, weight?) and what they are. Let's try to keep metadata that could only ever be used for one config generator prefixed with that config generator.

@@ -717,7 +717,7 @@ def generate_backend_stanza(watcher, config)
config.map {|c| "\t#{c}"},
backends.keys.shuffle.map {|backend_name|
backend = backends[backend_name]
b = "\tserver #{backend_name} #{backend['host']}:#{backend['port']}"
b = "\tserver #{backend_name} #{backend['host']}:#{backend['port']}#{' backup' if backend['backup']}#{' check port ' if backend['httpchk']}#{backend['chkport']}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on having a haproxy_server_options catch all registration info for backend registrations similar to the watcher's server_options?

So like, instead of haproxy_backup and haproxy_httpchk you just have haproxy_server_options which can contain weights, httpcheck lines, ports, backup status, etc ...

@Jaykah
Copy link
Contributor Author

Jaykah commented Sep 14, 2015

@jolynch The issue with server_options (or haproxy_server_options) for that matter is that those are blanket rules that apply to all backends instead of individually (unless I am missing something).

So in my case it would be:

 services:
  database:
   default_servers:
    - 
     name: "def-db1"
     host: "10.0.100.1"
     port: 3306
     haproxy-httpchk: true
     haproxy-chkport: 7201
    - 
     name: "def-db2"
     host: "10.0.100.2"
     port: 3307
     haproxy-httpchk: true
     haproxy-chkport: 7202

which would allow different check ports per instance (useful for flat infrastructure with the possibility of having two nodes coexist on the same machine).

However if we have haproxy_server_options per backend, that would be fine as well.

@jolynch
Copy link
Collaborator

jolynch commented Sep 14, 2015

@Jaykah, yes I mean in addition to the service backend's server options. So you can either set options per backend or per server.

Sweet, let's do that and I'll merge it.

@Jaykah
Copy link
Contributor Author

Jaykah commented Oct 6, 2015

#140

@Jaykah Jaykah closed this Oct 6, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants