Re-use SSH connections #166

bowlofeggs · 2016-03-15T17:34:02Z

I've noticed that pulp-smash takes about 6x longer to run when run against a remote host than when run locally. Of course it is expected that it would be a little slower since network latency comes into play, but I suspect that there is a practical way to speed it up a good bit. I believe it may not be keeping ssh connections, and may be starting new connections as it goes. Is this true? If so, could we figure out a way to get it to persist a connection?

Ichimonji10 · 2016-03-16T14:28:45Z

Adding something like this into the client's ~/.ssh/config may help:

ControlMaster auto
ControlPersist 30s
ControlPath ~/.ssh/controlmasters/%C

Ichimonji10 · 2016-03-16T16:12:20Z

I tested the solution proposed above, and I saw a drop in test suite execution time from from 1,967 seconds to 1,635 seconds. That's a significant improvement. My test system sported a dual core Opteron 280, and I suspect that modern systems will see proportionately greater improvements in execution time.

@rbarlow, can you test this solution? All that's needed is to execute mkdir -p ~/.ssh/controlmasters and edit ~/.ssh/config like so:

Host pulp.example.com
    ControlMaster auto
    ControlPersist 5m
    ControlPath ~/.ssh/controlmasters/%C

    # Not necessary, but likely desired.
    User nobody
    IdentityFile ~/.ssh/pulp.example.com
    StrictHostKeyChecking no
    UserKnownHostsFile /dev/null

bowlofeggs · 2016-03-17T20:41:35Z

Doing this makes the tests go about 21% faster. However, I still observed quite a few new ssh connections being established even with these settings. Additionally, I don't think users should have to configure ssh for this to work, especially since they may often test temporary hosts that don't have names, or that have ephemeral names. It would be ideal if pulp-smash automatically set up connection pooling, either by using CLI flags on ssh (see man ssh, as it seems possible to do this without editing the ssh config) or by using a Python ssh library that does this.

omaciel · 2016-03-17T21:44:47Z

I fully agree with what you said @rbarlow :-)

Ichimonji10 · 2016-03-18T13:39:13Z

However, I still observed quite a few new ssh connections being established even with these settings.

Do you have any insights into why? Is it because the sample ControlPersist times I've given above (five minutes and thirty seconds, respectively) are too short? Or something else?

or by using a Python ssh library that does this.

Python SSH libraries are out. We looked in to Paramiko, Fabric and Ansible, and none are viable. We've discussed this extensively in #32.

I don't think users should have to configure ssh for this to work,

SSH configuration is unavoidable. At a bare minimum, users must configure SSH keys to test against remote systems, and it is highly likely that they will also want to set a User and IdentityFile.

especially since they may often test temporary hosts that don't have names, or that have ephemeral names.

I don't see how this is relevant. Some SSH configuration is unavoidable. And this is made easier because the SSH config file supports host name globbing, IP address globbing, multiple hostnames and IPs, and so on. As an example, I had entries like this in my ~/.ssh/config for a while:

# Pulp VMs for use by Pulp Smash.

Host 192.168.121.*
    User vagrant
    StrictHostKeyChecking no
    UserKnownHostsFile /dev/null

Host 192.168.121.183
    IdentityFile ~/code/pulp-dev-45/pulp/.vagrant/machines/pulp-dev-45/libvirt/private_key

It would be ideal if pulp-smash automatically set up connection pooling

Is it, though? Yes, it'd be nice if connections were speedy and this wasn't something users had to configure. But there's up-sides to relying on the user's SSH configuration. By way of example, let's imagine that we automatically set up connection pooling using CLI flags, and the target Pulp system is configured to disallow SSH connection pooling. First, the user will be confused, because Pulp Smash SSH connections will be broken, but normal SSH connections will work fine. Second, they won't be able to override our CLI flags to fix the bug.

Relying on the user to edit ~/.ssh/config does put more work on them. But our current approach is extremely straightforward. Users never have to ask themselves "is SSH broken because of my configuration or because of some Pulp Smash bug?" And I don't really want us to take on more responsibilities unless we have to. Saying "no" to feature requests is a great way to keep ourselves sane and doing valuable work.

bowlofeggs · 2016-03-18T18:38:15Z

By way of example, let's imagine that we automatically set up connection pooling using CLI flags, and the target Pulp system is configured to disallow SSH connection pooling.

"Let's burn that bridge when we get to it." Our supported operating systems do not have this problem, and having people do a lot of configuration work will take real time, all in an effort to avoid something that would take theoretical time. Let's take the action that will let us all do our jobs faster instead of focusing so much on the "what if this weird edge case that will probably not happen happens" scenario.

Ichimonji10 · 2016-03-18T23:30:44Z

Ah, yes. Pulp is only supported in a limited number of environments, and we don't care about other platforms or interesting user configurations, by it or other applications, presumably now or in the future. sigh

By hard-coding SSH options into our application, we are creating an application that is unable to work in a variety of environments. One is where the target system disallows connection pooling. Another is where we're working with an older version of OpenSSH. I expect there are more cases. We already disallow the use of other SSH clients, such as Dropbear and SecureCRT. :/

To be clear, I am sympathetic to the complaint at the top of this thread. I would like for SSH connections to be efficient, through a mechanism such as connection pooling. But adopting a solution that further lowers the adaptability and portability of Pulp Smash is distasteful to me. This is especially true given that:

Many circumstances already require users to edit their ~/.ssh/config. Using this as a solution — at least until a nicer one is found — fits in with this existing workflow.
Using CLI flags is a non-orthogonal solution. Only a limited number of options are available; using CLI options means that Pulp Smash makes use of both CLI and config file options; using CLI options pushes SSH options into Pulp Smash itself instead of leaving them nicely separated in an external configuration file; and CLI options override whatever is in a configuration file, reducing Pulp Smash's flexibility.
The complaint that we're forcing people to do a lot of configuration work seems incorrect to me. (Remember: copying and pasting three lines into a file.) Disagreeable? Yes. A burdensome waste of time? No.
We haven't even brainstormed for other solutions yet. Hard-coded CLI options are the second solution that has been proffered. The second! To jump on it now and vociferously decry pushback is short-sighted. (It's also annoying.)

Does any of this matter? Yes, I think so. ~~redacted for propriety's sake~~

Finally: are there other solutions? I expect so. And if we discover them, we may find that we unexpectedly solve other, related problems. For example, I dislike how users are required to manually edit the Pulp Smash configuration file. A solution to that issue is to rework and expand pulp_smash/__main__.py and add a simple configuration wizard that creates $XDG_CONFIG_HOME/pulp_smash/settings.json and/or $XDG_CONFIG_HOME/pulp_smash/ssh_config and/or prints out lines the user may want to add to their ~/.ssh/config. When Pulp Smash runs, it can look for and use that file if present. Whaddya know — a flexible solution that solves two real-time-eating problems, not just the one described at the head of this thread.

bowlofeggs · 2016-03-19T16:43:25Z

It is not currently required to have an ssh config to use pulp-smash. I created the file when I tried your suggestion above. I was running pulp-smash against a remote computer without one before. So it's not an existing workflow for many real world user cases. Your position is sacrificing real world use cases for imaginary ones.

Spending time thinking about edge case uses when there are no known edge-case users is not time well spent. As I said, we can burn that bridge when we get to it. Think about it - how hard could it possibly be to respond to such an edge case if it ever happens (and I strongly suspect it never will): you use a try/except, and try again without the control settings and now it works in either environment. This would be a fix in a single place of code and would take probably 10 minutes or less. I don't think that's even worth thinking about at all until it happens, but if it does happen we're not talking about rearchitecting anything at all. In the mean time, pulp-smash will work substantially better.

Ichimonji10 · 2016-03-21T14:04:21Z

It is not currently required to have an ssh config to use pulp-smash. I created the file when I tried your suggestion above. I was running pulp-smash against a remote computer without one before. So it's not an existing workflow for many real world user cases. Your position is sacrificing real world use cases for imaginary ones.

Being able to log in to a remote system without an SSH config requires that the local and remote users have the same name, and that the SSH key for login is password-less or available via an SSH agent. Given this (and my own experiences with managing SSH profiles), I disagree with the notion that the existence of an SSH configuration file is an imaginary or uncommon use case.

Think about it - how hard could it possibly be to respond to such an edge case if it ever happens (and I strongly suspect it never will): you use a try/except, and try again without the control settings and now it works in either environment. This would be a fix in a single place of code and would take probably 10 minutes or less.

First, the solution of "try these arguments, then these arguments (, then these arguments...)" is special casing. In my experience, special casing is not a good technique for solving problems, because it requires that the program explicitly handle every situation that might be encountered. I've found it to be much more profitable to find a single generic solution to a class of issues than to walk through a hard-coded list of possible solutions.

Let's say we go ahead and add hard-coded SSH CLI arguments. How hard is this, and how hard would it be to handle a case where we've found our arguments are not appropriate? Well... hard and fragile.

"Hard," because AFAIK one must explicitly create a master socket when working with the CLI. In addition (again AFAIK) the SSH CLI doesn't support the ControlPersist argument, which means that this socket must be created and maintained in a separate Python thread. I think that'll be hard to write and hard to get right.

"Fragile" because SSH returns 255 for all errors. Catching 255 is like writing the following (pseudocode):

try:
     'ssh -S /path/to/controlmasters/some.socket example.com'
except SSHReturnCode255:  # what does 255 indicate?
    'ssh example.com'

We don't know what we're catching here. The 255 we're catching could indicate a wide range of errors, and we're swallowing them all.

Look, I also want a greater level of efficiency when executing tests. If we can drop execution time by 21%, that's great! But I think the specific solution of hard-coding CLI flags into Pulp Smash is a Bad Idea™. So let's come up with something else. In the immediate future, we can advise users to add several lines into their ~/.ssh/config, and to create the file if needed. Down the road, we can implement an even nicer solution. Maybe that'll be Pulp Smash offering to automagically creating a config file. Maybe it'll be a new PULP_SMASH_SSH_OPTS environment variable. Maybe something else.

Ichimonji10 · 2016-03-23T16:55:28Z

I spent a bit more time reading up on SSH multiplexing. (I was prompted by some benign multiplexing errors, interestingly enough.) I happily rediscovered the -o option provided by the client, and tested them with this script:

#!/usr/bin/env bash

time for i in {1..10}; do
    ssh example.com hostname
done

time for i in {1..10}; do
    ssh \
        -o 'ControlMaster=auto' \
        -o 'ControlPersist=10s' \
        -o 'ControlPath=~/.ssh/controlmasters/%C' \
        example.com hostname
done

It works well. Given this, we could default to inserting these options into SSH commands:

-o ControlMaster=auto -o ControlPersist=5m -o ControlPath=$XDG_RUNTIME_DIR/pulp_smash/controlmasters/%C

How can we control these options? (I think it's a Bad Idea to hard-code any options in.) One option is to support a PULP_SMASH_SSH_ARGS environment variable. Another option is to expand the configuration file. In either case, we should use a default argument only if the user has not provided their own.

Using default arguments in this manner is with precedent. See Ansible's OpenSSH Specific Settings documentation as an analogue.

This solution will require some environment inspection and manipulation. Specifically, we will need to inspect whether XDG_RUNTIME_DIR is set and provide a default value if unset. (Defaults for defaults!) It'll also require that we create the directory in which sockets go. This level of complexity has precedent: We already do similar environment inspection and manipulation in pulp_smash.config.

Ichimonji10 added the Issue Type: Discussion label Jul 8, 2016

Ichimonji10 changed the title ~~pulp-smash takes significantly longer to run when using ssh for commands instead of local~~ Can pulp smash re-use SSH connections? Jul 8, 2016

Ichimonji10 changed the title ~~Can pulp smash re-use SSH connections?~~ Re-use SSH connections Jul 8, 2016

Ichimonji10 added Issue Type: Plan Issue Type: Discussion and removed Issue Type: Discussion Issue Type: Plan labels Jul 8, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-use SSH connections #166

Re-use SSH connections #166

bowlofeggs commented Mar 15, 2016

Ichimonji10 commented Mar 16, 2016

Ichimonji10 commented Mar 16, 2016

bowlofeggs commented Mar 17, 2016

omaciel commented Mar 17, 2016

Ichimonji10 commented Mar 18, 2016

bowlofeggs commented Mar 18, 2016

Ichimonji10 commented Mar 18, 2016

bowlofeggs commented Mar 19, 2016

Ichimonji10 commented Mar 21, 2016

Ichimonji10 commented Mar 23, 2016

Re-use SSH connections #166

Re-use SSH connections #166

Comments

bowlofeggs commented Mar 15, 2016

Ichimonji10 commented Mar 16, 2016

Ichimonji10 commented Mar 16, 2016

bowlofeggs commented Mar 17, 2016

omaciel commented Mar 17, 2016

Ichimonji10 commented Mar 18, 2016

bowlofeggs commented Mar 18, 2016

Ichimonji10 commented Mar 18, 2016

bowlofeggs commented Mar 19, 2016

Ichimonji10 commented Mar 21, 2016

Ichimonji10 commented Mar 23, 2016