Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Authentication provided by worker does not match #127

Closed
pat-s opened this issue Feb 28, 2019 · 3 comments
Closed

Authentication provided by worker does not match #127

pat-s opened this issue Feb 28, 2019 · 3 comments

Comments

@pat-s
Copy link

pat-s commented Feb 28, 2019

Getting this error recently and I have no clue where the problem might be.
Executing from the master node.

Q(fx, x=1:3, n_jobs=1, template = list(n_cpus = 1, log_file = "log.txt"))

Submitting 1 worker jobs (ID: 6206) ...
Running 3 calculations (1 calls/chunk) ...
Error in qsys$receive_data(timeout = timeout) :
  Authentication provided by worker does not match

Template:

#!/bin/sh
#SBATCH --job-name={{ job_name }}
#SBATCH --partition=normal
#SBATCH --output={{ log_file | /dev/null }} # you can add .%a for array index
#SBATCH --error={{ log_file | /dev/null }}
#SBATCH --cpus-per-task={{ n_cpus }}
#SBATCH --array=1-{{ n_jobs }}

R --no-save --no-restore -e 'clustermq:::worker("{{ master }}")'

Worker log:

WARNING: ignoring environment value of R_HOME
  2
  3 R version 3.5.1 (2018-07-02) -- "Feather Spray"
  4 Copyright (C) 2018 The R Foundation for Statistical Computing
  5 Platform: x86_64-pc-linux-gnu (64-bit)
  6
  7 R is free software and comes with ABSOLUTELY NO WARRANTY.
  8 You are welcome to redistribute it under certain conditions.
  9 Type 'license()' or 'licence()' for distribution details.
 10
 11 R is a collaborative project with many contributors.
 12 Type 'contributors()' for more information and
 13 'citation()' on how to cite R or R packages in publications.
 14
 15 Type 'demo()' for some demos, 'help()' for on-line help, or
 16 'help.start()' for an HTML browser interface to help.
 17 Type 'q()' to quit R.
 18
 19 Warning message:
 20 package 'methods' was built under R version 3.5.2
 21 During startup - There were 12 warnings (use warnings() to see them)
 22 > clustermq:::worker("tcp://gisc:6206")
 23 Master: tcp://gisc:6206
 24 WORKER_UP to: tcp://gisc:6206
 25 slurmstepd: error: *** JOB 405 ON c0 CANCELLED AT 2019-02-28T11:56:35 ***
@pat-s
Copy link
Author

pat-s commented Feb 28, 2019

This only happens when R is running in packrat mode. What might be clashing here?
The only difference are the R pkg libs I think.

(Btw I had a working SSH setup before which worked with the packrat libs. So I'm a bit confused now).

@mschubert
Copy link
Owner

mschubert commented Feb 28, 2019

I added a simple socket (#125) authentication mechanism from version0.8.6. This is supposed to only be checked if the CMQ_AUTH variable is set in the template and ignored otherwise (with a warning that you are not using authentication).

In my tests this worked flawlessly with CMQ_AUTH set (no warning, but error if tokens do not match) and not set (warning but no error).

Can you try (1) making sure both your master and worker use the same version of clustermq and (2) changing your template line to the one below?

CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker("{{ master }}")'

edit: Do you have any idea why the master doesn't show you this warning?

@pat-s
Copy link
Author

pat-s commented Mar 1, 2019

Ah!

I remember getting the warning message. Then I inserted it as suggested and faced other errors. I did not realize that this error was caused by CMQ_AUTH and assumed other issues to be the reason. I did not even recognize the update as I installed the pkg on a new machine.

It works now - thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants