Add runner for vrap #139

scarlehoff · 2022-06-14T12:19:26Z

This thing mostly work. But there are two (related) issues that I guess happen for the first time with vrap:

The `results` part of the runner.

If you try to run this it will fail at the end (after generating everything) when it tries to compare with the monte carlo results. There is no "MC result" to compare with in vrap (specially for the scale variations).

I could implement a reader for the central scale I guess or modify vrap to print the result in a more reasonable format, but is there a way to skip that...

One run-per-grid or one-grid-per-table...

There is a problem with some of the FTDY sets, the dataset for E905 is actually a set of 10 bins with different acceptances so we have a dataset that would need to generate 10 different pineappls. Here we have three choices:

Generate the 10 grids and apply the acceptance factors and them sum them together so that we have one single grid where the information of the singular runs have been lost.
Generate 20 E905_binX folders each with the kinematics of the given bin.
Accommodate for the fact that we can actually have grids that are a composition of other grids.

I guess 2. is similar to what we do for jets. But I wonder how terrible would be to do 1. directly here. In principle the acceptance factors are inherent to the dataset so the fact that the grid and the MC calculation don't match (this is where both issues are related) is secondary...

Also, please have a look at least to check that the structure is correct.

scarlehoff · 2022-06-14T12:20:19Z

@enocera what do you think of option 1. above?: Basically burning the ACC factors in the grid.

nnpdf31_proc/DYE605/inputDYE605nlo.dat

runcardsrunner/external/vrap.py

cschwan · 2022-06-14T13:35:45Z

Hi @scarlehoff:

I suggest you detect Vrap by asking whether the directory contains one or more .dat file, this should be more robust than checking the dataset names (we might consider running Vrap for things other than fixed-target Drell-Yan)
Wouldn't it be simpler to have one directory with one kinematics file and one runcard that would produce one grid? That's how we kept with DIS and other collider datasets, see for instance the different ATLAS 8 TeV DY datasets: https://github.com/NNPDF/runcards/tree/master/nnpdf31_proc/ATLAS_DY_8TEV_3D_046080_0007. Merging is done afterwards.
A comparison with the MC result (the central, not scale-varied result) is essential, we really need this check; basically the same check that we did. But skipping this for the time being is OK!

runcardsrunner/install.py

alecandido · 2022-06-14T13:43:30Z

I could implement a reader for the central scale I guess or modify vrap to print the result in a more reasonable format, but is there a way to skip that...

A comparison with the MC result (the central, not scale-varied result) is essential, we really need this check; basically the same check that we did. But skipping this for the time being is OK!

I just agree with @cschwan: if you want to skip, return something like a None from that method. In the consumer method, if a None is received, just print/log explicitly you're skipping this step and return (if you wish I can implement myself, but it should pretty simple anyhow).

alecandido · 2022-06-14T13:57:10Z

@scarlehoff please add metadata files, like:
https://github.com/NNPDF/runcards/blob/97c1b9f5a658573c99d0ed8496bd6dfb406dd5f7/nnpdf31_proc/ATLAS_1JET_8TEV_R06/metadata.txt#L1-L13

scarlehoff · 2022-06-14T14:05:38Z

I suggest you detect Vrap by asking whether the directory contains one or more .dat file, this should be more robust than checking the dataset names (we might consider running Vrap for things other than fixed-target Drell-Yan)

.dat files are used for many different things. I can change the extension to .vrap

Wouldn't it be simpler to have one directory with one kinematics file and one runcard that would produce one grid?

I honestly prefer the solution of burning in the acceptances...

A comparison with the MC result (the central, not scale-varied result) is essential, we really need this check; basically the same check that we did. But skipping this for the time being is OK!

With the central is ok. I thought I also needed the scale variation (which I cannot get out of vrap in an easy manner rn)

felixhekhorn · 2022-06-14T14:10:26Z

.dat files are used for many different things. I can change the extension to .vrap

that is true and even applies in some sense to yadism which uses the very broad name "observable.yaml" - how about .vrap.dat and .yadism.yaml ?

felixhekhorn · 2022-06-14T14:13:23Z

3. Accommodate for the fact that we can actually have grids that are a composition of other grids.

if I remember correctly "composition" means "summing", isn't it?

alecandido

Mostly fine, just a few comments here and there.

I still have to run, I'll do later on :)

runcardsrunner/paths.py

runcardsrunner/install.py

runcardsrunner/external/vrap.py

scarlehoff · 2022-06-14T14:16:37Z

.dat files are used for many different things. I can change the extension to .vrap

that is true and even applies in some sense to yadism which uses the very broad name "observable.yaml" - how about .vrap.dat and .yadism.yaml ?

In this case I can just do .vrap. I'm using .dat because that's what was used before but it doesn't really matter.

if I remember correctly "composition" means "summing", isn't it?

Yes, $E_{906} = \sum_{i} a^{j}_{i} \sigma _{ij}$

scarlehoff · 2022-06-14T14:25:39Z

I still have to run, I'll do later on

Wait for it. I will fix the outstanding issues (mainly the results thing).

When finished it should work for all except for E906.

scarlehoff · 2022-06-14T14:33:01Z

@alecandido @felixhekhorn one question, what if instead of having an exact copy of the vrap input card I have a vrap.yaml file with the options and the runcard is generated automagically? Would that be ok? Or do we want an "original runcard" that can be run without the runner?

enocera · 2022-06-14T14:49:35Z

@enocera what do you think of option 1. above?: Basically burning the ACC factors in the grid.

I think that this is consistent, in perspective, with what we discussed more generally about NNLO QCD K-factors at the code meeting in Milan. If I'm not wrong, the proposal was to incorporate K-factors into grids, so that these do not need to be specified in the fit runcards anymore. While in the future it is more likely that we will get rid of NNLO QCD K-factors by computing NNLO grids exactly, rather than by incorporating K-factors into grids, all K-factors are created equal in this respect. So it'd be silly to get rid of NNLO QCD K-factors, but not of ACC K-factors.

scarlehoff · 2022-06-14T14:50:48Z

Then I will simply add the ACC factors to E906.

enocera · 2022-06-14T14:50:50Z

This is a very complicated way of saying that, of all options, I'm in favour of option 1.

alecandido · 2022-06-14T15:22:08Z

@alecandido @felixhekhorn one question, what if instead of having an exact copy of the vrap input card I have a vrap.yaml file with the options and the runcard is generated automagically? Would that be ok? Or do we want an "original runcard" that can be run without the runner?

It would be wonderful, I'd say.

The only thing: make sure that it is not only happening automagically, but in principle you can generate the actual runcard in a simple way, i.e. calling a single Python function (single entry point, but make it as modular as possible, as always ^^) and even better if there is a CLI command for it (the moment you have the single entry point, I can expose it to the CLI).

alecandido · 2022-06-14T15:24:06Z

all K-factors are created equal

Yes sir 🇺🇸 🦅

runcardsrunner/external/vrap.py

Co-authored-by: Felix Hekhorn <[email protected]>

alecandido · 2022-06-24T12:42:59Z

Rebased on current master, it was just an import issue. Solved.

I'll test and (assume it's working) we'll merge :)

scarlehoff requested review from alecandido, felixhekhorn and enocera June 14, 2022 12:19