Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-implementation of CMS_Z0J_8TEV in the new format #2241

Open
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

achiefa
Copy link
Contributor

@achiefa achiefa commented Dec 6, 2024

This PR implements CMS_Z0J_8TEV in the new format.

$(x,Q^2)$-map and data-theory comparison

Legacy: [default], [sys_10]
New: [default], [sys_10]

@achiefa achiefa self-assigned this Dec 6, 2024
@achiefa achiefa requested a review from scarlehoff December 6, 2024 17:53
@achiefa
Copy link
Contributor Author

achiefa commented Dec 6, 2024

Hi @scarlehoff & @enocera, I see that this dataset was implemented with three different variants:

  • sys_10_NNPDF31
  • sys_NNPDF31
  • sys_10
    I suspect the first two account for K-factors, so they should not be necessary anymore. For the last one I don't know. Could you please confirm?

@scarlehoff
Copy link
Member

As usual, @enocera should confirm. If _10 is also a scale uncertainty I don't think it should've gone into 4.0?

In any case, if you are able to reproduce the default from hepdata (i.e., without the theory uncertainty, whatever it is), then you can always create the _10 version (taking it from the legacy file) by concatenating that uncertainty with the hepdata ones, so that we can reproduce previous result.

@enocera
Copy link
Contributor

enocera commented Dec 6, 2024

@achiefa @scarlehoff As for the case of ATLAS, the sys_10 uncertainty is a 1% uncorrelated uncertainty due to the Monte Carlo unreliability of the NNLO QCD K-factors determined wiht Njetti. @scarlehoff is recomputing the theory with NNLOjet for a separate project. My recommendation is to implement two variants: one variant, which we will match to the NNLOjet computation and which will possibly become the default, without this 1% uncertainty; and one "legacy" variant with the 1% uncertainty to be used with the old Njetti theory for benchmark purposes. The sys_10_NNPDF31 and sys_NNPDF31 uncertainties can be ignored.

@achiefa
Copy link
Contributor Author

achiefa commented Dec 9, 2024

The two variants match between the new implementation and the legacy.

I've also uploaded the data-theory comparisons in the description of this PR. The chi2's are unchanged. There is a difference in the (x,Q2) map, but I think this is expected as we move to the new format (i.e. different functions).

I'm now regenerating the common data using the bot because of the same issue encountered in #2238. After that, all tests should pass successfully.

Screenshot 2024-12-09 at 12 17 41 Screenshot 2024-12-09 at 12 17 58

@achiefa
Copy link
Contributor Author

achiefa commented Dec 9, 2024

Hi @Radonirinaunimi & @RoyStegeman , I need your help here. I had to regenerate the tests after the implementation in the new format. However, the three tests in test_plots below keep failing even if they succeed locally on my laptop. Do you know what's happening?

=================================== FAILURES ===================================
________________________________ test_plotfancy ________________________________
Error: Image files did not match.
  RMS Value: 41.03842988928521
  Expected:  
    /tmp/tmp1vpabeyq/validphys.tests.test_plots.test_plotfancy/baseline.png
  Actual:    
    /tmp/tmp1vpabeyq/validphys.tests.test_plots.test_plotfancy/result.png
  Difference:
    /tmp/tmp1vpabeyq/validphys.tests.test_plots.test_plotfancy/result-failed-diff.png
  Tolerance: 
    22
________________________________ test_plot_xq2 _________________________________
Error: Image files did not match.
  RMS Value: 33.0809017673408
  Expected:  
    /tmp/tmp1vpabeyq/validphys.tests.test_plots.test_plot_xq2/baseline.png
  Actual:    
    /tmp/tmp1vpabeyq/validphys.tests.test_plots.test_plot_xq2/result.png
  Difference:
    /tmp/tmp1vpabeyq/validphys.tests.test_plots.test_plot_xq2/result-failed-diff.png
  Tolerance: 
    22
_____________________________ test_plot_xq2_custom _____________________________
Error: Image files did not match.
  RMS Value: 32.44088131829135
  Expected:  
    /tmp/tmp1vpabeyq/validphys.tests.test_plots.test_plot_xq2_custom/baseline.png
  Actual:    
    /tmp/tmp1vpabeyq/validphys.tests.test_plots.test_plot_xq2_custom/result.png
  Difference:
    /tmp/tmp1vpabeyq/validphys.tests.test_plots.test_plot_xq2_custom/result-failed-diff.png
  Tolerance: 
    22

@achiefa achiefa requested a review from RoyStegeman December 9, 2024 16:50
@RoyStegeman
Copy link
Member

Thanks. For me they do fail (are you using the --mpl flag and removing the decorator @pytest.mark.linux?). Tests can be quite sensitive to the mpl version, and I imagine OS as well, so you could try generating the figures on a linux machine while making sure the mpl version is the same as in the CI.

I'm leaving the office now, but can have a closer look tomorrow.

@achiefa
Copy link
Contributor Author

achiefa commented Dec 9, 2024

Hi @RoyStegeman, thanks for that. Indeed I was only removing the @pytest.mark.linux decorator. I'll generate the new figures tomorrow.

@Radonirinaunimi
Copy link
Member

(Re-)Generating reference plots are always tricky, not only they are highly dependent on the package versions, but also on the platform on which they are run. In #2191 I had asked @scarlehoff to generate the plots.

@achiefa
Copy link
Contributor Author

achiefa commented Dec 10, 2024

Hi @Radonirinaunimi, thanks for that. So, what do you think is the best way to go forward, given the circumstances? Is it worth trying what @RoyStegeman suggested?

@RoyStegeman
Copy link
Member

RoyStegeman commented Dec 10, 2024

Yes, unless @Radonirinaunimi has a better idea, please do try

@RoyStegeman RoyStegeman marked this pull request as ready for review December 11, 2024 15:41
@achiefa
Copy link
Contributor Author

achiefa commented Dec 11, 2024

I've regenerated the files using a linux machine. I'm not sure this will work out, but let's wait for the test.

Incidentally, I was thinking that it may be worth implementing a bot that re-generates the test plots when needed, similar to the bot that regenerates data. That would produce figures consistently, right?

@RoyStegeman
Copy link
Member

Incidentally, I was thinking that it may be worth implementing a bot that re-generates the test plots when needed, similar to the bot that regenerates data. That would produce figures consistently, right?

In principle yes, but with the repo-size in mind I'm not sure if we want to regenerate all figures if the tests fail because of a subset. A workaround for that would be to have the bot share the plots via a comment

@RoyStegeman
Copy link
Member

RoyStegeman commented Dec 11, 2024

I increased the tolerance, but I don't know how strict a tolerance of 24 is. Maybe it's too lenient? 2 is the default setting.

Of course, we still have the pip installation test which is a bit more strict.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason the variable on the x-axis has been changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose to present data as functions of pT for each bin in rapidity because this is the way experimentalists present the results in the paper. But if you prefer the other way around, I can change it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants