Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing RGCN/ExpGrad reshape and sampling #14

Closed
wants to merge 3 commits into from

Conversation

jdiaz4302
Copy link
Contributor

Made reshape and sampling changes discussed in #13 here, to the 03_04 notebook (using Expected Gradients on river-dl data).

Simon and I previously found a difference in compute time that was a result of different river-dl data (different sequence length 360 vs 180 and number of input features 16 vs 7), so I did subset my data to match his for faster compute.

This seems to work well, potentially with a less strong exact convergence (see the cellblock plotting expected_gradiants_ls1 vs expected_gradiants_ls2), but both notebooks are showing consistent/near-identical results (i.e., I'm not concerned about it)

@jdiaz4302
Copy link
Contributor Author

Since sampling random combinations of seg/years is the giving the same answers as sampling random years (same for all segs), I think we can choose to do the latter since we were leaning that way anyways

@jdiaz4302
Copy link
Contributor Author

jdiaz4302 commented Jun 10, 2022

I can make these changes to the xai_utils.py file pending any discussion. The changes would be:

  • Fix data reshaping
  • Sample the same random year for all segments

These changes can already be seen in the 03_04_02 notebook included in this PR

@jdiaz4302
Copy link
Contributor Author

One thing to note is that if you look at the compute time testing (%%time) in the notebooks, it will give misleading results and comparison because the notebook displaying initially faster compute time was running exclusively at first, but then the other notebook (03_04_02) was running concurrently (slowing them both down, but appears to show 03_04_02 being slower across the broad). If you look at the last %%time usage in both notebooks, it shows the real comparison - that they are the same. Each timing would be better if I had not been running them together (using a bottlenecked local CPU)

@jdiaz4302 jdiaz4302 closed this Jun 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant