-
Notifications
You must be signed in to change notification settings - Fork 4
Inspect hidden states for process understanding #45
Comments
I started looking at this a little bit and it's interesting! A little mysterious, but interesting. Here are a few plots so far: Randomized weightsI first ran the model to get the "DO" predictions and the hidden states with randomized weights to get something to compare them with: "DO" predshidden statesTrained weightsDO predshidden states |
Awesome, @jsadler2 - thanks for sharing these! 🚀 I'm wondering about how/whether I should compare the h time series in the randomized weights plots above versus the trained weights plots below. The weights more or less determine how the inputs are mapped to outputs, right? So in the model run with randomized weights, it seems like the model is learning the relationship between air temperature and DO (e.g. I'm looking at h_2), but that dynamic isn't really maintained in h_2 on the trained weights. It's also interesting that predicted daily mean DO is often greater than the predicted daily max DO for this model run (and that min DO doesn't really have much seasonality) - does that just point to the importance of model training? For the model run with trained weights, h_3 and h_9 jump out to me as the model appears to be inferring some dynamic that is relatively higher from Oct - May compared to the rest of the year. h_1 looks sort of like a hydrograph (see below), and I can imagine h_0 as some concatenation of the h_3 and h_1. Mysterious but interesting, indeed! Also interested in your thoughts/interpretation so far, and whether I'm reading these tea leaves appropriately with regard to the randomized versus trained weights. |
Yeah. The randomized weight output is just that - it's totally random. So the seasonal trend we are seeing in h_2 is basically just the trend in the temperature inputs that is randomly coming through. The model hasn't seen any DO data, so it's all just noise. .. and yes. That is especially apparent in the "DO" predictions. They are all over the place and unrealistic (mean > max).
Those ones stood out to me too. So interesting that there are these two very distinct seasons. That is most apparent in h_3, but I think you could argue that there are two distinct regimes in most all of the states. For example
I agree. It would be interesting to plot those in the same figure
That is how I'm reading the tea leaves too :) One other thing that stood out to me is the sudden drops in h_0. For example, what was it it mid-April 2018 that caused the sudden drop in that state... and there are many similar patterns in h_0, but mid-Apr '18 is the ~biggest magnitude. |
Thanks for this explanation, Jeff - that makes a lot more sense. I didn't realize the model hadn't seen any DO data in the randomized version (and that's probably why you had DO in quotes 😃 ) |
Yeah, that's interesting. I haven't looked at the input variable time series, but it does look like there was a storm during mid-April 2018. It's not the biggest storm in the record (or even in that year), but we might expect some storms to be more consequential than others if they occur during windows of time that are conducive to relatively high biological activity. That'd be pretty cool if the model could pick up on that.
|
This is a really interesting conversation. It makes me wonder:
|
Good question. I'll do a couple runs today to see how they compare across model runs. I can also look across sites.
I also am wondering if dynamic time warping would be an good way to measure simliarity.
Good idea. If I'm understanding this, though, the weights will be just a static 10x3 (or 3x10?) matrix, so I don't think those would tell us anything about the importance in time, but I do think they'd be worth looking at too. |
Jeff has some useful python code in his forked version of the repo for plotting the hidden states. @jsadler2, do you have anything to add here, or any steps you think we should take based on the commits referenced above? Or can we close this issue? |
An idea that has come up is to inspect the hidden states to see if they are behaving as we would expect some state or flux in the process would behave.
The two examples that have come up are:
We can look into answering these questions with the baseline LSTM model (#40).
The text was updated successfully, but these errors were encountered: