Skip to content
This repository has been archived by the owner on Jun 1, 2023. It is now read-only.

Compare inventory vs pulled sites #42

Open
limnoliver opened this issue Apr 16, 2021 · 2 comments
Open

Compare inventory vs pulled sites #42

limnoliver opened this issue Apr 16, 2021 · 2 comments
Assignees

Comments

@limnoliver
Copy link
Member

I noticed that we are pulling far fewer temperature sites than are in the inventory for WQP. Why is this? Some known reasons:

  • some bad site IDs are dropped because we can't pull them
  • some sites are returned in the inventory but don't have data

We should create a comparison in the pipeline for review, and should follow up by investigating some sites (particularly if there are sites where the inventory says there is a lot of data, but we don't get any back). This isn't new to the recent pulls (see this PR).

@limnoliver
Copy link
Member Author

I think we need to 1) look for sites in the inventory files that are missing the pulled data. Then 2) test if you can individually pull these sites from WQP using readWQPdata. If rows are returned, are the observations missing some critical piece of information (a temp value, a date, lat/lon, etc)?

@RAtshan
Copy link
Contributor

RAtshan commented May 20, 2021

There are 288,065 sites missing from the final WQP data pull that were in the initial WQP inventory. 287,097 of these sites are dropped with the site type filter here #L19 . Most of these are appropriately dropped, but 782 appear to be streams or rivers but have slightly different naming conventions (e.g., "River/Stream", “Lake”, “Stream: Tidal stream”) and could be recovered by modifying this line of code #L14. While some Streams or Rivers were dropped even though these sites shouldn’t be filtered since they have the appropriate naming (e.g., “Stream”, “Lake, Reservoir, Impoundment”). 59 sites are dropped because of bad site IDs.

An additional 168 could not be accounted for in the code. I attempted to pull a subset of these sites with dataRetrieval and was able to retrieve temperature data for some of the sites. For a subset of the sites I could not pull with dataRetrieval, I attempted to download these data from the WQP web interface, and found other sites were dropped because they are missing the site naming (e.g. “Not Assigned”, or empty cell). The other reason that is mentioned above which is related to the streams or rivers.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants