Make GF download reproducible #47

aappling-usgs · 2020-06-16T20:23:22Z

Right now we have an .ind file and build/status file to represent the download of the geospatial fabric, and the corresponding command creates the file but doesn't push it to Drive. That's sorta weird and probably creates some fragilities that I haven't quite pinned down.

We're also referring to the file in later targets with I('1_network/in/GeospatialFabric_National.gdb') rather than via the .ind file, so if the .gdb contents change, downstream targets won't get rebuild. That's definitely fragile.

This file is tricky because it's a huge download, so we'd prefer for not everyone to need to download it.

I proposed a handful of solutions in a Teams thread with Hayley and Sam today, but now I think they're all wrong. At the moment I think the solution might be to

convert the current target into a getter in getters.yml that produces a summary file (.yml extension)
create an .ind target in 1_network.yml that builds the getter target?
make downstream targets depend on the .ind target and call sc_retrieve.

Scenarios:

Nobody has ever downloaded the file: neither target is built; .ind target gets requested, which builds getter target and then creates .ind file; ind file gets git committed.
Someobdy else has downloaded the file but you haven't: you get the .ind file, so you only rebuild downstream targets if there's a need. If you do rebuild downstream targets, those builds call sc_retrieve to build the getter...but this approach doesn't rebuild the .ind file, so there's potential for a mismatch between the .ind and an updated GF file. And isn't there potential for a double build somewhere in here, too? So I still don't have it right...

Why is this so hard for me today? Don't we handle big input files all the time?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make GF download reproducible #47

Make GF download reproducible #47

aappling-usgs commented Jun 16, 2020 •

edited

Loading

Make GF download reproducible #47

Make GF download reproducible #47

Comments

aappling-usgs commented Jun 16, 2020 • edited Loading

aappling-usgs commented Jun 16, 2020 •

edited

Loading