-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring #21
Open
Andrey170170
wants to merge
50
commits into
main
Choose a base branch
from
refactoring
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Refactoring #21
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Prepared file structure for package creation
some little fixes
fixes for scripts to be runnable in new file structure
Added config file in yaml format
Created a new wrapper scripts that controls the whole process to make the project more package like.
Finished `main.py` script Added folder structure initialization steps to `server_prep.py` script (now renamed to `initialization.py`)
Created `fake_profiler.py`, it initializes profiles with constant rate_limit Rewrote `MPI_download_prep` to follow a new logic of structure
Transferred downloader job submission inside schedule_creator Added a restriction prohibiting user from running main.py if schedule_creation was already scheduled and haven't completed yet Some minor changes
Small fix
Small fix
Added filtering scripts: based on image size and based on similarity between MD5 hashsum Also added scripts to delete images that were filtered out
Added filtering scripts: based on image size and based on similarity between MD5 hashsum Also added scripts to delete images that were filtered out
Some minor changes and fixes
Added name_table to have stable names between several sections of data transfer
minor updates
Fixed bug in schedule creation script.
Made downloader scripts consistent with new format of configuration (using `.yaml` file) Added verification step inside downloading job (`slurm` files) to reduce total number of jobs that is scheduled
Added check for main function whether there is possibility of infinite loop or if all servers are downloaded
Added scripts to perform data merging
some small adjustments
Transferred code of all filters into a new file structure.
Changed the way how registry works, now it uses decorators Added wrapper runner scripts for each stage of tool
Completed tools refactoring, haven't tested yet
# Conflicts: # README.md # requirements.txt
Some minor fixes
Some minor fixes
Extracted Config and Checkpoint logic into separate classes Updated all scripts to follow it
Updated tools to follow new Config/Checkpoint logic Refactored code to follow snake_case scheme for all file fields
Added config checking mechanism (compares config with a template) Added reset options for downloader and tools, so now it can be automatically relaunched
Updated structure to be package installable
Updated documentation (Readme.md file)
Added example for ignored_servers
Small readme fixes
Documentation
thompsonmj
reviewed
Jul 31, 2024
thompsonmj
reviewed
Jul 31, 2024
thompsonmj
reviewed
Jul 31, 2024
Installation instruction fix
Installation instruction fix
thompsonmj
reviewed
Aug 7, 2024
Changed gbif_id to source_id
Co-authored-by: Matt Thompson <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Completed the first version of the Distributed downloader package. It is runnable and installable.
There are some non-critical problems: