- Notifications
You must be signed in to change notification settings - Fork 0
Refactoring #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Refactoring #21
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
Prepared file structure for package creation
some little fixes
fixes for scripts to be runnable in new file structure
Added config file in yaml format
Created a new wrapper scripts that controls the whole process to make the project more package like.
Finished `main.py` script Added folder structure initialization steps to `server_prep.py` script (now renamed to `initialization.py`)
Created `fake_profiler.py`, it initializes profiles with constant rate_limit Rewrote `MPI_download_prep` to follow a new logic of structure
Transferred downloader job submission inside schedule_creator Added a restriction prohibiting user from running main.py if schedule_creation was already scheduled and haven't completed yet Some minor changes
Small fix
Small fix
Added filtering scripts: based on image size and based on similarity between MD5 hashsum Also added scripts to delete images that were filtered out
Added filtering scripts: based on image size and based on similarity between MD5 hashsum Also added scripts to delete images that were filtered out
Some minor changes and fixes
Added name_table to have stable names between several sections of data transfer
minor updates
Fixed bug in schedule creation script.
Made downloader scripts consistent with new format of configuration (using `.yaml` file) Added verification step inside downloading job (`slurm` files) to reduce total number of jobs that is scheduled
Added check for main function whether there is possibility of infinite loop or if all servers are downloaded
Added scripts to perform data merging
some small adjustments
Transferred code of all filters into a new file structure.
Changed the way how registry works, now it uses decorators Added wrapper runner scripts for each stage of tool
Completed tools refactoring, haven't tested yet
# Conflicts: # README.md # requirements.txt
Some minor fixes
Some minor fixes
Updated tools to follow new Config/Checkpoint logic Refactored code to follow snake_case scheme for all file fields
Added config checking mechanism (compares config with a template) Added reset options for downloader and tools, so now it can be automatically relaunched
Updated structure to be package installable
Updated documentation (Readme.md file)
Added example for ignored_servers
Small readme fixes
Documentation
thompsonmj reviewed Jul 31, 2024
thompsonmj reviewed Jul 31, 2024
thompsonmj reviewed Jul 31, 2024
Installation instruction fix
Installation instruction fix
thompsonmj reviewed Aug 7, 2024
Changed gbif_id to source_id
Co-authored-by: Matt Thompson <31709066+thompsonmj@users.noreply.github.com>
thompsonmj approved these changes Feb 4, 2025
Contributor
thompsonmj left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
Completed the first version of the Distributed downloader package. It is runnable and installable.
There are some non-critical problems: