forked from DIMEX2022/SPFFinalReport
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do initial conversion to R package #1
Open
pricet1
wants to merge
48
commits into
development
Choose a base branch
from
38-convert-to-r-pkg
base: development
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Add RStudio project file (SPFFinalReport.Rproj) - Add R package structure (DESCRIPTION, R directory, .Rbuildignore, LICENSE) Note that the email currently specified for package maintainer is a placeholder - Add non-standard directories to .Rbuildignore - Add R project-specific entries to .gitignore - Move R scripts from Code/CaseStudy2/ to R/ - For now, remove Code/CaseStudy1/, Code/CaseStudy2/RTests/ and Report/ (some of which to add back later if needed)
- Create one function for each original R script - Strip whitespace off line endings (Note: Causes lines of identical-looking code to appear in a git diff) - Convert 'library()' statements into package dependencies - Remove occurrences of 'setwd()' and 'rm(list = ls())' as these requirements now addressed by converting to package - Where appropriate, parameterise functions with an 'msoa_lim' parameter to limit number of loops within functions (a temp aid to get initial code running)
- Change output locations to allow for comparing current results with historical data during initial testing - Add bare-bones vignette to describe data strategy - Replace occurrences of Data/ with Data_ref/ or Data_act/ as appropriate - Replace occurrences of Output/ with Output_ref/ or Output_act/ as appropriate (where '_ref' is for historical data and '_act' is for data produced by the current code) - Update paths in 4*_CollateResults_*.R to match current code
For 1a_DataPrep_StudyRegion.R: - Qualify occurrences of package functions - Save processed data as separate .rds files to facilitate testing - Derive parent_area_name instead of loading from Data_ref/Raw/Shapefiles/area_hierarchy.csv (which is not currently available) - Implement test to ensure (saved) data produced by running code matches reference data
For 1b_DataPrep_Population.R: - Qualify occurrences of package functions - Save processed data as separate .rds file to facilitate testing - Implement test to ensure (saved) data produced by running code matches reference data Also: - Remove maptools package (as retired in Oct 2023 and does not seem to be used in the code) - Add 'Development' vignette
For 1c_DataPrep_TUS.R: - Qualify occurrences of package functions - Save processed data as separate .rds file to facilitate testing - Implement test to ensure (saved) data produced by running code matches reference data
For 1d_DataPrep_PM25_CAMS.R: - Qualify occurrences of package functions - Save processed data as separate .rds file to facilitate testing - Implement test to ensure (saved) data produced by running code matches reference data
…part of #42) - Implement system env vars (SPF_RUN_HLDT_1* or SPF_RUN_ALL_HLDT) which specify whether to run 'high-level' data tests - Add config vignette and update other vignettes
For 1e_DataPrep_PM25_EMEP.R: - Qualify occurrences of package functions - Save processed data as separate .rds file to facilitate testing - Implement test to ensure (saved) data produced by running code matches reference data
For 1f_DataPrep_PM25_GM.R: - Qualify occurrences of package functions - Save processed data as separate .rds file to facilitate testing - Implement test to ensure (saved) data produced by running code matches reference data
For 2_Activities_2021.R: - Qualify occurrences of package functions - Save processed data as separate .rds file to facilitate testing - Implement test to ensure (saved) data produced by running code matches reference data
For 3a_Exposures_July_2021.R: - Qualify occurrences of package functions - Save processed data as separate .rds file to facilitate testing - Implement test to ensure (saved) data produced by running code matches reference data
For 3b_Exposures_Q1_2021.R: - Qualify occurrences of package functions - Save processed data as separate .rds file to facilitate testing - Implement test to ensure (saved) data produced by running code matches reference data
For 4a_CollateResults_July_2021.R: - Qualify occurrences of package functions - Save processed data as separate .rds file to facilitate testing - Implement test to ensure (saved) data produced by running code matches reference data
For 4b_CollateResults_Q1_2021.R: - Qualify occurrences of package functions - Save processed data as separate .rds file to facilitate testing - Implement test to ensure (saved) data produced by running code matches reference data
For 5a_PlotResults_July_2021.R: - Qualify occurrences of package functions - Substitute gridExtra::arrangeGrob for deprecated multiplot - Use ggplot2::ggsave to save graphs
For 5b_PlotResults_Q1_2021.R: - Qualify occurrences of package functions - Substitute gridExtra::arrangeGrob for deprecated multiplot - Use ggplot2::ggsave to save graphs Also: - Update man pages - Move test-helpers.R to testthat/helper-run-test-check.R as it should be in the testthat directory
Add option to specify whether to import the openair AURN data or use a cached version
Note: These parameters will provide a mechanism for parameterising a run, though they do not serve any purpose yet.
Add utilities for managing file system paths, e.g. to split a path into its sub-directory components and reconstruct a path from a vector of components.
Setup code: - Create test fixtures - Silence cli output in tests Teardown code: - Ensure cli output is restored after tests
Note: The 'Browse source code' link on the website's front page is currently not being rendered so this is to provide a link for it.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
These commits fulfil #40 and #42, initial conversion to an R package (with automated testing which will facilitate refactoring the code into smaller units). They involve large changes to the original R codebase as there is restructuring required for this task. Note that this is a minimal conversion which does not include the configuration system with support for (inter alia) eliminating hard-coded paths.