Refactoring: streamline `perf()` and `tune()` functions to do performance assessment on just input model #343

evaham1 · 2024-11-12T03:51:06Z

Describe the issue
The functions perf() and tune() perform overlapping roles which do not follow exactly their nomenclature.

Expected behaviour
Based on function naming, would expect tune() to build and assess multiple models with different parameters to aid parameter tuning whilst perf() would assess the performance of just one, final model.

Solution
Refactor the code for perf() and tune() so that they can still be run as before BUT also can be run in this new way:

# perform tuning for component selection across all distances
tune(X, Y, ncomp = 10, keepX = NULL, dist = NULL)
# perform tuning for variable selection using chosen ncomp and distance
tune(X, Y, ncomp = 3, keepX = c(1, 10, 100), dist = "max.dist")
# once have final model, assess performance using just the ncomp used
perf(model)

Old way of running these methods also needs to remain unchanged, i.e.

# tuning for keepX only
tune(X, Y, ncomp = 3, keepX = c(1, 10, 100), dist = "max.dist")
# using perf() to identify optimal number of components
perf(model, ncomp = 10)

Update: for this to be possible will need to make a new function called perf.assess() rather than edit perf() as would not be possible to both add new functionality and keep old function mechanism.

Functions to add:

perf.assess.mixo.plsda() == perf.assess.mixo.splsda()
perf.assess.mixo.pls() == perf.assess.mixo.spls()
perf.assess.sgccda()
perf.assess.mint.plsda() == perf.assess.mint.splsda()

The text was updated successfully, but these errors were encountered:

evaham1 · 2024-11-19T06:39:25Z

Going through tune() functions to see which need to be updated:

tune.pca() is fine - tunes ncomp
tune.rcc() is fine - tunes lambda1 and lambda2
tune.spca() is fine - tunes ncomp and test.keepX simultaneously, but as described in the sPCA Multidrug Case Study users should use tune.pca() to choose ncomp and then tune.spca() to choose which variables to keep
tune.splslevel() does not need to be edited - only for variable selection
tune.splsda() tunes ncomp and test.keepX but doesn't work when test.keepX = NULL

plot(tune(method = "splsda", liver.toxicity$gene, liver.toxicity$treatment$Treatment.Group, 
     test.keepX = c(15, 25), ncomp = 3, nrep = 3)) # works
plot(tune(method = "splsda", liver.toxicity$gene, liver.toxicity$treatment$Treatment.Group, 
          test.keepX = NULL, ncomp = 3, nrep = 3)) # fails

tune.plsda() which only tunes ncomp doesn't exist (essentially should be same as perf())
tune.spls() tunes ncomp and test.keepX and test.keepY, but fails if both keepX = NULL and keepY = NULL

plot(tune.spls(liver.toxicity$gene, liver.toxicity$clinic, ncomp = 3,
                      test.keepX = c(5, 10, 15), test.keepY = c(3, 6, 8),
                      folds = 5, nrepeat = 3, progressBar = TRUE)) # works
plot(tune.spls(liver.toxicity$gene, liver.toxicity$clinic, ncomp = 3,
               test.keepX = NULL, test.keepY = c(3, 6, 8),
               folds = 5, nrepeat = 3, progressBar = TRUE)) # works
plot(tune.spls(liver.toxicity$gene, liver.toxicity$clinic, ncomp = 3,
               test.keepX = c(5, 10, 15), test.keepY = NULL,
               folds = 5, nrepeat = 3, progressBar = TRUE)) # works
plot(tune.spls(liver.toxicity$gene, liver.toxicity$clinic, ncomp = 3,
               test.keepX = NULL, test.keepY = NULL,
               folds = 5, nrepeat = 3, progressBar = TRUE)) # fails

tune.spls1() which is called inside tune.spls() can stay the same but update examples and make sure spls1 models can be tuned in the new method using tune.spls()
tune.pls() which only tunes ncomp doesn't exist (essentially should be same as perf()) (also make sure this runs for pls1 models)
tune.block.splsda() - tunes test.keepX for each data block and component, need to edit so it can just run perf() if test.keepX = NULL
tune.block.plsda() - which only tunes ncomp doesn't exist (should basically call perf())
tune.mint.splsda() - tunes test.keepX, need to edit so can just run perf() if test.keepX = NULL
tune.mint.plsda() - which only tunes ncomp doesn't exist (should basically call perf())
put all new tuning functions into tune() wrapper and add to unit tests to make sure they run the same in and out of the wrapper. Also check how all new and adapted tune functions work with variety of params like modes, distances, validation methods, multilevel, etc.
add unit tests for all new tune functions to check they give the same outputs as perf() on same models

evaham1 added the refactorisation For improvements to the package which don't change functionality, just improve backend structure label Nov 12, 2024

evaham1 self-assigned this Nov 12, 2024

evaham1 linked a pull request Nov 13, 2024 that will close this issue

343 streamline perf #344

Merged

evaham1 changed the title ~~Refactoring: streamline perf() and tune() functions so they perform different roles~~ Refactoring: streamline perf() functions to do performance assessment on just input model Nov 18, 2024

evaham1 removed their assignment Nov 18, 2024

evaham1 changed the title ~~Refactoring: streamline perf() functions to do performance assessment on just input model~~ Refactoring: streamline perf() and tune() functions to do performance assessment on just input model Nov 19, 2024

evaham1 closed this as completed in #344 Nov 19, 2024

evaham1 reopened this Nov 22, 2024

evaham1 self-assigned this Nov 26, 2024

evaham1 linked a pull request Nov 26, 2024 that will close this issue

Issue 304 streamline tune #348

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring: streamline `perf()` and `tune()` functions to do performance assessment on just input model #343

Refactoring: streamline `perf()` and `tune()` functions to do performance assessment on just input model #343

evaham1 commented Nov 12, 2024 •

edited

Loading

evaham1 commented Nov 19, 2024 •

edited

Loading

Refactoring: streamline perf() and tune() functions to do performance assessment on just input model #343

Refactoring: streamline perf() and tune() functions to do performance assessment on just input model #343

Comments

evaham1 commented Nov 12, 2024 • edited Loading

evaham1 commented Nov 19, 2024 • edited Loading

Refactoring: streamline `perf()` and `tune()` functions to do performance assessment on just input model #343

Refactoring: streamline `perf()` and `tune()` functions to do performance assessment on just input model #343

evaham1 commented Nov 12, 2024 •

edited

Loading

evaham1 commented Nov 19, 2024 •

edited

Loading