Abandon plugin system for Galaxy stored workflows #1292
Replies: 3 comments
-
copying from Gitter:
|
Beta Was this translation helpful? Give feedback.
-
Thank you for copying the conversation from Gitter here @innovate-invent What you propose was actually in part the original way IRIDA interacted with Galaxy. Mainly, a user would install a (specifically structured) workflow in Galaxy and load the corresponding workflow identifier into IRIDA. Subsequently, IRIDA would just execute that specific workflow in Galaxy. This is also how I've seen other applications operate that use Galaxy (I recall the Refinery Platform http://www.refinery-platform.org/ required developers to annotate workflows in Galaxy with specific tags, which would mark them as refinery platform workflows). We later switched to storing the workflows as a file in IRIDA and loading them into Galaxy whenever a workflow was executed. This was done in part to make sure we are always distributing and executing an identical workflow, no matter which instance of IRIDA was being run. I may have been a bit overly concerned with this issue at the time, but I wanted to guarantee that every workflow version used by IRIDA was identical (loading workflows from Galaxy leaves open the potential for someone to modify it accidentally, or for the workflow/tools/parameters to change depending on the version of Galaxy it was loaded into and the tools available). Storing the Galaxy workflow (the We were initially storing all workflows in the IRIDA code, which made it convenient to store and distributed these workflows (though tools in Galaxy still had to be installed separately). But, this meant any new workflows would have to be built directly into the IRIDA code. Hence, we defined a plugin system for people to package up the IRIDA workflow files ( What you're proposing is another possibility for getting workflows to work in IRIDA (which reminds me a lot of the Refinery Platform https://refinery-platform.readthedocs.io/en/latest/administrator/preparing_galaxy_workflows.html). And I do agree that there is a lot of complexity in just writing a workflow for IRIDA right now. However, I do think that the system you propose has it's own complexity. One issue is loading metadata into IRIDA from output files. You are correct that this could be done with a tool in Galaxy (though IRIDA authentication would have to be properly handled). However, in a way, this just replaces writing Java code for parsing files in IRIDA with writing a custom tool in Galaxy for doing the same thing. So I don't think it makes things less complex. One advantage of your proposal is that it makes it much easier to install new workflows in IRIDA (mainly you just have to define them in Galaxy and tell IRIDA to use a specific workflow). However, this does introduce the potential that you could change workflows used by IRIDA without anybody knowing. Plus, this makes it more challenging to distribute workflows to other users (as everything would have to be encoded into the workflow file, including the required input files, the specific output files to save into IRIDA and the messages displayed in the IRIDA interface). So, overall, while I do think there are some advantages to using stored workflows in Galaxy, there would also be a lot of work needed to make such a system work. And I'm not sure if it would save people too much time as there would then need to be custom tools developed for each workflow to save metadata back to IRIDA. I do agree, though, that we could do more work to improve the development process. Also note you don't have to recompile the plugins for every IRIDA update, only for every workflow release (which is an advantage to me as it enforces proper versioning). |
Beta Was this translation helpful? Give feedback.
-
Thanks for considering my proposal. There are new features of Galaxy workflows that you may not be aware of since the last time you evaluated its functionality. Galaxy tracks changes to a workflow, meaning that a dependent system can detect when a workflow has been modified. Workflows also allow you to mark specific output files as 'workflow outputs'. Workflow inputs are also specified in a workflow and provide label and description fields. Galaxy workflows also aggressively track tool versions and raise errors if the incorrect one is installed. IRIDA can discover and dynamically link all of its internals given the amount of information the Galaxy API now provides. The issue with metadata loading doesn't require a custom tool per workflow, but a tool per metadata schema. Even then, the metadata loading tools could likely be further generalized. Security can be done by simply passing the loading tool a callback url with long random string. Requiring bioinformaticians to become familiar with the Java stack has a chilling effect on IRIDA contributions and adoption. |
Beta Was this translation helpful? Give feedback.
-
The plugin system presents a significant barrier to implementing, maintaining, and distributing workflows for IRIDA. They also currently provide no value other than specifying a color for display in the workflow list.
I propose we abandon the plugin system and store workflows in Galaxy. IRIDA can then query the Galaxy API for the workflows and their descriptions.
Workflows can be distributed by the Galaxy workflow file and imported under the irida user.
Galaxy workflows provide input annotations and output specifications that can be dynamically discovered.
I would be willing to contribute this functionality if it is acceptable.
Beta Was this translation helpful? Give feedback.
All reactions