-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build the “composite“ life stages ontology directly in Uberon #3443
Draft
gouttegd
wants to merge
9
commits into
master
Choose a base branch
from
add-composite-life-stages
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The SSLSO (species-specific life stage ontology) is now a fairly normal ODK-managed ontology that we can "import" (talking about our "local imports" here, not imports in a ODK sense -- those are the imports that are used to build Composite Metazoan) without any special treatment.
The SSLSO project provides its own mapping set, so we just need to fetch it, then we can generate the bridge at the same time as all the other bridges. We do _not_ generate a distinct bridge for all the species present in SSLSO, and we will not do that until/unless there is an explicit demand for it. All bridging axioms to SSLSO terms are in a single bridge, except for HsapDv and MmusDv terms (we need MmusDv as a separate bridge to construct composite-mouse; there is no real reason I can think of to have a separate HsapDv bridge, but we always had it, so I can already hear people screaming if I dare remove it.)
Add a new product coming out of the Composite pipeline: "composite-lifestages". This is basically the equivalent to the "ssso-merged-uberon" product that used to be produced by the "developmental stages ontology" project. The intermediate file on the way to get to "composite-lifestages", "collected-lifestages", is basically the equivalent of "ssso-merged".
gouttegd
added
tech
pipeline
composite
bridge-files
Issues related to the generation of bridge files from Uberon to other ontologies.
labels
Dec 6, 2024
QC workflow cancelled as it is bound to fail currently, since the new version of SSLSO is not publicly available yet. |
Instead of calling the uberon:merge-species command repeatedly, once for every species to merge, we call it only once, with a batch file listing all the species for which a merge is required. This removes some clutter from the Makefile, but most importantly this also makes the whole operation much faster (from ~45min down to ~7min, on my machine), because in batch mode the reasoner state is shared between all merge operations -- we don't need to create a new reasoner and have it reason over the ontology for every merge, which is what takes the most time. The reasoner is initialised once at the beginning of the first merge, and then it just needs to be kept updated for the subsequent merge, which is much faster than creating a whole new reasoner instance.
As for composite-vertebrate.owl and composite-metazoan.owl, we need a separate rule to create the composite-lifestages.owl product. The generic rule 'composite-%.owl' is not enough because the standard ODK-generated Makefile already contains a more specific rule, than can only be overriden by an equally specific rule.
Now that we generate those two additional products, we must take care that they are not inadvertently committed to the repository.
Currently, the information about which bridges to generate and how, and which species to unfold in composite-metazoan and how, is dispersed in two different places: in the bridges/bridges.rules.m4 source file to generate the bridge, and the config/tax-merges.tsv to generate composite-metazoan. This commit proposes to make those config data more manageable by moving them all to a single config/taxa.yaml file, from which we derive (using a relatively simple Python script) both the SSSOM/T rule file and the batch file that drives the compositing process. Arguably, having the SSSOM/T ruleset being generated by a Python script is more maintainer-friendly than having it generated by M4 macros, given that there are likely many more ontology engineers that can read and write Python than ontology engineers that can read and write M4 (which I believe is a shame, as M4 is a powerful and lightweight tool that can do great things when used well, but that's unfortunately beyond the point).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bridge-files
Issues related to the generation of bridge files from Uberon to other ontologies.
composite
pipeline
tech
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
(Draft PR as this is a work that depends on:
This PR changes the way Uberon interacts with the “Species-Specific Life Stages Ontology” (SSLSO) project. Roughly, instead of relying on that project to provide us with a “pre-composited” version of the life stage ontologies, we do everything here in Uberon. All SSLSO has to do is to provide us with the mappings between their terms and the corresponding taxon-neutral terms in Uberon.
There are several reasons for such a change, the most important being that it keeps all the logic to create the “composite” ontologies in the same place, here in Uberon. Having the SSLSO perform its own compositing leads to a lot of duplicated code, a lot of unnecessary back-and-forth between Uberon and SSLSO (Uberon generating the bridges with FBdv and WBls, which are then fetched by SSLSO to produce
ssso-merged-uberon
, which is then fetched by Uberon to producecomposite-metazoan
), and a risk that the two composite pipelines (the one in SSLSO and the one in Uberon) are not kept in sync and therefore behave slightly differently (which is exactly the case currently).