Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COB Revised -- Wait for James to Merge #288

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open

COB Revised -- Wait for James to Merge #288

wants to merge 14 commits into from

Conversation

jamesaoverton
Copy link
Member

@jamesaoverton jamesaoverton commented Oct 24, 2024

This revision is based on the results of the September 2024 COB workshop.

Update: The upshot of these changes is to decide which classes should be in the COB namespace and which should remain in their source namespace. Previous versions of COB were not careful or consistent about this.

Changes have not yet been vetted.

Update: This PR has been public for 6 weeks. It has been reviewed by two senior OBO Technical Working Group members. OBO Operations members have been specifically asked to review if interested. We are now asking the OBO Discuss mailing list for comment.

We need to compare to the previously released cob.owl file and ensure that we aren't dropping any important terms.

Update: We have carefully compared to the previously released cob.owl and are satisfied that changes are intentional.

The biggest change here is a src/ontology/cob-edit.tsv file, which is a ROBOT template that replaces cob-edit.owl. Given COB's small size and tight integration, I think that a template is a good choice for editing, but this needs discussion we could reverse course. I may have bent ODK conventions to the breaking point.

The src/scripts/split-cob-edit.py script splits the cob-edit.tsv into three "modules":

  • BASE has only terms in the COB ID space
  • FULL has all the BASE terms, and other OBO terms used to define them and fill out COB
  • ROOT has all of FULL, plus "root" terms from various OBO ontologies, which demonstrate the interoperability of OBO

Update: COB workshop term review sheet

Update: Direct links to download the OWL files:

I think that the most controversial part of the current version is the 'REPLACED' and 'REPLACEMENT' object properties. I added temporary object properties such as 'has material part' with tight domains and ranges, and used these in the logical axioms of the terms in the sheet. The split-cob-edit.py script replaces these temporary object properties with existing object properties, such as RO 'has part'. The existing object properties may have domains and ranges that we were trying to leave out of COB, such as 'continuant'.

This revision is based on the results
of the September 2024 COB workshop.
Changes have not yet been vetted.
@jamesaoverton
Copy link
Member Author

Looking at the diff, I already see some changes I want to make:

  • remove obo:COB_module annotations from the OWL files
  • figure out language tags for labels

@jamesaoverton
Copy link
Member Author

@cmungall I'd appreciate your feedback on the overall approach here.

@cmungall
Copy link
Contributor

I'm afraid I only have time now for a cursory look.

Minor: cob-to-external is now subsumed into cob-edit. This makes sense (we should make sure to sync docs in this PR or closely after). It looks like not everything is accounted for. E.g. we had a lot of narrow classes e.g ZFA roots. Maybe these are of less importance but I think we should have a trail here, maybe leave behind a rump cob-to-external with the entries not in cob-edit

Domains and ranges: we still need to solve this problem (#213). This PR improves over the current release (which actually injects domains and ranges that are too strict).

@jamesaoverton
Copy link
Member Author

I made a few improvements:

  • remove COB:module and COB:module_reason annotations
  • obsolete some terms, based on discussions at the workshop
  • restore some terms that were in the previous release -- these still need discussion and are marked as such in the cob-edit.tsv file
  • restore some missing ontology metadata
  • update the generated OWL files

@sebastianduesing Can you please sanity check the cob.owl, cob-base.owl and cob-root.owl files for me?

@sebastianduesing
Copy link
Collaborator

I went through those three files to check for the following:

  1. All terms are in the right module(s).
  2. The new terms are in the right modules and placed correctly in the hierarchy.
  3. Expected metadata is present.

Everything looks good on points 1 and 2. While checking on point 3, I noticed a problem: cob-edit.tsv lists the ID of IAO:'definition' as IAO:0000015. This is actually the ID for IAO:'information carrier'. The correct ID for IAO:'definition' is IAO:0000115.

I think this mix-up started in the workshop Google Sheet; I've now fixed it there. Let me know if you'd like me to add a commit here to fix it in cob-edit.tsv. If there's anything else I should be checking for, just let me know of that too and I'll do another look-over.

@jamesaoverton
Copy link
Member Author

Thanks @sebastianduesing! I think this is in good enough shape to share and discuss.

Still to do:

  • discuss some remaining terms: marked as "TO DISCUSS" in cob-edit.tsv
  • update documentation
  • language tags for labels and definitions (a ROBOT template column either uses language tag or it does not, but the use of language tags in the source ontologies will be mixed, which could cause problems with false duplicates for labels and definitions)
  • @cmungall's suggestion to trace dropped imports: while OBO is very strict that an ontology should never just drop its own terms (they should be obsoleted), I don't see a problem with dropping terms that we no longer want to import from another ontology, i.e. ZFA or XAO. I could be convinced otherwise.
  • RO axioms that have domain/range constraints with BFO classes above the COB shoreline #213: sort out domains and ranges for imported relations

GitHub was complaining about "illegal quoting", so this might make it happier.
@jamesaoverton
Copy link
Member Author

Copy link
Contributor

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR constitutes a critical improvement for COB moving forward. I endorse all the proposed changes from a high level - but I admittedly leave details to my trust in @jamesaoverton diligence in all things (sorry).

With this review I am assuming someone scrolled through the cob.owl diff and convinced themselves all changes are expected - I didn't do that (just eyeball, lgtm).

I reviewed cob-root.owl and cob-base.owl They look great, clean and all! I personally find it a little unfortunate that not all COB classes have a definition, e.g. cellular organism or the all dangerous characteristic, but that is not really key to this PR here.

I made a few minor comments, but I don't think they are critical to be addressed before merging this.

src/ontology/cob.Makefile Show resolved Hide resolved
src/ontology/cob.Makefile Show resolved Hide resolved
@sebastianduesing
Copy link
Collaborator

I noticed today that it appears that IAO:0000109 "measurement datum" was dropped from this version of COB. I went back to the workshop sheet to check if that's just one we hadn't made a decision about, but I didn't find it in that sheet either. I may have forgotten a conversation we've had about that one—was that an intentional omission? I've got no stake on whether it's in there or not, just want to make sure I didn't miss something.

@sebastianduesing
Copy link
Collaborator

By robot-exporting this version and the existing version of COB and diffing the TSVs, I found a couple more terms that appear to be unaccounted for in this version (ignoring terms that were simply not decided on at the workshop):

  • 'mass' and 'charge' were both placed in full according to the workshop sheet, but do not appear in this branch.
  • 'is specified input of' is missing in this branch.
  • Adjacently, COB currently changes the label of 'is specified output of' to 'specified output of'. I can't recall whether this was an agreed-upon change.

Let me know if you'd like me to add a commit to fix any/all of these.

@bpeters42
Copy link
Contributor

bpeters42 commented Dec 11, 2024 via email

@jamesaoverton
Copy link
Member Author

I made a few updates to address comments by @sebastianduesing:

  • restore 'measurement datum'
  • restore 'is specified input of'
  • relabel 'is specified output of'
  • restore 'mass'
  • restore 'electric charge' -- I decided to use the actual PATO label instead of just 'charge'

I also restored the generated cob.tsv file, as suggested by @matentzn.

@jamesaoverton jamesaoverton self-assigned this Dec 12, 2024
@jamesaoverton jamesaoverton changed the title COB Revised COB Revised -- Wait for James to Merge Dec 12, 2024
@jamesaoverton jamesaoverton marked this pull request as ready for review December 12, 2024 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants