Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support new KB API corpus changesCorpus KB API changes #248

Merged
merged 7 commits into from
Jul 11, 2022

Conversation

johnbradley
Copy link
Contributor

@johnbradley johnbradley commented Jul 8, 2022

Changes phenoscape_api() to return https://dev.phenoscape.org/api/v2-beta.
Fixes issue with anatomy_ontology_iris() returning PATO, and ZP.
Fixes errors with Resnik simlarity by removing sampling that occasionally caused problems.
Updates tests now that the KB API returns classification data in term info response.
Adds "states" to the corpus parameter for corpus_size() and term_freqs().

Fixes #243
Fixes #245
Fixes #246

@johnbradley
Copy link
Contributor Author

I think there is a problem with term_freqs().
With the current changes there is an issue with passing an array of as values to term_freqs().
The /similarity/frequency endpoint has a new required value of type that only takes a single value.

term_freqs <- function(x,
as = c("phenotype", "entity", "quality"),
corpus = c("taxa", "taxon_annotations", "gene_annotations", "genes"),
decodeIRI = FALSE,
...) {
as <- match.arg(as, several.ok = TRUE)

ontology_terms_type <- as
if (ontology_terms_type == "entity") {
ontology_terms_type <- "anatomical_entity"
}
query <- list(terms = as.character(jsonlite::toJSON(x)),
corpus = corpus,
type = ontology_terms_type)

@hlapp
Copy link
Member

hlapp commented Jul 8, 2022

There are I think only two ways to fix this:

  1. Change the as parameter to be a single value, and if in following the previous API a list is supplied, stop() if they're not all of the same value or if they are, change to that value.
  2. Break the input list of terms into chunks corresponding to the same value in as. (This would only be needed if as is provided as a list of terms and they are not all the same.) Then process each chunk and finally re-assemble the list.

Having 2 would be nice. But one could consider 1 to be a first step in that direction by allowing as to be a vector, with the caveat that the values not all being the same is not yet supported.

@hlapp
Copy link
Member

hlapp commented Jul 8, 2022

BTW is the following still true? I thought not?

if (any(as != "phenotype"))
stop("corpus '", corpus, "' requires phenotype terms", call. = FALSE)

@johnbradley
Copy link
Contributor Author

BTW is the following still true? I thought not?

Based on the new type parameter for/similarity/frequency that allows either phenotype and anatomical_entity I would assume not.

Blame says this commit was added via 330fc7d. I read through the referenced issues but am unsure what might have been the source of this code. Perhaps phenoscape/phenoscape-kb-services#146 (comment).

@hlapp
Copy link
Member

hlapp commented Jul 9, 2022

@johnbradley are you still working on this? It looks ready to merge?

@johnbradley
Copy link
Contributor Author

@hlapp I think the code is good now, but the documentation still needs work.

@johnbradley
Copy link
Contributor Author

@hlapp Could you take a look at updating the documentation for term_freq() and corpus_size()?

@hlapp hlapp marked this pull request as ready for review July 10, 2022 12:33
Copy link
Member

@hlapp hlapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all ready to go to me, and docs should now be good. Tests pass for me on local.

@johnbradley
Copy link
Contributor Author

johnbradley commented Jul 11, 2022

@hlapp Are you good with me merging the commits as they are or should I rebase/squash them?

@hlapp
Copy link
Member

hlapp commented Jul 11, 2022

Let's leave the commits, I think most of them make sense on their own.

johnbradley and others added 7 commits July 11, 2022 14:41
Fixes anatomy_ontology_iris to exclude PATO and ZP from the results.
Fixes #246
Removes sampling that occasionally caused failures implementing
fix suggested here: #235 (comment)

Fixes #246
The new API now returns classification for terminfo.
Fixes tests to accommodate this change.
Removes limitation on only passing "phenotype" for `as`. Users can
now pass "entity" as well. User may still pass in a vector matching
the same length as the input IRIs, but they must be all the same
value for now.
@johnbradley johnbradley force-pushed the corpus-kb-api-changes branch from fd44100 to a9659db Compare July 11, 2022 18:48
@johnbradley johnbradley merged commit a9659db into baseline-v0.3.0 Jul 11, 2022
@johnbradley johnbradley deleted the corpus-kb-api-changes branch August 10, 2022 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants