Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index is out of bound during the all_by_all_pairwise_similarity #28

Open
souzadevinicius opened this issue Feb 3, 2023 · 0 comments
Open
Labels
bug Something isn't working

Comments

@souzadevinicius
Copy link

souzadevinicius commented Feb 3, 2023

Component

GrapeImplementation.all_by_all_pairwise_similarity

Description

During the GrapeImplementation.all_by_all_pairwise_similarity method call, I got an index out of bounds exception:

IndexError                                Traceback (most recent call last)
Cell In [64], line 1
----> 1 tp = oi.all_by_all_pairwise_similarity(oba_list, vt_list)

File ~/.pyenv/versions/3.10.8/lib/python3.10/site-packages/oakx_grape/grape_implementation.py:402, in GrapeImplementation.all_by_all_pairwise_similarity(self, subjects, objects, predicates)
    398     raise ValueError("For now can only use hardcoded ensmallen predicates")
    400 resnik_model = self._make_grape_resnik_model()
--> 402 sim = resnik_model.get_similarities_from_bipartite_graph_node_names(
    403     source_node_names=subjects,
    404     destination_node_names=objects,
    405     return_similarities_dataframe=True,
    406     return_node_names=True,
    407 )
    409 pairs = iter(self._df_to_pairwise_similarity(sim))
    411 return pairs

File ~/.pyenv/versions/3.10.8/lib/python3.10/site-packages/embiggen/similarities/dag_resnik.py:145, in DAGResnik.get_similarities_from_bipartite_graph_node_names(self, source_node_names, destination_node_names, minimum_similarity, return_similarities_dataframe, return_node_names)
    120 def get_similarities_from_bipartite_graph_node_names(
    121     self,
    122     source_node_names: List[str],
   (...)
    126     return_node_names: bool = False
...
     81     ),
     82     "resnik_score": similarities
     83 })

To Reproduce

Steps to reproduce the behavior:

First of all, I merged two ontologies into one, then I did two terms lists subsetting them based on their prefixes. The first one contains OBA terms while the other contains VT terms.

oi = get_implementation_from_shorthand("grape:sqlite:../tmp/oba-vt.owl")
oba_terms = pd.read_csv('../tmp/oba_terms.txt', header=None)
#['OBA:1000035', 'OBA:1000045', 'OBA:0000003', 'OBA:0000005', 'OBA:0000006']
vt_terms = pd.read_csv('../tmp/vt_terms.txt', header=None)
#['VT:0000181', 'VT:0000362', 'VT:0000717', 'VT:0000813', 'VT:0001097']
tp = oi.all_by_all_pairwise_similarity(oba_list, vt_list)

Expected behavior

When I pass the same list in both of GrapeImplementation.all_by_all_pairwise_similarity parameters everything works fine.

tp = oi.all_by_all_pairwise_similarity(oba_list, oba_list)

for t in tp:
    print(t.ancestor_information_content)
10.202258110046387
5.5109100341796875
0.0001483669620938599
0.0001483669620938599
0.11778302490711212
5.5109100341796875
10.202258110046387
0.0001483669620938599
0.0001483669620938599
0.11778302490711212
10.202258110046387
5.587137222290039
5.587137222290039
10.202258110046387
0.0001483669620938599
0.0001483669620938599
10.202258110046387

Additional context

Library versions:
oaklib 0.1.70
oakx-grape 0.1.2
embiggen 0.11.39

@souzadevinicius souzadevinicius added the bug Something isn't working label Feb 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant