Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add similarity to raw_lookup endpoint #1639

Closed
wants to merge 6 commits into from

Conversation

prenner
Copy link
Contributor

@prenner prenner commented Oct 2, 2024

Summary

  • write tests for raw_lookup (some tests are 503'ing?)
  • add to lookup
  • add to anywhere where we try and find relevant matches
  • poke holes in api design

Test Plan

to run api tests:

docker exec -it app /bin/bash
pip install .
pip install pytest
cd OpenMediaMatch
pytest tests/

"""
signal = require_request_param("signal")
signal_type_name = require_request_param("signal_type")
return lookup_signal(signal, signal_type_name)


def lookup_signal(signal: str, signal_type_name: str) -> dict[str, list[int]]:
def lookup_signal(signal: str, signal_type_name: str) -> dict[str, dict[str, str]]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this response shape, but just noting that it is technically a breaking change for clients of this API

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, probably the right thing to do is version the api change

@andrew-dillon
Copy link

Related to #1573

@prenner prenner force-pushed the prenner/add-similarity-score branch from 93e50ea to e9e823d Compare October 2, 2024 20:31
Copy link
Contributor

@Dcallies Dcallies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! This ended up being simpler than I had thought it would be.

Tests are not expected to 503!

I believe the only path that can lead to 503 is if the index is not available when you try and query it. Are you saying they are 503-ing in main or just when you make the changes?

Comment on lines +331 to +332
if entry is not None and is_in_pytest():
entry.reload_if_needed(get_storage())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking q: Hmm, I'm very suspicious of any branching in the code that specifically changes behavior in tests.

Is there another solution that you can use in the test itself that can achieve the same results? (e.g. disabling background loading in the fixture for the test? Manually calling a load_all_indices() function?)

Comment on lines +342 to +343
def is_in_pytest():
return "pytest" in sys.modules
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Danger! I don't think this works consistently, and someone could accidentally include a test library in the build that silently changes the behavior of HMA. This is why I am generally suspicious of this kind of branching logic!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ha agreed, glad we found the root cause!

"""
signal = require_request_param("signal")
signal_type_name = require_request_param("signal_type")
return lookup_signal(signal, signal_type_name)
include_distance = bool(request.args.get("include_distance", False)) == True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for me as a solution, though I think this bool parsing is different than what we do for other flags.

blocking: Can you include it in the list of inputs in the docstring?


return lookup_signal_func(signal, signal_type_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Both of these functions return the same shape, it might be slightly easier to have this return the list and do

return {
  "matches": lookup_signal_func()
}

@prenner prenner closed this Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants