Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display help menus more promptly #62

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Display help menus more promptly #62

wants to merge 2 commits into from

Conversation

thompsonmj
Copy link
Contributor

Addresses #57 (speed up help menu displays by deferring large imports until called for to execute a command).

For a brief demo of the quality of life improvement, timing runs on my laptop for displaying help menus before and after the lazy loading are shown below.

Eager loading, fresh environment:

$ time bioclip -h
usage: bioclip [-h] {predict,embed,list-models} ...

BioCLIP command line interface

options:
  -h, --help            show this help message and exit

commands:
  {predict,embed,list-models}
    predict             Use BioCLIP to generate predictions for image files.
    embed               Use BioCLIP to generate embeddings for image files.
    list-models         List available models and pretrained model checkpoints.

real    0m18.306s
user    0m0.000s
sys     0m0.062s

Eager loading, subsequently in the same environment:

$ time bioclip -h
usage: bioclip [-h] {predict,embed,list-models} ...

BioCLIP command line interface

options:
  -h, --help            show this help message and exit

commands:
  {predict,embed,list-models}
    predict             Use BioCLIP to generate predictions for image files.
    embed               Use BioCLIP to generate embeddings for image files.
    list-models         List available models and pretrained model checkpoints.

real    0m4.203s
user    0m0.000s
sys     0m0.047s

Lazy loading, fresh environment:

$ time bioclip -h
usage: bioclip [-h] {predict,embed,list-models} ...

BioCLIP command line interface

options:
  -h, --help            show this help message and exit
 
commands:
  {predict,embed,list-models}
    predict             Use BioCLIP to generate predictions for image files.
    embed               Use BioCLIP to generate embeddings for image files.
    list-models         List available models and pretrained model checkpoints.

real    0m2.612s
user    0m0.000s
sys     0m0.031s

Lazy loading, subsequently in the same environment:

$ time bioclip -h
usage: bioclip [-h] {predict,embed,list-models} ...

BioCLIP command line interface

options:
  -h, --help            show this help message and exit

commands:
  {predict,embed,list-models}
    predict             Use BioCLIP to generate predictions for image files.
    embed               Use BioCLIP to generate embeddings for image files.
    list-models         List available models and pretrained model checkpoints.

real    0m0.604s
user    0m0.000s
sys     0m0.046s

Similarly for sub-command menus.

Copy link
Member

@hlapp hlapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the help menus appear much more quickly is certainly helpful for a user. But we should still consider a cost-benefit analysis.

Obviously, there is a benefit for the user. However, the benefit is only for learning how one can control what the tool does, not for running the tool for a chosen task.

In terms of cost, it's hard to see how there could be costs for the user, but there can be and arguably are costs for developers. For example, some loss of code clarity, which is perhaps minor. If, however, this change affects (reduces) the way code completion works in widely used code editors (because they can't determine the imports anymore), then I would consider this major. And at this point in the codebase's lifecycle I would rank developer efficiency higher than a user's ability to shave off 10 seconds from the time it takes to get a help message.

So how does this affect code completion in, say, VS Code?

@thompsonmj
Copy link
Contributor Author

The VS Code IntelliSense auto-complete recommendations are still available:
image

image

I believe this is thanks to the declaration in __init__.py of __all__ = ["TreeOfLifeClassifier", "Rank", "CustomLabelsClassifier", "CustomLabelsBinningClassifier"]

@hlapp
Copy link
Member

hlapp commented Nov 13, 2024

I actually mostly meant code completions on the imported modules, not only the pybioclip modules.

@thompsonmj
Copy link
Contributor Author

Ah, gotcha.

I think this demonstrates what you're looking for then:

image

@hlapp
Copy link
Member

hlapp commented Nov 13, 2024

I think this demonstrates what you're looking for then:

Not quite. But I notice that it's only the open_clip module that would now be hidden away (and that presumably is causing the delay otherwise, due to its dependency on Torch etc?). Will it still autocomplete on open_clip?

@thompsonmj
Copy link
Contributor Author

Here's open_clip autocomplete working:
image

And yes, open_clip as well as the torch imports in predict.py were biggest contributors to the delay.

@hlapp
Copy link
Member

hlapp commented Nov 13, 2024

But your screenshot is with a direct import. Of course that works, that wasn't my concern.

@thompsonmj
Copy link
Contributor Author

How about this?
image

Where list_pretrained_tags_by_model is an oc method.

import os
import json
import sys
import prettytable as pt
import pandas as pd
import argparse

DEFAULT_MODEL_STR = "hf-hub:imageomics/bioclip"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates the value from here:

BIOCLIP_MODEL_STR = "hf-hub:imageomics/bioclip"



class TestParser(unittest.TestCase):

def test_parse_args_lazy_import(self):
"""Test that Rank is only imported when needed"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of testing that certain classes are loaded or not could we test how long it takes to display help?
I think it would be rather easy to accidentally undo the changes here by importing something new in main.

@johnbradley
Copy link
Collaborator

I'm wondering if there is a way to break up __main__.py so that it just contains the create_parser().
Then create a new file commands.py for example with the other content.
The idea is to have only one place we do lazy import.
So main() in __main__.py would do something like

parser = create_parser()
args = parser.parse_args()
import bioclip.commands
bioclip.commands.run(args)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants