-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Jupyter Notebook #264
Conversation
Current dependencies on/for this PR:
This stack of pull requests is managed by Graphite. |
485b941
to
63d0a83
Compare
ruff_lsp/server.py
Outdated
|
||
settings = _get_settings_by_document(document.path) | ||
|
||
if utils.is_stdlib_file(document.path): | ||
if document.is_stdlib_file(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also provide notebook specific code-actions. These are prefixed by namespace notebook.
in code action kind.
63d0a83
to
c01f75c
Compare
c01f75c
to
c12b459
Compare
0621462
to
095a808
Compare
c12b459
to
8505d3a
Compare
095a808
to
b295f14
Compare
8505d3a
to
6ecdd97
Compare
ruff_lsp/server.py
Outdated
notebook_document = LSP_SERVER.workspace.get_notebook_document(cell_uri=uri) | ||
if notebook_document is None: | ||
notebook_document = LSP_SERVER.workspace.get_notebook_document( | ||
notebook_uri=uri | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't use:
LSP_SERVER.workspace.get_notebook_document(notebook_uri=uri, cell_uri=uri)
Because the method uses either notebook_uri
or cell_uri
where notebook_uri
is preferred over cell_uri
. While our implementation will first try the cell_uri
and then the notebook_uri
.
|
||
This works for both Python files and Notebook Documents. For Notebook Documents, | ||
the hover works at the cell level. | ||
""" | ||
document = LSP_SERVER.workspace.get_text_document(params.text_document.uri) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is either a Python file or a Notebook cell.
ruff_lsp/server.py
Outdated
document = Document.from_text_document( | ||
LSP_SERVER.workspace.get_text_document(params.text_document.uri) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we creating the document again here when it's already constructed above at the start of the function?
This function is a generic implementation to create code actions at both source and line level. The document created above can only be used to create source level code actions as it represents either a Python file or an entire Notebook.
Here, we need a document which represents a Python file or an individual cell because we need to get the line level code action. Here, we already have the diagnostics information which is why this works as otherwise we would've been running Ruff on an individual cell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind adding something to this effect as a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've simplified this part such that each branch uses the document through our custom abstraction or the one from pygls
.
def _result_to_edits( | ||
document: workspace.TextDocument, | ||
result: RunResult | None, | ||
) -> list[TextEdit]: | ||
def _result_to_workspace_edit( | ||
document: Document, result: RunResult | None | ||
) -> WorkspaceEdit | None: | ||
"""Converts a run result to a WorkspaceEdit.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Source level code actions should apply the fix for every cell in a Notebook. Earlier the API was to create a single text edit to replace the entire content of a text document with the fixed source, create a workspace edit corresponding the text edit and send it back. This behavior is preserved when the document kind is Python.
But, for a Notebook, we need to create text edits for each cell where the fix exists which is why this was changed.
ruff_lsp/server.py
Outdated
if new_source.endswith("\r\n"): | ||
new_source = new_source[:-2] | ||
elif new_source.endswith("\n"): | ||
new_source = new_source[:-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more of a TODO for me to check if this is actually required now. Ruff adds and removes the newline at the end of every cell but I need to verify that behavior. I'm not sure what kind of example can be used here. I might have to check with a Ruff version when Jupyter Notebook support was not present.
6ecdd97
to
8cbb83d
Compare
8cbb83d
to
e062625
Compare
e062625
to
33a2738
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Would love to get a second look at this because I'm unfamiliar with our Python LSP implementation
cells: list[CodeCell | MarkdownCell] | ||
|
||
|
||
def _create_notebook_json(notebook_document: NotebookDocument) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow, this feels kind of expensive. Do we run this on every keystroke or only on save? How does it perform for large notebooks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good questions. What about this function feels expensive? I guess the json.dumps
part?
Do we run this on every keystroke or only on save?
It depends on user settings but it could run on every keystroke when "run": "onType"
.
How does it perform for large notebooks?
I'll test it out.
We could potentially create our own layer of notebook synchronization but it feels not worth the effort if we're going to move the LSP to Rust.
Context: openlawlibrary/pygls#394
ruff_lsp/server.py
Outdated
# Publish diagnostics for every code cell, replacing the previous diagnostics. | ||
for cell_idx, cell in enumerate(notebook_document.cells): | ||
if cell.kind is not NotebookCellKind.Code: | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth adding this explanation to the inline comment
ruff_lsp/server.py
Outdated
|
||
def is_notebook_file(self) -> bool: | ||
"""Return True if the document belongs to a Notebook or a cell in a Notebook.""" | ||
return self.kind is DocumentKind.Notebook or self.path.endswith(".ipynb") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, so the latter case is a cell? We use DocumentKind.Text
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, although I'm thinking of adding it's own variant (#264 (comment))
ruff_lsp/server.py
Outdated
document = Document.from_text_document( | ||
LSP_SERVER.workspace.get_text_document(params.text_document.uri) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind adding something to this effect as a comment?
document = Document.from_uri(uri) | ||
workspace_edit = await _fix_document_impl(document) | ||
if workspace_edit is None: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a behavior change? Like does this action now not show up in some cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a behavior change?
No, it's not.
Like does this action now not show up in some cases?
No, this action is registered and always shown on the client side using the VSCode extension. When invoked through VSCode, the code executes this command registered in the LSP.
@@ -1192,13 +1539,9 @@ async def _run_check_on_document( | |||
) | |||
|
|||
|
|||
async def _run_format_on_document(document: workspace.TextDocument) -> RunResult | None: | |||
async def _run_format_on_document(document: Document) -> RunResult | None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, so this works interchangeably with Python files and notebooks now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the Document
object creates an abstraction to store all the information required in the format (specifically the source) it is required.
workspace_edit = _result_to_workspace_edit(document, results) | ||
if workspace_edit is None: | ||
return | ||
LSP_SERVER.apply_edit(workspace_edit, "Ruff: Format document") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you walk me through what happens if we receive a notebook cell here? How does that interact with _run_format_on_document
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The applyFormat
command works at the Notebook level.
For a Notebook the arguments
is going to contain a single entry which will be a TextDocument
for the current cell (active cell or the cell containing the cursor). We'll construct a Document
which represents either a Python file or a Notebook.
After the formatting is done, the _result_to_workspace_edit
will generate a single TextDocumentEdit
for the Python file or a list of TextDocumentEdit
for each cell in a Notebook. These are then a part of WorkspaceEdit
which are then applied.
Thanks for the reviews! I just realized while playing around with the There are 2 ways to go about solving this:
(Sorry for the churn, I must've missed this in my earlier testing and I wasn't aware of the |
33a2738
to
8e0b58a
Compare
ruff_lsp/server.py
Outdated
def is_notebook_cell(self) -> bool: | ||
"""Return True if the document belongs to a cell in a Notebook.""" | ||
return self.kind is DocumentKind.Text and self.path.endswith(".ipynb") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is similar to is_notebook_file
. We represent a cell as Text
because that's what pygls
does as well using TextDocument
.
ruff_lsp/server.py
Outdated
workspace_edit = await _fix_document_impl( | ||
Document.from_uri(params.text_document.uri), only="I001" | ||
) | ||
if workspace_edit: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Document
creation is done only when required as it's eager to read the source text or construct the JSON notebook format.
workspace_edit = _result_to_workspace_edit(document, results) | ||
if workspace_edit is None: | ||
return | ||
LSP_SERVER.apply_edit(workspace_edit, "Ruff: Format document") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The applyFormat
command works at the Notebook level.
For a Notebook the arguments
is going to contain a single entry which will be a TextDocument
for the current cell (active cell or the cell containing the cursor). We'll construct a Document
which represents either a Python file or a Notebook.
After the formatting is done, the _result_to_workspace_edit
will generate a single TextDocumentEdit
for the Python file or a list of TextDocumentEdit
for each cell in a Notebook. These are then a part of WorkspaceEdit
which are then applied.
ruff_lsp/server.py
Outdated
class DocumentSource(NamedTuple): | ||
"""The source of a document.""" | ||
|
||
path: str | ||
"""The path to the document.""" | ||
|
||
text: str | ||
"""The text of the document.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so that we can pass a different source to Ruff than the one in the document. For example, in Format Cell
document, we'll pass the notebook representation containing a single cell and then compare against the original source. Refer to format_document
function.
DemoFormatting
notebook_formatting.movCode Actions
notebook_code_actions.movDiagnostics
diagnostics_on_type.mov
diagnostics_on_save.movHovernotebook_hover.mov |
8e0b58a
to
56a8a5a
Compare
(@dhruvmanila - FYI looks like a CI failure here.) |
56a8a5a
to
2b4d6ab
Compare
Sorry, it's fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great work, and the thorough testing -- gives me a lot of confidence.
LGTM |
09587d4
to
244a246
Compare
Cell = enum.auto() | ||
"""A cell in a Notebook Document.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to introduce the cell specific document kind now that they're being used in multiple places.
@classmethod | ||
def from_cell_or_text_uri(cls, uri: str) -> Self: | ||
"""Create a `Document` representing either a Python file or a Notebook cell from | ||
the given URI. | ||
|
||
The function will try to get the Notebook cell first, and if there's no cell | ||
with the given URI, it will fallback to the text document. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if there's any better name for this.
workspace_edit = await _fix_document_impl( | ||
Document.from_cell_or_text_uri(params.text_document.uri), | ||
only="I001", | ||
) | ||
if workspace_edit: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code actions now work at cell level for a Notebook. This is consistent with everything else in the VS Code ecosystem and a user can use builtin commands (Ruff:
) to run an action on the entire notebook.
def _result_single_cell_notebook_to_edits( | ||
document: Document, result: RunResult | ||
) -> list[TextEdit] | None: | ||
"""Converts a run result to a list of TextEdits. | ||
|
||
if new_source == document.source: | ||
The result is expected to be a single cell Notebook Document. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function exists because for formatting provider, we need to return a list of TextEdit
while for the command we need to apply a WorkspaceEdit
.
(@charliermarsh sorry for asking a review after approving, this is the final commit :)) I've pushed it as a separate commit to facilitate the review process: 244a246 Updates since the last reviewThe major update is that now the code actions work at cell level. Earlier, they were applied at the Notebook level. My thinking before was that source level code actions should be applied on the entire Notebook but it doesn't work well with VS Code settings, specifically the following: {
// Enable formatting the entire Notebook on save
"notebook.formatOnSave.enabled": true,
// Run the enabled code actions on the entire Notebook on save
"notebook.codeActionsOnSave": {
"source.fixAll": true,
"source.organizeImports.ruff": true
},
} This also means that the generic |
I'll review and approve this today, but before doing so I'll also check it out and poke around locally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, tested this out myself too.
Summary
This PR adds support for Jupyter Notebook. It requires client support for LSP 3.17 which contains the Notebook support.
Implementation
Context
Document
: LSP type representing a text file (Python file for Ruff).TextDocument
:pygls
representation of the LSPDocument
. This is an abstraction created from aDocument
which provides some useful methods like getting the file path, source code, etc.NotebookDocument
type was added to represent a Notebook which consists of a list of cells (NotebookCell
). Note that these are all LSP types coming fromlsprotocol
.pygls
, a Notebook cell is represented as a text document (TextDocument
).There are methods provided by
pygls
to get the object:get_text_document
- Returns aTextDocument
which either represents a Python file or a Notebook cellget_notebook_document
- Returns aNotebookDocument
either using the Notebook URI or a cell URI. For cell URI, it returns theNotebookDocument
containing the cell.Document
A new
Document
type was created to facilitate the implementation. This represents either a Python file, a Notebook or a Notebook cell. There are various constructor methods which should be used to create this type:from_uri
orfrom_text_document
.from_uri
orfrom_notebook_document
.from_cell_or_text_uri
orfrom_notebook_cell
.Notebook JSON
Ruff expects the source content of a Notebook file to be in JSON format following the [Notebook format specification] but the protocol uses it's own abstraction and doesn't store the JSON format. This means that we need to create a JSON string representing the Notebook from the available information. This doesn't need all the information as Ruff only uses the cell source and version information. So, we create a minimal JSON string representing the Notebook document and pass it to Ruff.
An example JSON string representing a Notebook Document:
We need to pass in every cell including the markdown cell to get an accurate information like the cell number.
For the cell document kind, the source value is a JSON string containing just a single code cell. This is required as code actions and formatting work at both cell and notebook level.
Configuration
For VSCode users, the
notebook.*
configuration is used to run the formatter or code actions on save:The way the above settings work in VSCode is that the editor runs the actions in parallel for every cell. This has the illusion that it was run on the entire Notebook. The commands defined by us (
Ruff: Organize imports
andRuff: Fix all auto-fixable problems
) are run on the entire Notebook at once. This is important because in the latter case theruff
command is invokedn
number of times wheren
is the number of cells while for the former it's run only once.Commands
Builtin
Ruff: Organize Imports
: Works at Notebook levelRuff: Fix all auto-fixable problems
: Works at Notebook levelVSCode specifc
Format Cell
: Formats the current cellNotebook: Format Notebook
: Formats the entire Notebook by running the formatter for every cellOrganize Imports
: Runs thesource.organizeImports
code action on every cell in parallelFix All
: Runs thesource.fixAll
code action on every cell in parallelFeature checklist
ruff.applyAutofix
ruff.applyOrganizeImports
ruff.applyFormat
Test Plan
Manually testing for all the features mentioned above.
How to run this locally?
git checkout dhruv/notebook
in theruff-lsp
repositoryruff-vscode
, uninstallruff-lsp
(pip uninstall --yes ruff-lsp
) as we'd want to use the local version. To install the localruff-lsp
version inruff-vscode
, follow Modifying the LSP.ruff-vscode
directory -> "Run and Debug" section from the sidebar -> "Debug Extension and Python" config.This will then open a VSCode development session which can be used to test out the notebook features.
Test notebooks:
Requires: astral-sh/ruff#7664 which was released in
v0.1.0
fixes: #267
closes: astral-sh/ruff-vscode#256
closes: astral-sh/ruff-vscode#314
closes: astral-sh/ruff-vscode#51