Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/* comments */ should not be reformatted #179

Open
toastal opened this issue Apr 2, 2024 · 18 comments
Open

/* comments */ should not be reformatted #179

toastal opened this issue Apr 2, 2024 · 18 comments
Labels
bug Something isn't working

Comments

@toastal
Copy link
Contributor

toastal commented Apr 2, 2024

Description

/* */ style comments should not be reformatted. They have additional use cases such as tree-sitter picking up the the syntax to highlight before scripts inside multiline strings.

Small example input

/* lua */ ''
  print("Hello, world!")
''

Expected output

/* lua */ ''
  print("Hello, world!")
''

(now tree-sitter in my editor highlights this string to Lua which really helps me read/write code in this block now that it’s not just syntax highlighted as a string)

Actual output

# lua
''
  print("Hello, world!")
''
@toastal toastal changed the title /* comments */ should be reformatted /* comments */ should not be reformatted Apr 2, 2024
@toastal
Copy link
Contributor Author

toastal commented Apr 2, 2024

In the wild, you can see this all over Nixpkgs test such as https://github.com/NixOS/nixpkgs/blob/master/nixos/tests/switch-test.nix#L8

@dasJ
Copy link
Member

dasJ commented Apr 9, 2024

Fwiw, nvim-treesitter now supports that: nvim-treesitter/nvim-treesitter#6418 (comment)

@SuperSandro2000
Copy link
Member

They have additional use cases such as tree-sitter picking up the the syntax to highlight before scripts inside multiline strings.

or fenced vim hints.

nvim-treesitter now supports that

That's not released, yet.

@infinisil
Copy link
Member

infinisil commented Apr 16, 2024

We discussed this in the team meeting today:

Conclusion: We generally think that it would be fine to preserve /*-style comments more. While we don't intend to fix this ourselves, anybody is free to PR this. If you do, make sure to update this line in the standard.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/formatting-team-meeting-2024-04-16/43533/1

@MattSturgeon
Copy link

Looks like this is the relevant rendering code:

instance Pretty Trivium where
pretty EmptyLine = emptyline
pretty (LineComment c) = comment ("#" <> c) <> hardline
pretty (BlockComment isDoc c) =
comment (if isDoc then "/**" else "/*")
<> hardline
-- Indent the comment using offset instead of nest
<> offset 2 (hcat $ map prettyCommentLine c)
<> comment "*/"
<> hardline
where
prettyCommentLine :: Text -> Doc
prettyCommentLine l
| Text.null l = emptyline
| otherwise = comment l <> hardline

And here is the corresponding parsing code:

-- Try to parse /** before /*, but don't parse /**/ (i.e. the empty comment)
isDoc <- try ((True <$ char '*') <* notFollowedBy (char '/')) <|> pure False
chars <- manyTill anySingle $ chunk "*/"
return $ PTBlockComment isDoc $ dropWhile Text.null $ fixIndent pos' $ removeStars pos' $ splitLines $ pack chars
where

Maybe there should be an additional isInline parameter passed to BlockComment, true when chars contains no linebreaks?
I guess it should also check if there is additional content after the end of the comment, but on the same line?
When isInline is true, the rendering code would avoid inserting newlines unless otherwise necessary due to line-length.

Having never touched haskell, I doubt I'm up to the task. Hopefully pointing to the relevant code will inspire someone more knowledgeable to submit a PR.

@piegamesde
Copy link
Member

The issue is less with the parsing, that's a bit fiddly but no more than a chore. The big open question is, what to do in the renderer with such comments when they are followed by significant amounts of code which overflow the line length limit.

The rendering engine currently ignores all line length calculations involving comments, however this strongly depends on the assumption that comment tokens are always strictly followed by a line break.

@MattSturgeon
Copy link

MattSturgeon commented Jun 4, 2024

Thanks for clarifying!

The big open question is, what to do in the renderer with such comments when they are followed by significant amounts of code which overflow the line length limit.

Ideally, inserting a line break before the comment would be enough to not overflow, but that's the best case scenario...

I think it's fair to say, that if someone has written an inline block comment on the same line as a long string (for example), it was intentional. Maybe in that scenario it is just accepted that the line will overflow?
Alternatively, perhaps a '' indented raw string could be used, with the string content starting after a "''\n"" sequence? Though this seems overly complex at first glance.

If the long content to the right of the inline block comment is not a string, then I think it is ok to reformat that as usual, such that it is wrapped over multiple lines.

In summary, yes - you're right; this is more complex that it first appears!

The rendering engine currently ignores all line length calculations involving comments, however this strongly depends on the assumption that comment tokens are always strictly followed by a line break.

That sounds like a bigger issue. Perhaps "inline" comments can be a distinct type from other comments and therefore included in line length calculations?

@piegamesde
Copy link
Member

So one thing that one could do would be to special case these comments in the parser but only if they are followed by '' and then a line break. I think that should cover all reasonable use cases without breaking everything else.

@MattSturgeon
Copy link

So one thing that one could do would be to special case these comments in the parser but only if they are followed by '' and then a line break. I think that should cover all reasonable use cases without breaking everything else.

While it'd be nice to have support for normal strings too, I agree that is a reasonable approach that'd work for most usage.

I do sometimes do things like this in my config.

{
  # This snippet is so short it feels strange to use an indented string:
  someLuaOption = /* lua */ "function() print('hello, world!') end";
}

But some support is better than no support, and it can always be iterated on in the future.

@dasJ dasJ added the bug Something isn't working label Jun 26, 2024
@ian-h-chamberlain
Copy link

Just to propose one additional use case for /* */ comment (non-)formatting: it's possible to toggle a block comment with a single # using something like the following:

{
  /* <-- Toggling a # line comment at the beginning of this line toggles the whole block
    services.foo = {
      enable = true;
      bar = "baz";
    };
  # */
}

For this kind of case I think it would be valuable to avoid splitting the final # */ onto two lines (and maybe also the first line, but that's less important). Otherwise this "quick toggle" results in invalid syntax, e.g.

{
  # /*
    <-- Toggling a # line comment at the beginning of this line toggles the whole block
      services.foo = {
        enable = true;
        bar = "baz";
      };
    #
  */
}

I use this in my configs to quickly toggle on/off large sections of the config. I'm not sure how popular a practice it is in general, but it seems to be moderately common from a cursory search:
https://github.com/search?q=lang%3Anix+%2F%28%23+%5C*%5C%2F%7C%23+%5C%2F%5C*%29%2F&type=code

@MattSturgeon
Copy link

Just to propose one additional use case for /* */ comment (non-)formatting

I don't think this issue is proposing that block comments not be formatted; instead this issue is about allowing short and/or inline /* */ comments.

Given nixfmt is deterministic, it'd probably have to be based on how long the comment content is and whether or not there is trailing code after the comment?

Your issue about block comments being formatted messing up your workflow should probably be tracked in a separate ticket and may be best implemented as a configurable cli flag.

@piegamesde
Copy link
Member

@ian-h-chamberlain commenting out code blocks with /* has the advantage that it is low on diff noise, however it has the big disadvantage that it breaks as soon as that block contains another nested /* comment.

@ian-h-chamberlain
Copy link

I opened #225 to track separately, maybe it would also make sense to update the title of this issue to be more specifically about short/inline /* */ comments just to clarify?
Any more discussion about whether / how to handle this case can continue on that issue, I suppose.

@infinisil
Copy link
Member

Btw here's the code that handles the conversion from /* foo */ to # foo:

nixfmt/src/Nixfmt/Lexer.hs

Lines 131 to 169 in e819b2d

convertTrailing :: [ParseTrivium] -> Maybe TrailingComment
convertTrailing = toMaybe . join . map toText
where
toText (PTLineComment c _) = strip c
toText (PTBlockComment False [c]) = strip c
toText _ = ""
join = Text.unwords . filter (/= "")
toMaybe "" = Nothing
toMaybe c = Just $ TrailingComment c
convertLeading :: [ParseTrivium] -> Trivia
convertLeading =
concatMap
( \case
PTNewlines 1 -> []
PTNewlines _ -> [EmptyLine]
PTLineComment c _ -> [LineComment c]
PTBlockComment _ [] -> []
PTBlockComment False [c] -> [LineComment $ " " <> strip c]
PTBlockComment isDoc cs -> [BlockComment isDoc cs]
)
isTrailing :: ParseTrivium -> Bool
isTrailing (PTLineComment _ _) = True
isTrailing (PTBlockComment False []) = True
isTrailing (PTBlockComment False [_]) = True
isTrailing _ = False
convertTrivia :: [ParseTrivium] -> Pos -> (Maybe TrailingComment, Trivia)
convertTrivia pts nextCol =
let (trailing, leading) = span isTrailing pts
in case (trailing, leading) of
-- Special case: if the trailing comment visually forms a block with the start of the following line,
-- then treat it like part of those comments instead of a distinct trailing comment.
-- This happens especially often after `{` or `[` tokens, where the comment of the first item
-- starts on the same line ase the opening token.
([PTLineComment _ pos], (PTNewlines 1) : (PTLineComment _ pos') : _) | pos == pos' -> (Nothing, convertLeading pts)
([PTLineComment _ pos], [PTNewlines 1]) | pos == nextCol -> (Nothing, convertLeading pts)
_ -> (convertTrailing trailing, convertLeading leading)

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/enforcing-nix-formatting-in-nixpkgs/49506/8

@infinisil
Copy link
Member

For reference, we had a discussion about this on Matrix

@omkumar312
Copy link

I want to work on this issues please assign me this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Todo
Development

No branches or pull requests

9 participants