Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a more helpful diagnostic on "shebang lookalikes" (and soon "frontmatter lookalikes" too) #137249

Open
fmease opened this issue Feb 19, 2025 · 8 comments · May be fixed by #137619
Open
Assignees
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-parser Area: The parsing of Rust source code to an AST D-terse Diagnostics: An error or lint that doesn't give enough information about the problem at hand. P-low Low priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@fmease
Copy link
Member

fmease commented Feb 19, 2025

For the token sequence #! to start a shebang which would get stripped, it must be located at the very start of the file (modulo Unicode BOM) and not be followed by (whitespace | line_comment | block_comment)* [. If that's not given, the lexer won't remove it for the parser to digest instead. Meaning if the #! doesn't start an inner attribute, the parser will emit a relatively opaque / terse error of the form expected `[`, found $ACTUAL.

The most common trigger would be leading whitespace (incl. leading newlines). Example:

 #!/usr/bin/env -S cargo +nightly -Zscript
error: expected `[`, found `/`
 --> src/lib.rs:1:4
  |
1 |  #!/usr/bin/env -S cargo +nightly -Zscript
  |    ^ expected `[`

Ideally the parser would helpfully point out that the #! doesn't start a shebang here because $reasons.

I suspect this to get hit increasingly more often with the advent of RFC3424 Cargo scripts but we're generally still in uncommon and P-low territory I'd say.


This diagnostic was originally encountered by @epage in the context of doctests (source). Example reproducer:

//! ```
//! #!/usr/bin/env -S cargo +nightly -Zscript
//! fn main() {}
//! ```

rustdoc file.rs --test currently yields (abbreviated):

---- file.rs - (line 1) stdout ----
error: expected `[`, found `/`
 --> file.rs:2:3
  |
2 | #!/usr/bin/env rustc
  |   ^ expected `[`

error: aborting due to 1 previous error

Couldn't compile the test.

NB: Whether rustdoc should interpret the #! in #[doc = "#!..."] (notice: no leading space!) or equivalently /// #!... as the start of a shebang (right now it doesn't) assuming the doc attr/comment is sandwiched in between /// ``` of course is a separate question for T-rustdoc to decide. #! will likely never start a shebang inside doctests and that's good, too.


Once rustc supports frontmatter (#136889) which will get treated very similarly, we can think about improving diagnostics for their "lookalikes", too (i.e., detecting "stray" --- (etc.) in the parser).

@fmease fmease added A-diagnostics Area: Messages for errors, warnings, and lints A-parser Area: The parsing of Rust source code to an AST D-terse Diagnostics: An error or lint that doesn't give enough information about the problem at hand. P-low Low priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 19, 2025
@fmease fmease changed the title Provide a more helpful diagnostic on "shebang lookalikes" (and "frontmatter lookalikes" soon too ) Provide a more helpful diagnostic on "shebang lookalikes" (and soon "frontmatter lookalikes" too) Feb 19, 2025
@theemathas

This comment has been minimized.

@fmease
Copy link
Member Author

fmease commented Feb 19, 2025

cc #134810

I'm not sure how that's related apart from the word "exclamation mark"? #! (Pound, Bang) falls into a different lexical / syntactic category and is much simpler to robustly detect and deal with by several orders of magnitude.

@epage
Copy link
Contributor

epage commented Feb 19, 2025

Overall, I feel like the lookalikes are most likely to happen in rustdoc that I suspect focusing on that can offer better results if its done before the conversion to test code. Frontmatters in rustdoc is being discussed on zulip

@fmease
Copy link
Member Author

fmease commented Feb 19, 2025

I'm not sure I completely agree but I do see your point. As long as the checks are robust enough I'd be okay with that as well. E.g., we shouldn't start rejecting the following code:

//! ```
//! #!//usr/bin/env -S cargo -Zscript
//! /* Intro /*!*/ */
//! [deny(unused_must_use)]
//! fn main() {}
//! ```

There's no shebang here.

Just note that handling this inside rustc would get us two for one.

@Pyr0de
Copy link
Contributor

Pyr0de commented Feb 22, 2025

I'd like to give try solving this issue.
Should I replace the current error message, produced here with another error message? and what should the error message be?

@rustbot claim

@fmease
Copy link
Member Author

fmease commented Feb 23, 2025

I think I would keep the message and simply attach a (subdiagnostic) note because it wouldn't be super high-confidence (since it likely admits several false positives (which is okay)). Thanks for taking a shot at it :)

@Pyr0de
Copy link
Contributor

Pyr0de commented Feb 24, 2025

If strip_shebang returns None here, we could try to strip the leading whitespaces(and maybe comments?) and check with strip_shebang again to know if there is a shebang with leading whitespaces to then emit a warning

but this would not give a warning for a shebang which is written after some valid syntax like

#![allow(unused_mut)]
#!/usr/bin/env -S cargo +nightly -Zscript

Is it needed to check for shebang throughout the file or just at the start?

@epage
Copy link
Contributor

epage commented Feb 24, 2025

@Pyr0de , If this is being used to help with errors in doctests, then a shebang can appear anywhere rustdoc would be injecting it.

@fmease I overlooked this previously:

I'm not sure I completely agree but I do see your point. As long as the checks are robust enough I'd be okay with that as well. E.g., we shouldn't start rejecting the following code:

//! //! #!//usr/bin/env -S cargo -Zscript //! /* Intro /*!*/ */ //! [deny(unused_must_use)] //! fn main() {} //!

There's no shebang here.

Just note that handling this inside rustc would get us two for one.

rustc's strip_shebang will not strip the shebang in that case because it looks for #! followed by a [, allowing for whitespace and comments in between.

@Pyr0de Pyr0de linked a pull request Feb 25, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-parser Area: The parsing of Rust source code to an AST D-terse Diagnostics: An error or lint that doesn't give enough information about the problem at hand. P-low Low priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants