Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New macro formatting getting clues from whitespace #101

Merged
merged 3 commits into from
Feb 2, 2025

Conversation

andrewbaxter
Copy link
Owner

@andrewbaxter andrewbaxter commented Jan 30, 2025

Fix #100

The original macro formatting determined if spaces should be added purely depending on the punctuation and some information carried over from the previous punctuation. This isn't sufficient to determine whether two pieces of punctuation need to be joined or separated.

For instance * = -- the previous method tried to make this *= but if the next symbol is > it should actually be * =>. The provided rust libraries (and macro processing) aren't powerful enough to determine this and the rust syntax is too ambiguous/complex.

The new processing relies on whitespace to determine if two subsequent punctuation tokens must be separated or joined. I.e. if they were originally separated with space, they must not be separated with space. When writing code, people will add spaces if to pieces of punctuation can't be joined, so this is a good alternate signal.

I tried to apply then the same logic for common cases: . typically pulls (non-punctuation) things together, ` typically begins a label so no space afterwards, # similarly (for quote!() macros).

I'm not sure how much this affects existing code, so I want to test it on a few codebases to try to reduce other unnecessary/undesired changes.

As an aside, I think I was originally hung up on determinism and whitespace - a deterministic formatter will produce the same output regardless of the initial formatting. Basically the parsed token stream doesn't directly encode all significant whitespace, and I assumed that data model was complete and no additional whitespace was significant. The presence of whitespace is significant when reformatting raw token streams. Only the type and amount of whitespace is insignificant.


I agree that when the request is merged I assign the copyright of the request to the repository owner.

@rootnas
Copy link

rootnas commented Jan 30, 2025

That's disappointing that Rust is too ambiguous to make this easy, but I guess that's a consequence of trying to format macros. I don't think the pattern * => exists outside of macro definitions/invocations, so perhaps this behavior can be special-cased for those contexts.

@andrewbaxter
Copy link
Owner Author

Yeah, the basic philosophy regarding macro formatting here is that ideally most macros look like normal rust function calls or class definitions for the most part so we can just parse them as normal rust code and format them that way (e.g. println!). Macro definitions are the ones that break the rules the most so there it's basically best effort... but in a way that shouldn't break anything.

I did some more testing and (along with an issue with the shebang workaround) I made it cause fewer style changes testing in a bunch of codebases I have locally. Since it's mostly my code it probably doesn't test everything, but most codebases had 0 formatting changes so I think this is probably pretty good now...

@rootnas
Copy link

rootnas commented Jan 31, 2025

Macro definitions are the ones that break the rules the most

This is true, but I can also define an invocation syntax completely foreign to normal Rust code.

@andrewbaxter andrewbaxter merged commit c06d730 into master Feb 2, 2025
5 checks passed
@andrewbaxter
Copy link
Owner Author

Released in genemichaels 0.6.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants