New macro formatting getting clues from whitespace #101
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix #100
The original macro formatting determined if spaces should be added purely depending on the punctuation and some information carried over from the previous punctuation. This isn't sufficient to determine whether two pieces of punctuation need to be joined or separated.
For instance
*
=
-- the previous method tried to make this*=
but if the next symbol is>
it should actually be* =>
. The provided rust libraries (and macro processing) aren't powerful enough to determine this and the rust syntax is too ambiguous/complex.The new processing relies on whitespace to determine if two subsequent punctuation tokens must be separated or joined. I.e. if they were originally separated with space, they must not be separated with space. When writing code, people will add spaces if to pieces of punctuation can't be joined, so this is a good alternate signal.
I tried to apply then the same logic for common cases:
.
typically pulls (non-punctuation) things together,`
typically begins a label so no space afterwards,#
similarly (forquote!()
macros).I'm not sure how much this affects existing code, so I want to test it on a few codebases to try to reduce other unnecessary/undesired changes.
As an aside, I think I was originally hung up on determinism and whitespace - a deterministic formatter will produce the same output regardless of the initial formatting. Basically the parsed token stream doesn't directly encode all significant whitespace, and I assumed that data model was complete and no additional whitespace was significant. The presence of whitespace is significant when reformatting raw token streams. Only the type and amount of whitespace is insignificant.
I agree that when the request is merged I assign the copyright of the request to the repository owner.