Why parsing failed #1808
-
Code for playground (function jsonGrammarOnlyExample() {
const { createToken, EmbeddedActionsParser, Lexer } = chevrotain;
const LCurly = createToken({ name: "LCurly", pattern: /{/, label: "{" });
const RCurly = createToken({ name: "RCurly", pattern: /}/, label: "}" });
const Comma = createToken({ name: "Comma", pattern: /,/, label: "," });
const Word = createToken({ name: "Word", pattern: /\w+/ });
const Space = createToken({ name: "Space", pattern: /\s+/ });
const jsonTokens = [RCurly, LCurly, Comma, Word, Space];
const JsonLexer = new Lexer(jsonTokens);
class JsonParser extends EmbeddedActionsParser {
constructor() {
super(jsonTokens)
const $ = this;
$.RULE("members", () => {
$.CONSUME(LCurly);
this.MANY(() => {
$.OPTION(() => $.CONSUME1(Space));
$.CONSUME(Word);
$.OPTION1(() => $.CONSUME2(Space));
$.CONSUME(Comma);
});
$.OPTION2(() => {
$.OPTION3(() => $.CONSUME3(Space));
$.CONSUME1(Word);
});
$.OPTION4(() => $.CONSUME4(Space));
$.CONSUME(RCurly);
return "SUCCESS";
});
this.performSelfAnalysis();
}
}
return {
lexer: JsonLexer,
parser: JsonParser,
defaultRule: "members"
};
}()) Parsing succeed for:
But failed for:
Looks like the problem in OPTION3. But whats wrong? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
You can tell chevrotain to ignore the whitespace token, that should make your grammar a lot simpler... https://chevrotain.io/docs/tutorial/step1_lexing.html#skipping-tokens |
Beta Was this translation helpful? Give feedback.
-
Hi @PavelDymkov, When entering the this.MANY(() => {
$.OPTION(() => $.CONSUME1(Space));
$.CONSUME(Word);
$.OPTION1(() => $.CONSUME2(Space));
$.CONSUME(Comma);
}); or $.OPTION2(() => {
$.OPTION3(() => $.CONSUME3(Space));
$.CONSUME1(Word);
});
$.OPTION4(() => $.CONSUME4(Space)); Because Chevrotain will always take the first match for parsing, it will always parse any string which starts with Here's an improved version of this rule: $.RULE("members", () => {
$.CONSUME(LCurly);
$.OPTION(() => {
$.OPTION1(() => $.CONSUME1(Space));
$.CONSUME(Word);
$.OPTION2(() => $.CONSUME2(Space));
$.MANY(() => {
$.CONSUME(Comma);
$.OPTION3(() => $.CONSUME3(Space));
$.CONSUME1(Word);
$.OPTION4(() => $.CONSUME4(Space));
});
$.CONSUME(RCurly);
});
return "SUCCESS"; Also please use ignored whitespace tokens, as @NaridaL suggests, it makes grammars a lot simpler 👍 |
Beta Was this translation helpful? Give feedback.
Hi @PavelDymkov,
When entering the
MANY
call, Chevrotain will try to identify whether it should actually parse it using it's lookahead algorithm. So it basically compares the next tokens in the input with the possible next tokens in theMANY
call. The issue is that the tokens in theMANY
call match exactly with the tokens after that. The tokens{
,X
," "
can be parsed using either:or
Because Chevrotain will always take the first match for p…