Repeat Parser Method n times #1876

vedant-g · 2022-10-31T06:36:40Z

vedant-g
Oct 31, 2022

Is there a way to specify how many times a parser method can be repeated in a production rule?

For eg. below is the snippet for a select clause from the tutorials:

$.RULE("selectClause", () => {
  $.CONSUME(Select);
  $.AT_LEAST_ONE_SEP({
    SEP: Comma,
    DEF: () => {
      $.CONSUME(Identifier);
    },
  });
});

Here the identifier should be at least one, but I can't specify the upper limit. If I want to have exactly 5 identifiers separated by comma, how can i specify that using AT_LEAST_ONE_SEP method

Answered by msujew

Oct 31, 2022

Hey @vedant-g,

effectively, Chevrotain uses a grammar notation that is quite similar to the well-known EBNF. The MANY and AT_LEAST_ONE functions map to * and + in the EBNF notation respectively. However, there's no way to specify how often a repetition is to be parsed in EBNF nor in Chevrotain. You can only specify that parsing should stop using GATE properties.

If I want to have exactly 5 identifiers separated by comma, how can i specify that using AT_LEAST_ONE_SEP method

On another note, I would advise you to not go down that route. In general, grammars should be kept flexible, maybe even allowing to parse relatively unreasonable constructs. A post-processing phase should then validat…

View full answer

msujew · 2022-10-31T11:59:40Z

msujew
Oct 31, 2022
Collaborator

Hey @vedant-g,

effectively, Chevrotain uses a grammar notation that is quite similar to the well-known EBNF. The MANY and AT_LEAST_ONE functions map to * and + in the EBNF notation respectively. However, there's no way to specify how often a repetition is to be parsed in EBNF nor in Chevrotain. You can only specify that parsing should stop using GATE properties.

If I want to have exactly 5 identifiers separated by comma, how can i specify that using AT_LEAST_ONE_SEP method

On another note, I would advise you to not go down that route. In general, grammars should be kept flexible, maybe even allowing to parse relatively unreasonable constructs. A post-processing phase should then validate the input and assert that domain/semantic rules are correctly followed and inform the user of any errors. In your case, the parser would simply return a mismatched token exception, which can be hard to understand for the uneducated user, while a custom error message can help resolve the issue.

0 replies

bd82 · 2022-11-02T16:19:18Z

bd82
Nov 2, 2022
Maintainer

Hello @vedant-g

I suspect you can implement such a helper method yourself.
This sounds like a use case for user defined macros:

User Defined Macros #1004

function EXACTLY_N_TIMES(n, subrule) {
    for (let i = 1; i <= n; i++) {
        this.subrule(i, subrule)
    }
}

But as @msujew mentioned, why would you want to do this?
If your grammar only allows something to repeat (at most) X times, won't it be simpler to evaluate it the "correctness" in a post parsing
phase?

Imagine inspecting the number of arguments passed to a function match the number of parameters defined for it...

I personally prefer to keep grammars as simple as possible and defer other (none syntactic) concerns to later stages in the pipeline.

0 replies

vedant-g · 2022-11-03T19:34:42Z

vedant-g
Nov 3, 2022
Author

Thanks @msujew and @bd82 for your inputs on this. I think it makes sense to keep the grammar rules simple and validate it in post parsing stage. Is there any example for this that I can refer.

But as @msujew mentioned, why would you want to do this?

I am trying to add few functions to the calculator example like

  A(x)
  B(x,y)

But as you mentioned the number of arguments passed to a function should match the number of parameters defined for it.

In the parser I wrote different rules like below

$.RULE('binaryFunction', () => {
      $.CONSUME(B)
      $.CONSUME(LParen)
      $.SUBRULE($.expression, { LABEL: 'a' })
      $.CONSUME(Comma)
      $.SUBRULE1($.expression, { LABEL: 'b' })
      $.CONSUME(RParen)
    })

$.RULE('unaryFunction', () => {
      $.CONSUME(A)
      $.CONSUME(LParen)
      $.SUBRULE($.expression, { LABEL: 'a' })
      $.CONSUME(RParen)
    })

Like this if I keep on adding rules basis the number of arguments, there's a lot of code that would be repeated. So I was wondering if I can use the AT_LEAST_ONE_SEP method for the parameters part, specifying the times it should be repeated.

I would like to know your thoughts on this.

Thanks in advance!

0 replies

bd82 · 2022-11-07T09:48:37Z

bd82
Nov 7, 2022
Maintainer

I would like to know your thoughts on this.

Hi @vedant-g

In the general case (parsing a complete programming language) the grammar should be more generic.
But your case is much simpler (calculator) when the type/names of the functions are known.

You could consider a somewhat hybrid approach:

one rule for pre-known function names (big OR)
one generic rule for function arguments (0-N)
combine these two rules to handle all function calls.
Add a stage that builds an AST from the Chevrotain CST
perform semantic errors validation (e.g wrong number of arguments) on the AST.

WDYT?

1 reply

vedant-g Nov 11, 2022
Author

Thank you very much @bd82 . Now I got what you meant by validation in the post parsing phase. It would keep the grammar simple and validate the AST instead. I will try this out.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repeat Parser Method n times #1876

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Repeat Parser Method n times #1876

vedant-g Oct 31, 2022

Replies: 4 comments · 1 reply

msujew Oct 31, 2022 Collaborator

bd82 Nov 2, 2022 Maintainer

vedant-g Nov 3, 2022 Author

bd82 Nov 7, 2022 Maintainer

vedant-g Nov 11, 2022 Author

vedant-g
Oct 31, 2022

Replies: 4 comments 1 reply

msujew
Oct 31, 2022
Collaborator

bd82
Nov 2, 2022
Maintainer

vedant-g
Nov 3, 2022
Author

bd82
Nov 7, 2022
Maintainer

vedant-g Nov 11, 2022
Author