You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+45-39
Original file line number
Diff line number
Diff line change
@@ -6,17 +6,57 @@ Proofs written in [Lean4](https://leanprover.github.io/) for the core [katydid](
6
6
7
7
## Goal
8
8
9
-
The goal is to formalize the core [katydid](https://katydid.github.io/) validation algorithm. This algorithm allows us to validate millions of serialized data structures per second on a single core. The algorithm is based on derivatives for regular expressions and extends this to Visibly Pushdown Automata (VPA), by splitting the derivative function into two functions. It also includes several basic optimizations, such as memoization, simplification, laziness, zipping of multiple states, short circuiting, evaluation at compilation, symbolic derivatives and a pull based parser for serialized data structures that allows us to skip over some of the parsing. You can play around with the validation language on its [playground](http://katydid.github.io/play/).
9
+
The goal is to formalize the core [katydid](https://katydid.github.io/) validation algorithm. This algorithm allows us to validate millions of serialized data structures per second on a single core. The algorithm is based on derivatives for regular expressions and extends this to Visibly Pushdown Automata (VPA), by splitting the derivative function into two functions. It also includes several basic optimizations, such as memoization, simplification, laziness, zipping of multiple states, short circuiting, evaluation at compilation, symbolic derivatives and a pull based parser for serialized data structures that allows us to skip over some of the parsing. You can play around with the validation language on its [playground](http://katydid.github.io/play/).
10
10
11
-
## Prerequisites
11
+
## Plan
12
12
13
-
This is required reading for understanding the underlying algorithm, the subject of the proofs in this repo:
13
+
This is just a quick overview of the steps towards our goal.
14
+
15
+
## Symbolic regular expressions
16
+
17
+
Prove theorems about Symbolic regular expressions as a foundation to build upon.
18
+
19
+
-[ ] Prove correctness of derivative algorithm via a commuting diagram.
20
+
-[ ] Prove correctness of derivative algorithm via a Regex type indexed with Language.
21
+
-[ ] Prove decidability of derivative algorithm.
22
+
-[ ] Prove correctness of simplification rules.
23
+
-[ ] Prove correctness of smart constructors.
24
+
25
+
Reuse as much as we can from [our previous work in Coq](https://github.com/katydid/regex-derivatives-coq) and out attempt at [Reproving Agda in Lean](https://github.com/katydid/symbolic-automatic-derivatives)
26
+
27
+
## Symbolic predicates
28
+
29
+
-[ ] Create expression language as described in the post: [Derivatives of Symbolic Automata explained](https://medium.com/@awalterschulze/derivatives-of-symbolic-automata-explained-4673dee6af82)
30
+
-[ ] Prove correctness of simplification rules for `or`, `and`, `false`, etc.
31
+
-[ ] Prove that non-reader functions can be pre-computed before evaluating time
32
+
-[ ] Prove that the optimized comparison method using a hash is comparable (transitive, associative, etc.)
33
+
34
+
## Create the symbolic regular expressions for trees
35
+
36
+
-[ ] Create Language definition for the symbolic tree expressions.
37
+
-[ ] Code Pull-based Parser class in Lean and implement JSON as an example.
38
+
-[ ] Code Katydid algorithm in Lean.
39
+
-[ ] Prove correctness of derivative tree algorithm.
40
+
-[ ] Prove decidablity of derivative tree algorithm.
41
+
-[ ] Prove that the simple tree function and the VPA functions are equivalent and equivalent to the inductive predicate.
42
+
-[ ] Prove correctness of new simplification rules
43
+
-[ ] Prove all optimizations of the katydid algorithm
44
+
45
+
### Contributing
46
+
47
+
Please check the [prerequisites](https://github.com/katydid/proofs#prerequisites) and read the [contributing guidelines](https://github.com/katydid/proofs/blob/master/CONTRIBUTING.md). The contributing guidelines are short and shouldn't be surprising.
48
+
49
+
### Understanding Brzozowski derivatives
50
+
51
+
Contributing to this repo requires an understanding the underlying algorithm that is the subject of the proofs in this repo:
14
52
15
53
-[Derivatives of Regular Expressions explained](https://medium.com/@awalterschulze/how-to-take-the-derivative-of-a-regular-expression-explained-2e7cea15028d)
16
54
-[Derivatives of Context-Free Grammars explained](https://medium.com/@awalterschulze/derivatives-of-context-free-grammars-explained-3f930c5e363b) (only the simplification rules, smart constructors and memoization are important)
17
55
-[Derivatives of Symbolic Automata explained](https://medium.com/@awalterschulze/derivatives-of-symbolic-automata-explained-4673dee6af82)
18
56
19
-
This repo also requires the following background:
57
+
### Understanding LeanProver
58
+
59
+
This repo also requires an understanding of proof assistants, since all the proofs in this repo are done using LeanProver:
20
60
21
61
- Knowledge of dependent types, induction and understanding the difference between a property `True` and a boolean `true`. We recommend reading [The Little Typer](https://mitpress.mit.edu/9780262536431/the-little-typer/) to gain an understanding of the basics.
22
62
- Experience with an Interactive Theorem Prover, like Coq or Lean, including using tactics and Inductive Predicates. If you are unfamiliar with interactive theorem provers you can watch our [talk](https://www.youtube.com/watch?v=-NHWF4ntc1I) for a taste. We recommend reading [Coq in a Hurry](https://cel.archives-ouvertes.fr/file/index/docid/459139/filename/coq-hurry.pdf) as a quick overview and [Coq Art](https://www.labri.fr/perso/casteran/CoqArt/) up to `Chapter 8: Inductive Predicates` for a proper understanding.
@@ -33,42 +73,8 @@ Optionally the following will also be helpful, but this is not required:
33
73
34
74
Questions about Lean4 can be asked on [proofassistants.stackexchange](https://proofassistants.stackexchange.com/) by tagging questions with `lean` and `lean4` or in the [Zulip Chat](https://leanprover.zulipchat.com/).
35
75
36
-
## Development
37
-
38
-
### Pair Programming Proofs
39
-
40
-
We have pair programming sessions between 18:00 - 20:00 UK time, on dates that are announced on [meetup.com](https://www.meetup.com/london-tydd/).
41
-
42
-
- If you would like to watch, this will be streamed on [Twitch](https://www.twitch.tv/awalterschulze).
43
-
- If you would like to do more than watch and join us inside the zoom session, please make sure you have met the [prerequisites](https://github.com/katydid/proofs#prerequisites) and email [Walter](https://github.com/awalterschulze).
44
-
45
-
### Plan
46
-
47
-
This is just a quick overview of the steps towards our goal.
48
-
49
-
- Create Symbolic Predicates
50
-
+ Create expression language as described in the post: [Derivatives of Symbolic Automata explained](https://medium.com/@awalterschulze/derivatives-of-symbolic-automata-explained-4673dee6af82)
51
-
+ Prove simplification rules for `or`, `and`, `false`, etc.
52
-
+ Prove that non-reader functions can be pre-computed before evaluating time
53
-
+ Prove that the optimized comparison method using a hash is comparable (transitive, associative, etc.)
54
-
- Create the symbolic regular expressions for trees
55
-
+ Steal as much as we can from [our previous work in Coq](https://github.com/awalterschulze/regex-reexamined-coq/)
56
-
+ Prove that the induction predicate is decidable.
57
-
+ Create derivative functions for the symbolic tree expressions.
58
-
+ Prove that the simple tree function and the VPA functions are equivalent and equivalent to the inductive predicate.
59
-
+ Prove simplification rules
60
-
+ Prove all optimizations of the katydid algorithm
61
-
+ ...
62
-
63
-
### Contributing
64
-
65
-
Please check the [prerequisites](https://github.com/katydid/proofs#prerequisites) and read the [contributing guidelines](https://github.com/katydid/proofs/blob/master/CONTRIBUTING.md). The contributing guidelines are short and shouldn't be surprising.
66
-
67
76
### Setup
68
77
69
78
- Lean4 has exceptional [instructions for installing Lean4 in VS Code](https://github.com/leanprover/lean4/blob/master/doc/quickstart.md).
70
79
- Remember to also add `lake` (the build system for lean) to your `PATH`. You can do this on mac by adding `export PATH=~/.elan/bin/:${PATH}` to your `~/.zshrc` file
71
-
- Use mathlib's cache to speed up building time by running: `$lake exe cache get`
72
-
73
-
74
-
80
+
- Use mathlib's cache to speed up building time by running: `$ lake exe cache get`
0 commit comments