Constrained Dynamic Parser
This page describes the requirements for a generic parser for the Event-B Mathematical Language. It will also draft a first design proposal.
Requirements
In order to be usable mathematical extensions require the event-b mathematical language syntax to be extendable by the final user.
Thus, the lexical analyser and the syntaxic parser have to be extendable in a simple enough way (from a user point of vue).
Requirements Exported by the Current Language Design
Operator Priority
- operator are defined by group,
- group of operator have a defined precedences,
- there may be precedences defined inside groups.
Operator Associativity
- a compatibility table defines allowed associativity inside a group,
- a compatibility tables defines allowed associativity between groups (it allows to forbid a syntaxic construction like
- nota: this requirement was added afterwards with consistency in mind.
Expected Extension Schemes
We do want to at least define operators of the following form :
- infix : or
- prefix :
- postfix :
- closed :
- parentheses sugar :
- functional postfix :
- functional prefix :
- lambda like :
- Union like : or
- sets : or or
We also like to define precisely precedences and associativity between existing and new operators.
Requirements exported by the dynamic feature
- the precedence should not be enumerated but defined by a relation, like: '+' < '*' and '*' < '**', ...
Limitations
Design Alternatives
Make Existing Parser Extendable
The existing parser is a LL recursive descent parser generated by the Coco/R Compiler Generator, which makes extensive use of pre-computed lookahead, which makes it very difficult to be transformed in a sufficiently enough generic parser.
Parser Combinator
- on wikipedia. Some implementations are referenced [here http://pdos.csail.mit.edu/~baford/packrat/].
- with functionnal language Agda[1] : Parsing Mixfix Operators
- This paper is interesting in its proposal of using an acyclic graph to define operator precedence.
- with functional language OPAL : How To Obtain Powerful Parsers That Are Elegant and Practical
Pratt Parser
Some Existing Extendable Languages
- Katahdin, and the associated paper A Programming Language Where the Syntax and Semantics Are Mutable at Runtime which describes the language and its implementation (in C#).
- XMF whose implementation seems not be describes in details.
- XLR: Extensible Language and Runtime, which looks like a pratt parser according to the sources.
- whose implementation seems not be describes in details (but sources availables).
Design Proposal
Main Concepts
- symbol
- a symbol is a lexem known by the parser.
- group
- each symbol belongs to one and only one group.
- symbol compatibility
- a relation telling if inside a group two symbol are compatibles.
- Two symbol are compatibles is they can be parsed without parentheses.
- group compatibility
- a graph telling which groups are compatible.
- group associativity
- a graph
- symbol associativity
Proposed User Interface
The user will be able to create
Core Algorithm
Sample Implementation
A prototype as been developed in Python to quickly try different solutions.
The development tree is available at http://bitbucket.org/matclab/eventb_pratt_parser/