# Constrained Dynamic Parser

This page describes the requirements for a generic parser for the Event-B Mathematical Language. It will also draft a first design proposal.

## Requirements

In order to be usable mathematical extensions require the event-b mathematical language syntax to be extendable by the final user.

Thus, the lexical analyser and the syntaxic parser have to be extendable in a simple enough way (from a user point of vue).

### Requirements Exported by the Current Language Design

#### Operator Priority

• operator are defined by group,
• group of operator have a defined precedences,
• there may be precedences defined inside groups.

#### Operator Associativity

• a compatibility table defines allowed associativity inside a group,
• a compatibility tables defines allowed associativity between groups (it allows to forbid a syntaxic construction like $f(x)^{-1}\;$
nota: this requirement was added afterwards with consistency in mind.

### Expected Extension Schemes

We do want to at least define operators of the following form :

• infix : $a + b\;$ or $a \vdash b : c\;$
• prefix : $\neg a\;$
• postfix : $R^*\;$
• closed : $\|a\|\;$
• parentheses sugar : $(a +b) * c\;$
• functional postfix : $a \langle b \rangle\;$
• functional prefix : $\langle b \rangle f\;$
• lambda like : $\lambda x\mapsto y . P | E\;$
• Union like : $\Union\{ e \mid P\}$ or $\Union\{ x,y . P \mid e\}$
• sets : $\{a, b, c + e\}\;$ or $\{ e \mid P\}\;$ or $\{x,y . P \mid e\}\;$

We also like to define precisely precedences and associativity between existing and new operators.

### Requirements exported by the dynamic feature

• the precedence should not be enumerated but defined by a relation, like: '+' < '*' and '*' < '**', ...

## Design Alternatives

### Make Existing Parser Extendable

The existing parser is a LL recursive descent parser generated by the Coco/R Compiler Generator, which makes extensive use of pre-computed lookahead, which makes it very difficult to be transformed in a sufficiently enough generic parser.

### Parser Combinator

This paper is interesting in its proposal of using an acyclic graph to define operator precedence.

### Pratt Parser

Note that a pratt' parser is essentially a parser for expressions (where expression parts are themselves expressions)... But it may be tweaked to limit the accepted sub-expressions or to enforce some non-expression parts inside an expression. More on that hereafter.

## Design Proposal

### Main Concepts

symbol
a symbol is a lexem known by the parser.
group
each symbol belongs to one and only one group.
symbol compatibility
a relation telling if inside a group two symbol are compatibles.
Two symbol are compatibles is they can be parsed without parentheses (see Event-B Mathematical Language for more on compatibility).
group compatibility
a relation telling if two groups are compatibles.
Two symbol of different groups are said compatibles if their group are compatibles.
symbol associativity
a relation which gives the associative precedence of symbols inside a given group.
group associativity
a relation which gives the associative precedence of groups.
Two symbol of different groups have a relative precedence given by the associative precedence of their groups.
generic parser
the parser which is defined.
This parser may be called recursively (an infix expression being build upon a left and right sub-expression) and is often the default parser used when building the AST associated with a symbol.
specific parser
some specific parser, predefined as helper functions to build specific AST.
Some example of specific parser are the one used to parse and build list of identifiers (for quantified predicates), mapplets lists (for lambda), and so one...

### Proposed User Interface

The user will be able to create the following symbols:

infix

### Sample Implementation

A prototype as been developed in Python to quickly try different solutions.

The development tree is available at http://bitbucket.org/matclab/eventb_pratt_parser/