Constrained Dynamic Parser

From Event-B
Revision as of 21:01, 30 January 2010 by imported>Mathieu (→‎Design Proposal)
Jump to navigationJump to search

This page describes the requirements for a generic parser for the Event-B Mathematical Language. It will also draft a first design proposal.

Requirements

In order to be usable mathematical extensions require the event-b mathematical language syntax to be extendable by the final user.

Thus, the lexical analyser and the syntaxic parser have to be extendable in a simple enough way (from a user point of vue).

Requirements Exported by the Current Language Design

Operator Priority

  • operator are defined by group,
  • group of operator have a defined precedences,
  • there may be precedences defined inside groups.

Operator Associativity

  • a compatibility table defines allowed associativity inside a group,
  • a compatibility tables defines allowed associativity between groups (it allows to forbid a syntaxic construction like f(x)^{-1}\;
nota: this requirement was added afterwards with consistency in mind.

Expected Extension Schemes

We do want to at least define operators of the following form :

  • infix : a + b\; or a \vdash b : c\;
  • prefix : \neg a\;
  • postfix :  R^*\;
  • closed : \|a\|\;
  • parentheses sugar : (a +b) * c\;
  • functional postfix :  a \langle b \rangle\;
  • functional prefix :   \langle b \rangle f\;
  • lambda like :  \lambda x\mapsto y . P | E\;
  • Union like :  \Union\{ e \mid P\} or  \Union\{ x,y . P \mid e\}
  • sets : \{a, b, c + e\}\; or \{ e \mid P\}\; or \{x,y . P \mid e\}\;

We also like to define precisely precedences and associativity between existing and new operators.

Requirements exported by the dynamic feature

  • the precedence should not be enumerated but defined by a relation, like: '+' < '*' and '*' < '**', ...

Limitations

Design Alternatives

Make Existing Parser Extendable

The existing parser is a LL recursive descent parser generated by the Coco/R Compiler Generator, which makes extensive use of pre-computed lookahead, which makes it very difficult to be transformed in a sufficiently enough generic parser.

Parser Combinator

This paper is interesting in its proposal of using an acyclic graph to define operator precedence.

Pratt Parser

Some Existing Extendable Languages

Design Proposal

Main Concepts

symbol
a symbol is a lexem known by the parser.
group
each symbol belongs to one and only one group.
symbol compatibility
a relation telling if inside a group two symbol are compatibles.
Two symbol are compatibles is they can be parsed without parentheses.
group compatibility
a graph telling which groups are compatible.
group associativity
a graph
symbol associativity

Proposed User Interface

The user will be able to create


Core Algorithm

Sample Implementation

A prototype as been developed in Python to quickly try different solutions.

The development tree is available at http://bitbucket.org/matclab/eventb_pratt_parser/