imported>Mathieu |
imported>WikiSysop |
Line 1: |
Line 1: |
− | {{TOCright}}
| + | = Rodin User and Developer Workshop, 27-29 February 2012, Fontainebleau, France = |
− | This page describes the requirements for a '''generic parser''' for the [[Event-B Mathematical Language]].
| |
| | | |
− | A first design proposal is also drafted.
| |
| | | |
− | == Requirements ==
| + | Event-B is a formal method for system-level modelling and analysis. The Rodin Platform is an Eclipse-based toolset for Event-B that provides effective support for modelling and automated proof. The platform is open source and is further extendable with plug-ins. A range of plug-ins have already been developed including ones that support animation, model checking and UML-B. |
− | In order to be usable [[mathematical extensions]] require that the event-b mathematical language syntax can be extended by the final user.
| + | The [http://wiki.event-b.org/index.php/Rodin_Workshop_2009 first Rodin User and Developer Workshop was held in July 2009 at the University of Southampton] while the [http://wiki.event-b.org/index.php/Rodin_Workshop_2010 second took place at the University of Düsseldorf in September 21-23, 2010]. The 2012 workshop will be part of the [http://www.bmethod.com/php/federated-event-2012-en.php DEPLOY Federated Event]. |
| | | |
− | The lexical analyser and the syntaxic parser thus have to be extensible in a simple enough way (from a user point of vue). | + | While much of the development and use of Rodin takes place within the [http://www.deploy-project.eu EU FP7 DEPLOY Project], there is a growing group of users and plug-in developers outside of DEPLOY. The purpose of this workshop is to bring together existing and potential users and developers of the Rodin toolset and to foster a broader community of Rodin users and developers. |
| | | |
− | === Requirements Exported by the Current Language Design ===
| + | For Rodin users the workshop will provide an opportunity to share tool experiences and to gain an understanding of on-going tool developments. For plug-in developers the workshop will provide an opportunity to showcase their tools and to achieve better coordination of tool development effort. |
− | ==== Operator Priority ====
| |
− | * operator are defined by group,
| |
− | * group of operator have a defined precedences,
| |
− | * there may be precedences defined inside groups.
| |
| | | |
− | ==== Operator Compatibility ====
| + | The format will be presentations together with plenty of time for discussion. On Day 1 a Developer Tutorial will be held while Days 2 and 3 will be devoted to tool usage and tool developments. The workshop will be followed by an open [http://www.bmethod.com/php/federated-event-2012-en.php Industry Day]. |
− | * a compatibility table defines allowed associativities inside a group,
| |
− | * a compatibility table defines allowed associativities between groups (it allows to forbid a syntaxic construction like <math>f(x)^{-1}\;</math>
| |
− | : nota: this requirement was added afterwards with consistency in mind.
| |
| | | |
− | === Expected Extension Schemes ===
| + | If you are interested in giving a presentation at the Rodin workshop, send a short abstract (1 or 2 pages of A4) to rodin@ecs.soton.ac.uk by 16 January 2012. Indicate whether it is a tool usage or tool development presentation. Plug-in presentations may be about existing developments or planned future developments. We will endeavour to accommodate all submissions that are relevant to Rodin and Event-B. |
− | We do want to at least define operators of the following form :
| |
− | * infix : <math>a + b\;</math> or <math>a \vdash b : c\;</math>
| |
− | * prefix : <math>\neg a\;</math>
| |
− | * postfix : <math> R^*\;</math>
| |
− | * closed : <math>\|a\|\;</math>
| |
− | * parentheses sugar : <math>(a +b) * c\;</math>
| |
− | * functional postfix : <math> a \langle b \rangle\;</math>
| |
− | * functional prefix : <math> \langle b \rangle f\;</math>
| |
− | * lambda like : <math> \lambda x\mapsto y . P | E\;</math>
| |
− | * Union like : <math> \Union\{ e \mid P\}</math> or <math> \Union\{ x,y . P \mid e\}</math>
| |
− | * sets : <math>\{a, b, c + e\}\;</math> or <math>\{ e \mid P\}\;</math> or <math>\{x,y . P \mid e\}\;</math>
| |
| | | |
− | We also like to define precisely precedences and associativity between existing and new operators.
| |
| | | |
− | === Requirements exported by the dynamic feature ===
| + | '''Organisers''' |
− | * the precedence should not be enumerated but defined by a relation, like: '+' < '*' and '*' < '**', ...
| |
| | | |
− | == Limitations ==
| + | Michael Butler, University of Southampton |
| | | |
− | == Design Alternatives ==
| + | Stefan Hallerstede, University of Aarhus |
| | | |
− | === Make Existing Parser Extensible ===
| + | Thierry Lecomte, ClearSy |
− | The existing parser is a LL recursive descent parser generated by the [http://www.ssw.uni-linz.ac.at/Coco/ Coco/R Compiler Generator], which makes extensive use of pre-computed lookahead, which makes it very difficult to be transformed in a sufficiently enough generic parser.
| |
| | | |
− | === Parser Combinator ===
| + | Michael Leuschel, University of Düsseldorf |
− | :* on [http://en.wikipedia.org/wiki/Parser_combinator wikipedia]. Some implementations are referenced [http://pdos.csail.mit.edu/~baford/packrat/ here].
| |
− | :* with functionnal language [http://wiki.portal.chalmers.se/agda/Agda Agda] : ''[http://www.cs.nott.ac.uk/~nad/publications/danielsson-norell-mixfix.pdf Parsing Mixfix Operators]
| |
− | :: This paper is interesting in its proposal of using an acyclic graph to define operator precedence.
| |
− | :* with functional language [http://user.cs.tu-berlin.de/~opal/opal-language.html OPAL] : ''[http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=8A8A8B39FFBF1AEF17C782C4822C5AF6?doi=10.1.1.58.655&rep=rep1&type=url&i=0 How To Obtain Powerful Parsers That Are Elegant and Practical]''
| |
| | | |
− | === Pratt Parser ===
| + | Alexander Romanovsky, University of Newcastle |
− | :* [http://portal.acm.org/citation.cfm?id=512931 original paper from Vaughan Pratt]
| |
− | :* [http://effbot.org/zone/tdop-index.htm SImple ''Pratt'' parsing in python]
| |
− | :* [http://javascript.crockford.com/tdop/tdop.html ''Pratt'' parsing in javascript]
| |
− | Note that a ''pratt' parser is essentially a parser for expressions (where expression parts are themselves expressions)... But it may be tweaked to limit the accepted sub-expressions or to enforce some non-expression parts inside an expression. More on that hereafter.
| |
| | | |
− | === Some Existing Extensible Languages ===
| + | Laurent Voisin, Systerel |
− | * [http://www.chrisseaton.com/katahdin/ Katahdin], and the associated paper [http://www.chrisseaton.com/katahdin/katahdin-thesis.pdf A Programming Language Where the Syntax and Semantics Are Mutable at Runtime] which describes the language and its implementation (in C#).
| |
− | * [http://itcentre.tvu.ac.uk/~clark/xmf.html XMF] whose implementation seems not be describes in details.
| |
− | * [http://xlr.sourceforge.net/index.html XLR: Extensible Language and Runtime], which looks like a ''pratt'' parser according to the [http://xlr.svn.sourceforge.net/viewvc/xlr/trunk/xl2/xlr/parser.cpp?revision=805&view=markup sources].
| |
− | * [http://www.pi-programming.org/What.html <math>\pi</math>] whose implementation seems not be describes in details (but sources availables).
| |
− | | |
− | == Design Proposal ==
| |
− | === Main Concepts ===
| |
− | ;symbol: a symbol is a lexem known by the parser.
| |
− | ;group: each symbol belongs to one and only one group.
| |
− | ;symbol compatibility: a relation telling if inside a group two symbol are compatibles.
| |
− | :Two symbol are compatibles is they can be parsed without parentheses (see [[Event-B Mathematical Language]] for more on compatibility).
| |
− | ;group compatibility: a relation telling if two groups are compatibles.
| |
− | :Two symbol of different groups are said compatibles if their group are compatibles.
| |
− | ;symbol associativity: a relation which gives the associative precedence of symbols inside a given group.
| |
− | ;group associativity: a relation which gives the associative precedence of groups.
| |
− | :Two symbol of different groups have a relative precedence given by the associative precedence of their groups.
| |
− | ;generic parser: the parser which is defined.
| |
− | : This parser may be called recursively (an infix expression being build upon a left and right sub-expression) and is often the default parser used when building the AST associated with a ''symbol''.
| |
− | ;specific parser: some specific parser, predefined as helper functions to build specific AST.
| |
− | :Some example of specific parser are the one used to parse and build list of identifiers (for quantified predicates), mapplets lists (for lambda), and so one...
| |
− | | |
− | === Proposed User Interface ===
| |
− | ''The hereafter proposition does not preclude the final UI to take other (more user friendly forms). But the concepts should probably stay the same''
| |
− | | |
− | We propose that the user will be able to create several predefined type of symbols (new types may certainly be further added by mean of eclipse extension points). When defining a symbol, the user can choose the parser used to parse the sub-expressions (default to generic parser, but a specific parser may be used) and can enforce the form of the resulting parsed sub-expression in terms of their AST root type.
| |
− | | |
− | ==== Predefined Symbol Builders ====
| |
− | The proposed predefined symbol builders are:
| |
− | ;infix: For example predicate conjunction which expects two predicates for left and right parts may be defined by <code>infix("∧", gid="(logic pred)", expected_nonterm=["predicate", "predicate"])</code>
| |
− | :As another example, substitution ≔ of group '(infix subst)' which expect two ''ident list'' for left and right expressions may be defined by <code>infix(symbol="≔", gid="(infix subst)", parsers=["ident_list", "ident_list"], expected_nonterm=["ident_list", "ident_list"])</code>
| |
− | | |
− | ;prefix: <code>prefix("¬", "(not pred)", expected_nonterm=["predicate"])</code>
| |
− | | |
− | ;atomic: <code>atomic("⊤", "(atomic pred)")</code>
| |
− | | |
− | ;prefix closed: used to build prefix symbol which must be followed by enclosing symbol(brackets, parentheses,..). For example <tt>finite</tt> may be defined by: <code>prefix_closed("finite", "(", ")","(atomic pred)", expected_nonterm=["expression"])</code>
| |
− | | |
− | :postfix: <code>postfix("∼", gid="(unary relation)", expected_nonterm=["expression"])</code>
| |
− | | |
− | ;quantified: <code>quantified("∀", "·", expected_nonterm=["predicate"]</code>
| |
− | | |
− | ;lambda like: <code>lambda_like("λ", "·", u"∣", gid="(quantification)", expected_nonterm=["ident list", "predicate", "expression"])</code>
| |
− | | |
− | ;quantified union like: <code>Union_like("⋃", "·", "∣", gid="(quantification)")</code>
| |
− | | |
− | ;closed sugar: use for parentheses added to enforce associativity: <code>closed_sugar("(", ")")</code>
| |
− | | |
− | ;functional postfix: <code>functional_postfix("(", ")", gid="(functional)", expected_nonterm=["expression", "expression"])</code>
| |
− | | |
− | Some right associative versions of those symbol builders may also be supplied (infix_right, or a specific parameter ''right associative'' parameter may also be provided to the ''infix'' symbol builder).
| |
− | | |
− | ==== Predefined Specific Parser ====
| |
− | * expression lists
| |
− | * ident lists
| |
− | * maplet tree (for lambda)
| |
− | | |
− | | |
− | ==== Defining Compatibility and Precedences ====
| |
− | Inside a group, ''compatibility'' may be defined like:
| |
− | <code>
| |
− | g = group["(binop)"]
| |
− | g.add_many_compatibility(
| |
− | [ ("∪", "∪"),
| |
− | ("∩", "∩"),
| |
− | ("∩", "∖"),
| |
− | ("∩", "▷"),
| |
− | ("∩", "⩥"),
| |
− | ("∘", "∘"),
| |
− | (";", ";"),
| |
− | (";", "▷"),
| |
− | (";", "⩥"),
| |
− | ("", ""),
| |
− | ("◁", "∩"),
| |
− | ("◁", "∖"),
| |
− | ("◁", ";"),
| |
− | ("◁", "⊗"),
| |
− | ("◁", "▷"),
| |
− | ("◁", "⩥"),
| |
− | ("⩥", "∩"),
| |
− | ("⩥", "∖"),
| |
− | ("⩥", ";"),
| |
− | ("⩥", "⊗"),
| |
− | ("⩥", "▷"),
| |
− | ("⩥", "⩥"),
| |
− | ("×", "×"), ]
| |
− | </code>
| |
− | which defines the compatibility table for expression binary operator as defined in [[Event-B Mathematical Language]].
| |
− | | |
− | | |
− | Inside a group, ''precedence'' may be defined like:
| |
− | <code>
| |
− | g = group["(arithmetic)"]
| |
− | g.add_many_precedence(
| |
− | [ ("^", "÷"),
| |
− | ("^", "∗"),
| |
− | ("^", "+"),
| |
− | ("^", "−"),
| |
− | ("^", "mod"),
| |
− | ("+", "∗"),
| |
− | ("−", "∗"),
| |
− | ("+", "÷"),
| |
− | ("−", "÷"),
| |
− | ("+", "mod"),
| |
− | ("−", "mod"), ] )
| |
− | </code>
| |
− | which defines ^ as having the least precedence amongst the operators of the expression arithmetic group.
| |
− | | |
− | Similarly, precedences and compatibilities between groups may be defined :
| |
− | <code>
| |
− | group_precedence.add(("(binop)", "(interval)"))
| |
− | group_precedence.add(("(interval)", "(arithmetic)"))
| |
− | group_precedence.add(("(arithmetic)", "(unary relation)"))
| |
− | group_precedence.add(("(unary relation)", "(bound unary)"))
| |
− | group_precedence.add(( "(bound unary)", "(bool)"))
| |
− | | |
− | group_compatibility.add_all( [ "(binop)", "(interval)", (arithmetic)", "(unary relation)", "(unary relation)", "(bound unary)",....])
| |
− | </code>
| |
− | | |
− | ==== Some Difficulties ====
| |
− | * different builders for the same symbol
| |
− | * common prefixes {{TODO| backtracking + memoisation}}
| |
− | * Grammar testing {{TODO| should be include in the user interface: the user shall provide tests with its extensions to ensure that he agree with the parser on the semantic of its extension}}
| |
− | | |
− | === Core Algorithm ===
| |
− | === Sample Implementation ===
| |
− | A prototype as been developed in Python to quickly try different solutions.
| |
− | | |
− | The development tree is available at http://bitbucket.org/matclab/eventb_pratt_parser/
| |
− | | |
− | | |
− | | |
− | | |
− | | |
− | | |
− | [[Category:Design Proposal]]
| |