Difference between pages "D45 Prover Enhancement" and "Mathematical Extensions"

From Event-B
(Difference between pages)
Jump to navigationJump to search
imported>Tommy
 
imported>Nicolas
 
Line 1: Line 1:
= Overview =
+
Currently the operators and basic predicates of the Event-B mathematical language supported by Rodin are fixed.
* Two tasks concerned the prover performance from the core platform: the addition of rewriting and inference rules, and the addition of a mechanism to allow the customization and the parametrization or combination of tactics. While the addition of rewriting and inference rules has always been a regular task to enhance the Rodin integrated prover during DEPLOY lifetime, a new way to manage tactics has been implemented. In fact, the user is now able to define various types of tactics called 'profiles' which could be customized and parameterized tactics to discharge some specific proof obligations.
+
We propose to extend Rodin to define basic predicates, new operators or new algebraic types.
* The ProB plug-in allows to find and visualize counter examples to the invariant and deadlock preservation proof obligations. ProB has also been used for finding count examples to proof rules of the industrial partner Siemens.
 
* The SMT Solvers plug-in allowing to use the SMT solvers within Rodin is an effective alternative to the Atelier-B provers, particularly when reasoning on linear arithmetic.
 
* The Isabelle/HOL integration plug-in allows to extend the Rodin proving capabilities through the definition of tactic written in Isabelle/HOL and thus, provides help to automatically discharge proof obligations, but also allows the verification of the new extensions of the Rodin sequent prover written in java through the translation of their corresponding rules.
 
  
= Motivations =
+
== Requirements ==
== New rewriting and inference rules ==
 
In an Event-B development, more than 60% of the time is spent on proofs. It has been a continuous aim to increase the number of automatically discharged proof obligations (POs) by improving the capabilities of the integrated sequent prover through the addition of rewriting and inference rules. These rules were provided through tactics, or existing or newly created. These tactics were automatic, or manual, or sometimes both. Providing new proving rules, even if it sometimes does not increase directly the number of automatically discharged POs aims to help the user to interactively discharge them and spare his time.
 
  
== Advanced Preferences for Auto-tactics ==
+
=== User Requirements ===
 +
* Binary operators (prefix form, infix form or suffix form).
 +
* Operators on boolean expressions.
 +
* Unary operators, such as absolute values.
 +
: Note that the pipe, which is already used for set comprehension, cannot be used to enter absolute values (''in fact, as in the new design the pipe used in set comprehension is only a syntaxic sugar, the same symbol may be used for absolute value. To be confirmed with the prototyped parser. It may however be not allowed in a first time until backtracking is implemented, due to use of lookahead making assumptions on how the pipe symbol is used. -Mathieu ).
 +
* Basic predicates (e.g., the symmetry of relations <math>sym(R) \defi R=R^{-1}</math>).
 +
: Having a way to enter such predicates may be considered as syntactic sugar, because it is already possible to use sets (e.g., <math>R \in sym</math>, where <math>sym \defi \{R \mid R=R ^{-1}\}</math>) or functions (e.g., <math>sym(R) = \True</math>, where <math>sym \defi (\lambda R \qdot R \in A \rel B \mid \bool(R = R^{-1}))</math>).
 +
* Quantified expressions (e.g., <math>\sum x \qdot P \mid y</math>, <math>\prod x \qdot P \mid y</math>, <math>~\min(S)</math>, <math>~\max(S)</math>).
 +
* Types.
 +
** Enumerated types.
 +
** Scalar types.
  
The proportion of automatically discharged proof obligations heavily depends on Auto-Tactic configuration.
+
=== User Input ===
Sometimes, the automatic prover fails because the tactics are applied in a 'wrong' order - 'wrong' for a given PO - even though all needed tactics are present.
+
The end-user shall provide the following information:
Early version of Rodin provided preferences for automatic tactics that enabled to reorder them, but the ordering was lost at each change: one could not record a particular tactic order in order to reuse it later.
+
* keyboard input
 +
* Lexicon and Syntax. <br/>More precisely, it includes the symbols, the form (prefix, infix, postfix), the grammar, associativity (left-associative or right associative), commutativity, priority, the mode (flattened or not), ...
 +
* Pretty-print. <br/>Alternatively, the rendering may be determined from the notation parameters passed to the parser.
 +
* Typing rules.
 +
* Well-definedness.
  
Another issue is to have more than one possibility to combine the tactics. Indeed, the only implicit combination of tactics available consisted in trying to apply them in turn for every open node of a proof. In the proving area, there exists a notion of ''tactic combinators'', also called ''tacticals'', that allow to combine tactics in various specific manners, thus providing a sort of tactic arithmetic.
+
=== Development Requirements ===
 +
* Scalability.
  
The advanced preferences for auto-tactics solved these two issues.
+
== Towards a generic AST ==
  
== Isabelle Plug-in ==
+
The following AST parts are to become generic, or at least parameterised:
At the beginning of the Deploy project, the user had three choices when valid proof obligations are not discharged automatically: (1) do manual proofs, (2) reorganise the model to make it "simpler" for Rodin's theorem provers, or (3) implement an extension of Rodin's theorem provers in Java. None of these options is fully satisfactory for models at industrial scale:  
+
* [[Constrained_Dynamic_Lexer | Lexer]]
#The number of undischarged proof obligations is typically very high, which makes manual proving an expensive task; in particular, users have reported that they had to enter very similar proofs again and again (for different proof obligations), which they found frustrating.
+
* [[Constrained Dynamic Parser | Parser]]
#Reorganising the model helps in some situations, but it does not help in others, and it requires considerable experience and a deep understanding of Rodin's theorem provers.
+
* Nodes ( Formula class hierarchy ): parameters needed for:
#Extending Rodin's theorem provers is quite effective in principle. However, prover extensions need be definned in Java in a procedural style; the source code of prover extensions is therefore hard to read and write. Moreover, it is quite difficult to ensure that prover extensions are sound. If a model has been verified with an extended version of Rodin's theorem provers, it is therefore questionable whether the model is indeed correct.
+
** Type Solve (type rule needed to synthesize the type)
The integration of Isabelle/HOL<ref>T. Nipkow, L. C. Paulson, and M. Wenzel. Isabelle/HOL - A Proof Assistant for Higher-Order Logic, volume 2283 of Lecture Notes in Computer Science. Springer, 2002.</ref> into Rodin is one solution to this problem. Isabelle/HOL is the instantiation of the generic theorem prover Isabelle <ref>L. C. Paulson. The foundation of a generic theorem prover. Journal of Automated Reasoning, 5(3):363-397, 1989.</ref> to higher-order logic<ref>P. B. Andrews. An Introduction to Mathematical Logic and Type Theory. Springer,2002.</ref><ref>A. Church. A formulation of the simple theory of types. Journal of Symbolic Logic,5(2):56-68, 1940.</ref><ref>M. J. C. Gordon and T. F. Melham. Introduction to HOL. Cambridge University Press, 1993.</ref>. The main advantage of Isabelle is to provide a high degree of extensibility whilst ensuring soundness. In a nutshell, every proof in Isabelle has to be approved by Isabelle's core, which has matured over several decades and is therefore most likely correct. In Isabelle, user supplied prover extensions are sound by construction.
+
** Type Check (type rule needed to verify constraints on children types)
Isabelle comes with a vast collection of automated proof tactics that can be customised by the user in a declarative manner, i.e., by declaring new inference and rewrite rules. It provides link-ups to several competition winning automated theorem provers such as Z3 <ref>L. M. de Moura and N. Bjørner. Z3: An efficient SMT solver. In TACAS, volume 4963 of Lecture Notes in Computer Science, pages 337-340. Springer, 2008.</ref> and Vampire<ref>A. Riazanov and A. Voronkov. The design and implementation of Vampire. AI Communications, 15(2-3):91-110, 2002.</ref>. Last but not least, the default configuration of Isabelle's proof tactics has matured over years or decades and has been applied in numerous verification projects.
+
** WD (WD predicate)
 +
** PrettyPrint (tag image + notation (prefix, infix, postfix) + needs parentheses)
 +
** Visit Formula (getting children + visitor callback mechanism)
 +
** Rewrite Formula (associative formulæ have a specific flattening treatment)
 +
* Types (Type class hierarchy): parameters needed for:
 +
** Building the type expression (type rule needed)
 +
** PrettyPrint (set operator image)
 +
** getting Base / Source / Target type (type rule needed)
 +
* Verification of preconditions (see for example <tt>AssociativeExpression.checkPreconditions</tt>)
  
== ProB Disprover ==
+
=== Vocabulary ===
  
ProB can be used to find counterexamples to invariant preservation and deadlock preservation proof obligations.
+
An '''extension''' is to be understood as a single additional operator definition.  
This feature relies on the constraint solving capabilities of the ProB kernel. It is available from the ProB plugin after loading an Event-B model.
 
  
WIthin the context of WP2 (Transportation) ProB has also been used to try and find counterexamples to individual proof rules in the Siemens proof rule database.
+
=== Tags ===
For this, the ProB command line tool has been extended to load proof rule files (using a Prolog parser to avoid the JVM startup time) and then applies the ProB constraint solver to try and find counter examples to each proof rule.
 
This work has already identified several errors in the Siemens rule database and the work is still ongoing.
 
A current issue is the addition of explicit well-definedness conditions into the proof rules, as well as the fact that some proof rules use substations inside predicates.
 
  
 +
Every extension is associated with an integer tag, just like existing operators. Thus, questions arise about how to allocate new tags and how to deal with existing tags.<br />
 +
The solution proposed here consists in keeping existing tags 'as is'. They are already defined and hard coded in the ''Formula'' class, so this choice is made with backward compatibility in mind.
  
 +
Now concerning extension tags, we will first introduce a few hypotheses:
 +
* Tags_Hyp1: tags are never persisted across sessions on a given platform
 +
* Tags_Hyp2: tags are never referenced for sharing purposes across various platforms
 +
In other words, cross-platform/session formula references are always made through their ''String'' representation. These assumptions, which were already made and verified for Rodin before extensions, lead us to restrict further considerations to the scope of a single session on a single platform.
  
== SMT Solver Integration ==
+
The following definitions hold at a given instant <math>t</math> and for the whole platform.<br />
The integration of SMT solvers into the Rodin platform is motivated by two main reasons. On the one hand, the enhancement of its proving capability, especially in the field of arithmetics. On the other hand, the ability of extracting some useful informations from the proofs produced by these solvers, such as unsatisfiable cores, in order to significantly decrease the time necessary to prove a modified model.
+
Let <math>\mathit{EXTENSION}_t</math> be the set of extensions supported by the platform at instant <math>t</math>;<br /> let <math>\mathit{tag}_t</math> denote the affectation of tags to a extensions at instant <math>t</math> (<math>\mathit{tag}_t \in \mathit{EXTENSION}_t \pfun \intg</math>);<br /> let <math>\mathit{COMMON}</math> be the set of existing tags defined by the ''Formula'' class (<math>\mathit{COMMON} \sub \intg</math>).<br /> The following requirements emerge:
 +
* Tags_Req1: <math>\forall t \qdot \mathit{tag}_t \in \mathit{EXTENSION}_t \tinj \intg</math>
 +
* Tags_Req2: <math>\forall e, t_1,t_2 \qdot \mathit{tag}_{t_1}(e)=\mathit{tag}_{t_2}(e)</math> where <math>t_1, t_2</math> are two instants during a given session
 +
* Tags_Req3: <math>\forall t \qdot \ran(\mathit{tag}_t) \cap \mathit{COMMON} = \empty</math>
  
== Incorporating SAT Solvers via Kodkod ==
+
The above-mentioned scope-restricting hypothesis can be reformulated into: <math>\mathit{tag}</math> needs not be stable across sessions nor across platforms.
The motivation behind incorporating the relational logic solver "Kodkod" into ProB was to evaluate how other technologies can complement ProB's constraint solving capabilities.
 
A thight integration into ProB makes it available as well for model checking, animation as well as for the disproving capabilities mentioned above.
 
Our integration allows predicates from Event-B (and B, Z and TLA+ as well) to be solved using a mixture of SAT-solving and ProB's own constraint-solving capabilities developed using constraint logic programming: the first-order parts which can be dealt with by Kodkod and the remaining parts solved by the existing ProB kernel.
 
We have conducted a first empirical evaluation and analyzed the respective merits of SAT-solving and classical constraint solving.
 
We also compare to using SMT solvers via recently available translators for Event-B.
 
  
Our experiments have shown that the translation can be highly beneficial for certain kinds of constraints, and as such opens up new ways to analyze and validate formal specifications in Event-B.
+
=== Formula Factory ===
However, the experiments have also shown that the constraint logic programming approach of ProB can be superior in a considerable number of scenarios; the translation to Kodkod and down to SAT is not (yet) the panacea.
 
The same can be said of the existing translations from B to SMT.
 
As such, we believe that much more research required to reap the best of both worlds (SAT/SMT and constraint programming).
 
An interesting side-effect of our work is that the ProB toolset now provides a double-chain (relying on technology developed independently and using different programming languages and paradigms) of validation for first-order predicates, which should prove relevant in high safety integrity level contexts.
 
  
= Choices / Decisions =
+
The requirements about tags give rise to a need for centralising the <math>\mathit{tag}</math> relation in order to enforce tag uniqueness.
== New rewriting and inference rules ==
+
The Formula Factory appears to be a convenient and logical candidate for playing this role. Each time an extension is used to make a formula, the factory is called and it can check whether its associated tag exists, create it if needed, then return the new extended formula while maintaining tag consistency.
{{TODO}} ''To be completed by Laurent Voisin''
 
== Advanced Preferences for Auto-tactics ==
 
  
Since Rodin 2.1, one can create his own tactic profiles. A tactic profile allows to set and record a particular order of chosen basic tactics. Furthermore, a profile can be applied either globally (as before), or specifically for a given project.
+
The factory can also provide API for requests about tags and extensions: getting the tag from an extension and conversely.
  
Since Rodin 2.3, tacticals and parameterization have been added to the profiles, thus increasing the potential of such proving feature.
+
We also need additional methods to create extended formulæ. A first problem to address is: which type should these methods return ?
A tactic profile may now be composed of tacticals, that combine any number of basic tactics and other profiles.
+
We could define as many extended types as the common AST API does, namely ''ExtendedUnaryPredicate'', ''ExtendedAssociativeExpression'', and so on, but this would lead to a large number of new types to deal with (in visitors, filters, …), together with a constraint about which types extensions would be forced to fit into. It is thus preferable to have as few extended types as possible, but with as much parameterisation as can be. Considering that the two basic types ''Expression'' and ''Predicate'' have to be extensible, we come to add two extended types ''ExtendedExpression'' and ''ExtendedPredicate''.
The parameterization allows for example to set a custom timeout on external provers such as AtelierB P1.
 
  
== Isabelle Plug-in ==
+
ExtendedExpression makeExtendedExpression( ? )
The Isabelle plug-in translates a proof obligation to Isabelle/HOL and then invokes a custom Isabelle tactic and reports the result back to Rodin. Most of the work that has been done concerned the study of Event-B theory and its translation to HOL. Unlike translations to other theorem provers (such as PP, newPP, and ML), the translation to HOL preserves provability: the translations of provable proof obligations are provable and the translations of unprovable proof obligations are unprovable. (This claim rests on the assumption that the implementation of the translation is correct. We regard this a reasonable assumption, as the implementation of the translation is quite straightforward and concise.)  
+
ExtendedPredicate makeExtendedPredicate( ? )
The performance of the default configuration can be tweaked regarding some specific situations and users can inspect the translated proof obligations in Isabelle/HOL, test the behaviour of various proof tactics, and declare new inference and rewrite rules to meet the desired performance goals.
 
The Isabelle plug-in was evaluated on the BepiColombo model [42] by Space Systems Finland. Isabelle discharged some of the proof obligations out of the box that could not be discharged automatically by Rodin. The automation rate could be drastically increased by declaring new inference and rewrite rules. This proceeding had an important advantage over directly entering manual proofs: by declaring new rules, we did not only make Isabelle’s automated tactics prove the proof obligation under investigation, but also other proof obligations that we had not yet looked at.
 
  
== ProB Disprover ==
+
Second problem to address: which arguments should these methods take ?
 +
Other factory methods take the tag, a collection of children where applicable, and a source location. In order to discuss what can be passed as argument to make extended formulæ, we have to recall that the make… factory methods have various kinds of clients, namely:
 +
* parser
 +
* POG
 +
* provers
 +
(other factory clients use the parsing or identifier utility methods: SC modules, indexers, …)
  
Currently the constraint based invariant and deadlock checker is run from within the ProB plug-in and not from the prover interface on individual proof obligations.
+
Thus, the arguments should be convenient for clients, depending on which information they have at hand.
Indeed, for invariant preservation the disprover thus checks an event for all invariants, rather than being applied on individual proof obligations related to that event.
+
The source location does not obviously seem to require any adaptation and can be taken as argument the same way. Concerning the tag, it depends on whether clients have a tag or an extension at hand. Both are intended to be easily retrieved from the factory. As a preliminary choice, we can go for the tag and adjust this decision when we know more about client convenience.
However, ProB does know which invariants have already been proven to be preserved by that particular event.
 
The rationale for running the invariant preservation checks in this way is that:
 
* the counterexample is a full valuation of the models variables and constants which can be displayed using the ProB interface,
 
* the invariants and guards truth values can be inspected in detail using the ProB interface, make the counter-example intelligible to the user,
 
* ProB can find out which of the axioms are theorems and ignore them. Indeed in the old ProB disprover approach all of the assumptions were fed, and ProB did not know which ones are relevant to find correct values for the constants and variables. In fact, some of the theorems turned out to be very complicated to check, and prevented ProB from finding counterexamples.
 
  
Similarly, the constraint-based deadlock checker is run from the ProB plug-in on a loaded model. As Rodin does not currently generate the deadlock freedom proof obligations, this was an obvious choice. Also, it enables the user to type in additional predicates to find "particularly interesting" counter examples (see ICLP'2011 article below).
+
As for children, the problem is more about their types. We want to be able to handle as many children as needed, and of all possible types. Looking to existing formula children configurations, we can find:
Also, another advantage is that, again, the counter example can be inspected using the ProB interface.
+
* expressions with predicate children: <math>\mathit{bool}(P)</math>
 +
* expressions with expression children: <math>E_1 + E_2</math>
 +
* predicates with predicate children: <math>P_1 \limp P_2</math>
 +
* predicates with expression children: <math>\mathit{partition}(S, E_1, E_2)</math>
 +
* mixed operators: <math>\{x \qdot P(x) \mid E(x)\}</math>, but it is worth noting that the possibility of introducing bound variables in extended formulæ is not established yet.
 +
Thus, for the sake of generality, children of both types should be supported for both extended predicates and extended expressions.
  
== SMT Solver Integration ==
+
ExtendedExpression makeExtendedExpression(int tag, Expression[] expressions, Predicate[] predicates, SourceLocation location)
The translation of Event-B language into the SMT-LIB language is the main issue of this integration. Two approaches were developed for this. The more efficient one is based on the translation capabilities of the integrated predicate prover of the Rodin platform (PP). It is completed by translating membership using an uninterpreted predicate symbol, refined with an axiom of the set theory.
+
ExtendedPredicate makeExtendedPredicate(int tag, Expression[] expressions, Predicate[] predicates, SourceLocation location)
Technically, the plug-in classes extend the existing XProverCall, XProverInput, XProverReasoner, AbstractLazilyConstrTactic and ITacticParametizer classes which make it easy to integrate new tactics, reasoners and external prover calls.
 
  
== Incorporating SAT Solvers via Kodkod ==
 
Before starting our translation to Kodkod, we had experimented with several other alternate approaches to solve constraints in ProB.
 
[http://bddbddb.sourceforge.net/ bddbddb] offers the user a Datalog-like language that aims to support program analysis.
 
It uses BDDs to represent relations and compute queries on these relations.
 
In particular, one has to represent a state of the model as a bit-vector and events have to be implemented as relations between two of those bit-vectors.
 
These relations have to be constructed by creating BDDs directly with the underlying BDD library (JavaBDD) and storing them into a file.
 
Soon after starting experimenting with [http://bddbddb.sourceforge.net/ bddbddb] it became  apparent that due to the lack of more abstract data types than bit vectors, the complexity of a direct translation from B to [http://bddbddb.sourceforge.net/ bddbddb] was too high, even for small models, and this avenue was abandoned.
 
 
SAL is a model-checking framework combining a range of tools for reasoning about systems.
 
The SAL tool suite includes a state of the art symbolic (BDD-based) and bounded (SAT-based) model checkers.
 
Some first results were encouraging for a small subset of the Event-B language, but the gap between B and SAL turned out to be too big in general and no realistic way was found to handle important B operators.
 
  
Kodkod has the advantage that is provides good support for relations and sets which play an essential role in Event-B's mathematical notation.
+
=== Defining Extensions ===
  
= Available Documentation =
+
An extension is meant to contain every information and behaviour required by:
* {{TODO}} Links for New rewriting and inference rules
+
* Keyboard
* {{TODO}} Links for Advanced Preferences for Auto-tactics
+
* (Extensible Fonts ?)
* {{TODO}} Links for Isabelle Plug-in
+
* Lexer
* Links for ProB Disprover
+
* Parser
** [http://www.stups.uni-duesseldorf.de/w/Special:Publication/HaLe2011 Constraint-Based Deadlock Checking of High-Level Specifications. In Proceedings ICLP'2011, Cambridge University Press, 2011.]
+
* AST
* Links for ProB Kodkod Integration
+
* Static Checker
** [http://www.stups.uni-duesseldorf.de/w/Special:Publication/PlaggeLeuschel_Kodkod2012 Validating B,Z and TLA+ using ProB and Kodkod. Technical Report, 2012.]
+
* Proof Obligation Generator
* {{TODO}} Links for SMT Solver Integration
+
* Provers
  
= Status =
+
==== Keyboard requirements ====
== New rewriting and inference rules ==
 
{{TODO}} ''To be completed by Laurent Voisin''
 
== Advanced Preferences for Auto-tactics ==
 
  
Advanced Preferences for Auto-tactics are functional in Rodin 2.3. This release provides a first set of tacticals and parameterization options.
+
'''Kbd_req1''': an extension must provide an association combo/translation for every uncommon symbol involved in the notation.
  
Further releases may offer additional tacticals and options, according to user feedback.
+
==== Lexer requirements ====
  
== Isabelle Plug-in ==
+
'''Lex_req1''': an extension must provide an association lexeme/token for every uncommon symbol involved in the notation.
{{TODO}} ''To be completed by Matthias Schmaltz''
 
== ProB Disprover ==
 
  
The ProB constraint-based invariant and deadlock checkers are available in the current release of ProB for Rodin. The deadlock checker has been applied successfully on a very large industrial model within WP1 (automotive).
+
==== Parser requirements ====
There is of course always the scope to improve the capabilities of the tool by improving the constraint solving capabilities.
 
In particular, further simplification of existential quantifiers will be important for the deadlock checker to be applied effectively on certain models.
 
  
The Siemens proof rule database validation work is still ongoing.
+
According to [[Constrained_Dynamic_Parser| the Parser page]], the following informations are required by the parser in order to be able to parse a formula containing a given extension.
This work has already identified several errors in the Siemens rule database.
 
However, a current issue is the addition of explicit well-definedness conditions into the proof rules, as well as the fact that some proof rules use substations inside predicates.
 
  
== SMT Solver Integration ==
+
* symbol compatibility
{{TODO}} ''Laurent Voisin & Yoann Guyot''
+
* group compatibility
 +
* symbol precedence
 +
* group precedence
 +
* notation (see below)
  
== Incorporating SAT Solvers via Kodkod ==
+
==== AST requirements ====
The Kodkod integration is available in the latest nightly build version of ProB (1.3.5-beta7).
 
The scope of the translations is being extended, the work has been partially carried out within the ADVANCE project and will be continued within that project.
 
  
=References=
+
The following hard-coded concepts need to be reified for supporting extensions in the AST. An extension instance will then provide these reified informations to an extended formula class for it to be able to fulfil its API. It is the expression of the missing information for a ''ExtendedExpression'' (resp. ''ExtendedPredicate'') to behave as a ''Expression'' (resp. ''Predicate''). It can also be viewed as the parametrization of the AST.
<references/>
 
  
[[Category:D45 Deliverable]]
+
===== Notation =====
 +
 
 +
The notation defines how the formula will be printed. In this document, we use the following convention:
 +
* <math>e_i</math> is the i-th child expression of the extended formula
 +
* <math>p_i</math> is the i-th child predicate of the extended formula
 +
 
 +
''Example'': infix operator "<math>\lozenge</math>"
 +
<math>e_1 \lozenge e_2 \lozenge \ldots \lozenge e_n</math>
 +
 
 +
We define the following notation framework:
 +
 
 +
[[Image:notation_uml.png]]
 +
 
 +
On the "<math>\lozenge</math>" infix operator example, the iterable notation would return sucessively:
 +
* a ''IFormulaChild'' with index 1
 +
* the ''INotationSymbol'' "<math>\lozenge</math>"
 +
* a ''IFormulaChild'' with index 2
 +
* the ''INotationSymbol'' "<math>\lozenge</math>"
 +
* &hellip;
 +
* a ''IFormulaChild'' with index <math>n</math>
 +
 
 +
For the iteration not to run forever, the limit <math>n</math> needs to be known: this is the role of the ''mapsTo()'' method, which fixes the number of children, called when this number is known (i.e. for a particular formula instance).
 +
 
 +
'''Open question''': how to integrate bound identifier lists in the notation ?
 +
 
 +
We may make a distinction between fixed-size notations (like n-ary operators for a given n) and variable-size notations (like associative infix operators).
 +
While fixed-size notations would have no specific expressivity limitations (besides parser conflicts with other notations), only a finite set of pre-defined variable-size notation patterns will be proposed to the user. For technical reasons related to the parser,
 +
 
 +
The following features need to be implemented to support the notation framework:
 +
* a notation parser that extracts a fixed-size notation pattern from a user-input String
 +
 
 +
===== Well-Definedness =====
 +
 
 +
WD predicates also are user-provided data.
 +
 
 +
''Example'':
 +
<math>\mathit{D}(e_1) \land (\forall i \cdot i \in 1\mathit{..}(n-1) \Rightarrow (\mathit{D}(e_i) \Rightarrow \mathit{D}(e_{i+1})))</math>
 +
 
 +
In order to process WD predicates, we need to add the following features to the AST:
 +
* the <math>\mathit{D}</math> operator
 +
* expression variables (predicate variables already exist)
 +
* special expression variables and predicate variables that denote a particular formula child (we need to refer to <math>e_1</math> and <math>e_i</math> in the above example)
 +
* a ''parse()'' method that accepts these special meta variables and the <math>\mathit{D}</math> operator and returns a ''Predicate'' (a WD Predicate Pattern)
 +
* a ''makeWDPredicate''(aWDPredicatePattern, aPatternInstantiation) method that makes an actual WD predicate
 +
 
 +
===== Type Check =====
 +
 
 +
An extension shall give a type rule, which consists in:
 +
* type check predicates (addressed in this very section)
 +
* a resulting type expression (only for expressions)
 +
 
 +
''Example'':
 +
<math>(\forall i \cdot \mathit{type}(e_i) = \pow(\alpha)) \land (\forall i,j \cdot \mathit{type}(e_i)=\mathit{type}(e_j))</math>
 +
 
 +
Type checking can be reified provided the following new AST features:
 +
* the <math>\mathit{type}</math> operator
 +
* type variables (<math>\alpha</math>)
 +
* the above-mentioned expression variables and predicate variables
 +
* a ''parse()'' method that accepts these special meta variables and the <math>\mathit{type}</math> operator and returns a ''Predicate'' (a Type Predicate Pattern)
 +
* a ''makeTypePredicate''(aTypePredicatePattern, aPatternInstantiation) method that makes an actual Type predicate
 +
 
 +
===== Type Solve =====
 +
 
 +
This section addresses type synthesizing for extended expressions (the resulting type part of a type rule). It is given as a type expression pattern, so that the actual type can be computed from the children.
 +
 
 +
''Example'':
 +
  <math>\pow(\mathit{type}(e_1))</math>
 +
 
 +
In addition to the requirements for Type Check, the following features are needed:
 +
* a ''parse()'' method that accepts special meta variables and the <math>\mathit{type}</math> operator and returns an ''Expression'' (a Type Expression Pattern)
 +
* a ''makeTypeExpression''(aTypeExpressionPattern, aPatternInstantiation) method that makes an actual Type expression
 +
 
 +
==== Static Checker requirements ====
 +
{{TODO}}
 +
==== Proof Obligation Generator requirements ====
 +
{{TODO}}
 +
==== Provers requirements ====
 +
{{TODO}}
 +
 
 +
=== Extension compatibility issues ===
 +
{{TODO}}
 +
 
 +
== User Input Summarization ==
 +
 
 +
Identified required data entail the following user input:
 +
 
 +
{{TODO}}
 +
 
 +
== Impact on other tools ==
 +
 
 +
Impacted plug-ins (use a factory to build formulæ):
 +
* <tt>org.eventb.core</tt>
 +
: In particular, the static checker and proof obligation generator are impacted.
 +
* <tt>org.eventb.core.seqprover</tt>
 +
* <tt>org.eventb.pp</tt>
 +
* <tt>org.eventb.pptrans</tt>
 +
* <tt>org.eventb.ui</tt>
 +
 
 +
== Identified Problems ==
 +
The parser shall enforce verifications to detect the following situations:
 +
* Two mathematical extensions are not compatible (the extensions define symbols with the same name but with a different semantics).
 +
* A mathematical extension is added to a model and there is a conflict between a symbol and an identifier.
 +
* An identifier which conflicts with a symbol of a visible mathematical extension is added to a model.
 +
 
 +
Beyond that, the following situations are problematic:
 +
* A formula has been written with a given parser configuration and is read with another parser configuration.
 +
: As a consequence, it appears as necessary to remember the parser configuration.
 +
: The static checker will then have a way to invalid the sources of conflicts (e.g., priority of operators, etc).
 +
:: ''The static checker will then have a way to invalid the formula upon detecting a conflict (name clash, associativity change, semantic change...) [[User:Mathieu|mathieu]]
 +
 
 +
* A proof may free a quantified expression which is in conflict with a mathematical extension.
 +
: SOLUTION #1: Renaming the conflicting identifiers in proofs?
 +
 
 +
== Open Questions ==
 +
 
 +
=== New types ===
 +
Which option should we prefer for new types?
 +
* OPTION #1: Transparent mode.
 +
:In transparent mode, it is always referred to the base type. As a consequence, the type conversion is implicitly supported (weak typing).
 +
:For example, it is possible to define the <tt>DISTANCE</tt> and <tt>SPEED</tt> types, which are both derived from the <math>\intg</math> base type, and to multiply a value of the former type with a value of the latter type.
 +
 
 +
* OPTION #2: Opaque mode.
 +
:In opaque mode, it is never referred to the base type. As a consequence, values of one type cannot be converted to another type (strong typing).
 +
:Thus, the above multiplication is not allowed.
 +
:This approach has at least two advantages:
 +
:* Stronger type checking.
 +
:* Better prover performances.
 +
:It also has some disadvantages:
 +
:* need of ''extractors'' to convert back to base types.
 +
:* need of extra circuitry to allow things like <math>x:=d*2</math> where <math>x, d</math> are of type <tt>DISTANCE</tt>
 +
 
 +
* OPTION #3: Mixed mode.
 +
:In mixed mode, the transparent mode is applied to scalar types and the opaque mode is applied to other types.
 +
 
 +
=== Scope of the mathematical extensions ===
 +
* OPTION #1: Project scope.
 +
:The mathematical extensions are implicitly visible to all components of the project that has imported them.
 +
* OPTION #2: Component scope.
 +
:The mathematical extensions are only visible to the components that have explicitly imported them. However, note that this visibility is propagated through the hierarchy of contexts and machines (<tt>EXTENDS</tt>, <tt>SEES</tt> and <tt>REFINES</tt> clauses).
 +
:An issue has been identified. Suppose that <tt>ext1</tt> extension is visible to component <tt>C1</tt> and <tt>ext2</tt> is visible to component <tt>C2</tt>, and there is no compatibility issue between <tt>ext1</tt> and <tt>ext2</tt>. It is not excluded that an identifier declared in <tt>C1</tt> conflict with a symbol in <tt>ext2</tt>. As a consequence, a global verification is required when adding a new mathematical extension.
 +
 
 +
== Bibliography ==
 +
* J.R. Abrial, M.Butler, M.Schmalz, S.Hallerstede, L.Voisin, [http://deploy-eprints.ecs.soton.ac.uk/80 ''Proposals for Mathematical Extensions for Event-B''], 2009.
 +
:This proposal consists in considering three kinds of extension:
 +
# Extensions of set-theoretic expressions or predicates: example extensions of this kind consist in adding the transitive closure of relations or various ordered relations.
 +
# Extensions of the library of theorems for predicates and operators.
 +
# Extensions of the Set Theory itself through the definition of algebraic types such as  lists or ordered trees using new set constructors.
 +
 
 +
[[Category:Design proposal]]

Revision as of 15:07, 25 March 2010

Currently the operators and basic predicates of the Event-B mathematical language supported by Rodin are fixed. We propose to extend Rodin to define basic predicates, new operators or new algebraic types.

Requirements

User Requirements

  • Binary operators (prefix form, infix form or suffix form).
  • Operators on boolean expressions.
  • Unary operators, such as absolute values.
Note that the pipe, which is already used for set comprehension, cannot be used to enter absolute values (in fact, as in the new design the pipe used in set comprehension is only a syntaxic sugar, the same symbol may be used for absolute value. To be confirmed with the prototyped parser. It may however be not allowed in a first time until backtracking is implemented, due to use of lookahead making assumptions on how the pipe symbol is used. -Mathieu ).
  • Basic predicates (e.g., the symmetry of relations sym(R) \defi R=R^{-1}).
Having a way to enter such predicates may be considered as syntactic sugar, because it is already possible to use sets (e.g., R \in sym, where sym \defi \{R \mid R=R ^{-1}\}) or functions (e.g., sym(R) = \True, where sym \defi (\lambda R \qdot R \in A \rel B \mid \bool(R = R^{-1}))).
  • Quantified expressions (e.g., \sum x \qdot P \mid y, \prod x \qdot P \mid y, ~\min(S), ~\max(S)).
  • Types.
    • Enumerated types.
    • Scalar types.

User Input

The end-user shall provide the following information:

  • keyboard input
  • Lexicon and Syntax.
    More precisely, it includes the symbols, the form (prefix, infix, postfix), the grammar, associativity (left-associative or right associative), commutativity, priority, the mode (flattened or not), ...
  • Pretty-print.
    Alternatively, the rendering may be determined from the notation parameters passed to the parser.
  • Typing rules.
  • Well-definedness.

Development Requirements

  • Scalability.

Towards a generic AST

The following AST parts are to become generic, or at least parameterised:

  • Lexer
  • Parser
  • Nodes ( Formula class hierarchy ): parameters needed for:
    • Type Solve (type rule needed to synthesize the type)
    • Type Check (type rule needed to verify constraints on children types)
    • WD (WD predicate)
    • PrettyPrint (tag image + notation (prefix, infix, postfix) + needs parentheses)
    • Visit Formula (getting children + visitor callback mechanism)
    • Rewrite Formula (associative formulæ have a specific flattening treatment)
  • Types (Type class hierarchy): parameters needed for:
    • Building the type expression (type rule needed)
    • PrettyPrint (set operator image)
    • getting Base / Source / Target type (type rule needed)
  • Verification of preconditions (see for example AssociativeExpression.checkPreconditions)

Vocabulary

An extension is to be understood as a single additional operator definition.

Tags

Every extension is associated with an integer tag, just like existing operators. Thus, questions arise about how to allocate new tags and how to deal with existing tags.
The solution proposed here consists in keeping existing tags 'as is'. They are already defined and hard coded in the Formula class, so this choice is made with backward compatibility in mind.

Now concerning extension tags, we will first introduce a few hypotheses:

  • Tags_Hyp1: tags are never persisted across sessions on a given platform
  • Tags_Hyp2: tags are never referenced for sharing purposes across various platforms

In other words, cross-platform/session formula references are always made through their String representation. These assumptions, which were already made and verified for Rodin before extensions, lead us to restrict further considerations to the scope of a single session on a single platform.

The following definitions hold at a given instant t and for the whole platform.
Let \mathit{EXTENSION}_t be the set of extensions supported by the platform at instant t;
let \mathit{tag}_t denote the affectation of tags to a extensions at instant t (\mathit{tag}_t \in \mathit{EXTENSION}_t \pfun \intg);
let \mathit{COMMON} be the set of existing tags defined by the Formula class (\mathit{COMMON} \sub \intg).
The following requirements emerge:

  • Tags_Req1: \forall t \qdot \mathit{tag}_t \in \mathit{EXTENSION}_t \tinj \intg
  • Tags_Req2: \forall e, t_1,t_2 \qdot \mathit{tag}_{t_1}(e)=\mathit{tag}_{t_2}(e) where t_1, t_2 are two instants during a given session
  • Tags_Req3: \forall t \qdot \ran(\mathit{tag}_t) \cap \mathit{COMMON} = \empty

The above-mentioned scope-restricting hypothesis can be reformulated into: \mathit{tag} needs not be stable across sessions nor across platforms.

Formula Factory

The requirements about tags give rise to a need for centralising the \mathit{tag} relation in order to enforce tag uniqueness. The Formula Factory appears to be a convenient and logical candidate for playing this role. Each time an extension is used to make a formula, the factory is called and it can check whether its associated tag exists, create it if needed, then return the new extended formula while maintaining tag consistency.

The factory can also provide API for requests about tags and extensions: getting the tag from an extension and conversely.

We also need additional methods to create extended formulæ. A first problem to address is: which type should these methods return ? We could define as many extended types as the common AST API does, namely ExtendedUnaryPredicate, ExtendedAssociativeExpression, and so on, but this would lead to a large number of new types to deal with (in visitors, filters, …), together with a constraint about which types extensions would be forced to fit into. It is thus preferable to have as few extended types as possible, but with as much parameterisation as can be. Considering that the two basic types Expression and Predicate have to be extensible, we come to add two extended types ExtendedExpression and ExtendedPredicate.

ExtendedExpression makeExtendedExpression( ? )
ExtendedPredicate makeExtendedPredicate( ? )

Second problem to address: which arguments should these methods take ? Other factory methods take the tag, a collection of children where applicable, and a source location. In order to discuss what can be passed as argument to make extended formulæ, we have to recall that the make… factory methods have various kinds of clients, namely:

  • parser
  • POG
  • provers

(other factory clients use the parsing or identifier utility methods: SC modules, indexers, …)

Thus, the arguments should be convenient for clients, depending on which information they have at hand. The source location does not obviously seem to require any adaptation and can be taken as argument the same way. Concerning the tag, it depends on whether clients have a tag or an extension at hand. Both are intended to be easily retrieved from the factory. As a preliminary choice, we can go for the tag and adjust this decision when we know more about client convenience.

As for children, the problem is more about their types. We want to be able to handle as many children as needed, and of all possible types. Looking to existing formula children configurations, we can find:

  • expressions with predicate children: \mathit{bool}(P)
  • expressions with expression children: E_1 + E_2
  • predicates with predicate children: P_1 \limp P_2
  • predicates with expression children: \mathit{partition}(S, E_1, E_2)
  • mixed operators: \{x \qdot P(x) \mid E(x)\}, but it is worth noting that the possibility of introducing bound variables in extended formulæ is not established yet.

Thus, for the sake of generality, children of both types should be supported for both extended predicates and extended expressions.

ExtendedExpression makeExtendedExpression(int tag, Expression[] expressions, Predicate[] predicates, SourceLocation location)
ExtendedPredicate makeExtendedPredicate(int tag, Expression[] expressions, Predicate[] predicates, SourceLocation location)


Defining Extensions

An extension is meant to contain every information and behaviour required by:

  • Keyboard
  • (Extensible Fonts ?)
  • Lexer
  • Parser
  • AST
  • Static Checker
  • Proof Obligation Generator
  • Provers

Keyboard requirements

Kbd_req1: an extension must provide an association combo/translation for every uncommon symbol involved in the notation.

Lexer requirements

Lex_req1: an extension must provide an association lexeme/token for every uncommon symbol involved in the notation.

Parser requirements

According to the Parser page, the following informations are required by the parser in order to be able to parse a formula containing a given extension.

  • symbol compatibility
  • group compatibility
  • symbol precedence
  • group precedence
  • notation (see below)

AST requirements

The following hard-coded concepts need to be reified for supporting extensions in the AST. An extension instance will then provide these reified informations to an extended formula class for it to be able to fulfil its API. It is the expression of the missing information for a ExtendedExpression (resp. ExtendedPredicate) to behave as a Expression (resp. Predicate). It can also be viewed as the parametrization of the AST.

Notation

The notation defines how the formula will be printed. In this document, we use the following convention:

  • e_i is the i-th child expression of the extended formula
  • p_i is the i-th child predicate of the extended formula

Example: infix operator "\lozenge"

e_1 \lozenge e_2 \lozenge \ldots \lozenge e_n

We define the following notation framework:

Notation uml.png

On the "\lozenge" infix operator example, the iterable notation would return sucessively:

  • a IFormulaChild with index 1
  • the INotationSymbol "\lozenge"
  • a IFormulaChild with index 2
  • the INotationSymbol "\lozenge"
  • a IFormulaChild with index n

For the iteration not to run forever, the limit n needs to be known: this is the role of the mapsTo() method, which fixes the number of children, called when this number is known (i.e. for a particular formula instance).

Open question: how to integrate bound identifier lists in the notation ?

We may make a distinction between fixed-size notations (like n-ary operators for a given n) and variable-size notations (like associative infix operators). While fixed-size notations would have no specific expressivity limitations (besides parser conflicts with other notations), only a finite set of pre-defined variable-size notation patterns will be proposed to the user. For technical reasons related to the parser,

The following features need to be implemented to support the notation framework:

  • a notation parser that extracts a fixed-size notation pattern from a user-input String
Well-Definedness

WD predicates also are user-provided data.

Example:

\mathit{D}(e_1) \land (\forall i \cdot i \in 1\mathit{..}(n-1) \Rightarrow (\mathit{D}(e_i) \Rightarrow \mathit{D}(e_{i+1})))

In order to process WD predicates, we need to add the following features to the AST:

  • the \mathit{D} operator
  • expression variables (predicate variables already exist)
  • special expression variables and predicate variables that denote a particular formula child (we need to refer to e_1 and e_i in the above example)
  • a parse() method that accepts these special meta variables and the \mathit{D} operator and returns a Predicate (a WD Predicate Pattern)
  • a makeWDPredicate(aWDPredicatePattern, aPatternInstantiation) method that makes an actual WD predicate
Type Check

An extension shall give a type rule, which consists in:

  • type check predicates (addressed in this very section)
  • a resulting type expression (only for expressions)

Example:

(\forall i \cdot \mathit{type}(e_i) = \pow(\alpha)) \land (\forall i,j \cdot \mathit{type}(e_i)=\mathit{type}(e_j))

Type checking can be reified provided the following new AST features:

  • the \mathit{type} operator
  • type variables (\alpha)
  • the above-mentioned expression variables and predicate variables
  • a parse() method that accepts these special meta variables and the \mathit{type} operator and returns a Predicate (a Type Predicate Pattern)
  • a makeTypePredicate(aTypePredicatePattern, aPatternInstantiation) method that makes an actual Type predicate
Type Solve

This section addresses type synthesizing for extended expressions (the resulting type part of a type rule). It is given as a type expression pattern, so that the actual type can be computed from the children.

Example:

 \pow(\mathit{type}(e_1))

In addition to the requirements for Type Check, the following features are needed:

  • a parse() method that accepts special meta variables and the \mathit{type} operator and returns an Expression (a Type Expression Pattern)
  • a makeTypeExpression(aTypeExpressionPattern, aPatternInstantiation) method that makes an actual Type expression

Static Checker requirements

TODO

Proof Obligation Generator requirements

TODO

Provers requirements

TODO

Extension compatibility issues

TODO

User Input Summarization

Identified required data entail the following user input:

TODO

Impact on other tools

Impacted plug-ins (use a factory to build formulæ):

  • org.eventb.core
In particular, the static checker and proof obligation generator are impacted.
  • org.eventb.core.seqprover
  • org.eventb.pp
  • org.eventb.pptrans
  • org.eventb.ui

Identified Problems

The parser shall enforce verifications to detect the following situations:

  • Two mathematical extensions are not compatible (the extensions define symbols with the same name but with a different semantics).
  • A mathematical extension is added to a model and there is a conflict between a symbol and an identifier.
  • An identifier which conflicts with a symbol of a visible mathematical extension is added to a model.

Beyond that, the following situations are problematic:

  • A formula has been written with a given parser configuration and is read with another parser configuration.
As a consequence, it appears as necessary to remember the parser configuration.
The static checker will then have a way to invalid the sources of conflicts (e.g., priority of operators, etc).
The static checker will then have a way to invalid the formula upon detecting a conflict (name clash, associativity change, semantic change...) mathieu
  • A proof may free a quantified expression which is in conflict with a mathematical extension.
SOLUTION #1: Renaming the conflicting identifiers in proofs?

Open Questions

New types

Which option should we prefer for new types?

  • OPTION #1: Transparent mode.
In transparent mode, it is always referred to the base type. As a consequence, the type conversion is implicitly supported (weak typing).
For example, it is possible to define the DISTANCE and SPEED types, which are both derived from the \intg base type, and to multiply a value of the former type with a value of the latter type.
  • OPTION #2: Opaque mode.
In opaque mode, it is never referred to the base type. As a consequence, values of one type cannot be converted to another type (strong typing).
Thus, the above multiplication is not allowed.
This approach has at least two advantages:
  • Stronger type checking.
  • Better prover performances.
It also has some disadvantages:
  • need of extractors to convert back to base types.
  • need of extra circuitry to allow things like x:=d*2 where x, d are of type DISTANCE
  • OPTION #3: Mixed mode.
In mixed mode, the transparent mode is applied to scalar types and the opaque mode is applied to other types.

Scope of the mathematical extensions

  • OPTION #1: Project scope.
The mathematical extensions are implicitly visible to all components of the project that has imported them.
  • OPTION #2: Component scope.
The mathematical extensions are only visible to the components that have explicitly imported them. However, note that this visibility is propagated through the hierarchy of contexts and machines (EXTENDS, SEES and REFINES clauses).
An issue has been identified. Suppose that ext1 extension is visible to component C1 and ext2 is visible to component C2, and there is no compatibility issue between ext1 and ext2. It is not excluded that an identifier declared in C1 conflict with a symbol in ext2. As a consequence, a global verification is required when adding a new mathematical extension.

Bibliography

This proposal consists in considering three kinds of extension:
  1. Extensions of set-theoretic expressions or predicates: example extensions of this kind consist in adding the transitive closure of relations or various ordered relations.
  2. Extensions of the library of theorems for predicates and operators.
  3. Extensions of the Set Theory itself through the definition of algebraic types such as lists or ordered trees using new set constructors.