Difference between pages "Constrained Dynamic Parser" and "Membership in Goal"

From Event-B
(Difference between pages)
Jump to navigationJump to search
imported>Mathieu
m (→‎Some Difficulties: remove tODO)
 
imported>Billaude
 
Line 1: Line 1:
{{TOCright}}
+
= Objective =
This page describes the requirements for a '''generic parser''' for the [[Event-B Mathematical Language]].
 
  
A first design proposal is also drafted.
+
This page describes the design of the reasoner MembershipGoal and its associated tactic MembershipGoalTac.<br>
 +
This reasoner discharges sequent whose goal denotes a membership which can be inferred from hypotheses. Here an basic example of what it discharges :<br>
 +
<math>H,\quad x\in S,\quad S\subset T,\quad T\subseteq U \quad\vdash x\in U</math>
  
== Requirements ==
+
= Analysis =
In order to be usable [[mathematical extensions]] require that the event-b mathematical language syntax can be extended by the final user.
 
  
The lexical analyser and the syntaxic parser thus have to be extensible in a simple enough way (from a user point of vue).
+
Such sequent are proved by PP and ML. But, these provers have both drawbacks :
 +
*All the visible are added as needed hypotheses, which is most of the time not the case.
 +
*They take quite consequent time to prove it (even with the basic example given here above, the difference in time execution is noticeable).
 +
*If there are too many hypotheses, or if the expression of the <math>x</math> is too complicated, they may not prove it.
 +
This is particularly true when in the list of inclusion expressions of each side of the relation are not equal. For example : <math>H,\quad a\in S,\quad S\subset T_1\cap T_2,\quad T_1\cup T_3\subseteq  U\quad\vdash a\in U</math>
 +
<p>
 +
Such a reasoner contributes to prove more Proof Obligations automatically, faster and with fewer needed hypotheses which makes proof rule more legible and proof replay less sensitive to modifications.
  
=== Requirements Exported by the Current Language Design ===
+
= Design Decision =
==== Operator Priority ====
 
* operator are defined by group,
 
* group of operator have a defined precedences,
 
* there may be precedences defined inside groups.
 
  
==== Operator Compatibility ====
+
== Tactic ==
* a compatibility table defines allowed associativities inside a group,
 
* a compatibility table defines allowed associativities between groups (it allows to forbid a syntaxic construction like <math>f(x)^{-1}\;</math>
 
: nota: this requirement was added afterwards with consistency in mind.
 
  
=== Expected Extension Schemes ===
+
This part explains how the tactic (MembershipGoalTac) associated to the reasoner MembershipGoal is working.
We do want to at least define operators of the following form :
+
=== Goal ===
* infix : <math>a + b\;</math> or <math>a \vdash b : c\;</math>
+
The tactic (as the reasoner) should works only on goals such as :
* prefix : <math>\neg a\;</math>
+
*<math>\cdots~\in~\cdots</math>
* postfix : <math> R^*\;</math>  
+
For examples :
* closed : <math>\|a\|\;</math>
+
*<math>f(x)\in g\otimes h</math>
* parentheses sugar : <math>(a +b) * c\;</math>
+
*<math>x\in A\cprod\left(B\cup C\right)</math>
* functional postfix : <math> a \langle b \rangle\;</math>
+
*<math>x\mapsto y\in A\cprod B</math>
* functional prefix : <math> \langle b \rangle f\;</math>
+
In the latter case, the reasoner won't try to prove that ''x'' belongs to ''A'' and ''y'' belongs to ''B'', but that the mapplet belong to the Cartesian product.
* lambda like : <math> \lambda x\mapsto y . P | E\;</math>
+
=== Hypotheses ===
* Union like : <math> \Union\{ e \mid P\}</math> or <math> \Union\{ x,y . P \mid e\}</math>
+
Now we have to find hypotheses leading to discharge the sequent. To do so, the tactic recovers two kinds of hypotheses :
* sets : <math>\{a, b, c + e\}\;</math> or <math>\{ e \mid P\}\;</math> or <math>\{x,y . P \mid e\}\;</math>
+
#the ones related to the left member of the goal <math>\left( x\in S\right)</math> (this is the start point):
 +
#*<math>x\in \cdots</math>
 +
#*<math>\cdots\mapsto x\mapsto\cdots\in\cdots</math>
 +
#*<math>\left\{\cdots, x,\cdots\right\}=\cdots</math>
 +
#*<math>\left\{\cdots, \cdots\mapsto x\mapsto\cdots,\cdots\right\}=\cdots</math>
 +
#the ones denoting inclusion (all but the ones matching the description of the first point) :
 +
#*<math>\cdots\subset\cdots</math>
 +
#*<math>\cdots\subseteq\cdots</math>
 +
#*<math>\cdots=\cdots</math>
 +
Then, it will search a link between those hypotheses so that the sequent can be discharged.
 +
=== Find a path ===
 +
Now that we recovered all the hypotheses that could be useful for the reasoner, it remains to find a path among the hypotheses leading to discharge the sequent. Depending on the relations on each side of the inclusion, we will act differently. <math>f</math> always represent an expression (may be a domain, a range, etc.).
 +
#The following sequent is provable because <math>f\subseteq \varphi (f)</math>.
 +
#*<math>x\in f,\quad \varphi (f)\subseteq g\quad\vdash\quad x\in g</math>
 +
#*<math>\varphi (f) = f\quad\mid\quad f\cup h \quad\mid\quad h\cup f \quad\mid\quad h\ovl f</math>
 +
#The following sequent is provable because <math>\psi (f)\subseteq f</math>.
 +
#*<math>x\in \psi (f),\quad f\subseteq g\quad\vdash\quad x\in g</math>
 +
#*<math>\psi (f) = f\quad\mid\quad f\cap h \quad\mid\quad h\cap f \quad\mid\quad f\setminus h \quad\mid\quad f\ransub A \quad\mid\quad f\ranres A \quad\mid\quad A\domsub f \quad\mid\quad A\domres f</math>
 +
#We can generalized the first two points. This is the Russian dolls system. We can easily prove a sequent with multiple inclusions by going from hypothesis to hypothesis.
 +
#*<math>x\in \psi (f),\quad \varphi (f)\subseteq g\quad\vdash\quad x\in g</math>
 +
#For some relations, [[#positions|positions]] are needed to be known to continue to find hypotheses, but it is not always necessary.
 +
#*<math>x\mapsto y\in f,\quad f\subseteq A\cprod B\quad\vdash\quad x\in A</math>
 +
#*<math>x\in dom(f),\quad f\subseteq A\cprod B\quad\vdash\quad x\in A</math>
 +
#*<math>x\in ran(f),\quad f\subseteq A\cprod B\quad\vdash\quad x\in B</math>
  
We also like to define precisely precedences and associativity between existing and new operators.
+
By using these inclusion and rewrites, it tries to find a path among the recovered hypotheses. Every one of those should only be used once, avoiding possible infinite loop <math>\left(A\subseteq B,\quad B\subseteq A\right)</math>. Among all paths that lead to discharge the sequent, the tactic give the first it finds. Moreover, so that the reasoner does not do the same work as the tactic of writing new hypothesis, it gives all needed hypotheses and added hypotheses in the input.
  
=== Requirements exported by the dynamic feature ===
+
== Reasoner ==
* the precedence should not be enumerated but defined by a relation, like: '+' < '*' and '*' < '**', ...
 
  
== Limitations ==
+
The way the reasoner must work is still in discussion.
  
== Design Alternatives ==
+
= Implementation =  
  
=== Make Existing Parser Extensible ===
+
This section explain how the tactic has bee implemented.
The existing parser is a LL recursive descent parser generated by the [http://www.ssw.uni-linz.ac.at/Coco/ Coco/R Compiler Generator], which makes extensive use of pre-computed lookahead, which makes it very difficult to be transformed in a sufficiently enough generic parser.
 
  
=== Parser Combinator ===
+
=== Positions ===
:* on [http://en.wikipedia.org/wiki/Parser_combinator wikipedia]. Some implementations are referenced [http://pdos.csail.mit.edu/~baford/packrat/ here].
+
As it was said, we may sometimes need the position. It is represented by an array of integer. Here are the possible values the array contains (atomic positions) :
:* with functionnal language [http://wiki.portal.chalmers.se/agda/Agda Agda] : ''[http://www.cs.nott.ac.uk/~nad/publications/danielsson-norell-mixfix.pdf Parsing Mixfix Operators]
+
* '''''kdom''''' : it corresponds to the domain.
:: This paper is interesting in its proposal of using an acyclic graph to define operator precedence.
+
**<math>\left[A\cprod B\right]_{pos\;=\;kdom} = A</math>
:* with functional language [http://user.cs.tu-berlin.de/~opal/opal-language.html OPAL] : ''[http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=8A8A8B39FFBF1AEF17C782C4822C5AF6?doi=10.1.1.58.655&rep=rep1&type=url&i=0 How To Obtain Powerful Parsers That Are Elegant and Practical]''
+
**<math>\left[x\mapsto y\right]_{pos\;=\;kdom} = x</math>
 +
**<math>\left[g\right]_{pos\;=\;kdom} = dom(g)</math>
 +
* '''''kran''''' : it corresponds to the domain.
 +
**<math>\left[A\cprod B\right]_{pos\;=\;kran} = B</math>
 +
**<math>\left[x\mapsto y\right]_{pos\;=\;kran} = y</math>
 +
**<math>\left[g\right]_{pos\;=\;kran} = ran(g)</math>
 +
* '''''leftDProd''''' : it corresponds to the left member of a direct product.
 +
**<math>\left[f\otimes g\right]_{pos\;=\;leftDProd} = f</math>
 +
**<math>\left[A\cprod \left(B\cprod C\right)\right]_{pos\;=\;leftDProd} = A\cprod B</math>
 +
* '''''rightDProd''''' : it corresponds to the right member of a direct product
 +
**<math>\left[f\otimes g\right]_{pos\;=\;rightDProd} = g</math>
 +
**<math>\left[A\cprod\left(B\cprod C\right)\right]_{pos\;=\;rightDProd} = A\cprod C</math>
 +
* '''''leftPProd''''' : it corresponds to the left member of a parallel product
 +
**<math>\left[f\parallel g\right]_{pos\;=\;leftPProd} = f</math>
 +
**<math>\left[\left(A\cprod B\right)\cprod\left(C\cprod D\right)\right]_{pos\;=\;leftPProd} = A\cprod C</math>
 +
* '''''rightPProd''''' : it corresponds to the right member of a parallel product
 +
**<math>\left[f\parallel g\right]_{pos\;=\;rightPProd} = g</math>
 +
**<math>\left[\left(A\cprod B\right)\cprod\left(C\cprod D\right)\right]_{pos\;=\;rightPProd} = B\cprod D</math>
 +
* '''''converse''''' : it corresponds to the child of an inverse
 +
**<math>\left[f^{-1}\right]_{pos\;=\;converse}=f</math>
 +
**<math>\left[A\cprod B\right]_{pos\;=\;converse} = B\cprod A</math>
 +
For example, the following expressions at the given positions are equivalent.
 +
:<math>\left[f\otimes\left(\left(A\cprod dom(g)\right)\parallel g\right)\right]_{pos\;=\;\left[rightDProd,\;leftPProd,\;kran\right]} = \left[\left(A\cprod dom(g)\right)\parallel g\right]_{pos\;=\;\left[leftPProd,\;kran\right]} = \left[A\cprod dom(g)\right]_{pos\;=\;\left[kran\right]} = \left[dom(g)\right]_{pos\;=\;\left[~\right]} = \left[g\right]_{pos\;=\;\left[kdom\right]}</math>
  
=== Pratt Parser ===
+
Some combinations of atomic positions are equivalent. In order to simplify comparison between positions, they are normalized :
:* [http://portal.acm.org/citation.cfm?id=512931 original paper from Vaughan Pratt]
+
*<math>ran(f^{-1}) = dom(f)\quad\limp\quad \left[f\right]_{pos \;=\; \left[converse,~kran\right]} = \left[f\right]_{pos\;=\;\left[kdom\right]}</math>
:* [http://effbot.org/zone/tdop-index.htm SImple ''Pratt'' parsing in python]
+
*<math>dom(f^{-1}) = ran(f)\quad\limp\quad \left[f\right]_{pos \;=\; \left[converse,~kdom\right]} = \left[f\right]_{pos\;=\;\left[kran\right]}</math>
:* [http://javascript.crockford.com/tdop/tdop.html ''Pratt'' parsing in javascript]
+
*<math>\left(f^{-1}\right)^{-1} = f\quad\limp\quad \left[f\right]_{pos \;=\; \left[converse,~converse\right]} = \left[f\right]_{pos\;=\;\left[~\right]}</math>
Note that a ''pratt'' parser is essentially a parser for expressions (where expression parts are themselves expressions)... But it may be tweaked to limit the accepted sub-expressions or to enforce some non-expression parts inside an expression. More on that hereafter.
 
  
=== Some Existing Extensible Languages ===
+
=== Goal ===
* [http://www.chrisseaton.com/katahdin/ Katahdin], and the associated paper [http://www.chrisseaton.com/katahdin/katahdin-thesis.pdf A Programming Language Where the Syntax and Semantics Are Mutable at Runtime] which describes the language and its implementation (in C#).
 
* [http://itcentre.tvu.ac.uk/~clark/xmf.html XMF] whose implementation seems not be describes in details.
 
* [http://xlr.sourceforge.net/index.html XLR: Extensible Language and Runtime], which looks like a ''pratt'' parser according to the [http://xlr.svn.sourceforge.net/viewvc/xlr/trunk/xl2/xlr/parser.cpp?revision=805&view=markup sources].
 
* [http://www.pi-programming.org/What.html <math>\pi</math>] whose implementation seems not be describes in details (but sources availables).
 
  
== Design Proposal ==
+
As explained in the design decision part, goal is checked. If it matches the description here above <math>\left(x\in S\right)</math> then ''x'' is stored in an attribute. Moreover, from the set ''S'', we compute every pair ''expression'' & ''position'' equivalent to it. For example, from the set <math>dom(ran(ran(g)))</math>, the map will be computed :
=== Main Concepts ===
+
*<math>dom(ran(ran(g)))\;\mapsto\;[\;]</math>
;symbol: a symbol is a lexem known by the parser.
+
*<math>ran(ran(g))\;\mapsto\;[0]</math>
;group: each symbol belongs to one and only one group.
+
*<math>ran(g)\;\mapsto\;[1,~0]</math>
;symbol compatibility: a relation telling if inside a group two symbol are compatibles.
+
*<math>g\;\mapsto\;[1,~1,~0]</math>
:Two symbol are compatibles is they can be parsed without parentheses (see [[Event-B Mathematical Language]] for more on compatibility).
+
Only ''range'', ''domain'' and ''converse'' can be taken into account to get all the possibles goals.
;group compatibility: a relation telling if two groups are compatibles.
 
:Two symbol of different groups are said compatibles if their group are compatibles.
 
;symbol associativity: a relation which gives the associative precedence of symbols inside a given group.
 
;group associativity: a relation which gives the associative precedence of groups.
 
:Two symbol of different groups have a relative precedence given by the associative precedence of their groups.
 
;generic parser: the parser which is defined.
 
: This parser may be called recursively (an infix expression being build upon a left and right sub-expression) and is often the default parser used when building the AST associated with a ''symbol''.
 
;specific parser: some specific parser, predefined as helper functions to build specific AST.
 
:Some example of specific parser are the one used to parse and build list of identifiers (for quantified predicates), mapplets lists (for lambda), and so one...
 
  
=== Proposed User Interface ===
+
A pair (expression ; position) is said equal to the goal if and only if there exists a pair equivalent to the goal (GoalExp ; GoalPos) and a pair equivalent to the given pair (Exp ; Pos) such as ''Pos = GoalPos'' and ''Exp'' is contained in ''GoalExp''.
''The hereafter proposition does not preclude the final UI to take other (more user friendly forms). But the concepts should probably stay the same''
 
  
We propose that the user will be able to create several predefined type of symbols (new types may certainly be further added by mean of eclipse extension points). When defining a symbol, the user can choose the parser used to parse the sub-expressions (default to generic parser, but a specific parser may be used) and can enforce the form of the resulting parsed sub-expression in terms of their AST root type.
+
=== Hypotheses ===
  
==== Predefined Symbol Builders ====
+
As explained in the design decision part, there are two kinds of hypotheses which are recovered. But when hypotheses related to the left member of the goal <math>\left(x\in S\right)</math> are stored, the position of ''x'' is also record. Then, if there is an hypothesis such as <math>\left\{\cdots\;,\;y\mapsto x\mapsto z\;,\;m\mapsto x\;,\;\cdots\right\} = A</math>, then this hypothesis is mapped to the positions <math>\left\{\left[0,~1\right],~\left[1\right]\right\}</math>.
The proposed predefined symbol builders are:
 
;infix: For example predicate conjunction which expects two predicates for left and right parts may be defined by <code>infix("∧", gid="(logic pred)", expected_nonterm=["predicate", "predicate"])</code>
 
:As another example, substitution ≔ of group '(infix subst)' which expect two ''ident list'' for left and right expressions may be defined by <code>infix(symbol="≔", gid="(infix subst)", parsers=["ident_list", "ident_list"], expected_nonterm=["ident_list", "ident_list"])</code>
 
  
;prefix: <code>prefix("¬", "(not pred)", expected_nonterm=["predicate"])</code>
+
=== Find a path ===
  
;atomic: <code>atomic("⊤", "(atomic pred)")</code>
+
Let's considered the sequent with the following goal : <math>x\in V</math>.
 +
We start with the hypotheses which have a connection with the goal's member. Such a hypothesis gives two informations : the position <math>pos</math> and the set <math>S</math> as explained in [[#Hypotheses|hypotheses]]. Then, for each equivalent pair to these one <math>\left(S', pos'\right)</math>, we compute set containing <math>S'</math> ([[#Design Decision#Tactic#Find a path| Find a path 2.]]). For every new pair, we test if it is contained in the goal.
  
;prefix closed: used to build prefix symbol which must be followed by enclosing symbol(brackets, parentheses,..). For example <tt>finite</tt> may be defined by: <code>prefix_closed("finite", "(", ")","(atomic pred)", expected_nonterm=["expression"])</code>
+
To be continued.
  
:postfix: <code>postfix("∼", gid="(unary relation)",  expected_nonterm=["expression"])</code>
+
= Untreated cases =
  
;quantified: <code>quantified("∀", "·", expected_nonterm=["predicate"]</code>
+
Some cases are not treated. Further enhancement may be provided for some.
 +
*<math>x\in A\cup B,\quad A\cup B\cup C\subseteq D\quad\vdash\quad x\in D</math>
 +
*<math>x\in f,\quad f\in A\;op\;B\quad\vdash\quad x\in A\cprod B</math>
 +
*<math>f\ovl \left\{x\mapsto y\right\}\subseteq A\cprod B\quad\vdash\quad x\in A</math>
 +
*<math>x\in f\otimes g,\quad f\subseteq A\cprod B,\quad g\subseteq C\cprod D\quad\vdash\quad x\in (A\cprod C)\cprod(B\cprod D)</math>
 +
*<math>x\in f\otimes g,\quad f\subseteq h\quad\vdash\quad x\in h\otimes g</math>
 +
*<math>x\in \left\{a,~b,~c\right\},\quad\left\{a,~b,~c,~d,~e,~f\right\}\subseteq D\quad\vdash\quad x\in D</math>
 +
*<math>x\in A\cprod B,\quad A\subseteq C\quad\vdash\quad x\in C\cprod B</math>
 +
*<math>x\in dom(f)\cap A\quad\vdash\quad x\in dom(A\domres f)</math>
 +
*<math>x\in ran(f)\cap A\quad\vdash\quad x\in ran(f\ranres A)</math>
 +
*<math>x\in \quad\vdash\quad</math>
 +
*<math>\quad\vdash\quad</math>
 +
*<math>\quad\vdash\quad</math>
 +
*<math>\quad\vdash\quad</math>
 +
*:<math>\bigl(</math> where <math>op_1</math> and <math>op_2</math> are ones of :<math>\quad\rel, \trel, \srel, \strel, \pfun, \tfun, \pinj, \tinj, \psur, \tsur, \tbij\bigr)</math>
  
;lambda like: <code>lambda_like("λ", "·", u"∣", gid="(quantification)", expected_nonterm=["ident list", "predicate", "expression"])</code>
+
[[Category:Design proposal]]
 
 
;quantified union like: <code>Union_like("⋃", "·", "∣", gid="(quantification)")</code>
 
 
 
;closed sugar: use for parentheses added to enforce associativity: <code>closed_sugar("(", ")")</code>
 
 
 
;functional postfix: <code>functional_postfix("(", ")", gid="(functional)", expected_nonterm=["expression", "expression"])</code>
 
 
 
Some right associative versions of those symbol builders may also be supplied (infix_right, or a specific parameter ''right associative'' parameter may also be provided to the ''infix'' symbol builder).
 
 
 
==== Predefined Specific Parser ====
 
* expression lists
 
* ident lists
 
* maplet tree (for lambda)
 
 
 
 
 
==== Defining Compatibility and Precedences ====
 
Inside a group, ''compatibility'' may be defined like:
 
<code>
 
    g = group["(binop)"]
 
    g.add_many_compatibility(
 
            [  ("∪", "∪"),
 
                ("∩", "∩"),
 
                ("∩", "∖"),
 
                ("∩", "▷"),
 
                ("∩", "⩥"),
 
                ("∘", "∘"),
 
                (";", ";"),
 
                (";", "▷"),
 
                (";", "⩥"),
 
                ("", ""),
 
                ("◁", "∩"),
 
                ("◁", "∖"),
 
                ("◁", ";"),
 
                ("◁", "⊗"),
 
                ("◁", "▷"),
 
                ("◁", "⩥"),
 
                ("⩥", "∩"),
 
                ("⩥", "∖"),
 
                ("⩥", ";"),
 
                ("⩥", "⊗"),
 
                ("⩥", "▷"),
 
                ("⩥", "⩥"),
 
                ("×", "×"), ]
 
</code>
 
which defines the compatibility table for expression binary operator as defined in [[Event-B Mathematical Language]].
 
 
 
 
 
Inside a group, ''precedence'' may be defined like:
 
<code>
 
g = group["(arithmetic)"]
 
g.add_many_precedence(
 
            [  ("^", "÷"),
 
                ("^", "∗"),
 
                ("^", "+"),
 
                ("^", "−"),
 
                ("^", "mod"),
 
                ("+", "∗"),
 
                ("−", "∗"),
 
                ("+", "÷"),
 
                ("−", "÷"),
 
                ("+", "mod"),
 
                ("−", "mod"), ] )
 
</code>
 
which defines ^ as having the least precedence amongst the operators of the expression arithmetic group.
 
 
 
Similarly, precedences and compatibilities between groups may be defined :
 
<code>
 
    group_precedence.add(("(binop)", "(interval)"))
 
    group_precedence.add(("(interval)", "(arithmetic)"))
 
    group_precedence.add(("(arithmetic)", "(unary relation)"))
 
    group_precedence.add(("(unary relation)", "(bound unary)"))
 
    group_precedence.add(( "(bound unary)", "(bool)"))
 
 
 
    group_compatibility.add_all( [ "(binop)", "(interval)", (arithmetic)", "(unary relation)", "(unary relation)", "(bound unary)",....])
 
</code>
 
 
 
==== Some Difficulties ====
 
Some difficulties appears when trying to implement an event-b parser with the preceding scheme:
 
* A symbol may be declared by several different builders (thinf of unary and infix minus for example).
 
** ''pratt'' parser are able to cope with a symbol appearing either at the beginning of an expression or inside an expression (see ''led'' and ''nud'' hereafter). For example, unary minus and infix minus wan be both defined by the proposed scheme.
 
** but in their default form, ''pratt'' parser failed in parsing symbol like '{' which appears at the beginning of extension sets, comprehension sets and quantified sets.
 
::The existing event-b parser use infinite lookahead to determine which one of those three sets is actually being parsed. We can not afford such a solution, as it would hinder extensibility of the parser. Implementing a backtracking parser is probably our best solution provided that we use memoisation to alleviate the cost of parsing formulas that require heavy backtracking.
 
 
 
* Defining a grammar or a grammar extension is not an easy job, even with those simple mechanisms.
 
: It would certainly be a good idea to implement a user interface to grammar testing so that an extension designer as the mean to ensure that he agree with the parser on the semantic of (or at least on the AST obtained by using) its extension.
 
 
 
=== Core Algorithm ===
 
=== Sample Implementation ===
 
A prototype as been developed in Python to quickly try different solutions.
 
 
 
The development tree is available at http://bitbucket.org/matclab/eventb_pratt_parser/
 
 
 
 
 
 
 
 
 
 
 
 
 
[[Category:Design Proposal]]
 

Revision as of 13:28, 9 August 2011

Objective

This page describes the design of the reasoner MembershipGoal and its associated tactic MembershipGoalTac.
This reasoner discharges sequent whose goal denotes a membership which can be inferred from hypotheses. Here an basic example of what it discharges :
H,\quad x\in S,\quad S\subset T,\quad T\subseteq U \quad\vdash x\in U

Analysis

Such sequent are proved by PP and ML. But, these provers have both drawbacks :

  • All the visible are added as needed hypotheses, which is most of the time not the case.
  • They take quite consequent time to prove it (even with the basic example given here above, the difference in time execution is noticeable).
  • If there are too many hypotheses, or if the expression of the x is too complicated, they may not prove it.

This is particularly true when in the list of inclusion expressions of each side of the relation are not equal. For example : H,\quad a\in S,\quad S\subset T_1\cap T_2,\quad T_1\cup T_3\subseteq  U\quad\vdash a\in U

Such a reasoner contributes to prove more Proof Obligations automatically, faster and with fewer needed hypotheses which makes proof rule more legible and proof replay less sensitive to modifications.

Design Decision

Tactic

This part explains how the tactic (MembershipGoalTac) associated to the reasoner MembershipGoal is working.

Goal

The tactic (as the reasoner) should works only on goals such as :

  • \cdots~\in~\cdots

For examples :

  • f(x)\in g\otimes h
  • x\in A\cprod\left(B\cup C\right)
  • x\mapsto y\in A\cprod B

In the latter case, the reasoner won't try to prove that x belongs to A and y belongs to B, but that the mapplet belong to the Cartesian product.

Hypotheses

Now we have to find hypotheses leading to discharge the sequent. To do so, the tactic recovers two kinds of hypotheses :

  1. the ones related to the left member of the goal \left( x\in S\right) (this is the start point):
    • x\in \cdots
    • \cdots\mapsto x\mapsto\cdots\in\cdots
    • \left\{\cdots, x,\cdots\right\}=\cdots
    • \left\{\cdots, \cdots\mapsto x\mapsto\cdots,\cdots\right\}=\cdots
  2. the ones denoting inclusion (all but the ones matching the description of the first point) :
    • \cdots\subset\cdots
    • \cdots\subseteq\cdots
    • \cdots=\cdots

Then, it will search a link between those hypotheses so that the sequent can be discharged.

Find a path

Now that we recovered all the hypotheses that could be useful for the reasoner, it remains to find a path among the hypotheses leading to discharge the sequent. Depending on the relations on each side of the inclusion, we will act differently. f always represent an expression (may be a domain, a range, etc.).

  1. The following sequent is provable because f\subseteq \varphi (f).
    • x\in f,\quad \varphi (f)\subseteq g\quad\vdash\quad x\in g
    • \varphi (f) = f\quad\mid\quad f\cup h \quad\mid\quad h\cup f \quad\mid\quad h\ovl f
  2. The following sequent is provable because \psi (f)\subseteq f.
    • x\in \psi (f),\quad f\subseteq g\quad\vdash\quad x\in g
    • \psi (f) = f\quad\mid\quad f\cap h \quad\mid\quad h\cap f \quad\mid\quad f\setminus h \quad\mid\quad f\ransub A \quad\mid\quad f\ranres A \quad\mid\quad A\domsub f \quad\mid\quad A\domres f
  3. We can generalized the first two points. This is the Russian dolls system. We can easily prove a sequent with multiple inclusions by going from hypothesis to hypothesis.
    • x\in \psi (f),\quad \varphi (f)\subseteq g\quad\vdash\quad x\in g
  4. For some relations, positions are needed to be known to continue to find hypotheses, but it is not always necessary.
    • x\mapsto y\in f,\quad f\subseteq A\cprod B\quad\vdash\quad x\in A
    • x\in dom(f),\quad f\subseteq A\cprod B\quad\vdash\quad x\in A
    • x\in ran(f),\quad f\subseteq A\cprod B\quad\vdash\quad x\in B

By using these inclusion and rewrites, it tries to find a path among the recovered hypotheses. Every one of those should only be used once, avoiding possible infinite loop \left(A\subseteq B,\quad B\subseteq A\right). Among all paths that lead to discharge the sequent, the tactic give the first it finds. Moreover, so that the reasoner does not do the same work as the tactic of writing new hypothesis, it gives all needed hypotheses and added hypotheses in the input.

Reasoner

The way the reasoner must work is still in discussion.

Implementation

This section explain how the tactic has bee implemented.

Positions

As it was said, we may sometimes need the position. It is represented by an array of integer. Here are the possible values the array contains (atomic positions) :

  • kdom : it corresponds to the domain.
    • \left[A\cprod B\right]_{pos\;=\;kdom} = A
    • \left[x\mapsto y\right]_{pos\;=\;kdom} = x
    • \left[g\right]_{pos\;=\;kdom} = dom(g)
  • kran : it corresponds to the domain.
    • \left[A\cprod B\right]_{pos\;=\;kran} = B
    • \left[x\mapsto y\right]_{pos\;=\;kran} = y
    • \left[g\right]_{pos\;=\;kran} = ran(g)
  • leftDProd : it corresponds to the left member of a direct product.
    • \left[f\otimes g\right]_{pos\;=\;leftDProd} = f
    • \left[A\cprod \left(B\cprod C\right)\right]_{pos\;=\;leftDProd} = A\cprod B
  • rightDProd : it corresponds to the right member of a direct product
    • \left[f\otimes g\right]_{pos\;=\;rightDProd} = g
    • \left[A\cprod\left(B\cprod C\right)\right]_{pos\;=\;rightDProd} = A\cprod C
  • leftPProd : it corresponds to the left member of a parallel product
    • \left[f\parallel g\right]_{pos\;=\;leftPProd} = f
    • \left[\left(A\cprod B\right)\cprod\left(C\cprod D\right)\right]_{pos\;=\;leftPProd} = A\cprod C
  • rightPProd : it corresponds to the right member of a parallel product
    • \left[f\parallel g\right]_{pos\;=\;rightPProd} = g
    • \left[\left(A\cprod B\right)\cprod\left(C\cprod D\right)\right]_{pos\;=\;rightPProd} = B\cprod D
  • converse : it corresponds to the child of an inverse
    • \left[f^{-1}\right]_{pos\;=\;converse}=f
    • \left[A\cprod B\right]_{pos\;=\;converse} = B\cprod A

For example, the following expressions at the given positions are equivalent.

\left[f\otimes\left(\left(A\cprod dom(g)\right)\parallel g\right)\right]_{pos\;=\;\left[rightDProd,\;leftPProd,\;kran\right]} = \left[\left(A\cprod dom(g)\right)\parallel g\right]_{pos\;=\;\left[leftPProd,\;kran\right]} = \left[A\cprod dom(g)\right]_{pos\;=\;\left[kran\right]} = \left[dom(g)\right]_{pos\;=\;\left[~\right]} = \left[g\right]_{pos\;=\;\left[kdom\right]}

Some combinations of atomic positions are equivalent. In order to simplify comparison between positions, they are normalized :

  • ran(f^{-1}) = dom(f)\quad\limp\quad \left[f\right]_{pos \;=\; \left[converse,~kran\right]} = \left[f\right]_{pos\;=\;\left[kdom\right]}
  • dom(f^{-1}) = ran(f)\quad\limp\quad \left[f\right]_{pos \;=\; \left[converse,~kdom\right]} = \left[f\right]_{pos\;=\;\left[kran\right]}
  • \left(f^{-1}\right)^{-1} = f\quad\limp\quad \left[f\right]_{pos \;=\; \left[converse,~converse\right]} = \left[f\right]_{pos\;=\;\left[~\right]}

Goal

As explained in the design decision part, goal is checked. If it matches the description here above \left(x\in S\right) then x is stored in an attribute. Moreover, from the set S, we compute every pair expression & position equivalent to it. For example, from the set dom(ran(ran(g))), the map will be computed :

  • dom(ran(ran(g)))\;\mapsto\;[\;]
  • ran(ran(g))\;\mapsto\;[0]
  • ran(g)\;\mapsto\;[1,~0]
  • g\;\mapsto\;[1,~1,~0]

Only range, domain and converse can be taken into account to get all the possibles goals.

A pair (expression ; position) is said equal to the goal if and only if there exists a pair equivalent to the goal (GoalExp ; GoalPos) and a pair equivalent to the given pair (Exp ; Pos) such as Pos = GoalPos and Exp is contained in GoalExp.

Hypotheses

As explained in the design decision part, there are two kinds of hypotheses which are recovered. But when hypotheses related to the left member of the goal \left(x\in S\right) are stored, the position of x is also record. Then, if there is an hypothesis such as \left\{\cdots\;,\;y\mapsto x\mapsto z\;,\;m\mapsto x\;,\;\cdots\right\} = A, then this hypothesis is mapped to the positions \left\{\left[0,~1\right],~\left[1\right]\right\}.

Find a path

Let's considered the sequent with the following goal : x\in V. We start with the hypotheses which have a connection with the goal's member. Such a hypothesis gives two informations : the position pos and the set S as explained in hypotheses. Then, for each equivalent pair to these one \left(S', pos'\right), we compute set containing S' ( Find a path 2.). For every new pair, we test if it is contained in the goal.

To be continued.

Untreated cases

Some cases are not treated. Further enhancement may be provided for some.

  • x\in A\cup B,\quad A\cup B\cup C\subseteq D\quad\vdash\quad x\in D
  • x\in f,\quad f\in A\;op\;B\quad\vdash\quad x\in A\cprod B
  • f\ovl \left\{x\mapsto y\right\}\subseteq A\cprod B\quad\vdash\quad x\in A
  • x\in f\otimes g,\quad f\subseteq A\cprod B,\quad g\subseteq C\cprod D\quad\vdash\quad x\in (A\cprod C)\cprod(B\cprod D)
  • x\in f\otimes g,\quad f\subseteq h\quad\vdash\quad x\in h\otimes g
  • x\in \left\{a,~b,~c\right\},\quad\left\{a,~b,~c,~d,~e,~f\right\}\subseteq D\quad\vdash\quad x\in D
  • x\in A\cprod B,\quad A\subseteq C\quad\vdash\quad x\in C\cprod B
  • x\in dom(f)\cap A\quad\vdash\quad x\in dom(A\domres f)
  • x\in ran(f)\cap A\quad\vdash\quad x\in ran(f\ranres A)
  • x\in \quad\vdash\quad
  • \quad\vdash\quad
  • \quad\vdash\quad
  • \quad\vdash\quad
    \bigl( where op_1 and op_2 are ones of :\quad\rel, \trel, \srel, \strel, \pfun, \tfun, \pinj, \tinj, \psur, \tsur, \tbij\bigr)