Difference between revisions of "D32 Provers"

From Event-B
Jump to: navigation, search
m
Line 8: Line 8:
 
==== Relevance Filtering ====
 
==== Relevance Filtering ====
  
Rodin's external provers (PP, newPP, ML) tend to timeout if the given sequent contains many irrelevant hypotheses.
+
Rodin's external provers (PP, newPP, and sometimes also ML) tend to perform poorly in the presence of irrelevant hypotheses.
 
For PP and newPP the user can still manually select the hypotheses he considers relevant, but that is a tedious and error-prone process, in particular for large models.
 
For PP and newPP the user can still manually select the hypotheses he considers relevant, but that is a tedious and error-prone process, in particular for large models.
Several heuristics for selecting relevant hypotheses have been proposed in the literatur.
+
Several heuristics for selecting relevant hypotheses have been proposed in the literatur<ref>[http://www.cs.manchester.ac.uk/~hoderk/sine K. Hoder. SUMO infernce engine.]</ref><ref>J. Meng and L. C. Paulson. Lightweight relevance filtering for machine-generated resolution problems. Journal of Applied Logic, 7(1);41-57, 2009.</ref><ref>A. Roederer, Y. Puzis, and G. Sutcliffe. Divvy: an atp meta-system based on axiom relevance ordering. In CADE, pages 157-162, 2009.</ref><ref>G. Sutcliffe and Y. Puzis. SRASS - a semantic relevance axiom selection system. In CADE, pages 295-310, 2007.</ref>.
The relevance filter plug-in implements these heuristics and provides a default configuration that has been shown to be almost optimal on a given collection of models
+
The relevance filter plug-in implements these and other heuristics, and provides a default configuration that has been shown to be almost optimal on a given collection of models
from different domains. The relevance filter plug-in has also significantly increased the number of automatically proved proof obligations on models of industrial partners, which have not been used for fine tuning the heuristics.
+
from different domains<ref name="jannThesis">[http://n.ethz.ch/~roederja/download/thesis.pdf J. Röder. Relevance filters for Event-B. Master Thesis, ETH Zurich, 2010.]</ref>. The relevance filter plug-in has also significantly increased the number of automatically discharged proof obligations on models of industrial partners, which have not been used for fine tuning the heuristics.
  
 
==== Foundations of Event-B's Logic ====
 
==== Foundations of Event-B's Logic ====
Line 19: Line 19:
 
Unfortunately, several bugs have been discovered that make Rodin's theorem prover unsound.
 
Unfortunately, several bugs have been discovered that make Rodin's theorem prover unsound.
 
Obviously, any examination of soundness presupposes a clearly written specification of the logic's syntax, semantics, and proof calculus.
 
Obviously, any examination of soundness presupposes a clearly written specification of the logic's syntax, semantics, and proof calculus.
There are several publications on the logic of Event-B, but they fail to serve as specification documents, because the described logic itself is inconsistent [http://www.event-b.org/abook.html] or only fragments of the logic implemented in Rodin are considered [http://e-collection.ethbib.ethz.ch/eserv/eth:30601/eth-30601-02.pdf] [http://deploy-eprints.ecs.soton.ac.uk/11/4/kernel_lang.pdf].
+
There are several publications on the logic of Event-B, but they fail to serve as specification documents, because the logic defined therein is inconsistent <ref>[http://www.event-b.org/abook.html J.-R. Abrial. Modeling in Event-B: system and software engineering. Cambridge University Press, 2010]</ref> or only fragments of the logic implemented in Rodin are considered <ref>[http://e-collection.ethbib.ethz.ch/eserv/eth:30601/eth-30601-02.pdf F. D. Mehta. Proofs for the working engineer. PhD Thesis, ETH Zurich, 2008.]</ref> <ref>[http://deploy-eprints.ecs.soton.ac.uk/11/4/kernel_lang.pdf C. Metayer and L. Voisin. The Event-B mathematical language, 2009.]</ref>.
Therefore we have devised a rigorous specification document for the full logic of Event-B [ftp://ftp.inf.ethz.ch/pub/publications/tech-reports/6xx/698.pdf].
+
Therefore we have devised a rigorous specification document for the logic of Event-B <ref name="eblogic">[ftp://ftp.inf.ethz.ch/pub/publications/tech-reports/6xx/698.pdf M. Schmalz. The logic of Event-B. Technical Report 698, ETH Zurich, Switzerland, 2010.]</ref>.
  
 
[[D32 Mathematical Extensions|Mathematical extensions]] play an important role in avoiding unsoundness, because they allow the user to define new operators, binders, types, and inference and rewrite rules in a soundness preserving fashion.
 
[[D32 Mathematical Extensions|Mathematical extensions]] play an important role in avoiding unsoundness, because they allow the user to define new operators, binders, types, and inference and rewrite rules in a soundness preserving fashion.
Therefore, the report [ftp://ftp.inf.ethz.ch/pub/publications/tech-reports/6xx/698.pdf] also devises the theoretical foundations of mathematical extensions.
+
The specification document <ref name="eblogic"/> also devises the theoretical foundations of mathematical extensions.
Note that mathematical extensions are well-understood for, e.g., [http://www.cambridge.org/gb/knowledge/isbn/item1143701 HOL], but the extension methods for HOL cannot be straightforwardly adopted for Event-B because of Event-B's [[Well-definedness in Event-B|well-definedness]] mechanism and non-standard term rewriting.
+
Note that mathematical extensions are well-understood for, e.g., HOL<ref>M. J. C. Gordon and T. F. Melham. Introduction to HOL. Cambridge University Press, 1993.</ref>, but the extension methods for HOL cannot be straightforwardly adopted for Event-B because of Event-B's [[Well-definedness in Event-B|well-definedness]] mechanism and non-standard term rewriting.
  
 
=== Choices / Decisions ===
 
=== Choices / Decisions ===
 +
 +
==== Relevance Filtering ====
 +
The relevance filter heuristics we have considered do not work out of the box - their parameters need to be carefully adjusted.
 +
The major design decision concerned how to carry out the process of fine tuning.
 +
We started with an ad-hoc benchmark containing models of several problem domains and aimed for maximizing the number of automatically discharged proof obligations among this benchmark while minimizing the amount of time spent for proving.
 +
We experimented with different filter configurations, i.e., combinations of heuristics, heuristic parameters, provers (PP, newPP, or ML) and prover timeouts.
 +
Finally, the parameters and timeouts were chosen such that
 +
* the number of automatically discharged proof obligations is almost maximal among all considered filter configurations, and
 +
* decreasing the timeouts would significantly decrease the number of automatically discharged proof obligations.
 +
 +
To rebut criticism of overfitting, we tested the final filter configuration on a validation benchmark, which was chosen independently from the benchmark used for fine-tuning.
 +
We observed that the final filter configuration significantly increases the number of automatically discharged proof obligations among the validation benchmark in comparison to not using relevance filtering.
  
 
==== Foundations of Event-B's Logic ====
 
==== Foundations of Event-B's Logic ====
 
The major design decision concerned the logic in which the semantics of Event-B's logic is formalized.
 
The major design decision concerned the logic in which the semantics of Event-B's logic is formalized.
We experimented with ZF set theory and HOL. Finally, we decided to define semantics in terms of a (shallow) embedding into HOL, because that allows us to carry out vast parts of our soundness proofs using Isabelle/HOL. In the long term, the embedding allows us
+
We experimented with ZF set theory and HOL. Finally, we decided to define semantics in terms of a (shallow) embedding into HOL, because that allows us to carry out vast parts of our soundness proofs using Isabelle/HOL<ref>T. Nipkow, L. C. Paulson, and M. Wenzel. Isabelle/HOL - a proof assistant for higher-order logic. LNCS 2283, 2002.</ref>. In the long term, the embedding allows us
 
to use Isabelle/HOL as an external theorem prover for Rodin.
 
to use Isabelle/HOL as an external theorem prover for Rodin.
  
Other design decisions, e.g., concerning terminology, are discussed in [ftp://ftp.inf.ethz.ch/pub/publications/tech-reports/6xx/698.pdf].
+
Other design decisions, e.g., concerning terminology, are discussed in <ref name="eblogic"/>.
  
 
=== Available Documentation ===
 
=== Available Documentation ===
  
* The internals of the relevance filter plug-in and the process of fine tuning are documented in: [http://n.ethz.ch/~roederja/download/thesis.pdf J. Röder, Relevance Filters for Event-B, Master Thesis, ETH Zurich, 2010].
+
* The internals of the relevance filter plug-in and the process of fine tuning are documented in <ref name="jannThesis"/>.
* A rigorous specification of Event-B's logic (for Rodin developers) and a reference document containing the definitions of built-in symbols (for Rodin developers and users): [ftp://ftp.inf.ethz.ch/pub/publications/tech-reports/6xx/698.pdf M. Schmalz, The logic of Event-B, Technical Report 698, ETH Zurich, Switzerland, 2010].
+
* A rigorous specification of Event-B's logic (for Rodin developers) and a reference document containing the definitions of built-in symbols (for Rodin developers and users) can be found in <ref name="eblogic"/>.
  
 
=== Planning ===
 
=== Planning ===
Line 45: Line 57:
 
That allows us to implement proof tactics that internally use Isabelle/HOL to discharge the given sequent.
 
That allows us to implement proof tactics that internally use Isabelle/HOL to discharge the given sequent.
 
Consistency of these tactics depends merely on the consistency of Isabelle/HOL and correctness of the translation from Event-B to Isabelle/HOL, which is quite straightforward.
 
Consistency of these tactics depends merely on the consistency of Isabelle/HOL and correctness of the translation from Event-B to Isabelle/HOL, which is quite straightforward.
As Isabelle/HOL comes with link-ups to first-order solvers such as E, Spass, and Vampire and SMT solvers such as Z3,
+
As Isabelle/HOL comes with link-ups to first-order solvers such as E<ref>S. Schulz. E - a brainiac theorem prover. AI Commun. 15(2-3);11-126, 2002.</ref>, Spass<ref>[http://www.spass-prover.org SPASS: an automated theorem prover for first-order logic with equality.]</ref>, and Vampire<ref>A. Riazanov and A. Voronkov. The design and implementation of VAMPIRE. AI Commun. 15(2-3);91-110, 2002.</ref> and SMT solvers such as Z3<ref>L. M. de Moura and N. Bjorner. Z3: an efficient SMT solver. TACAS, pages 337-340, 2008.</ref>,
 
a link-up between Rodin and Isabelle/HOL makes these solvers also available to Rodin.
 
a link-up between Rodin and Isabelle/HOL makes these solvers also available to Rodin.
 +
 +
=== References ===
 +
<references/>
  
 
[[Category:D32 Deliverable]]
 
[[Category:D32 Deliverable]]
 
[[Category:Books]]
 
[[Category:Books]]

Revision as of 11:57, 25 November 2010

Overview

Concerning Rodin's provers the following contributions have been made:

  • Jann Röder (ETH Zurich) has developed a relevance filter plug-in. The plug-in provides a proof tactic that first removes hypotheses from a given sequent according to several heuristics. The tactic then inputs the reduced sequent to one or several of Rodin's external provers (PP, newPP, ML). Jann Röder carried out experiments using Event-B models from different domains and observed that his tactic significantly increases the number of proof obligations proved automatically.
  • Matthias Schmalz (ETH Zurich) has worked out the theoretical foundations of Event-B's logic. "Event-B's logic" stands for the formalism in which, e.g., guards, invariants, axioms, and theorems are expressed, and proof obligations are expressed and proved. He provides a rigorous specification of syntax, semantics, proofs, theories, and mathematical extensions in one document. The document encompasses a small theory "Core", proves "Core"'s soundness, and shows how to define the remaining operators, types, and binders available in Rodin using mathematical extensions. The document thus provides a proof calculus for Event-B that is sound by construction, and a methodology for reasoning about the soundness of Event-B proof rules within Event-B. The document also allows users to look-up definitions of predefined operators and binders, answering questions like "what is the meaning of <math>x \div y</math> if <math>x</math> or <math>y</math> is negative". For developers, it sheds some light on intricate questions concerning partial functions, e.g., why is it sound to rewrite <math>x \in \{y \mid \varphi(y)\}</math> to <math>\varphi(x)</math> but unsound (in general) to rewrite <math>\varphi(x)</math> to <math>x \in \{y \mid \varphi(y)\}</math>.

Motivations

Relevance Filtering

Rodin's external provers (PP, newPP, and sometimes also ML) tend to perform poorly in the presence of irrelevant hypotheses. For PP and newPP the user can still manually select the hypotheses he considers relevant, but that is a tedious and error-prone process, in particular for large models. Several heuristics for selecting relevant hypotheses have been proposed in the literatur[1][2][3][4]. The relevance filter plug-in implements these and other heuristics, and provides a default configuration that has been shown to be almost optimal on a given collection of models from different domains[5]. The relevance filter plug-in has also significantly increased the number of automatically discharged proof obligations on models of industrial partners, which have not been used for fine tuning the heuristics.

Foundations of Event-B's Logic

As Rodin is used to develop safety critical systems, bugs in Rodin's theorem prover constitute a serious problem. Unfortunately, several bugs have been discovered that make Rodin's theorem prover unsound. Obviously, any examination of soundness presupposes a clearly written specification of the logic's syntax, semantics, and proof calculus. There are several publications on the logic of Event-B, but they fail to serve as specification documents, because the logic defined therein is inconsistent [6] or only fragments of the logic implemented in Rodin are considered [7] [8]. Therefore we have devised a rigorous specification document for the logic of Event-B [9].

Mathematical extensions play an important role in avoiding unsoundness, because they allow the user to define new operators, binders, types, and inference and rewrite rules in a soundness preserving fashion. The specification document [9] also devises the theoretical foundations of mathematical extensions. Note that mathematical extensions are well-understood for, e.g., HOL[10], but the extension methods for HOL cannot be straightforwardly adopted for Event-B because of Event-B's well-definedness mechanism and non-standard term rewriting.

Choices / Decisions

Relevance Filtering

The relevance filter heuristics we have considered do not work out of the box - their parameters need to be carefully adjusted. The major design decision concerned how to carry out the process of fine tuning. We started with an ad-hoc benchmark containing models of several problem domains and aimed for maximizing the number of automatically discharged proof obligations among this benchmark while minimizing the amount of time spent for proving. We experimented with different filter configurations, i.e., combinations of heuristics, heuristic parameters, provers (PP, newPP, or ML) and prover timeouts. Finally, the parameters and timeouts were chosen such that

  • the number of automatically discharged proof obligations is almost maximal among all considered filter configurations, and
  • decreasing the timeouts would significantly decrease the number of automatically discharged proof obligations.

To rebut criticism of overfitting, we tested the final filter configuration on a validation benchmark, which was chosen independently from the benchmark used for fine-tuning. We observed that the final filter configuration significantly increases the number of automatically discharged proof obligations among the validation benchmark in comparison to not using relevance filtering.

Foundations of Event-B's Logic

The major design decision concerned the logic in which the semantics of Event-B's logic is formalized. We experimented with ZF set theory and HOL. Finally, we decided to define semantics in terms of a (shallow) embedding into HOL, because that allows us to carry out vast parts of our soundness proofs using Isabelle/HOL[11]. In the long term, the embedding allows us to use Isabelle/HOL as an external theorem prover for Rodin.

Other design decisions, e.g., concerning terminology, are discussed in [9].

Available Documentation

  • The internals of the relevance filter plug-in and the process of fine tuning are documented in [5].
  • A rigorous specification of Event-B's logic (for Rodin developers) and a reference document containing the definitions of built-in symbols (for Rodin developers and users) can be found in [9].

Planning

In DEPLOY's fourth year, we intend to provide a link-up between Rodin and Isabelle/HOL. That allows us to implement proof tactics that internally use Isabelle/HOL to discharge the given sequent. Consistency of these tactics depends merely on the consistency of Isabelle/HOL and correctness of the translation from Event-B to Isabelle/HOL, which is quite straightforward. As Isabelle/HOL comes with link-ups to first-order solvers such as E[12], Spass[13], and Vampire[14] and SMT solvers such as Z3[15], a link-up between Rodin and Isabelle/HOL makes these solvers also available to Rodin.

References

  1. K. Hoder. SUMO infernce engine.
  2. J. Meng and L. C. Paulson. Lightweight relevance filtering for machine-generated resolution problems. Journal of Applied Logic, 7(1);41-57, 2009.
  3. A. Roederer, Y. Puzis, and G. Sutcliffe. Divvy: an atp meta-system based on axiom relevance ordering. In CADE, pages 157-162, 2009.
  4. G. Sutcliffe and Y. Puzis. SRASS - a semantic relevance axiom selection system. In CADE, pages 295-310, 2007.
  5. 5.0 5.1 J. Röder. Relevance filters for Event-B. Master Thesis, ETH Zurich, 2010.
  6. J.-R. Abrial. Modeling in Event-B: system and software engineering. Cambridge University Press, 2010
  7. F. D. Mehta. Proofs for the working engineer. PhD Thesis, ETH Zurich, 2008.
  8. C. Metayer and L. Voisin. The Event-B mathematical language, 2009.
  9. 9.0 9.1 9.2 9.3 M. Schmalz. The logic of Event-B. Technical Report 698, ETH Zurich, Switzerland, 2010.
  10. M. J. C. Gordon and T. F. Melham. Introduction to HOL. Cambridge University Press, 1993.
  11. T. Nipkow, L. C. Paulson, and M. Wenzel. Isabelle/HOL - a proof assistant for higher-order logic. LNCS 2283, 2002.
  12. S. Schulz. E - a brainiac theorem prover. AI Commun. 15(2-3);11-126, 2002.
  13. SPASS: an automated theorem prover for first-order logic with equality.
  14. A. Riazanov and A. Voronkov. The design and implementation of VAMPIRE. AI Commun. 15(2-3);91-110, 2002.
  15. L. M. de Moura and N. Bjorner. Z3: an efficient SMT solver. TACAS, pages 337-340, 2008.