D32 Provers

From Event-B
Jump to navigationJump to search

Overview

Concerning Rodin's provers the following contributions have been made:

  • Jann Röder (ETH Zurich) has developed a relevance filter plug-in. The plug-in provides a proof tactic that first removes hypotheses from a given sequent according to several heuristics. The tactic then inputs the reduced sequent to one or several of Rodin's external provers (PP, newPP, ML). Jann Röder carried out experiments using Event-B models from different domains and observed that his tactic significantly increases the number of proof obligations proved automatically.
  • Matthias Schmalz (ETH Zurich) formally expressed the theoretical foundations of Event-B's logic. "Event-B's logic" stands for the formalism in which, e.g., guards, invariants, axioms, and theorems are expressed, and proof obligations are expressed and proved. He provides a rigorous specification of syntax, semantics, proofs, theories, and mathematical extensions [1] in one document. The document encompasses a small theory "Core", proves "Core"'s soundness, and shows how to define the remaining operators, types, and binders available in Rodin using mathematical extensions. The document thus provides a proof calculus for Event-B that is sound by construction, and a methodology for reasoning about the soundness of Event-B proof rules within Event-B. The document also allows users to look-up definitions of predefined operators and binders, answering questions like "what is the meaning of x \div y if x or  y  is negative". For developers, it sheds some light on intricate questions concerning partial functions, e.g., why it is sound to rewrite x \in \{y \mid \varphi(y)\} to  \varphi(x) but unsound (in general) to rewrite  \varphi(x) to x \in \{y \mid \varphi(y)\}.
  • Systerel improved the support for well-definedness in the sequent prover. The new implementation generates much smaller well-definedness predicates and new automated tactics have been added to discharge most of the well-definedness subgoals, thus rendering well-definedness almost transparent to the end-user.

Motivations

Relevance Filtering

Rodin's external provers (PP, newPP, and sometimes also ML) tend to perform poorly in the presence of irrelevant hypotheses. For PP and newPP the user can still manually select the hypotheses he considers relevant, but that is a tedious and error-prone process, in particular for large models. Several heuristics for selecting relevant hypotheses have been proposed in the literature[2][3][4][5]. The relevance filter plug-in implements these and other heuristics, and provides a default configuration that has been shown to be almost optimal on a given collection of models from different domains[6]. The relevance filter plug-in has also significantly increased the number of automatically discharged proof obligations on models of deployment partners, which had not been used for fine tuning the heuristics.

Foundations of Event-B's Logic

As Rodin is used to develop safety critical systems, bugs in Rodin's theorem prover constitute a serious problem. Unfortunately, several bugs have been discovered that make Rodin's theorem prover unsound. Obviously, any examination of soundness presupposes a clearly written specification of the logic's syntax, semantics, and proof calculus. There are several publications on the logic of Event-B, but they fail to serve as specification documents, because the logic defined therein is inconsistent [7] or only fragments of the logic implemented in Rodin are considered [8] [9]. Therefore we have devised a rigorous specification document for the logic of Event-B [10].

Mathematical extensions[1] play an important role in avoiding unsoundness, because they allow the user to define new operators, binders, types, and inference and rewrite rules in a soundness preserving fashion. The specification document [10] also devises the theoretical foundations of mathematical extensions. Note that mathematical extensions are well-understood for, e.g., HOL[11], but the extension methods for HOL cannot be straightforwardly adopted for Event-B because of Event-B's well-definedness [12] mechanism and non-standard term rewriting.

Improved WD Support

When the user enters an expression or predicate that is possibly ill-defined (such as applying a partial function), the Rodin platform insists that the user demonstrates that this formula is indeed well-defined (e.g., the partial function is applied to an element of its domain) before using it. This verification is implemented by generating a well-definedness (WD) predicate, based on the syntax of the input formula.

In previous releases of the Rodin platform, the generation of WD predicates was implemented in a very simple manner, and the generated predicate was usually highly redundant. Moreover, the support by automated tactics for discharging such predicates was not always appropriate. Consequently, many well-definedness subproofs needed to be carried out interactively in a very cumbersome and repetitive manner.

Choices / Decisions

Relevance Filtering

The relevance filter heuristics we have considered do not work out of the box - their parameters need to be carefully adjusted. The major design decision concerned how to carry out the process of fine tuning. We started with an ad-hoc benchmark containing models of several problem domains and aimed for maximizing the number of automatically discharged proof obligations among this benchmark while minimizing the amount of time spent for proving. We experimented with different filter configurations, i.e., combinations of heuristics, heuristic parameters, provers (PP, newPP, or ML) and prover timeouts. Finally, the parameters and timeouts were chosen such that

  • the number of automatically discharged proof obligations is almost maximal among all considered filter configurations, and
  • decreasing the timeouts would significantly decrease the number of automatically discharged proof obligations.

To rebut criticism of overfitting, we tested the final filter configuration on a validation benchmark (based on deployment partners' models), which was chosen independently from the benchmark used for fine-tuning. We observed that the final filter configuration significantly increases the number of automatically discharged proof obligations among the validation benchmark in comparison to not using relevance filtering.

Foundations of Event-B's Logic

The major design decision concerned the logic in which the semantics of Event-B's logic is formalized. We experimented with ZF set theory and HOL. Finally, we decided to define semantics in terms of a (shallow) embedding into HOL, because that allows us to carry out vast parts of our soundness proofs using Isabelle/HOL[13]. In the long term, the embedding allows us to use Isabelle/HOL as an external theorem prover for Rodin.

Other design decisions, e.g., concerning terminology, are discussed in [10].

Improved WD Support

To improve generated WD lemmas, the generating algorithm has not been changed (to ensure safety) but enhanced by a back-end that simplifies the generated lemma after the fact. The enhancement consists in removing all sub-predicates that are subsumed within the WD lemma.

Also, as WD lemmas were changing between two releases of the platform, the automated proof replay mechanism needed to better tackle changes in proof obligations (when they get simpler). This allows user to retain their proof status, although proof obligations have changed.

As concerns automated support, it has been chosen not to add new reasoners (to avoid expanding the trusted base of the sequent prover) but rather to work on the outside by adding new tactics that schedule the existing reasoners to discharge the WD subgoals. This approach also allowed to start introducing speculative reasoning within tactics (attempt proofs).

Available Documentation

  • The internals of the relevance filter plug-in and the process of fine tuning are documented in J. Röder's Master thesis.[6]
  • A rigorous specification of Event-B's logic (for Rodin developers) and a reference document containing the definitions of built-in symbols (for Rodin developers and users) can be found in "The logic of Event-B" report.[10]
  • The specification of the Improved WD Lemma Generation [14] is available from the Rodin Wiki.

Planning

In DEPLOY's fourth year, we intend to provide a link-up between Rodin and Isabelle/HOL. That allows us to implement proof tactics that internally use Isabelle/HOL to discharge the given sequent. Consistency of these tactics depends merely on the consistency of Isabelle/HOL and correctness of the translation from Event-B to Isabelle/HOL, which is quite straightforward. As Isabelle/HOL comes with link-ups to first-order solvers such as E[15], Spass[16], and Vampire[17] and SMT solvers such as Z3[18], a link-up between Rodin and Isabelle/HOL makes these solvers also available to Rodin.

References

  1. 1.0 1.1 http://wiki.event-b.org/index.php/D32_Mathematical_Extensions Cite error: Invalid <ref> tag; name "mathext" defined multiple times with different content
  2. K. Hoder. SUMO infernce engine.
  3. J. Meng and L. C. Paulson. Lightweight relevance filtering for machine-generated resolution problems. Journal of Applied Logic, 7(1);41-57, 2009.
  4. A. Roederer, Y. Puzis, and G. Sutcliffe. Divvy: an atp meta-system based on axiom relevance ordering. In CADE, pages 157-162, 2009.
  5. G. Sutcliffe and Y. Puzis. SRASS - a semantic relevance axiom selection system. In CADE, pages 295-310, 2007.
  6. 6.0 6.1 J. Röder. Relevance filters for Event-B. Master Thesis, ETH Zurich, 2010.
  7. J.-R. Abrial. Modeling in Event-B: system and software engineering. Cambridge University Press, 2010
  8. F. D. Mehta. Proofs for the working engineer. PhD Thesis, ETH Zurich, 2008.
  9. C. Metayer and L. Voisin. The Event-B mathematical language, 2009.
  10. 10.0 10.1 10.2 10.3 M. Schmalz. The logic of Event-B. Technical Report 698, ETH Zurich, Switzerland, 2010.
  11. M. J. C. Gordon and T. F. Melham. Introduction to HOL. Cambridge University Press, 1993.
  12. http://wiki.event-b.org/index.php/Well-definedness_in_Event-B
  13. T. Nipkow, L. C. Paulson, and M. Wenzel. Isabelle/HOL - a proof assistant for higher-order logic. LNCS 2283, 2002.
  14. http://wiki.event-b.org/index.php/Improved_WD_Lemma_Generation
  15. S. Schulz. E - a brainiac theorem prover. AI Commun. 15(2-3);11-126, 2002.
  16. SPASS: an automated theorem prover for first-order logic with equality.
  17. A. Riazanov and A. Voronkov. The design and implementation of VAMPIRE. AI Commun. 15(2-3);91-110, 2002.
  18. L. M. de Moura and N. Bjorner. Z3: an efficient SMT solver. TACAS, pages 337-340, 2008.