ADVANCE D3.3 Improvement of automated proof
In a regular Event-B modelling activity, more than 60% of the time is spent on proofs. Therefore, increasing the rate of automated proofs is a productivity booster which decreases the overall cost of formal modelling. Consequently, enhancing the automated prover has been a continuous task since the inception of the Rodin platform.
This task can be achieved by refactoring the core platform, by adding new features to it, such as new integrated reasoners and tactics, but also by connecting some external reasoning ability such as external provers (e.g., SMT solvers).
During the second period of the ADVANCE project, all these three kinds of activities have been performed.
Motivations / Decisions
Internal code refactoring
The membership goal reasoner has been part of the integrated sequent prover since Rodin 2.3. This tool provides specialized reasoning capability about pure set membership and chains of set inclusion. For instance, it can be used to discharge sequents like .
Internally, the reasoner used a specialized algorithm to discover chains of inclusions and all hypotheses matching . Trying to extend the reasoner to other cases of inclusion such as , we found out that the reasoner was in fact solving a SAT problem where all atoms are of the form . So, we stepped back and replaced the specialized machinery by a call to the SAT4J solver which is already part of Eclipse (it is used to solve dependencies when installing a new plug-in). This refactoring, although not increasing yet the rate of automated proofs, opens the way for further improvements of this reasoner, and this without any penalty, as the new version of the reasoner takes the same amount of time to run (as shown by benchmarks).
During this period, we have implemented new normalization rules about empty sets and their dual : complete sets (that is sets that fill entirely their type). These new rules are now part of the normalizer which is run at each step of the interactive prover. They reduce the diversity of predicates and increase the automated proofs of trivial proof obligations often encountered in degenerated cases (that is not interesting to the modeller, but that need to be carried out nevertheless).
In a similar vein, we have also improved the HYP, HYP_OR, CNTR and GEN_MP reasoners. All these reasoners have in common to search for a hypothesis matching a certain pattern. Until Rodin 3.0, only exact matches were sought for. Now, the reasoners also look for variations of a predicate, that is either equivalent predicates or predicates that provide a sufficient condition. Benchmarking this improvement, we measured that we had an overall 4 % increase in automated proofs compared to the previous version. As for the membership goal reasoner, this improvement comes at no cost as the new versions of these reasoners does not take more time to execute.
The proportion of automatically discharged proof obligations heavily depends on the auto-tactic configuration. Sometimes, the automatic prover fails because the tactics are applied in an unappropriate order. Since Rodin 3.0, plug-ins can contribute their own default profiles (through extension points, for convenience) which allows users to pick up their auto-tactic from a portfolio, rather than having to define a new one by themselves. This therefore improves the usability of the auto-prover and allows to capitalize easily on best practices.
At the beginning of the ADVANCE project, an initial version of the SMT solvers plug-in was available. However, this early version was still very fragile and most of the times was producing proofs that could not be saved, rendering it useless. During the first and second period of the project, as a background task, this plug-in has been strengthened and brought to industrial grade. Version 1.0 of the plug-in has been released in June 2013.
ProB as a Disprover / Prover We reactivated a plug-in for Rodin, that uses the ProB animator and model-checker to generate counterexamples for proof obligations. Searching for a counterexample can be done effortlessly and without user interaction. If successful, the futility of a manual proof attempt is detected early. Furthermore, the counterexample itself provides an insight into the problem and aids in debugging the model.
Additionally, we investigated the use of ProB as a prover. By observing the number and quality of occurring enumerations, ProB is able to tell if the search for a counterexample was done exhaustively. If this is the case, the proof obligation in question is discharged. An initial empirical evaluation was performed. For certain proof obligations containing variables with finite data types, ProB is able to find proofs that are not found by the integrated provers or the SMT solvers.
Other external provers We initially planned to connect the Super Zenon and the iProver external provers to the Rodin platform. The development of iProver seems to have gone closed source : there is no development anymore on the Google project, although a new version 1.0 has been presented to the latest CASC-24 competition. Consequently, we could not make any progress with this prover. Super Zenon sources have not yet been made publicly available, but it is planned to be released under a GPL license by the end of 2013. We will then work on its integration into Rodin.
Tactic profiles are described in the Rodin Handbook.
The user manual of the SMT Solvers plug-in is available on the Event-B wiki.
The introduction of variations in the HYP, HYP_OR, CNTR and GenMP reasoners is described in the Event-B wiki .
The design of the Membership Goal reasoner has been updated to reflect the use of a SAT solver.
During the last period of the ADVANCE project, we plan to continue to improve the rate of automated proofs based on feedback from the case studies.
As concerns the SMT solver integration plug-in, we will benchmark it on the case study models and investigate the integration of new solvers such as CVC4.
Finally, as concerns external provers, we plan to work on Super Zenon integration once it is made open source. We will also check regularly for public releases of new provers that could be integrated.
We will evaluate the ProB Disprover in particular with respect to correctness. We will also prepare a set of automatic tests to ensure that we don't regress.