Difference between pages "Indexing System" and "The Use of Theories in Code Generation"

From Event-B
(Difference between pages)
Jump to navigationJump to search
imported>Nicolas
m
 
imported>Andy
 
Line 1: Line 1:
This document aims at helping developers getting into the code of the '''index''' plug-in.
+
= Defining Translations Using The Theory Plug-in =
Please refer to [[Rodin Index Design]] for more general considerations about the purpose and design.
+
The theory plug-in is used to add mathematical extensions to Rodin. The theories are created, and deployed, and can then be used in any models in the workspace. When dealing with implementation level models, such as in Tasking Event-B, we need to consider how to translate newly added types and operators into code. We have augmented the theory interface with a Translation Rules section. This enables a user to define translation rules that map Event-B formulas to code.
 +
== Translation Rules==
 +
<div id="fig:Translation Rules">
 +
<br/>
 +
[[Image:TheoryCGRules.png|center||caption text]]
 +
<center>'''Figure 1''': Translation Rules</center>
 +
<br/>
 +
</div>
  
==General structure of the package==
+
Figure 1 shows the interface, and some translations rules of the mapping to Ada.
  
The indexer plug-in is made of the following packages:
+
The theory is given a name, and may import some other theories. Type parameters can be added, and we use them here to type the meta-variables. The meta-variable ''a'' is restricted to be an integer type, but meta-variable ''c'' can be any type. Meta-variables are used in the translator rules for pattern matching.
  
* '''org.rodinp.core.index''': public interfaces and Plugin class
+
Translator rules are templates, which are used in pattern matching. Event-B formulas are defined on the left hand side of the rule, and the code to be output (as text) appears on the right hand side of the matching rule. During translation an abstract syntax tree (AST) representation of the formula is used. The theory plug-in attempts to match the formulas in the rules with each syntactic element of the AST. As it does so it builds the textual output as a string, until the whole AST has been successfully matched. When a complete tree is matched, the target code is returned. If the AST is not matched, a warning is issued, and a string representation of the original formula is returned.
* '''org.rodinp.internal.core.index''': implementation of public interfaces and main classes (including '''IndexManager''')
 
* '''org.rodinp.internal.core.index.tables''': index tables (memory storage)
 
* '''org.rodinp.internal.core.index.sort''': generical sorting support (used to manage file dependencies)
 
* '''org.rodinp.internal.core.index.persistence''': index saving main classes (disk storage)
 
* '''org.rodinp.internal.core.index.persistence.xml''': index saving in xml (Strategy pattern)
 
  
 +
== Type Rules for Code Generation ==
  
==The Index Manager==
+
The type rules section, shown in Figure 1, is where the relationship is defined, between Event-B types and the type system of the implementation.
  
'''IndexManager''' is the master class of the plug-in. There resides the ''start()'' method called in a background job at plug-in startup. Firstly, it restores saved index data from previous session (see [[#Persistence]]). Then it loops, waiting for database deltas to be enqueued and launching the indexing of changed files via a user-cancellable job (see [[#Concurrency]]).  
+
= Adding New (implementation-level) Types =
 +
When we are working at abstraction levels close to the implementation level, we may make an implementation decision which requires the introduction of a new type to the development. We give an example of our approach, where we add a new array type, shown in Figure 2, and then define its translation to code.
  
Indexing is done in two steps, project per project:
+
== An Array Type Definition ==
* updating file [[#Dependencies|dependencies]] to compute indexing order
+
<div id="fig:Extension with an Array Type">
* indexing files in computed order to update [[#Index Tables|index tables]]
+
<br/>
 +
[[Image:ArrayDef.png|center||caption text]]
 +
<center>'''Figure 2''': Array Definition</center>
 +
<br/>
 +
</div>
  
Finally, when the session closes, index data is saved (see [[#Persistence]]) and the background job is cancelled.
+
The array operator notation is defined in the expression array(s: P(T)); and the semantics is defined in the direct definition. arrayN constrains the arrays to be of fixed length. Array lookup, update, and constructor operators are subsequently defined. In the next step we need to define any translations required to implement the array in code.
  
The IndexManager is also the entry point for index requesting.
+
== Translation Rules ==
  
 +
<div id="Translation Rules for the Array Type">
 +
<br/>
 +
[[Image:ArrayTrans.png|center||caption text]]
 +
<center>'''Figure 3''': Translation Rules for the Array Type</center>
 +
<br/>
 +
</div>
  
===Project Index Manager===
+
Figure 3 shows the Ada translation; beginning with the meta-variable definitions that are used for pattern matching in the translation rules. Each of the operators; ''newArray'', and ''update'', and an expression using the ''lookup'' operator, are mapped to their implementations on the right hand side of the rule. The ''Type Rules'' section describes the implementation's description of the ''arrayN'' type.
 
 
A '''ProjectIndexManager''' (PIM) is responsible for maintaining [[#Index Tables|index tables]] of one project. The IndexManager has a '''PerProjectPIM''', that is simply an association between projects and their respective PIMs.
 
 
 
A PIM is called by the IndexManager at two moments:
 
* when a file has just changed (''fileChanged()''), to update file [[#Dependencies|dependencies]]
 
* when the indexing job is scheduled (''doIndexing()''), to index files and update its [[#Index Tables|index tables]]
 
 
 
It delegates indexer calling to the [[#File Indexing Manager|file indexing manager]].
 
 
 
===Indexer Registry===
 
 
 
The '''IndexerRegistry''' (Singleton) records registered indexers. It maps file types on lists of indexers, so that when a given file has to be indexed, it can be requested to give the corresponding registered indexers, in the order they have to be called.
 
 
 
===File Indexing Manager===
 
 
 
Actual indexer calls are achieved by the '''FileIndexingManager''' (FIM), which implements Singleton pattern. Being given a file, it gets appropriate indexers from the [[#Indexer Registry|indexer registry]], asks them either to get [[#Dependencies|dependencies]] of the file or to index it, then returns the result to the calling [[#Project Index Manager|PIM]].
 
 
 
==Index Tables==
 
 
 
Those tables store indexing result data.
 
 
 
===Rodin Index===
 
 
 
'''RodinIndex''' is the main index table. It stores the entities together with their occurrences. More precisely, it maps each indexed IInternalElement to a '''Descriptor'''.
 
 
 
A Descriptor is a '''Declaration''' (handle + name) of an element plus a set of '''Occurrence'''s of this element.
 
 
 
An Occurrence has an '''OccurrenceKind''' (defined by indexer plug-in implementers) and a '''RodinLocation''' (a location of the element within another in some file).
 
 
 
===Export Table===
 
 
 
The '''ExportTable''' records declarations of elements exported by each indexed file.
 
 
 
===File Table===
 
 
 
The '''FileTable''' gives, for each file, the elements indexed in it. It contains no more information than the RodinIndex (can be constructed entirely from it), but is intended to give faster access to indexed elements of a given file, rather than iterating through the whole index.
 
 
 
===Name Table===
 
 
 
The '''NameTable''' follows the same optimization idea: for each indexed user-defined name, it gives a set of elements declared with this name.
 
 
 
==Dependencies==
 
 
 
File dependencies are treated generically in the '''org.rodinp.internal.core.index.sort''' package. The final goal is to maintain an iterable [[#Total Order|total order]] on the files of each project. PIMs will then follow this order when indexing their project, using type ''IRodinFile'' for generic parameter T.
 
 
 
===Node===
 
 
 
The '''Node''' is to be understood as a node in the dependence graph. It has a label of type T and lists of predecessors and successors.
 
 
 
Telling that ''N1'' is a predecessor of ''N2'' means that ''N2'' depends on ''N1'', so ''N1'' must be indexed before ''N2''.
 
 
 
Nodes also have an order position (''orderPos'' field) that is the position of this node in the total order. Its value can only be known after nodes have been [[#Sorted Nodes|sorted]].
 
 
 
Not all nodes need to be processed each time. This is the reason why nodes have a ''mark'': when iterating over nodes, marked ones are processed (indexed) whereas others are simply ignored.
 
 
 
===Total Order===
 
 
 
'''TotalOrder''' is the main class. It implements ''Iterator'', providing ''next()'', ''hasNext()'' and ''remove()'' methods, along with other specific facilities.
 
 
 
It delegates graph maintenance to '''Graph''' and node sorting and iterating to '''SortedNodes'''.
 
 
 
When the graph changes, nodes are sorted again, but only if iterating is requested. This avoids sorting at each change.
 
 
 
Graph change events are received through a listener. This might seem strange at first: as the TotalOrder class calls for graph modification itself, it should know when it changes. The problem is that, for instance, predecessors can be set to exactly the same list of nodes as previously. In this case, the graph remains unchanged but TotalOrder cannot guess it and will unnecessarily sort the whole graph again (well, actually it could guess, but we prefer delegating change check to '''Graph'''). Same problem when creating a node that already exists.
 
 
 
===Graph===
 
 
 
The '''Graph''' stores nodes, applies changes to their links and maintains their coherence: if N1 is a predecessor of N2 then N2 is a successor of N1.
 
 
 
As explained above, it fires change events when a method call actually modifies the graph.
 
 
 
===Sorted Nodes===
 
 
 
'''SortedNodes''' maintains a list of iterable sorted nodes. It delegates sorting to a '''Sorter'''.
 
 
 
An important feature is that, if the graph is modified during iteration, it restarts from the first node that differs between previous order and new order. To that end, iterated nodes are recorded.  
 
 
 
==Indexing Toolkit==
 
 
 
The '''IndexingToolkit''' acts as a kind of a go-between during the indexing of a single file. It is built by the PIM, then sent to indexers through the FIM. Indexers get the file to index from it, then they fill it with declarations and occurrences. From these, the IndexingToolkit  builds an '''IndexingResult''' that is sent back to the PIM (through the FIM again).
 
 
 
Thus, IndexingToolkit is the visible part of the indexing system, a front-end, from indexers point of view.
 
 
 
This role gives it the responsibility to enforce constraints like:
 
* an element occurrence cannot have an occurrence if it has not been declared before
 
* an element cannot be declared more than once
 
* declared elements must be local to the file
 
* occurring elements must be either local or imported
 
 
 
 
 
==Persistence==
 
 
 
Index data persistence is implemented following the 'Strategy' pattern. Current implemention provides an xml strategy, however any other means could be used instead.
 
 
 
Saved and restored data consist in current deltas plus, for each PIM:
 
* index table
 
* export table
 
* file sort order
 
* iterating state
 
 
 
The '''PersistenceManager''' controls restoring and saving, being called by the IndexManager at startup and listening to session and project closure as a '''ISaveParticipant'''.
 
 
 
Actual xml persistence is achieved by the '''XMLPersistor''' that implements the required '''IPersistor''' interface. Then each type of persistent object (index table, ...) is decomposed into specific "persistors" that provide both ''save()'' and ''restore()'' methods for their assigned object.
 
 
 
Persistent object are passed to persistors as '''PersistentIndexManager''', '''PersistentSortedNodes''', '''PersistentTotalOrder'''. These objects contain the state and/or data of IndexManager, SortedNodes, TotalOrder, that they have to save and from which they can be restored.
 
 
 
==Concurrency==
 
 
 
===Delta Queue===
 
 
 
The '''DeltaQueue''', used by the IndexManager, is the place where deltas sent by the database are temporarily stored, waiting to be processed. '''DeltaQueuer''' is the listener. Actually, deltas are not enqueued "as is", but as '''IndexDelta''', that is a IRodinElement (file or project) plus a '''Kind''' (file changed, project closed …).
 
 
 
To properly handle index requests, we have to be in a stable state, waiting until the queue is empty. This is achieved by a '''CountUpDownLatch''' that counts the number of elements in the queue. It is incremented when an element is added and decremented when an element has been removed '''and processed'''.
 
 
 
 
 
===Jobs and Locks===
 
 
 
 
 
The IndexManager starts as a background job with low priority, then launches a user-cancelable indexing job at each delta processing.
 
 
 
During indexing, which potentially involves several projects, resources must be left unchanged. Therefore, we use the RodinDB scheduling rule to lock it entirely, forcing the indexing job to wait until it is released.
 
 
 
Restoring and saving index data (see [[#Persistence]]) may lead to concurrency issues if deltas or requests were to be processed at the same time. Protection is achieved by locks in IndexManager (restore and requests), in PIMs (read/write access to index tables) and in PersistenceManager (project save).
 
Full save is left unlocked, as no deltas or requests can occur during session closure.
 

Revision as of 15:51, 15 May 2012

Defining Translations Using The Theory Plug-in

The theory plug-in is used to add mathematical extensions to Rodin. The theories are created, and deployed, and can then be used in any models in the workspace. When dealing with implementation level models, such as in Tasking Event-B, we need to consider how to translate newly added types and operators into code. We have augmented the theory interface with a Translation Rules section. This enables a user to define translation rules that map Event-B formulas to code.

Translation Rules


caption text
Figure 1: Translation Rules


Figure 1 shows the interface, and some translations rules of the mapping to Ada.

The theory is given a name, and may import some other theories. Type parameters can be added, and we use them here to type the meta-variables. The meta-variable a is restricted to be an integer type, but meta-variable c can be any type. Meta-variables are used in the translator rules for pattern matching.

Translator rules are templates, which are used in pattern matching. Event-B formulas are defined on the left hand side of the rule, and the code to be output (as text) appears on the right hand side of the matching rule. During translation an abstract syntax tree (AST) representation of the formula is used. The theory plug-in attempts to match the formulas in the rules with each syntactic element of the AST. As it does so it builds the textual output as a string, until the whole AST has been successfully matched. When a complete tree is matched, the target code is returned. If the AST is not matched, a warning is issued, and a string representation of the original formula is returned.

Type Rules for Code Generation

The type rules section, shown in Figure 1, is where the relationship is defined, between Event-B types and the type system of the implementation.

Adding New (implementation-level) Types

When we are working at abstraction levels close to the implementation level, we may make an implementation decision which requires the introduction of a new type to the development. We give an example of our approach, where we add a new array type, shown in Figure 2, and then define its translation to code.

An Array Type Definition


caption text
Figure 2: Array Definition


The array operator notation is defined in the expression array(s: P(T)); and the semantics is defined in the direct definition. arrayN constrains the arrays to be of fixed length. Array lookup, update, and constructor operators are subsequently defined. In the next step we need to define any translations required to implement the array in code.

Translation Rules


caption text
Figure 3: Translation Rules for the Array Type


Figure 3 shows the Ada translation; beginning with the meta-variable definitions that are used for pattern matching in the translation rules. Each of the operators; newArray, and update, and an expression using the lookup operator, are mapped to their implementations on the right hand side of the rule. The Type Rules section describes the implementation's description of the arrayN type.