Library regexp.examples.cident.definition




From regexp Require Import synthesis byte.

The module regexp.synthesis provides all these and some more helper functions. The regexp.byte module exposes the byte types and various boolean functions on it.

Regular expressions as rules.

The regular expressions are represented as elements of type Rule Inp, where the type Inp is the type of the alphabet. For the current example, the input type Inp is byte and the rules for matching an identifier are given below.

Definition rules : Rule byte
  := [rule|
       let start := any [isAlpha; char "_" ] in
       let rest := any [isAlphaNum; char "_"] in
       match (start ++ many rest)%re
       |].

Our library uses the Notation mechanism of to give a readable syntax which we hope is self explanatory. The let bindings define the atomic regular expressions or_atoms, which are then used in the regular expression given in the match clause. In the example above, we define the atoms start and rest, both of which are predicates of type byte -> bool and correspond to the character classes and respectively. The final regular expressions is then defined in terms of these atoms using the concatenation operator ++, and the Kleene star (the function many).
The ability to use arbitrary predicates on the type Inp as atoms rather than individual elements is an important feature of our library and allows us to express character classes like start succinctly.


Synthesizing the automaton

Given the regular expression associated with rules : Rule byte, we synthesize an automaton out of it as shown below.


Definition automaton :=
  Eval lazy -[isAlpha char isDigit isAlphaNum] in
    synthesize rules.

The use of Eval is a technical requirement that is not necessary when exploring the design at the level. In fact, skipping the Eval and deferring it to just before extraction can speed up the interactive design phase where the user is interacting with the system through an editor interface such as Proof General. However, appropriate evaluation is necessary to make the extracted code type check when compiling with . In systems that blur the boundary between types and values, type checking is closely tied to term reduction. The HDL with its type level nats used in vector types is one such system and the purpose of the Eval is to force sufficient reduction at the level so that the resulting extracted code can be seen as well-typed by the compiler.


The synthesized automaton in the above step can be considered as a hardware module that is meant to be embedded in larger hardware designs. To this end, we expose its starting state, transition function, and finality checking function.

Definition state0 := start automaton.

Definition tr := transition automaton.

Definition isFinal :=
  Eval lazy -[isAlpha char isDigit isAlphaNum] in
    isFinal automaton.


To demonstrate such an embedding, we give a simple topEntity in that synthesizes a Moore machine which essentially runs the automaton. The input to the Moore machine is a stream of 8-bit bytes. It emits a stream of booleans that signifies whether the bytes streamed so far is accepted by the automaton.
--
-- Some boilerplate imports suppressed.
--
import qualified CIdent

topEntity :: Clock System
          -> Reset System
          -> Enable System
          -> Signal System (Unsigned 8)
          -> Signal System Bool
topEntity = exposeClockResetEnable
          $ moore CIdent.tr CIdent.isFinal CIdent.state0