regexp.abstract.automata.build: Regexp to NFA construction

Overview

re2nfa: Regular expression to NFA conversion.
correctness:Proof of correctness of re2nfa.

The re2nfa builds the automata recursively with one recursive step for each constructor. The proof is carried out by induction on the structure of the regular expression with one inductive step for each constructor.

The recursive steps themselves are nontrivial and hence we have packaged them in a module of their own (one for each constructor of the regular expression).

Require regexp.abstract.automata.alt.
Require regexp.abstract.automata.epsilon.
Require regexp.abstract.automata.concat.
Require regexp.abstract.automata.oneOrMore.
Require regexp.abstract.automata.singleton.

Each of the above modules expose two functions

build: Which gives the appropriate construction
correctness: The correctness of the above construction.

Section Cxt.
  Context {B : finType}.
  Fixpoint State (re : expr.t B) : finType :=
    match re with
    | expr.eps ⇒ epsilon.State
    | expr.single _ ⇒ singleton.State
    | expr.alt re1 re2 ⇒ @alt.State (State re1) (State re2)
    | expr.concat re1 re2 ⇒ @concat.State (State re1) (State re2)
    | expr.many1 rep ⇒ State rep
    end.

The function re2nfa recurses on the structure of re and uses the appropriate build for each constructor

  Fixpoint re2nfa (re : expr.t B) : NFA B (State re) :=
    match re with
    | expr.eps ⇒ epsilon.build
    | expr.single x ⇒ singleton.build x
    | expr.alt re1 re2 ⇒ alt.build (re2nfa re1) (re2nfa re2)
    | expr.concat re1 re2 ⇒ concat.build (re2nfa re1) (re2nfa re2)
    | expr.many1 rep ⇒ oneOrMore.build (re2nfa rep)
    end.

Each inductive step corresponds to a constructor and the associated correctness theorem together with any inductive hypothesis is used to rewrite and obtain the r

  Theorem correctness (re : expr.t B) : accept (re2nfa re) ≡ expr.language re.
  Proof using Type.
    elim: re ⇒ [|b|re1 IH1 re2 IH2| re1 IH1 re2 IH2| re IH]; rewrite /expr.language -/expr.language .
    - by rewrite epsilon.correctness.
    - by rewrite singleton.correctness.
    - by rewrite alt.correctness IH1 IH2.
    - by rewrite concat.correctness IH1 IH2.
    - by rewrite oneOrMore.correctness IH.
  Qed.

End Cxt.