faop(3)



grammar::fa::op(3tcl)Finite automaton operations and usaggrammar::fa::op(3tcl)

______________________________________________________________________________

NAME
       grammar::fa::op - Operations on finite automatons

SYNOPSIS
       package require Tcl  8.4

       package require snit

       package require struct::list

       package require struct::set

       package require grammar::fa::op  ?0.4.1?

       ::grammar::fa::op::constructor cmd

       ::grammar::fa::op::reverse fa

       ::grammar::fa::op::complete fa ?sink?

       ::grammar::fa::op::remove_eps fa

       ::grammar::fa::op::trim fa ?what?

       ::grammar::fa::op::determinize fa ?mapvar?

       ::grammar::fa::op::minimize fa ?mapvar?

       ::grammar::fa::op::complement fa

       ::grammar::fa::op::kleene fa

       ::grammar::fa::op::optional fa

       ::grammar::fa::op::union fa fb ?mapvar?

       ::grammar::fa::op::intersect fa fb ?mapvar?

       ::grammar::fa::op::difference fa fb ?mapvar?

       ::grammar::fa::op::concatenate fa fb ?mapvar?

       ::grammar::fa::op::fromRegex fa regex ?over?

       ::grammar::fa::op::toRegexp fa

       ::grammar::fa::op::toRegexp2 fa

       ::grammar::fa::op::toTclRegexp regexp symdict

       ::grammar::fa::op::simplifyRegexp regexp

______________________________________________________________________________

DESCRIPTION
       This  package provides a number of complex operations on finite automa-
       tons (Short: FA), as provided by the package grammar::fa.  The  package
       does  not provide the ability to create and/or manipulate such FAs, nor
       the ability to execute a FA for a stream of symbols.  Use the  packages
       grammar::fa and grammar::fa::interpreter for that.  Another package re-
       lated to this is grammar::fa::compiler which turns a FA into an  execu-
       tor class which has the definition of the FA hardwired into it.

       For  more  information about what a finite automaton is see section FI-
       NITE AUTOMATONS in package grammar::fa.

API
       The package exports the API described here.  All commands modify  their
       first  argument.  I.e. whatever FA they compute is stored back into it.
       Some of the operations will construct an automaton whose states are all
       new, but related to the states in the source automaton(s). These opera-
       tions take variable names as optional arguments where they  will  store
       mappings  which  describe  the  relationship(s).  The operations can be
       loosely partitioned into structural and language operations. The latter
       are  defined  in terms of the language the automaton(s) accept, whereas
       the former are defined in terms of the structural properties of the in-
       volved automaton(s). Some operations are both.  Structure operations

       ::grammar::fa::op::constructor cmd
              This  command has to be called by the user of the package before
              any other operations is performed, to establish a command  which
              can  be  used to construct a FA container object. If this is not
              done several operations will fail as they  are  unable  to  con-
              struct  internal  and  transient containers to hold state and/or
              partial results.

              Any container class using this package  for  complex  operations
              should set its own class command as the constructor. See package
              grammar::fa for an example.

       ::grammar::fa::op::reverse fa
              Reverses the fa. This is done by reversing the direction of  all
              transitions and swapping the sets of start and final states. The
              language of fa changes unpredictably.

       ::grammar::fa::op::complete fa ?sink?
              Completes the fa complete, but nothing is done if the fa is  al-
              ready  complete. This implies that only the first in a series of
              multiple consecutive complete operations on fa will perform any-
              thing. The remainder will be null operations.

              The language of fa is unchanged by this operation.

              This is done by adding a single new state, the sink, and transi-
              tions from all other states to that sink for  all  symbols  they
              have  no  transitions  for.  The sink itself is made complete by
              adding loop transitions for all symbols.

              Note: When a FA has epsilon-transitions transitions over a  sym-
              bol for a state S can be indirect, i.e. not attached directly to
              S, but to a state in the epsilon-closure of S. The  symbols  for
              such indirect transitions count when computing completeness of a
              state. In other words, these indirectly reached symbols are  not
              missing.

              The  argument  sink provides the name for the new state and most
              not be present in the fa if specified. If the name is not speci-
              fied  the command will name the state "sinkn", where n is set so
              that there are no collisions with existing states.

              Note that the sink state is not useful by definition.  In  other
              words,  while  the FA becomes complete, it is also not useful in
              the strict sense as it has a state from which no final state can
              be reached.

       ::grammar::fa::op::remove_eps fa
              Removes all epsilon-transitions from the fa in such a manner the
              the language of fa is unchanged. However nothing is done if  the
              fa is already epsilon-free.  This implies that only the first in
              a series of multiple consecutive complete operations on fa  will
              perform anything. The remainder will be null operations.

              Note:  This  operation may cause states to become unreachable or
              not useful. These states are not removed by this operation.  Use
              ::grammar::fa::op::trim for that instead.

       ::grammar::fa::op::trim fa ?what?
              Removes unwanted baggage from fa.  The legal values for what are
              listed below. The command defaults to !reachable|!useful  if  no
              specific argument was given.

              !reachable
                     Removes  all  states which are not reachable from a start
                     state.

              !useful
                     Removes all states which are  unable  to  reach  a  final
                     state.

              !reachable&!useful

              !(reachable|useful)
                     Removes  all  states which are not reachable from a start
                     state and are unable to reach a final state.

              !reachable|!useful

              !(reachable&useful)
                     Removes all states which are not reachable from  a  start
                     state or are unable to reach a final state.

       ::grammar::fa::op::determinize fa ?mapvar?
              Makes  the  fa  deterministic  without changing the language ac-
              cepted by the fa. However nothing is done if the fa  is  already
              deterministic.  This  implies that only the first in a series of
              multiple consecutive complete operations on fa will perform any-
              thing. The remainder will be null operations.

              The  command will store a dictionary describing the relationship
              between the new states of the resulting dfa and  the  states  of
              the  input  nfa in mapvar, if it has been specified. Keys of the
              dictionary are the handles for the states of the resulting  dfa,
              values are sets of states from the input nfa.

              Note:  An  empty dictionary signals that the command was able to
              make the fa deterministic without performing a full subset  con-
              struction,  just  by  removing  states and shuffling transitions
              around (As part of making the FA epsilon-free).

              Note: The algorithm fails to make the FA  deterministic  in  the
              technical  sense if the FA has no start state(s), because deter-
              minism requires the FA to have exactly  one  start  states.   In
              that  situation  we  make  a  best effort; and the missing start
              state will be the only condition preventing the generated result
              from  being deterministic.  It should also be noted that in this
              case the possibilities for trimming states from the FA are  also
              severely reduced as we cannot declare states unreachable.

       ::grammar::fa::op::minimize fa ?mapvar?
              Creates  a  FA  which accepts the same language as fa, but has a
              minimal number of states. Uses Brzozowski's method to accomplish
              this.

              The  command will store a dictionary describing the relationship
              between the new states of  the  resulting  minimal  fa  and  the
              states of the input fa in mapvar, if it has been specified. Keys
              of the dictionary are the handles for the states of the  result-
              ing minimal fa, values are sets of states from the input fa.

              Note:  An  empty dictionary signals that the command was able to
              minimize the fa without  having  to  compute  new  states.  This
              should happen if and only if the input FA was already minimal.

              Note: If the algorithm has no start or final states to work with
              then the result might be technically minimal, but  have  a  very
              unexpected structure.  It should also be noted that in this case
              the possibilities for trimming states from the FA are  also  se-
              verely reduced as we cannot declare states unreachable.

       Language operations All operations in this section require that all in-
       put FAs have at least one start and at least one final state. Otherwise
       the  language  of  the  FAs  will  not be defined, making the operation
       senseless (as it operates on the languages of the FAs in a defined man-
       ner).

       ::grammar::fa::op::complement fa
              Complements  fa.  This is possible if and only if fa is complete
              and deterministic. The resulting FA  accepts  the  complementary
              language  of  fa. In other words, all inputs not accepted by the
              input are accepted by the result, and vice versa.

              The result will have all states and transitions  of  the  input,
              and different final states.

       ::grammar::fa::op::kleene fa
              Applies  Kleene's  closure  to fa.  The resulting FA accepts all
              strings S for which we can find a natural number n (0 inclusive)
              and  strings  A1 ... An in the language of fa such that S is the
              concatenation of A1 ... An.  In other words, the language of the
              result  is  the infinite union over finite length concatenations
              over the language of fa.

              The result will have all states and transitions  of  the  input,
              and new start and final states.

       ::grammar::fa::op::optional fa
              Makes  the  fa optional. In other words it computes the FA which
              accepts the language of fa and the empty the word  (epsilon)  as
              well.

              The  result  will  have all states and transitions of the input,
              and new start and final states.

       ::grammar::fa::op::union fa fb ?mapvar?
              Combines the FAs fa and fb such that the  resulting  FA  accepts
              the union of the languages of the two FAs.

              The result will have all states and transitions of the two input
              FAs, and new start and final states. All states of fb which  ex-
              ist in fa as well will be renamed, and the mapvar will contain a
              mapping from the old states of fb to the new ones, if present.

              It should be noted that the result  will  be  non-deterministic,
              even if the inputs are deterministic.

       ::grammar::fa::op::intersect fa fb ?mapvar?
              Combines  the  FAs  fa and fb such that the resulting FA accepts
              the intersection of the languages  of  the  two  FAs.  In  other
              words,  the result will accept a word if and only if the word is
              accepted by both fa and fb. The result will be useful,  but  not
              necessarily deterministic or minimal.

              The  command will store a dictionary describing the relationship
              between the new states of the resulting  fa  and  the  pairs  of
              states  of  the  input  FAs in mapvar, if it has been specified.
              Keys of the dictionary are the handles for the states of the re-
              sulting fa, values are pairs of states from the input FAs. Pairs
              are represented by lists. The first element in each pair will be
              a state in fa, the second element will be drawn from fb.

       ::grammar::fa::op::difference fa fb ?mapvar?
              Combines  the  FAs  fa and fb such that the resulting FA accepts
              the difference of the languages of the two FAs. In other  words,
              the  result  will  accept  a word if and only if the word is ac-
              cepted by fa, but not by fb. This can also be expressed  as  the
              intersection of fa with the complement of fb. The result will be
              useful, but not necessarily deterministic or minimal.

              The command will store a dictionary describing the  relationship
              between  the  new  states  of  the resulting fa and the pairs of
              states of the input FAs in mapvar, if  it  has  been  specified.
              Keys of the dictionary are the handles for the states of the re-
              sulting fa, values are pairs of states from the input FAs. Pairs
              are represented by lists. The first element in each pair will be
              a state in fa, the second element will be drawn from fb.

       ::grammar::fa::op::concatenate fa fb ?mapvar?
              Combines the FAs fa and fb such that the  resulting  FA  accepts
              the cross-product of the languages of the two FAs. I.e. a word W
              will be accepted by the result if there are two words  A  and  B
              accepted by fa, and fb resp. and W is the concatenation of A and
              B.

              The result FA will be non-deterministic.

       ::grammar::fa::op::fromRegex fa regex ?over?
              Generates a non-deterministic FA which accepts the same language
              as  the regular expression regex. If the over is specified it is
              treated as the set of symbols the regular expression and the au-
              tomaton  are defined over. The command will compute the set from
              the "S" constructors in regex when over was not specified.  This
              set  is  important if and only if the complement operator "!" is
              used in regex as the complementary language of an  FA  is  quite
              different for different sets of symbols.

              The  regular  expression  is represented by a nested list, which
              forms a syntax tree. The following structures are legal:

              {S x}  Atomic regular expression. Everything else is constructed
                     from these. Accepts the Symbol "x".

              {. A1 A2 ...}
                     Concatenation  operator. Accepts the concatenation of the
                     regular expressions A1, A2, etc.

                     Note that this operator accepts zero or  more  arguments.
                     With  zero arguments the represented language is epsilon,
                     the empty word.

              {| A1 A2 ...}
                     Choice operator, also called "Alternative".  Accepts  all
                     input accepted by at least one of the regular expressions
                     A1, A2, etc. In other words, the union of A1, A2.

                     Note that this operator accepts zero or  more  arguments.
                     With zero arguments the represented language is the empty
                     language, the language without words.

              {& A1 A2 ...}
                     Intersection operator, logical and. Accepts all input ac-
                     cepted  which  is  accepted by all of the regular expres-
                     sions A1, A2, etc. In other words,  the  intersection  of
                     A1, A2.

              {? A}  Optionality operator. Accepts the empty word and anything
                     from the regular expression A.

              {* A}  Kleene closure. Accepts the empty  word  and  any  finite
                     concatenation of words accepted by the regular expression
                     A.

              {+ A}  Positive Kleene closure. Accepts any finite concatenation
                     of  words  accepted  by the regular expression A, but not
                     the empty word.

              {! A}  Complement operator. Accepts any word not accepted by the
                     regular expression A. Note that the complement depends on
                     the set of symbol the result should  run  over.  See  the
                     discussion of the argument over before.

       ::grammar::fa::op::toRegexp fa
              This  command  generates  and returns a regular expression which
              accepts the same language as the finite automaton fa. The  regu-
              lar  expression is in the format as described above, for ::gram-
              mar::fa::op::fromRegex.

       ::grammar::fa::op::toRegexp2 fa
              This   command   has   the   same   functionality   as   ::gram-
              mar::fa::op::toRegexp,  but  uses  a different algorithm to sim-
              plify the generated regular expressions.

       ::grammar::fa::op::toTclRegexp regexp symdict
              This command generates and returns a regular expression  in  Tcl
              syntax  for  the regular expression regexp, if that is possible.
              regexp  is  in  the  same  format   as   expected   by   ::gram-
              mar::fa::op::fromRegex.

              The command will fail and throw an error if regexp contains com-
              plementation and intersection operations.

              The argument symdict is a dictionary  mapping  symbol  names  to
              pairs of syntactic type and Tcl-regexp. If a symbol occurring in
              the regexp is not listed in this dictionary then  single-charac-
              ter  symbols are considered to designate themselves whereas mul-
              tiple-character symbols are considered to be a  character  class
              name.

       ::grammar::fa::op::simplifyRegexp regexp
              This  command  simplifies  a  regular expression by applying the
              following algorithm first to the main expression and then recur-
              sively to all sub-expressions:

              [1]    Convert the expression into a finite automaton.

              [2]    Minimize the automaton.

              [3]    Convert the automaton back to a regular expression.

              [4]    Choose  the shorter of original expression and expression
                     from the previous step.

EXAMPLES
BUGS, IDEAS, FEEDBACK
       This document, and the package it describes, will  undoubtedly  contain
       bugs and other problems.  Please report such in the category grammar_fa
       of the Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist].   Please
       also  report any ideas for enhancements you may have for either package
       and/or documentation.

       When proposing code changes, please provide unified diffs, i.e the out-
       put of diff -u.

       Note  further  that  attachments  are  strongly  preferred over inlined
       patches. Attachments can be made by going  to  the  Edit  form  of  the
       ticket  immediately  after  its  creation, and then using the left-most
       button in the secondary navigation bar.

KEYWORDS
       automaton, finite automaton, grammar, parsing, regular expression, reg-
       ular grammar, regular languages, state, transducer

CATEGORY
       Grammars and finite automata

COPYRIGHT
       Copyright (c) 2004-2008 Andreas Kupries <andreas_kupries@users.sourceforge.net>

tcllib                                0.4                grammar::fa::op(3tcl)

Man(1) output converted with man2html
list of all man pages