token_shell(3)



string::token::shell(3tcl) Text and string utilitiesstring::token::shell(3tcl)

______________________________________________________________________________

NAME
       string::token::shell - Parsing of shell command line

SYNOPSIS
       package require Tcl  8.5

       package require string::token::shell  ?1.2?

       package require string::token  ?1?

       package require fileutil

       ::string token shell ?-indices? ?-partial? ?--? string

______________________________________________________________________________

DESCRIPTION
       This package provides a command which parses a line of text using basic
       sh-syntax into a list of words.

       The complete set of procedures is described below.

       ::string token shell ?-indices? ?-partial? ?--? string
              This command parses the input string under the assumption of  it
              following  basic sh-syntax.  The result of the command is a list
              of words in the string.  An error is thrown if  the  input  does
              not follow the allowed syntax.  The behaviour can be modified by
              specifying any of the two options -indices and -partial.

              --     When specified option parsing stops at this  point.  This
                     option is needed if the input string may start with dash.
                     In other words, this is pretty much required if string is
                     user input.

              -indices
                     When  specified  the output is not a list of words, but a
                     list of 4-tuples describing the words.  Each  tuple  con-
                     tains the type of the word, its start- and end-indices in
                     the input, and the actual text of the word.

                     Note that the length of the word as given by the  indices
                     can  differ from the length of the word found in the last
                     element of the tuple. The indices describe the words  ex-
                     tent in the input, including delimiters, intra-word quot-
                     ing, etc. whereas for the actual text of the word  delim-
                     iters are stripped, intra-word quoting decoded, etc.

                     The possible token types are

                     PLAIN  Plain word, not quoted.

                     D:QUOTED
                            Word is delimited by double-quotes.

                     S:QUOTED
                            Word is delimited by single-quotes.

                     D:QUOTED:PART

                     S:QUOTED:PART
                            Like the previous types, but the word has no clos-
                            ing quote, i.e. is incomplete. These  token  types
                            can  occur  if and only if the option -partial was
                            specified, and only for the last word of  the  re-
                            sult.  If  the  option  -partial was not specified
                            such incomplete words cause the command to  thrown
                            an error instead.

              -partial
                     When  specified  the  parser  will  accept  an incomplete
                     quoted word (i.e. without closing quote) at  the  end  of
                     the line as valid instead of throwing an error.

       The  basic shell syntax accepted here are unquoted, single- and double-
       quoted words, separated by whitespace. Leading and trailing  whitespace
       are possible too, and stripped.  Shell variables in their various forms
       are not recognized, nor are sub-shells.  As for the recognized forms of
       words, see below for the detailed specification.

              single-quoted word
                     A  single-quoted  word begins with a single-quote charac-
                     ter, i.e.  ' (ASCII 39) followed by zero or more  unicode
                     characters  not a single-quote, and then closed by a sin-
                     gle-quote.

                     The word must be  followed  by  either  the  end  of  the
                     string,  or whitespace. A word cannot directly follow the
                     word.

              double-quoted word
                     A double-quoted word begins with a  double-quote  charac-
                     ter,  i.e.  " (ASCII 34) followed by zero or more unicode
                     characters not a double-quote, and then closed by a  dou-
                     ble-quote.

                     Contrary to single-quoted words a double-quote can be em-
                     bedded into the word, by prefacing, i.e.  escaping,  i.e.
                     quoting it with a backslash character \ (ASCII 92). Simi-
                     larly a backslash character must be quoted with itself to
                     be inserted literally.

              unquoted word
                     Unquoted  words are not delimited by quotes and thus can-
                     not contain whitespace or single-quote  characters.  Dou-
                     ble-quote  and  backslash  characters can be put into un-
                     quoted words,  by  quting  them  like  for  double-quoted
                     words.

              whitespace
                     Whitespace  is  any  unicode  space  character.   This is
                     equivalent to string is space, or the regular  expression
                     \\s.

                     Whitespace  may occur before the first word, or after the
                     last word. Whitespace must occur between adjacent words.

BUGS, IDEAS, FEEDBACK
       This document, and the package it describes, will  undoubtedly  contain
       bugs  and  other problems.  Please report such in the category textutil
       of the Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist].   Please
       also  report any ideas for enhancements you may have for either package
       and/or documentation.

       When proposing code changes, please provide unified diffs, i.e the out-
       put of diff -u.

       Note  further  that  attachments  are  strongly  preferred over inlined
       patches. Attachments can be made by going  to  the  Edit  form  of  the
       ticket  immediately  after  its  creation, and then using the left-most
       button in the secondary navigation bar.

KEYWORDS
       bash, lexing, parsing, shell, string, tokenization

CATEGORY
       Text processing

tcllib                                1.2           string::token::shell(3tcl)

Man(1) output converted with man2html
list of all man pages