erl_scan(3)



erl_scan(3erl)             Erlang Module Definition             erl_scan(3erl)

NAME
       erl_scan - The Erlang token scanner.

DESCRIPTION
       This  module  contains  functions  for tokenizing (scanning) characters
       into Erlang tokens.

DATA TYPES
       category() = atom()

       error_description() = term()

       error_info() =
           {erl_anno:location(), module(), error_description()}

       option() =
           return | return_white_spaces | return_comments | text |
           {reserved_word_fun, resword_fun()}

       options() = option() | [option()]

       symbol() = atom() | float() | integer() | string()

       resword_fun() = fun((atom()) -> boolean())

       token() =
           {category(), Anno :: erl_anno:anno(), symbol()} |
           {category(), Anno :: erl_anno:anno()}

       tokens() = [token()]

       tokens_result() =
           {ok, Tokens :: tokens(), EndLocation :: erl_anno:location()} |
           {eof, EndLocation :: erl_anno:location()} |
           {error,
            ErrorInfo :: error_info(),
            EndLocation :: erl_anno:location()}

EXPORTS
       category(Token) -> category()

              Types:

                 Token = token()

              Returns the category of Token.

       column(Token) -> erl_anno:column() | undefined

              Types:

                 Token = token()

              Returns the column of Token's collection of annotations.

       end_location(Token) -> erl_anno:location() | undefined

              Types:

                 Token = token()

              Returns the end location of the text of  Token's  collection  of
              annotations. If there is no text, undefined is returned.

       format_error(ErrorDescriptor) -> string()

              Types:

                 ErrorDescriptor = error_description()

              Uses  an ErrorDescriptor and returns a string that describes the
              error or warning. This function  is  usually  called  implicitly
              when  an ErrorInfo structure is processed (see section Error In-
              formation).

       line(Token) -> erl_anno:line()

              Types:

                 Token = token()

              Returns the line of Token's collection of annotations.

       location(Token) -> erl_anno:location()

              Types:

                 Token = token()

              Returns the location of Token's collection of annotations.

       reserved_word(Atom :: atom()) -> boolean()

              Returns true if Atom  is  an  Erlang  reserved  word,  otherwise
              false.

       string(String) -> Return

       string(String, StartLocation) -> Return

       string(String, StartLocation, Options) -> Return

              Types:

                 String = string()
                 Options = options()
                 Return =
                     {ok, Tokens :: tokens(), EndLocation} |
                     {error, ErrorInfo :: error_info(), ErrorLocation}
                 StartLocation  = EndLocation = ErrorLocation = erl_anno:loca-
                 tion()

              Takes the list of characters String and tries to scan (tokenize)
              them. Returns one of the following:

                {ok, Tokens, EndLocation}:
                  Tokens are the Erlang tokens from String. EndLocation is the
                  first location after the last token.

                {error, ErrorInfo, ErrorLocation}:
                  An error occurred. ErrorLocation is the first location after
                  the erroneous token.

              string(String)   is   equivalent   to   string(String,  1),  and
              string(String, StartLocation) is  equivalent  to  string(String,
              StartLocation, []).

              StartLocation  indicates  the  initial  location  when  scanning
              starts. If StartLocation is a line, Anno, EndLocation,  and  Er-
              rorLocation  are lines. If StartLocation is a pair of a line and
              a column, Anno takes the form of an opaque compound  data  type,
              and EndLocation and ErrorLocation are pairs of a line and a col-
              umn. The token annotations contain information about the  column
              and  the line where the token begins, as well as the text of the
              token (if option text is specified), all of  which  can  be  ac-
              cessed by calling column/1, line/1, location/1, and text/1.

              A  token is a tuple containing information about syntactic cate-
              gory, the token annotations, and the terminal symbol. For  punc-
              tuation  characters  (such  as  ; and |) and reserved words, the
              category and the symbol coincide, and the token  is  represented
              by a two-tuple. Three-tuples have one of the following forms:

                * {atom, Anno, atom()}

                * {char, Anno, char()}

                * {comment, Anno, string()}

                * {float, Anno, float()}

                * {integer, Anno, integer()}

                * {var, Anno, atom()}

                * {white_space, Anno, string()}

              Valid options:

                {reserved_word_fun, reserved_word_fun()}:
                  A  callback  function  that  is  called when the scanner has
                  found an unquoted atom. If the function  returns  true,  the
                  unquoted  atom  itself becomes the category of the token. If
                  the function returns false, atom becomes the category of the
                  unquoted atom.

                return_comments:
                  Return comment tokens.

                return_white_spaces:
                  Return  white space tokens. By convention, a newline charac-
                  ter, if present, is always the first character of  the  text
                  (there  cannot be more than one newline in a white space to-
                  ken).

                return:
                  Short for [return_comments, return_white_spaces].

                text:
                  Include the token text in the token annotation. The text  is
                  the part of the input corresponding to the token.

       symbol(Token) -> symbol()

              Types:

                 Token = token()

              Returns the symbol of Token.

       text(Token) -> erl_anno:text() | undefined

              Types:

                 Token = token()

              Returns  the text of Token's collection of annotations. If there
              is no text, undefined is returned.

       tokens(Continuation, CharSpec, StartLocation) -> Return

       tokens(Continuation, CharSpec, StartLocation, Options) -> Return

              Types:

                 Continuation = return_cont() | []
                 CharSpec = char_spec()
                 StartLocation = erl_anno:location()
                 Options = options()
                 Return =
                     {done,
                      Result :: tokens_result(),
                      LeftOverChars :: char_spec()} |
                     {more, Continuation1 :: return_cont()}
                 char_spec() = string() | eof
                 return_cont()
                   An opaque continuation.

              This is the re-entrant scanner, which scans characters until ei-
              ther a dot ('.' followed by a white space) or eof is reached. It
              returns:

                {done, Result, LeftOverChars}:
                  Indicates that there is sufficient input data to get  a  re-
                  sult. Result is:

                  {ok, Tokens, EndLocation}:
                    The  scanning was successful. Tokens is the list of tokens
                    including dot.

                  {eof, EndLocation}:
                    End of file was encountered before any more tokens.

                  {error, ErrorInfo, EndLocation}:
                    An error occurred. LeftOverChars is the remaining  charac-
                    ters of the input data, starting from EndLocation.

                {more, Continuation1}:
                  More  data  is  required  for building a term. Continuation1
                  must be passed in a new call to tokens/3,4 when more data is
                  available.

              The  CharSpec  eof signals end of file. LeftOverChars then takes
              the value eof as well.

              tokens(Continuation, CharSpec, StartLocation) is  equivalent  to
              tokens(Continuation, CharSpec, StartLocation, []).

              For a description of the options, see string/3.

ERROR INFORMATION
       ErrorInfo is the standard ErrorInfo structure that is returned from all
       I/O modules. The format is as follows:

       {ErrorLocation, Module, ErrorDescriptor}

       A string describing the error is obtained with the following call:

       Module:format_error(ErrorDescriptor)

NOTES
       The continuation of the first call to the  re-entrant  input  functions
       must  be  [].  For  a  complete description of how the re-entrant input
       scheme works, see Armstrong, Virding and Williams: 'Concurrent Program-
       ming in Erlang', Chapter 13.

SEE ALSO
       erl_anno(3erl), erl_parse(3erl), io(3erl)

Ericsson AB                       stdlib 3.13                   erl_scan(3erl)

Man(1) output converted with man2html
list of all man pages