data/TWiki/TWikiInfixParserDotPm.txt
changeset 0 414e01d06fd5
equal deleted inserted replaced
-1:000000000000 0:414e01d06fd5
       
     1 ---+ Package =TWiki::Infix::Parser=
       
     2 
       
     3 A simple stack-based parser that parses infix expressions with nonary,
       
     4 unary and binary operators specified using an operator table.
       
     5 
       
     6 Escapes are supported in strings, using backslash.
       
     7 
       
     8 
       
     9 %TOC%
       
    10 
       
    11 ---++ new($client_class, \%options) -> parser object
       
    12 
       
    13 Creates a new infix parser. Operators must be added for it to be useful.
       
    14 
       
    15 The tokeniser matches tokens in the following order: operators,
       
    16 quotes (" and '), numbers, words, brackets. If you have any overlaps (e.g.
       
    17 an operator '<' and a bracket operator '<<') then the first choice
       
    18 will match.
       
    19 
       
    20 =$client_class= needs to be the _name_ of a _package_ that supports the
       
    21 following two functions:
       
    22    * =newLeaf($val, $type)= - create a terminal. $type will be:
       
    23       1 if the terminal matched the =words= specification (see below).
       
    24       2 if it is a number matched the =numbers= specification (see below)
       
    25       3 if it is a quoted string
       
    26    * =newNode($op, @params) - create a new operator node. @params
       
    27      is a variable-length list of parameters, left to right. $op
       
    28      is a reference to the operator hash in the \@opers list.
       
    29 These functions should throw Error::Simple in the event of errors.
       
    30 TWiki::Infix::Node is such a class, ripe for subclassing.
       
    31 
       
    32 The remaining parameters are named, and specify options that affect the
       
    33 behaviour of the parser:
       
    34    1 =words=>qr//= - should be an RE specifying legal words (unquoted
       
    35      terminals that are not operators i.e. names and numbers). By default
       
    36      this is =\w+=.
       
    37      It's ok if operator names match this RE; operators always have precedence
       
    38      over atoms.
       
    39    2 =numbers=>qr//= - should be an RE specifying legal numbers (unquoted
       
    40      terminals that are not operators or words). By default
       
    41      this is =qr/[+-]?(?:\d+\.\d+|\d+\.|\.\d+|\d+)(?:[eE][+-]?\d+)?/=,
       
    42      which matches integers and floating-point numbers. Number
       
    43      matching always takes precedence over word matching (i.e. "1xy" will
       
    44      be parsed as a number followed by a word. A typical usage of this option
       
    45      is when you only want to recognise integers, in which case you would set
       
    46      this to =numbers => qr/\d+/=.
       
    47 
       
    48 
       
    49 ---++ ObjectMethod *addOperator* <tt>(%oper)</tt>
       
    50 Add an operator to the parser.
       
    51 
       
    52 =%oper= is a hash, containing the following fields:
       
    53    * =name= - operator string
       
    54    * =prec= - operator precedence, positive non-zero integer.
       
    55      Larger number => higher precedence.
       
    56    * =arity= - set to 1 if this operator is unary, 2 for binary. Arity 0
       
    57      is legal, should you ever need it.
       
    58    * =close= - used with bracket operators. =name= should be the open
       
    59      bracket string, and =close= the close bracket. The existance of =close=
       
    60      marks this as a bracket operator.
       
    61    * =casematters== - indicates that the parser should check case in the
       
    62      operator name (i.e. treat 'AND' and 'and' as different).
       
    63      By default operators are case insensitive. *Note* that operator
       
    64      names must be caselessly unique i.e. you can't define 'AND' and 'and'
       
    65      as different operators in the same parser. Does not affect the
       
    66      interpretation of non-operator terminals (names).
       
    67 Other fields in the hash can be used for other purposes; the parse tree
       
    68 generated by this parser will point to the hashes passed to this function.
       
    69 
       
    70 Field names in the hash starting with =InfixParser_= are reserved for use
       
    71 by the parser.
       
    72 
       
    73 
       
    74 
       
    75 ---++ ObjectMethod *parse* <tt>($string) -> $parseTree</tt>
       
    76 Parses =$string=, calling =newLeaf= and =newNode= in the client class
       
    77 as necessary to create a parse tree. Returns the result of calling =newNode=
       
    78 on the root of the parse.
       
    79 
       
    80 Throws TWiki::Infix::Error in the event of parse errors.
       
    81 
       
    82