[ Index ]

PHP Cross Reference of DokuWiki

title

Body

[close]

/inc/Parsing/Lexer/ -> Lexer.php (summary)

Lexer adapted from Simple Test: http://sourceforge.net/projects/simpletest/ For an intro to the Lexer see: https://web.archive.org/web/20120125041816/http://www.phppatterns.com/docs/develop/simple_test_lexer_notes

Author: Marcus Baker http://www.lastcraft.com
File Size: 347 lines (11 kb)
Included or required:0 times
Referenced: 0 times
Includes or requires: 0 files

Defines 1 class

Lexer:: (14 methods):
  __construct()
  addPattern()
  addEntryPattern()
  addExitPattern()
  addSpecialPattern()
  mapHandler()
  parse()
  dispatchTokens()
  isModeEnd()
  isSpecialMode()
  decodeSpecial()
  invokeHandler()
  reduce()
  escape()


Class: Lexer  - X-Ref

Accepts text and breaks it into tokens.

Some optimisation to make the sure the content is only scanned by the PHP regex
parser once. Lexer modes must not start with leading underscores.
__construct($handler, $start = "accept", $case = false)   X-Ref
Sets up the lexer in case insensitive matching by default.

param: \Doku_Handler $handler  Handling strategy by reference.
param: string $start            Starting handler.
param: boolean $case            True for case sensitive.

addPattern($pattern, $mode = "accept")   X-Ref
Adds a token search pattern for a particular parsing mode.

The pattern does not change the current mode.

param: string $pattern      Perl style regex, but ( and )
param: string $mode         Should only apply this

addEntryPattern($pattern, $mode, $new_mode)   X-Ref
Adds a pattern that will enter a new parsing mode.

Useful for entering parenthesis, strings, tags, etc.

param: string $pattern      Perl style regex, but ( and ) lose the usual meaning.
param: string $mode         Should only apply this pattern when dealing with this type of input.
param: string $new_mode     Change parsing to this new nested mode.

addExitPattern($pattern, $mode)   X-Ref
Adds a pattern that will exit the current mode and re-enter the previous one.

param: string $pattern      Perl style regex, but ( and ) lose the usual meaning.
param: string $mode         Mode to leave.

addSpecialPattern($pattern, $mode, $special)   X-Ref
Adds a pattern that has a special mode.

Acts as an entry and exit pattern in one go, effectively calling a special
parser handler for this token only.

param: string $pattern      Perl style regex, but ( and ) lose the usual meaning.
param: string $mode         Should only apply this pattern when dealing with this type of input.
param: string $special      Use this mode for this one token.

mapHandler($mode, $handler)   X-Ref
Adds a mapping from a mode to another handler.

param: string $mode        Mode to be remapped.
param: string $handler     New target handler.

parse($raw)   X-Ref
Splits the page text into tokens.

Will fail if the handlers report an error or if no content is consumed. If successful then each
unparsed and parsed token invokes a call to the held listener.

param: string $raw        Raw HTML text.
return: boolean           True on success, else false.

dispatchTokens($unmatched, $matched, $mode, $initialPos, $matchPos)   X-Ref
Sends the matched token and any leading unmatched
text to the parser changing the lexer to a new
mode if one is listed.

param: string $unmatched Unmatched leading portion.
param: string $matched Actual token match.
param: bool|string $mode Mode after match. A boolean false mode causes no change.
param: int $initialPos
param: int $matchPos Current byte index location in raw doc thats being parsed
return: boolean             False if there was any error from the parser.

isModeEnd($mode)   X-Ref
Tests to see if the new mode is actually to leave the current mode and pop an item from the matching
mode stack.

param: string $mode    Mode to test.
return: boolean        True if this is the exit mode.

isSpecialMode($mode)   X-Ref
Test to see if the mode is one where this mode is entered for this token only and automatically
leaves immediately afterwoods.

param: string $mode    Mode to test.
return: boolean        True if this is the exit mode.

decodeSpecial($mode)   X-Ref
Strips the magic underscore marking single token modes.

param: string $mode    Mode to decode.
return: string         Underlying mode name.

invokeHandler($content, $is_match, $pos)   X-Ref
Calls the parser method named after the current mode.

Empty content will be ignored. The lexer has a parser handler for each mode in the lexer.

param: string $content Text parsed.
param: boolean $is_match Token is recognised rather
param: int $pos Current byte index location in raw doc
return: bool

reduce(&$raw)   X-Ref
Tries to match a chunk of text and if successful removes the recognised chunk and any leading
unparsed data. Empty strings will not be matched.

param: string $raw         The subject to parse. This is the content that will be eaten.
return: array|bool         Three item list of unparsed content followed by the

escape($str)   X-Ref
Escapes regex characters other than (, ) and /

param: string $str
return: string