[ Index ] |
PHP Cross Reference of DokuWiki |
[Source view] [Print] [Project Stats]
Author Markus Baker: http://www.lastcraft.com Version adapted from Simple Test: http://sourceforge.net/projects/simpletest/ For an intro to the Lexer see: https://web.archive.org/web/20120125041816/http://www.phppatterns.com/docs/develop/simple_test_lexer_notes
Author: | Marcus Baker |
Version: | $Id: lexer.php,v 1.1 2005/03/23 23:14:09 harryf Exp $ |
File Size: | 614 lines (20 kb) |
Included or required: | 2 times |
Referenced: | 0 times |
Includes or requires: | 0 files |
Doku_LexerParallelRegex:: (6 methods):
__construct()
addPattern()
match()
split()
_getCompoundedRegex()
_getPerlMatchingFlags()
Doku_LexerStateStack:: (4 methods):
__construct()
getCurrent()
enter()
leave()
Doku_Lexer:: (14 methods):
__construct()
addPattern()
addEntryPattern()
addExitPattern()
addSpecialPattern()
mapHandler()
parse()
_dispatchTokens()
_isModeEnd()
_isSpecialMode()
_decodeSpecial()
_invokeParser()
_reduce()
Doku_Lexer_Escape()
Class: Doku_LexerParallelRegex - X-Ref
Compounded regular expression. Any of__construct($case) X-Ref |
Constructor. Starts with no patterns. param: boolean $case True for case sensitive, false |
addPattern($pattern, $label = true) X-Ref |
Adds a pattern with an optional label. param: mixed $pattern Perl style regex. Must be UTF-8 param: bool|string $label Label of regex to be returned |
match($subject, &$match) X-Ref |
Attempts to match all patterns at once against a string. param: string $subject String to match against. param: string $match First matched portion of return: boolean True on success. |
split($subject, &$split) X-Ref |
Attempts to split the string against all patterns at once author: Christopher Smith <chris@jalakai.co.uk> param: string $subject String to match against. param: array $split The split result: array containing, pre-match, match & post-match strings return: boolean True on success. |
_getCompoundedRegex() X-Ref |
Compounds the patterns into a single regular expression separated with the "or" operator. Caches the regex. Will automatically escape (, ) and / tokens. return: null|string |
_getPerlMatchingFlags() X-Ref |
Accessor for perl regex mode flags to use. return: string Perl regex flags. |
Class: Doku_LexerStateStack - X-Ref
States for a stack machine.__construct($start) X-Ref |
Constructor. Starts in named state. param: string $start Starting state name. |
getCurrent() X-Ref |
Accessor for current state. return: string State. |
enter($state) X-Ref |
Adds a state to the stack and sets it to be the current state. param: string $state New state. |
leave() X-Ref |
Leaves the current state and reverts to the previous one. return: boolean False if we drop off |
Class: Doku_Lexer - X-Ref
Accepts text and breaks it into tokens.__construct($parser, $start = "accept", $case = false) X-Ref |
Sets up the lexer in case insensitive matching by default. param: Doku_Parser $parser Handling strategy by param: string $start Starting handler. param: boolean $case True for case sensitive. |
addPattern($pattern, $mode = "accept") X-Ref |
Adds a token search pattern for a particular parsing mode. The pattern does not change the current mode. param: string $pattern Perl style regex, but ( and ) param: string $mode Should only apply this |
addEntryPattern($pattern, $mode, $new_mode) X-Ref |
Adds a pattern that will enter a new parsing mode. Useful for entering parenthesis, strings, tags, etc. param: string $pattern Perl style regex, but ( and ) param: string $mode Should only apply this param: string $new_mode Change parsing to this new |
addExitPattern($pattern, $mode) X-Ref |
Adds a pattern that will exit the current mode and re-enter the previous one. param: string $pattern Perl style regex, but ( and ) param: string $mode Mode to leave. |
addSpecialPattern($pattern, $mode, $special) X-Ref |
Adds a pattern that has a special mode. Acts as an entry and exit pattern in one go, effectively calling a special parser handler for this token only. param: string $pattern Perl style regex, but ( and ) param: string $mode Should only apply this param: string $special Use this mode for this one token. |
mapHandler($mode, $handler) X-Ref |
Adds a mapping from a mode to another handler. param: string $mode Mode to be remapped. param: string $handler New target handler. |
parse($raw) X-Ref |
Splits the page text into tokens. Will fail if the handlers report an error or if no content is consumed. If successful then each unparsed and parsed token invokes a call to the held listener. param: string $raw Raw HTML text. return: boolean True on success, else false. |
_dispatchTokens($unmatched, $matched, $mode = false, $initialPos, $matchPos) X-Ref |
Sends the matched token and any leading unmatched text to the parser changing the lexer to a new mode if one is listed. param: string $unmatched Unmatched leading portion. param: string $matched Actual token match. param: bool|string $mode Mode after match. A boolean param: int $initialPos param: int $matchPos return: boolean False if there was any error |
_isModeEnd($mode) X-Ref |
Tests to see if the new mode is actually to leave the current mode and pop an item from the matching mode stack. param: string $mode Mode to test. return: boolean True if this is the exit mode. |
_isSpecialMode($mode) X-Ref |
Test to see if the mode is one where this mode is entered for this token only and automatically leaves immediately afterwoods. param: string $mode Mode to test. return: boolean True if this is the exit mode. |
_decodeSpecial($mode) X-Ref |
Strips the magic underscore marking single token modes. param: string $mode Mode to decode. return: string Underlying mode name. |
_invokeParser($content, $is_match, $pos) X-Ref |
Calls the parser method named after the current mode. Empty content will be ignored. The lexer has a parser handler for each mode in the lexer. param: string $content Text parsed. param: boolean $is_match Token is recognised rather param: int $pos Current byte index location in raw doc return: bool |
_reduce(&$raw) X-Ref |
Tries to match a chunk of text and if successful removes the recognised chunk and any leading unparsed data. Empty strings will not be matched. param: string $raw The subject to parse. This is the return: array Three item list of unparsed |
Doku_Lexer_Escape($str) X-Ref |
Escapes regex characters other than (, ) and / param: string $str return: mixed |