PHPXRef 0.7.1 : DokuWiki : Detail view of Clean.php

Methods to assess and clean UTF-8 strings

Checks if a string contains 7bit ASCII only

return: bool
param: string $str
author: Andreas Haerter <andreas.haerter@dev.mail-node.com>

isUtf8($str) X-Ref

Tries to detect if a string is in Unicode encoding

return: bool
param: string $str
link: http://php.net/manual/en/function.utf8-encode.php
author: <bmorel@ssi.fr>

strip($str) X-Ref

Strips all high byte chars

Returns a pure ASCII7 string

return: string
param: string $str
author: Andreas Gohr <andi@splitbrain.org>

stripspecials($string, $repl = '', $additional = '') X-Ref

Removes special characters (nonalphanumeric) from a UTF-8 string

This function adds the controlchars 0x00 to 0x19 to the array of
stripped chars (they are not included in $UTF8_SPECIAL_CHARS)

return: string
param: string $string The UTF8 string to strip of special chars
param: string $repl Replace special with this string
param: string $additional Additional chars to strip (used in regexp char class)
author: Andreas Gohr <andi@splitbrain.org>

replaceBadBytes($str, $replace = '') X-Ref

Replace bad bytes with an alternative character

ASCII character is recommended for replacement char

PCRE Pattern to locate bad bytes in a UTF-8 string
Comes from W3 FAQ: Multilingual Forms
Note: modified to include full ASCII range including control chars

return: string
see: http://www.w3.org/International/questions/qa-forms-utf-8
param: string $str to search
param: string $replace to replace bad bytes with (defaults to '?') - use ASCII
author: Harry Fuecks <hfuecks@gmail.com>

deaccent($string, $case = 0) X-Ref

Replace accented UTF-8 characters by unaccented ASCII-7 equivalents

Use the optional parameter to just deaccent lower ($case = -1) or upper ($case = 1)
letters. Default is to deaccent both cases ($case = 0)

return: string
param: string $string
param: int $case
author: Andreas Gohr <andi@splitbrain.org>

romanize($string) X-Ref

Romanize a non-latin string

return: string
param: string $string
author: Andreas Gohr <andi@splitbrain.org>

correctIdx($str, $i, $next = false) X-Ref

adjust a byte index into a utf8 string to a utf8 character boundary

return: int byte index into $str now pointing to a utf8 character boundary
param: string $str utf8 character string
param: int $i byte index into $str
param: bool $next direction to search for boundary, false = up (current character) true = down (next character)
author: chris smith <chris@jalakai.co.uk>

File Size:	203 lines (6 kb)
Included or required:	0 times
Referenced:	0 times
Includes or requires:	0 files

PHP Cross Reference of DokuWiki

/inc/Utf8/ -> Clean.php (summary)

Defines 1 class

Global DokuWiki Links