[ Index ]

PHP Cross Reference of DokuWiki

title

Body

[close]

/inc/ -> SafeFN.class.php (summary)

(no description)

File Size: 169 lines (6 kb)
Included or required:0 times
Referenced: 0 times
Includes or requires: 0 files

Defines 1 class

SafeFN:: (6 methods):
  encode()
  decode()
  validatePrintableUtf8()
  validateSafe()
  unicodeToSafe()
  safeToUnicode()


Class: SafeFN  - X-Ref

Class to safely store UTF-8 in a Filename

Encodes a utf8 string using only the following characters 0-9a-z_.-%
characters 0-9a-z in the original string are preserved, "plain".
all other characters are represented in a substring that starts
with '%' are "converted".
The transition from converted substrings to plain characters is
marked with a '.'

encode($filename)   X-Ref
Convert an UTF-8 string to a safe ASCII String

conversion process
- if codepoint is a plain or post_indicator character,
- if previous character was "converted", append post_indicator to output, clear "converted" flag
- append ascii byte for character to output
(continue to next character)

- if codepoint is a pre_indicator character,
- append ascii byte for character to output, set "converted" flag
(continue to next character)

(all remaining characters)
- reduce codepoint value for non-printable ASCII characters (0x00 - 0x1f).  Space becomes our zero.
- convert reduced value to base36 (0-9a-z)
- append $pre_indicator characater followed by base36 string to output, set converted flag
(continue to next character)

return: string    an encoded representation of $filename using only 'safe' ASCII characters
author: Christopher Smith <chris@jalakai.co.uk>
param: string    $filename     a utf8 string, should only include printable characters - not 0x00-0x1f

decode($filename)   X-Ref
decoding process
- split the string into substrings at any occurrence of pre or post indicator characters
- check the first character of the substring
- if its not a pre_indicator character
- if previous character was converted, skip over post_indicator character
- copy codepoint values of remaining characters to the output array
- clear any converted flag
(continue to next substring)

_ else (its a pre_indicator character)
- if string length is 1, copy the post_indicator character to the output array
(continue to next substring)

- else (string length > 1)
- skip the pre-indicator character and convert remaining string from base36 to base10
- increase codepoint value for non-printable ASCII characters (add 0x20)
- append codepoint to output array
(continue to next substring)

return: string    decoded utf8 representation of $filename
author: Christopher Smith <chris@jalakai.co.uk>
param: string    $filename     a 'safe' encoded ASCII string,

validatePrintableUtf8($printable_utf8)   X-Ref
No description

validateSafe($safe)   X-Ref
No description

unicodeToSafe($unicode)   X-Ref
convert an array of unicode codepoints into 'safe_filename' format

return: string        the unicode represented in 'safe_filename' format
author: Christopher Smith <chris@jalakai.co.uk>
param: array  int    $unicode    an array of unicode codepoints

safeToUnicode($safe)   X-Ref
convert a 'safe_filename' string into an array of unicode codepoints

return: array   int    an array of unicode codepoints
author: Christopher Smith <chris@jalakai.co.uk>
param: string         $safe     a filename in 'safe_filename' format