Moodle PHP Documentation 4.3
Moodle 4.3.5 (Build: 20240610) (7dcfaa79f78)
core_text Class Reference

defines string api's for manipulating strings More...

Static Public Member Functions

static code2utf8 ($num)
 Returns the utf8 string corresponding to the unicode value (from php.net, courtesy - roman.nosp@m.s@vo.nosp@m.id.lv)
 
static convert ($text, $fromCS, $toCS='utf-8')
 Converts the text between different encodings.
 
static encode_mimeheader ($text, $charset='utf-8')
 Generate a correct base64 encoded header to be used in MIME mail messages.
 
static entities_to_utf8 ($str, $htmlent=true)
 Converts all the numeric entities &#nnnn; or &#xnnn; to UTF-8 Original from laurynas dot butkus at gmail at: http://php.net/manual/en/function.html-entity-decode.php#75153 with some custom mods to provide more functionality.
 
static get_encodings ()
 Returns encoding options for select boxes, utf-8 and platform encoding first.
 
static is_charset_supported (string $charset)
 Check whether the charset is supported by mbstring.
 
static parse_charset ($charset)
 Standardise charset name.
 
static remove_unicode_non_characters ($value)
 There are a number of Unicode non-characters including the byte-order mark (which may appear multiple times in a string) and also other ranges.
 
static reset_caches ()
 Reset internal textlib caches.
 
static specialtoascii ($text, $charset='utf-8')
 Try to convert upper unicode characters to plain ascii, the returned string may contain unconverted unicode characters.
 
static str_max_bytes ($string, $bytes)
 Truncates a string to no more than a certain number of bytes in a multi-byte safe manner.
 
static strlen ($text, $charset='utf-8')
 Multibyte safe strlen() function, uses mbstring or iconv.
 
static strpos ($haystack, $needle, $offset=0)
 Find the position of the first occurrence of a substring in a string.
 
static strrchr ($haystack, $needle, $part=false)
 Finds the last occurrence of a character in a string within another.
 
static strrev ($str)
 Reverse UTF-8 multibytes character sets (used for RTL languages) (We only do this because there is no mb_strrev or iconv_strrev)
 
static strrpos ($haystack, $needle)
 Find the position of the last occurrence of a substring in a string UTF-8 ONLY safe strrpos(), uses mbstring.
 
static strtolower ($text, $charset='utf-8')
 Multibyte safe strtolower() function, uses mbstring.
 
static strtotitle ($text)
 Makes first letter of each word capital - words must be separated by spaces.
 
static strtoupper ($text, $charset='utf-8')
 Multibyte safe strtoupper() function, uses mbstring.
 
static substr ($text, $start, $len=null, $charset='utf-8')
 Multibyte safe substr() function, uses mbstring or iconv.
 
static trim_utf8_bom ($str)
 Removes the BOM from unicode string
 
static utf8_to_entities ($str, $dec=false, $nonnum=false)
 Converts all Unicode chars > 127 to numeric entities &#nnnn; or &#xnnn;.
 
static utf8ord ($utf8char)
 Returns the code of the given UTF-8 character.
 

Public Attributes

string const UTF8_BOM = "\xef\xbb\xbf"
 Byte order mark for UTF-8.
 

Static Protected Member Functions

static get_entities_table ()
 Returns HTML entity transliteration table.
 

Static Protected Attributes

static string[] $noncharacters
 Array of strings representing Unicode non-characters.
 

Detailed Description

defines string api's for manipulating strings

This class is used to manipulate strings under Moodle 1.6 an later. As utf-8 text become mandatory a pool of safe functions under this encoding become necessary. The name of the methods is exactly the same than their PHP originals.

This class was previously based on Typo3 which has now been removed and uses native functions now.

License
http://www.gnu.org/copyleft/gpl.html GNU GPL v3 or later

Member Function Documentation

◆ code2utf8()

static core_text::code2utf8 ( $num)
static

Returns the utf8 string corresponding to the unicode value (from php.net, courtesy - roman.nosp@m.s@vo.nosp@m.id.lv)

Parameters
int$numone unicode value
Return values
stringthe UTF-8 char corresponding to the unicode value

◆ convert()

static core_text::convert ( $text,
$fromCS,
$toCS = 'utf-8' )
static

Converts the text between different encodings.

It uses iconv extension with //TRANSLIT parameter. If both source and target are utf-8 it tries to fix invalid characters only.

Parameters
string$text
string$fromCSsource encoding
string$toCSresult encoding
Return values
string|boolconverted string or false on error

◆ encode_mimeheader()

static core_text::encode_mimeheader ( $text,
$charset = 'utf-8' )
static

Generate a correct base64 encoded header to be used in MIME mail messages.

This function seems to be 100% compliant with RFC1342. Credits go to: paravoid (http://www.php.net/manual/en/function.mb-encode-mimeheader.php#60283).

Parameters
string$textinput string
string$charsetencoding of the text
Return values
stringbase64 encoded header

◆ entities_to_utf8()

static core_text::entities_to_utf8 ( $str,
$htmlent = true )
static

Converts all the numeric entities &#nnnn; or &#xnnn; to UTF-8 Original from laurynas dot butkus at gmail at: http://php.net/manual/en/function.html-entity-decode.php#75153 with some custom mods to provide more functionality.

Parameters
string$strinput string
boolean$htmlentconvert also html entities (defaults to true)
Return values
stringencoded UTF-8 string

◆ get_encodings()

static core_text::get_encodings ( )
static

Returns encoding options for select boxes, utf-8 and platform encoding first.

Return values
arrayencodings

◆ get_entities_table()

static core_text::get_entities_table ( )
staticprotected

Returns HTML entity transliteration table.

Return values
arraywith (html entity => utf-8) elements

◆ is_charset_supported()

static core_text::is_charset_supported ( string $charset)
static

Check whether the charset is supported by mbstring.

Parameters
string$charsetNormalised charset
Return values
bool

◆ parse_charset()

static core_text::parse_charset ( $charset)
static

Standardise charset name.

Please note it does not mean the returned charset is actually supported.

Parameters
string$charsetraw charset name
Return values
stringnormalised lowercase charset name

◆ remove_unicode_non_characters()

static core_text::remove_unicode_non_characters ( $value)
static

There are a number of Unicode non-characters including the byte-order mark (which may appear multiple times in a string) and also other ranges.

These can cause problems for some processing.

This function removes the characters using string replace, so that the rest of the string remains unchanged.

Parameters
string$valueInput string
Return values
stringCleaned string value
Since
Moodle 3.5

◆ reset_caches()

static core_text::reset_caches ( )
static

Reset internal textlib caches.

Deprecated
since Moodle 4.0. See MDL-53544.
Todo
To be removed in Moodle 4.4 - MDL-71748

◆ specialtoascii()

static core_text::specialtoascii ( $text,
$charset = 'utf-8' )
static

Try to convert upper unicode characters to plain ascii, the returned string may contain unconverted unicode characters.

With the removal of typo3, iconv conversions was found to be the best alternative to Typo3's function. However using the standard iconv call iconv($charset, 'ASCII//TRANSLIT//IGNORE', (string) $text); resulted in invalid strings with special character from Russian/Japanese. To solve this, the transliterator was used but this resulted in empty strings for certain strings in our test. It was decided to use a combo of the 2 to cover all our bases. Refer MDL-53544 for further information.

Parameters
string$textinput string
string$charsetencoding of the text
Return values
stringconverted ascii string

◆ str_max_bytes()

static core_text::str_max_bytes ( $string,
$bytes )
static

Truncates a string to no more than a certain number of bytes in a multi-byte safe manner.

UTF-8 only!

Parameters
string$stringString to truncate
int$bytesMaximum length of bytes in the result
Return values
stringPortion of string specified by $bytes
Since
Moodle 3.1

◆ strlen()

static core_text::strlen ( $text,
$charset = 'utf-8' )
static

Multibyte safe strlen() function, uses mbstring or iconv.

Parameters
string$textinput string
string$charsetencoding of the text
Return values
intnumber of characters

◆ strpos()

static core_text::strpos ( $haystack,
$needle,
$offset = 0 )
static

Find the position of the first occurrence of a substring in a string.

UTF-8 ONLY safe strpos(), uses mbstring

Parameters
string$haystackthe string to search in
string$needleone or more charachters to search for
int$offsetoffset from begining of string
Return values
intthe numeric position of the first occurrence of needle in haystack.

◆ strrchr()

static core_text::strrchr ( $haystack,
$needle,
$part = false )
static

Finds the last occurrence of a character in a string within another.

UTF-8 ONLY safe mb_strrchr().

Parameters
string$haystackThe string from which to get the last occurrence of needle.
string$needleThe string to find in haystack.
boolean$partIf true, returns the portion before needle, else return the portion after (including needle).
Return values
string|falseFalse when not found.
Since
Moodle 2.4.6, 2.5.2, 2.6

◆ strrev()

static core_text::strrev ( $str)
static

Reverse UTF-8 multibytes character sets (used for RTL languages) (We only do this because there is no mb_strrev or iconv_strrev)

Parameters
string$strthe multibyte string to reverse
Return values
stringthe reversed multi byte string

◆ strrpos()

static core_text::strrpos ( $haystack,
$needle )
static

Find the position of the last occurrence of a substring in a string UTF-8 ONLY safe strrpos(), uses mbstring.

Parameters
string$haystackthe string to search in
string$needleone or more charachters to search for
Return values
intthe numeric position of the last occurrence of needle in haystack

◆ strtolower()

static core_text::strtolower ( $text,
$charset = 'utf-8' )
static

Multibyte safe strtolower() function, uses mbstring.

Parameters
string$textinput string
string$charsetencoding of the text (may not work for all encodings)
Return values
stringlower case text

◆ strtotitle()

static core_text::strtotitle ( $text)
static

Makes first letter of each word capital - words must be separated by spaces.

Use with care, this function does not work properly in many locales!!!

Parameters
string$textinput string
Return values
string

◆ strtoupper()

static core_text::strtoupper ( $text,
$charset = 'utf-8' )
static

Multibyte safe strtoupper() function, uses mbstring.

Parameters
string$textinput string
string$charsetencoding of the text (may not work for all encodings)
Return values
stringupper case text

◆ substr()

static core_text::substr ( $text,
$start,
$len = null,
$charset = 'utf-8' )
static

Multibyte safe substr() function, uses mbstring or iconv.

Parameters
string$textstring to truncate
int$startnegative value means from end
int$lenmaximum length of characters beginning from start
string$charsetencoding of the text
Return values
stringportion of string specified by the $start and $len

◆ trim_utf8_bom()

static core_text::trim_utf8_bom ( $str)
static

Removes the BOM from unicode string

Parameters
string$strinput string
Return values
string

◆ utf8_to_entities()

static core_text::utf8_to_entities ( $str,
$dec = false,
$nonnum = false )
static

Converts all Unicode chars > 127 to numeric entities &#nnnn; or &#xnnn;.

Parameters
string$strinput string
boolean$decoutput decadic only number entities
boolean$nonnumremove all non-numeric entities
Return values
stringconverted string

◆ utf8ord()

static core_text::utf8ord ( $utf8char)
static

Returns the code of the given UTF-8 character.

Parameters
string$utf8charone UTF-8 character
Return values
intthe code of the given character

The documentation for this class was generated from the following file: