Moodle PHP Documentation 4.5
Moodle 4.5dev (Build: 20240606) (d3ae1391abe)
HTMLPurifier_Lexer Class Reference
Inheritance diagram for HTMLPurifier_Lexer:
HTMLPurifier_Lexer_DOMLex HTMLPurifier_Lexer_DirectLex HTMLPurifier_Lexer_PH5P

Public Member Functions

 extractBody ($html)
 Takes a string of HTML (fragment or document) and returns the content.
 
 normalize ($html, $config, $context)
 Takes a piece of HTML and normalizes it by converting entities, fixing encoding, extracting bits, and other good stuff.
 
 parseAttr ($string, $config)
 
 parseData ($string, $is_attr, $config)
 Parses special entities into the proper characters.
 
 parseText ($string, $config)
 
 tokenizeHTML ($string, $config, $context)
 Lexes an HTML string into tokens.
 

Static Public Member Functions

static create ($config)
 Retrieves or sets the default Lexer as a Prototype Factory.
 

Public Attributes

 $tracksLineNumbers = false
 Whether or not this lexer implements line-number/column-number tracking.
 

Static Protected Member Functions

static CDATACallback ($matches)
 Callback function for escapeCDATA() that does the work.
 
static escapeCDATA ($string)
 Translates CDATA sections into regular sections (through escaping).
 
static escapeCommentedCDATA ($string)
 Special CDATA case that is especially convoluted for <script>
 
static removeIEConditional ($string)
 Special Internet Explorer conditional comments should be removed.
 

Protected Attributes

 $_special_entity2str
 Most common entity to raw value conversion table for special entities.
 

Member Function Documentation

◆ CDATACallback()

static HTMLPurifier_Lexer::CDATACallback ( $matches)
staticprotected

Callback function for escapeCDATA() that does the work.

Warning
Though this is public in order to let the callback happen, calling it directly is not recommended.
Parameters
array$matchesPCRE matches array, with index 0 the entire match and 1 the inside of the CDATA section.
Return values
stringEscaped internals of the CDATA section.

◆ create()

static HTMLPurifier_Lexer::create ( $config)
static

Retrieves or sets the default Lexer as a Prototype Factory.

By default HTMLPurifier_Lexer_DOMLex will be returned. There are a few exceptions involving special features that only DirectLex implements.

Note
The behavior of this class has changed, rather than accepting a prototype object, it now accepts a configuration object. To specify your own prototype, set Core.LexerImpl to it. This change in behavior de-singletonizes the lexer object.
Parameters
HTMLPurifier_Config$config
Return values
HTMLPurifier_Lexer
Exceptions
HTMLPurifier_Exception

◆ escapeCDATA()

static HTMLPurifier_Lexer::escapeCDATA ( $string)
staticprotected

Translates CDATA sections into regular sections (through escaping).

Parameters
string$stringHTML string to process.
Return values
stringHTML with CDATA sections escaped.

◆ escapeCommentedCDATA()

static HTMLPurifier_Lexer::escapeCommentedCDATA ( $string)
staticprotected

Special CDATA case that is especially convoluted for <script>

Parameters
string$stringHTML string to process.
Return values
stringHTML with CDATA sections escaped.

◆ extractBody()

HTMLPurifier_Lexer::extractBody ( $html)

Takes a string of HTML (fragment or document) and returns the content.

Todo
Consider making protected

◆ normalize()

HTMLPurifier_Lexer::normalize ( $html,
$config,
$context )

Takes a piece of HTML and normalizes it by converting entities, fixing encoding, extracting bits, and other good stuff.

Parameters
string$htmlHTML.
HTMLPurifier_Config$config
HTMLPurifier_Context$context
Return values
string
Todo
Consider making protected

◆ parseData()

HTMLPurifier_Lexer::parseData ( $string,
$is_attr,
$config )

Parses special entities into the proper characters.

This string will translate escaped versions of the special characters into the correct ones.

Parameters
string$stringString character data to be parsed.
Return values
stringParsed character data.

◆ removeIEConditional()

static HTMLPurifier_Lexer::removeIEConditional ( $string)
staticprotected

Special Internet Explorer conditional comments should be removed.

Parameters
string$stringHTML string to process.
Return values
stringHTML with conditional comments removed.

◆ tokenizeHTML()

HTMLPurifier_Lexer::tokenizeHTML ( $string,
$config,
$context )

Lexes an HTML string into tokens.

Parameters
$stringString HTML.
HTMLPurifier_Config$config
HTMLPurifier_Context$context
Return values
HTMLPurifier_Token[]array representation of HTML.

Reimplemented in HTMLPurifier_Lexer_DirectLex, HTMLPurifier_Lexer_DOMLex, and HTMLPurifier_Lexer_PH5P.

Member Data Documentation

◆ $_special_entity2str

HTMLPurifier_Lexer::$_special_entity2str
protected
Initial value:
=
array(
'&quot;' => '"',
'&amp;' => '&',
'&lt;' => '<',
'&gt;' => '>',
'&#39;' => "'",
'&#039;' => "'",
'&#x27;' => "'"
)

Most common entity to raw value conversion table for special entities.

@type array

◆ $tracksLineNumbers

HTMLPurifier_Lexer::$tracksLineNumbers = false

Whether or not this lexer implements line-number/column-number tracking.

If it does, set to true.


The documentation for this class was generated from the following file: