HTML5 Definitions for HTML Purifier
, (*1)
This library provides HTML5 element definitions for HTML Purifier,
compliant with the WHATWG spec., (*2)
It is the most complete HTML5-compliant solution among all based on HTML Purifier. Apart from providing the most extensive set of element definitions, it provides tidy/sanitization rules for transforming the input into a valid HTML5 output., (*3)
Installation
Install with Composer by running the following command:, (*4)
composer require xemlock/htmlpurifier-html5
Usage
The most basic usage is similar to the original HTML Purifier. Create a HTML5-compatible config
using HTMLPurifier_HTML5Config::createDefault()
factory method, and then pass it to an HTMLPurifier
instance:, (*5)
$config = HTMLPurifier_HTML5Config::createDefault();
$purifier = new HTMLPurifier($config);
$clean_html5 = $purifier->purify($dirty_html5);
To modify the config you can either instantiate the config with a configuration array passed to
HTMLPurifier_HTML5Config::create()
, or by calling set
method on an already existing config instance., (*6)
For example, to allow IFRAME
s with Youtube videos you can do the following:, (*7)
$config = HTMLPurifier_HTML5Config::create(array(
'HTML.SafeIframe' => true,
'URI.SafeIframeRegexp' => '%^//www\.youtube\.com/embed/%',
));
or equivalently:, (*8)
$config = HTMLPurifier_HTML5Config::createDefault();
$config->set('HTML.SafeIframe', true);
$config->set('URI.SafeIframeRegexp', '%^//www\.youtube\.com/embed/%');
Configuration
Apart from HTML Purifier's built-in configuration directives, the following new directives are also supported:, (*9)
-
Attr.AllowedInputTypes, (*10)
Version added: 0.1.12\
Type: Lookup (or null)\
Default: null
, (*11)
List of allowed input types, chosen from the types defined in the spec. By default, the setting is null
, meaning there is no restriction on allowed types. Empty array means that no explicit type
attributes are allowed, effectively making all inputs a text inputs., (*12)
-
HTML.Forms, (*13)
Version added: 0.1.12\
Type: Boolean\
Default: false
, (*14)
Whether or not to permit form elements in the user input, regardless of
%HTML.Trusted value.
Please be very careful when using this functionality, as enabling forms in untrusted
documents may allow for phishing attacks., (*15)
-
HTML.IframeAllowFullscreen, (*16)
Version added: 0.1.11\
Type: Boolean\
Default: false
, (*17)
Whether or not to permit allowfullscreen
attribute on iframe
tags. It requires either
%HTML.SafeIframe or
%HTML.Trusted to be true
., (*18)
-
HTML.Link, (*19)
Version added: 0.1.12\
Type: Boolean\
Default: false
, (*20)
Permit the link
tags in the user input, regardless of
%HTML.Trusted value.
This effectively allows link
tags without allowing other untrusted elements., (*21)
If enabled, URIs in link
tags will not be matched against a whitelist specified
in %URI.SafeLinkRegexp (unless %HTML.SafeIframe is also enabled)., (*22)
-
HTML.SafeLink, (*23)
Version added: 0.1.12\
Type: Boolean\
Default: false
, (*24)
Whether to permit link
tags in untrusted documents. This directive must
be accompanied by a whitelist of permitted URIs via %URI.SafeLinkRegexp,
otherwise no link
tags will be allowed., (*25)
-
HTML.XHTML, (*26)
Version added: 0.1.12\
Type: Boolean\
Default: false
, (*27)
While deprecated in HTML 4.01 / XHTML 1.0 context, in HTML5 it's used for
enabling support for namespaced attributes and XML self-closing tags., (*28)
When enabled it causes xml:lang
attribute to take precedence over lang
,
when both attributes are present on the same element., (*29)
-
URI.SafeLinkRegexp, (*30)
Version added: 0.1.12\
Type: String\
Default: null
, (*31)
A PCRE regular expression that will be matched against a <link>
URI. This directive
only has an effect if %HTML.SafeLink is enabled. Here are some example values:
%^https?://localhost/%
- Allow localhost URIs, (*32)
Use Attr.AllowedRel
to control permitted link relationship types., (*33)
Supported HTML5 elements
Aside from HTML elements supported originally by HTML Purifier, this library
adds support for the following HTML5 elements:, (*34)
<article>
, <aside>
, <audio>
, <bdi>
, <data>
, <details>
, <dialog>
, <figcaption>
, <figure>
, <footer>
, <header>
, <hgroup>
, <main>
, <mark>
, <nav>
, <picture>
, <progress>
, <section>
, <source>
, <summary>
, <time>
, <track>
, <video>
, <wbr>
, (*35)
as well as HTML5 attributes added to existing HTML elements, such as:, (*36)
<a>
, <del>
, <fieldset>
, <ins>
, <script>
, (*37)
License
The MIT License (MIT). See the LICENSE file., (*38)