Chapter 7: Comments
Limiting the HTML in Comments
Problem
HTML was used in the comments, but you want to limit which tags get displayed.
Solution
Use the "Limit HTML Tags" settings of the weblog.
Discussion
When data is submitted by visitors to your site, that data should not necessarily be trusted. For example, if you are allowing HTML in your comments, visitors to your site could submit malicious HTML, or scripts in Javascript or PHP, to run code on your site. This code could do anything including reading cookies or reading private files on your server.
To protect your site, Movable Type can limit any markup submitted by visitors to your site via comments or TrackBack pings (Prior to MT 3.2 this was referred to as "sanitizing.") This process removes any code that could compromise the security of your site. The sanitization process works by allowing only specific HTML tags; all other tags and scripting instructions (PHP, JSP, Javascript) are removed.
The basic specification format is a string of markup tag names separated by commas. The sanitization process assumes all HTML attributes are not permitted and will be removed. This behavior can be modified on a per tag basis by following the tag name with any attribute names delimited by a space and without a comma. More detail on defining your own specs follows below.
By default, MT permits the following HTML tags and attributes: a
href, b, i, br/, p,
strong, em, ul, ol, li, blockquote, pre. This collection can be modified with the GlobalSanitizeSpec directive in the mt-config.cgi file. The
permitted tags in the system configuration can also be overridden on a
per-weblog basis in the weblog settings. To specify the permitted markup
for a weblog, go to the General Settings screen and use the controls
labeled "Limit HTML Tags".
The sanitization process will also add closing tags to any tags left
open in the sanitized text. For instance, if a visitor to your site places
a <b> tag in a comment and forgets to close it, the
sanitize process will add a </b> tag. If that tag is
left open, all content that follows the comment on that page will be in
bold. It's quite easy for this to happen by accident with even the most
well-meaning commenter, so monitoring markup is not entirely about
security.
To process any MT template tag data use the sanitize global
filter.
<$MTFoo sanitize="1"$>
This global filter runs the sanitize filter on the text the tag outputs. If the value of the attribute is 1 (true), the default sanitize spec for the weblog is used. If the value is 0 (false), sanitizing will be turned off for this tag. Designers can define a sanitization spec inline with this global filter with a value other then 1 or 0. For example:
<$MTEntryBody sanitize="a href"$>
Here the sanitize filter would strip all markup except for
a with the href attribute from the entry
body.
The default value for sanitize is 0 (false) except for the
following tags where it is treated as true (1) unless explicitly turned
off:
MTCommentAuthorMTCommentEmailMTCommentURLMTCommentBodyMTPingTitleMTPingURLMTPingBlogNameMTPingExcerpt
For more details on the sanitize specification format see the article on the GlobalSanitizeSpec configuration directive.



