Six Apart Guide to Comment Spam

This document describes how malicious or unwanted comments ('comment spam') affect weblogs, the techniques spammers use to abuse weblogs, and the tactics that can be used to prevent and defend against these attacks. Also included is a review of the strengths and weaknesses of each tactic, instructions for implementing them on your weblog and ones which we recommend for the best protection.

Throughout this document, we refer to comments and comment spam, but these references apply to all reader-submitted content, including both comments and TrackBacks, except where specified.

If you're under attack from comment spam now you can skip ahead to our list of recommendations.

Table of Contents

Overview of Comment Spam Terms and Techniques

In basic terms, the problem of comment spam boils down to a conflict of convenience. Weblog publishers provide comments so that their readers can easily share their thoughts and ideas. Making it easy for readers to add comments makes it more likely that readers will do so. The problem is that by making it easy for readers to add legitimate comments, you also make it easy for spammers to abuse your comment system.

In the same way that store owners want to prevent shoplifting without inconveniencing honest customers, weblog publishers want to prevent comment spam without inconveniencing legitimate commenters.

Consider an analogy to shoplifting. Store owners would certainly like to stop 100 percent of all shoplifting attempts. However, the only means to do so are both expensive and intrusive. Each customer could be followed and observed throughout the store by a store employee. This might prevent nearly all cases of shoplifting, but at the expense of requiring store owners to hire a large number of extra employees. Worse, many honest customers might reasonably consider such a tactic to be both inconvenient and a violation of their privacy. Such honest customers would be likely to take their business elsewhere.

In the same way that store owners want to prevent shoplifting without unduly inconveniencing or offending their honest customers, weblog publishers want to prevent comment spam without unduly inconveniencing readers who wish to submit legitimate comments. The ideal solution would block all comment spam without any inconvenience to legitimate posters; unfortunately, in practice, that's no more possible than it is for a store to prevent every instance of shoplifting with no inconvenience to honest customers.

A simple, rudimentary form of comment spam involves spammers sitting in front of web browsers, pasting spam into your comment form, and submitting it manually -- in other words, submitting spam the same way legitimate comments are submitted by your readers. While this method maximizes a spammer's ability to blend in with regular commenters, it's a relatively slow, cumbersome process; and hence this method is rarely used.

The real problem is automated comment spamming, driven by scripts or software written specifically for the purpose of producing comment spam. Such software can submit thousands of spam comments in a very short period of time, to many pages on many weblogs.

These automated scripts don't typically submit comments by going through the comment entry forms on your weblog. They call your weblog's comment submission script directly. With Movable Type, this script is called "mt-comments.cgi", by default. This form of attack is not specific to Movable Type's commenting implementation; calling a remote server script directly is a fundamental feature of the web and the HTTP network protocol.

The Anatomy of a Submission

Web developers: feel free to skip this section, which provides a basic overview of the way form submission works on the web.

In the same way that the contents of a web page can be downloaded by any networked software, not just web browsers, a server-side script can be invoked from any software, not just web browsers.

In a nutshell, the way that web forms work is that each form has an action attribute. The action is, typically, the URL for the server-side script that processes the form submission. For example, in a typical comment form on a Movable Type weblog, the action attribute of the form is the URL for your mt-comments.cgi script.

Each form also specifies various variables. For example, in a typical comment form, these variables contain data such as the text of the comment, the name of person submitting the comment, and the ID of the weblog entry associated with the comment. When you submit the form to send your comment, your web browser connects to the URL of the form's action attribute, and provides the script with the form's variable data.

Any software can mimic what your web browser is doing.

The problem is that any software can mimic what your web browser is doing here -- connecting to the URL of your comment script, and providing it with data to generate a new comment on your weblog. Without any protection in place, such software can connect to your comment script repeatedly, sending comments as fast as your comment script is capable of accepting them.

Ideally, what we would like is to allow real people sitting in front of real web browsers to submit comments freely, and to disallow any comments submitted from automated comment-submission software. Unfortunately, this isn't possible.

User Agents

First, it is not possible to determine whether the software connecting to your comment script is a real web browser. Incoming requests do contain something called a user-agent string, which is a way for web client software to identify itself. Most web browsers send a user-agent string that informs the server which browser (and which version of it) you're using. But the user-agent string is just a bit of text, and is thus easily forged. If comment-spam submission software identified itself honestly, it would be easy to block. So, of course, such software lies, by specifying user-agent strings that are identical to those of real web browsers.

This means that if a spammer is using software that sends the same user-agent string as Firefox 1.0, your comment script has no way to tell which comments have been submitted from real people using Firefox, and which are coming from comment-spam submission software that claims to be Firefox. From a programmer's standpoint, user-agent strings are as easy to forge as they are to set honestly.

Referrers

Second, it isn't possible to only accept comments which were actually submitted through the comment forms on your weblog. It's not the form that connects to your comment script, it's each person's web browser that does so. A comment script could try to enforce this, but it would not prevent spammers from submitting bogus comments that appear to come from people who used the forms on your site, and it would prevent some legitimate commenters from being able to submit comments.

Here's why. Similar to the user-agent string, each HTTP request also contains an information field called a referrer. The purpose of the referrer field is to indicate the URL of the page where the request originated. For example, if you click on a link from site A to site B, the referrer field will contain the URL of the page on site A from which you clicked the link.

When someone submits a comment to your site, the referrer field should be set to the URL of the page on your site where they filled in the comment field. So, your comment script could conceivably check the referrer of each submitted comment, and if the referrer isn't the URL of a page on your weblog, reject the comment.

Unfortunately, in the same way that user-agent strings are easy to forge, so too are referrers. Some comment-spam software already specifies referrers that appear to be valid, and if more weblogs began checking referrers, it'd be easy for all spamming software to adapt.

Worse, some people configure their web browsers not to send referrer information, for various reasons. If your comment script blocked comments based on the referrer, such people would be unable to submit comments to your weblog.

IP Addresses

Another piece of identifying information that accompanies every HTTP connection is the IP address of the computer making the request. An IP address is number such as "192.0.34.166", which uniquely identifies every computer on the Internet. It's the loose equivalent to a phone number.

At first thought, you might hope to block comment spam by blocking the IP addresses of known spammers. The idea being that once you receive spam from "192.0.34.166", you could mark this comment as spam and your comment script could reject further attempts to submit comments from that IP address. Or, to defend against spammers who send a flood of spam -- dozens or hundreds of comments in the span of a few minutes -- your comment script could place a limit on the number of comments that can be sent from a single IP address during any given period of time. Real people submitting legitimate comments to your weblog are unlikely to be hindered if they were unable to send more than, say, five comments per hour. This technique is known as IP throttling.

And in fact, throttling does work against certain instances of comment spam, which are sent from a single computer under control of the spammer. However -- and unfortunately -- many spammers take advantage of "open proxy relays". There are thousands of open proxies on the Internet, and they allow spammers to submit comments anonymously from thousands of unique IP addresses.

Let's say a spammer wants to submit 50 comments to your weblog. If he sends all 50 from the same IP address (perhaps that of his own computer, for example), IP throttling will block all but the first comment he submits. However, if the spammer is using software that can connect to open relays, he can submit 50 comments from 50 different IP addresses. Simple IP-based throttling is ineffective against this spamming technique.

Anti-Spam Comment Strategies and Tools

In broad terms, these are the techniques that can be used to block or mitigate comment spam.

Upgrade to the Latest Version of Movable Type

Movable Type 3.1x offers several advantages over earlier versions to help thwart or mitigate the effects of comment spam. (It's also worth noting that for a single author and up to three weblogs, you can upgrade to Movable Type 3.1x for free.) If you're still running Movable Type v2.6 or earlier, here are a few of the reasons to consider upgrading:

Movable Type menu with comments and trackback buttons
  1. Centralized Comment Management: All comments can be searched and modified from a central "Comments" page. The same goes for TrackBacks as well.

  2. Comment Registration: Movable Type 3 and later offer a way to accept an identity for each commenter, and to control or manage submissions based on that identity. For identity management, site owners can use either a centralized service like Six Apart's TypeKey service or a local one like Tim Appnel's Tiny Orwell. See the section below on "Authentication and Identification" for details.

  3. Comment Moderation: Incoming comments from unidentified users -- i.e. those who haven't logged in via TypeKey -- can be held for moderation. Rather than being immediately published or completely rejected, moderated comments are held, pending your approval.

  4. Dynamic Publishing: Switching to dynamic publishing doesn't avoid comment spam, but it can significantly reduce the amount of work your server needs to handle for each incoming spam. Using dynamic publishing obviates the need for Movable Type to rebuild your pages on each comment received -- a fairly intensive task. If you're using Movable Type's default static publishing and get hit by a flood of comment spam, the program has much more work to handle than it would if you were using dynamic publishing.

  5. Plugins: Many of the plugins discussed in this article require features or plugin APIs only available in Movable Type 3.1 or later. This includes:

    • MT-Blacklist 2
    • MT-DSBL

    Because Movable Type version 3 introduced a significantly improved set of APIs for plugin developers, this list will continue to grow over time. Plugins written for these versions can do things that weren't possible in previous versions of Movable Type.

Security by Obscurity

Part of the problem with a default Movable Type installation is that the commenting mechanism is identical to that of all other default Movable Type installations, which means that once a spammer has a script that can spam one site with a default configuration, it can spam any of them.

By making your installation sufficiently different from the default, you place unexpected obstacles in a spammer's way. This is called "security through obscurity". Obscurity strategies vary widely and wildly and can effectively foil comment-spam software written specifically to target Movable Type weblogs.

Changing the Name of Your Comment Script

For example, one simple form of obscurity is to rename the CGI script that handles comments. By renaming this script to something entirely unique, your can foil any comment-spamming software which relies on the default comment script name.

By default, this script is named mt-comments.cgi. Rename it, upload it in ASCII (text) mode to the same directory and then remove your old comment script.

After uploading the new comment script, you then need to tell Movable Type what script name you are using. Use a text editor to open your mt.cfg file (installed in the same direcctory as the comment script) and add a line (or change it if one exists already) that looks like this:

CommentScript name-of-script.cgi

Within your comment forms, make sure the <form> tag's "action" parameter looks like this:

action="<$MTCGIPath$><$MTCommentScript$>"

After making these changes, you'll need to rebuild your site so that your published pages reflect the changes.

Pros: Changing the name of your comment script is relatively easy, causes no change to the functionality of your site, and does in fact foil many automated comment spammers.

Cons: It's easy for spammers to work around this. All their software needs to do is download a copy of the pages on your site containing your comment forms, and locate the new name of the comment script in the source code. You can't completely hide the name of the comment script, because when legitimate visitors to your site create comments, their web browsers need to know the URL of your comment script in order to submit them.

The software used by many spammers is already capable of doing this.

Adding Additional Form Values

Rather than (or in addition to) simply renaming your "mt-comments.cgi" file, you can add additional fields to the form, and modify the script to require "correct" data in these fields for any submitted comments. The idea is that comments sent via your weblog comment forms will contain the necessary additional fields; comments sent using spamming software that only sends data corresponding to what's expected by the default mt-comments.cgi script will not.

Pros: This can be much more effective than simply renaming your comments script. The same basic idea is at work, where you're making your weblog different enough from a default installation of Movable Type that any spamming software written specifically to target the default Movable Type configuration will no longer work.

Cons: This is still a form of "security through obscurity". There's nothing to stop spammers from customizing their scripts to send the data fields expected by your customized installation -- you're just hoping that it won't be worth their while to do so.

This hints at a larger flaw -- by definition, a technique that depends on obscurity is only "obscure" if few people are using it. If you download a plugin or follow a set of publicized instructions for achieving this technique, it's likely that it's being used by enough other people that spammers will work around it.

Obscurity strategies are generally best kept secret -- not because the secrecy prevents spammers from circumventing the measures, but because if yours is the only weblog using a particular form modification, spammers might not deem it worthwhile to make a special case for it.

Don't Send Update Pings

Sites such as weblogs.com and blo.gs publish lists of "recently updated weblogs". They produce these lists via "pings" -- short notifications to those servers from weblog software.

There is growing speculation that comment-spammers poll these sites and use the results to send spam to recently-updated weblogs.

Pros: Might stop some spammers from noticing when you've updated.

Cons: Unlikely to have any long-term effect. Spammers could just as easily identify recently-updated sites by tracking RSS and Atom syndicated feeds. Also, by not sending pings, some regular readers of your site may not notice when you've created a new entry.

Form Obfuscation

Although you can't completely hide the name of your comments script, or the values and names of the fields in your comment forms, you can attempt to obfuscate them. Most commonly, this is achieved using JavaScript to generate the your comment form, or at least parts of it.

Spammers often use automated scripts that download the source code to your web pages, and search them for the names of your comment script and comment form field names. If you've obfuscated these values using JavaScript, the spammers' spidering software would need to be able to parse and execute JavaScript to determine the correct values.

Pros: Makes spammers' jobs more difficult.

Cons: Unless you devise a particularly clever means of obfuscation, it's not that much more difficult for spammers to determine the information they need to spam you. Spammers can circumvent an entire range of JavaScript obfuscation tactics simply by running all of the weblogs they spider through a JavaScript interpretter before parsing them. And, unfortunately, it doesn't require much work to do so.

Turing Tests

Named for the pioneering computer scientist Alan Turing, a Turing test is an attempt to pose a challenge that can be passed by humans, but not by computers.

By adding a Turing test to your comment form, the goal is that your legitimate human commenters will pass through unhindered, but automated attacked by spamming software will not. In this way, Turing tests are an extended form of adding additional form values, with the additional twist that they require human interaction to pass.

One simplistic example would be to require commenters to answer a natural language question, such as "What is the last name of the author of this weblog?", or "Which month immediately precedes August?" The problem with this technique is that to be effective, the questions need to change frequently. If you ask the same question, spammers will seed their scripts with the answer.

CAPTCHAs

Perhaps the most famous Turing-style test in use as an anti-spam technique is the CAPTCHA (a cutesy acronym that stands for "Completely Automated Public Turing test to tell Computers and Humans Apart"). CAPTCHAs frequently come in the form of images of fuzzy or distorted letters and numbers, which humans can read and parrot into a text field, but which automated optical character recognition software has trouble identifying.

Example: James Seng's MT-Scode plugin

Pros: Can be very difficult for spammers to work around.

CAPTCHAs often are not legible even if you can see!
Captchas are an accessibility nightmare and easily circumventable

Cons: Numerous. First, an image-based CAPTCHA is impossible to solve for people with impaired vision, those with reading difficulties (e.g. dyslexia), or those using text-only web browsers. If the only way to comment on your site is by solving an image-based CAPTCHA, you have a serious accessibility problem.

Plus, because CAPTCHAs are in use on numerous high-profile sites, such as Yahoo Groups and PayPal, spammers have devoted significant effort into automating ways to solve them. For example, this report by Cory Doctorow at Boing Boing indicates that spammers have begun using unsuspecting web surfers on other sites to do the work for them in real-time.

Authentication and Identification

Spammers do their best work under the cover of anonymity. If a spammer cannot be tied to a particular piece of unchanging information (e.g. IP address, email address, URL), they are better able to circumvent any roadblocks that you put in their way based on these characteristics.

One way to overcome this problem is to require that a commenter identify themselves in some reliable and consistent way. With this method, it's not important that you actually know who the commenter is, so much as you can tie a particular commenter to each of their previous comments. With authentication of their pseduonymous identity, akin to a username on a message board, they can post comments on your site until such time when you revoke that right.

TypeKey

Movable Type 3 introduced a user authentication service called TypeKey. TypeKey is a service that lets you confidently identify a particular user, and it is a free service for both publishers and users. The TypeKey system takes care of managing user accounts and verifying authentication.

For more information on how to enable TypeKey on your Movable Type weblog, see this article on the Six Apart Professional Network weblog. To use TypeKey in other applications, see Everything TypeKey.

TypeKey is relatively easy to enable on a Movable Type weblog, but it is not tied to Movable Type specifically. TypeKey provides an open API that allows other applications (e.g. DropCash) to use it for verifying user identities.

With TypeKey enabled, Movable Type provides you with several configurable options for using it. The most effective configuration to prevent comment spam is to only accept comments from identifiable visitors and to disable the "Automatically Approve Registered Commenters" option. With automatic approval disabled, the first time a TypeKey user submits a comment to your weblog, it will be held for your approval. After approving the comment (and implicitly, the commenter), it will be published to your weblog, and any subsequent comments from that user will be published automatically.

Screenshot of configuration options

The worst case scenario when using TypeKey in this way would be if a spammer created a TypeKey account, and used it to send spam to your weblog. However, because the first comment from any TypeKey user must be approved by your before being published, the only way a spammer could sneak spam onto your site would be to first submit a comment that appears to be legitimate. While it's possible that some spammers might attempt this, it is highly unlikely that they would be able to do this using automated scripts. If they do and are reported to Six Apart, TypeKey's terms of service allows us to disable their accounts.

If you wish, you can also accept comments from people who don't have TypeKey accounts. To prevent spam from being published, you can configure Movable Type to automatically moderate all comments from users who aren't using TypeKey. This requires your approval for each such comment, but prevents comment spam from being published on your weblog.

Screenshot of configuration options

TypeKey strikes a balance between convenience and security -- any system significantly more convenient than TypeKey would also be more convenient for spammers.

Pros: TypeKey is free to use and easy to enable using Movable Type 3.1. It makes it more difficult for spammers to send comments to your weblog, because they need a valid TypeKey account to do so. And, even if they do submit spam, their initial messages need to be approved by you before they'll appear on your weblog. Using TypeKey is very effective at stopping spam, and should remain so going forward.

Cons: Users who haven't yet created their own TypeKey accounts must go through the account-creation process before commenting on your site. This is a one-time process, and allows them to use TypeKey on any other TypeKey-enabled web sites, but it still might lead some casual commenters to decide it isn't worth the trouble. Also, creating a new TypeKey account requires solving a CAPTCHA (only once, during account creation), which entails certain accessibility problems.

Tiny Orwell

Movable Type's support for user authentication is not tied specifically to the TypeKey service. Movable Type can work with any authentication system that supports the TypeKey API. One example is Tim Appnel's Tiny Orwell, a simple and free authentication system that runs locally on your server.

If you have an existing authentication system, the information at Everything TypeKey should help you allow it to be used as a comment authentication system for Movable Type.

Pros: Prevents spam in a method similar to TypeKey.

Cons: Commenters must create new accounts on each site using Tiny Orwell, whereas with TypeKey, you can create a single account to use on any TypeKey-enabled web site. (This could also be seen as an advantage, however, since it means spammers would need to create separate accounts for each site as well.)

Content Filtering

Content filtering encompasses a wide range of tactics to identify the differences between legitimate comments and spam. This includes not just examining the contents of the comment messages themselves, but also the HTTP headers that accompany the comment. HTTP headers include information such as the referrer, the user-agent string, and the IP address of the computer sending the comment.

Content filtering is one of the primary means of combatting email spam, and can be effective against comment and TrackBack spam as well.

However, it's also the case that successful content filtering is notoriously difficult, because spammers expend significant effort to circumvent them. The result is an ever-escalating arms race: anti-spam developers create clever new filtering schemes; spammers eventually figure out a way to work around them; which in turn leads anti-spam developers to devise new filtering schemes.

MT-Blacklist

MT-Blacklist offers numerous features in addition to its namesake blacklisting feature.

Jay Allen's MT-Blacklist plugin for Movable Type provides content-filtering against a list of items that you can maintain (and a list that you can update automatically from a central list maintained at the Comment Spam Clearinghouse). MT-Blacklist offers numerous features in addition to its namesake blacklisting feature, such as blocking duplicated submissions, forcing moderation on older posts, limiting the number of URLs within the content of a single comment (spammers often attempt to send comments containing dozens or even hundreds of URLs; legitimate comments seldom contain more than one or two).

For a much more in-depth look, see the tutorial entitled "Getting the most out of MT-Blacklist.

Pros: Combines several effective strategies in a single plugin. Provides a web-based user interface (integrated with Movable Type's own interface) for configuring options and monitoring MT-Blacklist's effectiveness. Helps block TrackBack spam in addition to comment spam.

Cons: Content filtering is not 100-percent accurate. The more aggressively you configure MT-Blacklist, the more likely it is to mistakenly flag good comments as spam. Conversely, the more leniently you configure MT-Blacklist, the more likely spam is to slip through. However, you can configure MT-Blacklist to use moderation on certain blacklist terms, which will prevent spam containing those terms from being published, but will not block legitimate comments which happen to contain those terms, i.e., with moderation, the worst fate for a good comment mistakenly flagged as spam is that it will be delayed until you approve it; it won't be outright rejected.

MT-DSBL

Brad Choate's free MT-DSBL plugin uses real-time DNS lookups to determine if the commenter's IP address is listed as an open proxy. Spammers often use open proxies because they allow the spammer to send comments through remote computers instead of their own machines. The real source of an attack becomes much harder to determine in such a case.

Depending on how this plugin is installed, it can either moderate or deny comments coming from open proxies and all actions are recorded in the Movable Type activity log. You can also install an option that denies TrackBacks that come from these sources as well.

Pros: Very easy to set up, and once installed, blocks spam from known open proxies. Requires absolutely no maintenance -- the list of proxies is kept up-to-date by the maintainers of the DSBL.org web site.

Cons: This could conceivably block some legitimate comments. Honest people might use open proxies in order to leave comments anonymously. For example, users who are extremely sensitive regarding their privacy, or those who are leaving comments of an extremely private or confidential nature, may choose to use a proxy.

MT-Bayesian

James Seng's MT-Bayesian plugin uses statistical analysis to flag comments that appear to be spam. For example, by looking at word frequency, some words appear with disproportionate frequency in spam. If several of these suspicious words appear in a comment, it's likely that the comment is spam. For more information and background on Bayesian-style filtering, see Paul Graham's seminal essay, "A Plan for Spam."

Pros: Bayesian-style content filtering can be surprisingly accurate if the spam is sufficiently different from the non-spam.

Cons: Weblog spam is often indistinguishable from a normal comment aside from a single spammy link. Plus, to achieve reasonable accuracy, a Bayesian-style content filter must be trained, typically by seeding it with large amounts of spam and non-spam messages at the outset, and then manually correcting any mistakes. As you correct mistakes, accuracy should increase; if you don't correct mistakes made by the filter, however, it's accuracy will get worse, because the filter will have an inaccurate conception of what's spam and what's not.

Throttling

Automated spam often comes in large waves -- dozens or even hundreds of spam comments within the span of a brief period of time.

You can defend against a flood of comment spam by throttling. By default, Movable Type is configured with a simple built-in throttling mechanism: if a new comment from any IP address is submitted within 20 seconds of a previous comment from that same address, the program returns an error message informing the commenter that they must slow down. You can change the period of time using the ThrottleSeconds directive in your "mt.cfg" file.

The biggest limitation to Movable Type's built-in throttle is that it only works against multiple spam comments coming from the same IP address; spammers who use open proxies to send comments will not be caught using this feature.

Real Comment Throttle

Phil Ringnalda's Real Comment Throttle is a free plugin that implements a simple throttle that ignores comments' IP addresses. Instead, it institutes a hard limit on the total number of comments your weblog will accept per hour and per day, and once that limit is reached, all subsequent comments will be rejected until the period of time is over.

Pros: Very simple to configure and use (and also to uninstall). Mitigates the effects of a large-scale flood of comment spam.

Cons: This technique effectively blocks all spam that comes in once the throttle threshold has been reached. However, it does so by blocking *all* comments, spam or not. It's entirely possible that legitimate commenters could be blocked once your throttle threshold is reached.

Moderating or Closing Comments on Old Entries

Keeping comments open on every entry in your weblog archive presents spammers with more targets to which to send spam. Spammers often target old entries because the spam is less likely to be noticed by you than on your most recent entries. Plus, older articles often bring in more traffic via search engines.

Closing comments on old entries -- or configuring Movable Type to automatically moderate comments sent to old entries -- can help reduce comment spam. Several plugins make these tasks easier. Using moderation on old entries allows readers to continue commenting on old topics. Closing comments on old entries is easier, but prevents further discussion or corrections on old entries.

Plugins for Moderating Comments on Old Entries

Pros: Can significantly cut down on spam; many spammers repeatedly send comments to a list of old entries on your weblog. Does not hamper active discussion on recent entries.

Cons: Does nothing to combat spam sent to recent posts.

Plugins for Closing Comments on Old Entries

Pros: Unlike moderating, closing comments requires no further effort on your part.

Cons: Closing comments on old entries prevents your readers from correcting or updating outdated information in old entries.

Six Apart Recommendations for Movable Type Users

The only 100% solution to weblog spam is to never accept comments or TrackBacks. However, you can still still come close with submissions open if you employ a multi-layer defense. Our recommendations were chosen on the basis of four main characteristics:

Diversity through plugins

By upgrading to the latest version of Movable Type, you not only get the benefit of a larger feature set (upon which many of our recommendations rely) and much better performance in the face of a comment spam attack, but the most powerful and extensible plugin API of any version of Movable Type.

Check out all Movable Type plugins at mt-plugins.org

This is very important since plugin authors will always be more nimble in their response to the adaptations of comment spammers than the core Movable Type code will. Perhaps more important, though, is the fact that plugins are the best way to create diversity in what has been, until now, a monoculture of default Movable Type installations. The less a spammer can rely on a particular configuration, the more work they must do and the less attractive the weblog medium is as a delivery vehicle for their payloads.

Elimination by proxy

Since spammers love to attack from behind a rotating lot of anonymous proxies, the second thing you should do is employ Brad Choate's MT-DSBL. In doing so, you've made it extremely difficult for spammers to hit your installation from multiple IP addresses and hence sharpen the teeth of Movable Type's built-in IP-based comment and TrackBack throttles.

What's more, the anonymous proxy blacklist that MT-DSBL employs is updated in real-time by a centralized service dedicated to this mission, meaning absolutely no maintenance headaches for you.

Halt! Who goes there?

There is no doubt about the effectiveness of authentication in stopping comment spam. Whether you use TypeKey, Tiny Orwell, or any other authentication service that supports the TypeKey API in the future, requiring a proof-of-identity (even a pseudonymous identity) presents an almost insurmountable barrier to spammers who are looking to extract the maximum effectiveness from the lowest possible work.

For more information on how to enable TypeKey on your Movable Type weblog, see this article on the Six Apart Professional Network weblog. To use TypeKey in other applications, see Everything TypeKey.

Of course, authentication also presents a barrier to entry, however slight, to your users. However, you can mitigate that by accepting both authenticated and non-authenticated comments while moderating the latter. By doing so, you give everyone the ability to comment while creating a "fast track" path for those who are willing to prove they are not spammers.

When you do this, you should place instructions by your comment form to let your visitor's know about your policy. Something like the following would be effective:

If you wish for your comment to be posted immediately, [sign in via TypeKey](link). Otherwise, your comment will be entered into a queue for approval by the site owner.

Because everyone loves immediacy, you will be surprised at how many choose to authenticate rather than wait for approval.

Swiss Army Defense

For a much more in-depth look, see the tutorial entitled "Getting the most out of MT-Blacklist.

Even if you loathe forced moderation for all non-authenticated commenters, there is an alternative. MT-Blacklist is itself a Swiss Army knife of comment/TrackBack defenses all rolled into one package. These include:

In essence, MT-Blacklist cuts out the spam while reducing the number of comments you have to moderate.

Throttle 'em

Given the defenses we've outlined above, it is doubtful that a single piece of spam would ever get through and posted to your site, much less a massive attack. However, in the extreme latter case, you're going to want to limit the damage and minimize the amount of time you'll have to spend cleaning up. For that, Phil Ringnalda's Real Comment Throttle will do the trick.

Note that by employing the defenses we've outlined in front of the throttle, you mitigate its one downside: By launching a severe attack, you let the spammer choose whether or not your site allows comments. Since the other defenses essentially reduce the possibility of high-volume attacks to zero, the throttle should never have to kick in. But if it does, you'll be glad.

If all else fails...

If for some reason all of your defenses fail and your server is under a severe attack, you can simply disable all comments temporarily. In Weblog Config » Preferences, uncheck the following two boxes:

We are sure that you will never have to do this if you've followed our recommendations, but it was worth a mention because it's not entirely obvious how one might quickly turn off all comments.

It is not possible to easily disable all TrackBacks at this time. If you are under attack however, you can simply remove the mt-tb.cgi script to prevent any submissions.

Summary

Whether we want to admit it or not, our little corner of the web has blipped onto the radar screens of spammers. Gone are the days where the ultimate simplicity of a comment form and a submit button immediately and frictionlessly connected you with hundreds or thousands of your readers.

When a small community grows up and finds it necessary to put locks on their doors to stop the would-be robbers, human nature tends to view this as the end of an era -- as the death of simplicity and trust. In the end, however, locks have never prevented neighbors from dropping by the house to say "Hi!".

So too will it be for weblogs.

We at Six Apart are of the strongest opinion that this does not signal the end of interactivity. By implementing the steps we've outline in this guide, interaction between you and your visitors on your site should proceed with the minimal amount of fuss while locking spammers out almost entirely.

The adjustments that we must make to confront this new element is nothing more than a temporary awkward adolescense for weblogs. Once those adjustments are made, we can move on and get back to doing what we love: Making the web a richer place.

We here at Six Apart are eager and committed to doing what we can to rid the world of comment spam.

Some key links for those concerned about comment spam, especially on the Movable Type platform:

Six Apart sites:

Tutorials:

Plugins: