Movable Type Documentation

Back to docs index

Chapter SpamLookup

In this section:


When Movable Type (or any other blogging software) sends a TrackBack ping, the server's IP address from which the ping was sent usually matches the IP address of the domain that is listed in the TrackBack URL.

On the other hand, when spammers send TrackBack pings, they tend to spam with many different TrackBack URLs, none of whose domains match the sending IP address.

This test will check the TrackBack ping IP address and the IP address of the domain listed in the URL to see if they match or are at least close, indicating that they might be on an associated server.

Recommended settings

On one hand, spammers almost never send TrackBack pings from the same IP address as the domains listed in the TrackBack URL so this is a great way to catch TrackBack spam. However, there is a downside: Pings sent directly from desktop blogging clients will also match the criteria for this test.

We consider this collateral damage since, in our experience, very few false positives result, while a massive number of junk TrackBacks are easily caught with this test. We recommend this test be turned on with a score weighting of 1.

If you are unsure, just make sure to check your Junk folder periodically to see if legitimate TrackBacks from your users are being incorrectly marked as Junk. You can always use the Lookup whitelist to make exceptions if the number of users affected is small.

SpamLookup can perform lookups on the domain names of all URLs contained in a feedback item with any domain blacklist service. These services track domain names used in spam.

There are three settings for this tests operation:

  • Off - This test is not performed
  • Moderate - When a feedback item is received with a domain name that is listed with one of the specified blacklist services, the item is left unpublished and awaiting administrator approval.
  • Junk - The same as above, except that instead of moderating it, SpamLookup contributes to the item's Junk score, which may, depending on the other scores the item receives, lead to it being marked as Junk.

Junk score weight

For the Junk option, you can adjust the score this test contributes based on your confidence of the test from 1 (least confident, the default) to 10 (most confident).

Domain Blacklist Services

You can alter the blacklist services used in the IP address lookups. The default services are bsb.spamlookup.net, sc.surbl.org.

Recommended settings

The default domain blacklists tend to be extremely accurate in determining what is and is not spam. We recommend that this setting be set to Junk matching items with a junk score weight of either 1 or 2.

SpamLookup can perform IP address lookups with any IP address blacklist service. These services often contain IP addresses of known anonymous proxies which are very commonly used by the majority of comment and TrackBack spammers.

There are three settings for this tests operation:

  • Off - This test is not performed
  • Moderate - When a feedback item is received with an IP address that is listed with one of the specified blacklist services, the item is left unpublished and awaiting administrator approval.
  • Junk - The same as above, except that instead of moderating it, SpamLookup contributes to the item's Junk score, which may, depending on the other scores the item receives, lead to it being marked as Junk.

Junk score weight

For the Junk option, you can adjust the score this test contributes based on your confidence of the test from 1 (least confident, the default) to 10 (most confident).

IP Blacklist Services

You can alter the blacklist services used in the IP address lookups. The default services are bsb.spamlookup.net and opm.blitzed.org.

Recommended settings

The default IP address blacklists are constrantly updated with new and accurate information and thus far have done an excellent job of determining what is and isn't spam. However, IP addresses can change rapidly and new anonymous proxies are created all of the time. In our experience thus far, there is little chance of false positives but may not catch the new proxies.

We recommend that this setting be set to Junk matching items with a junk score weight of 1.

SpamLookup is an anti-spam plugin created by lead Movable Type engineer Brad Choate, which identifies spam based on a number of unique characteristics and administrator based tests. It includes:

This plugin utilizes the feedback rating framework. Each test that is enabled contributes to the final composite score of a feedback item. The final score (which is an average of all tests, including SpamLookup and other plugins) determines whether an item is marked as Junk or not.

Because of its many accurate and customizable tests, SpamLookup is the most successful anti-spam plugin for Movable Type. It is bundled with Movable Type 3.3 and Movable Type Enterprise, and can be configured on both the system level and weblog level (by authors with the appropriate permissions).

When a comment or TrackBack is received, you can direct SpamLookup to scan the content for matches to certain keywords, domain names and patterns. In the case of the match, you can have SpamLookup hold the item for your approval and not publish it or contribute to the item's feedback rating.

Syntax of the keywords filter fields

Each item in your keyword filter list must be on its own line. It can be something as simple as a word, domain name or even a phrase. For the more advanced, complex Perl regular expressions can be used directly to accurately match what you're looknig for.

Words and phrases can be listed plainly. They are tested in a case-insensitive manner and match against "whole" words. That is cialis would match cialis or Cialis but not specialist.

For those who are familiar with such things, Perl regular expressions can also be used and are denoted by use of slashes (/) before and after and followed by any optional modifiers. For example:

/online-?casino/i

If you wish, you can also specify the junk score of items in the "Keywords to Junk" field by including the score (from 1 to 10 inclusive) at the end of the line, like so:

phentermine 4

This is good for making darn sure other tests don't let something through.

When you receive a comment or a TrackBack from someone and publish it, there is a very strong likelihood that future contributions from that person are also not spam. SpamLookup can be directed to contribute a positive (non-spam) score to:

  • Comments with an previously seen email address
  • Comments with an previously seen URL (in the URL field, not in the comment body)
  • TrackBacks with a previously seen source URL

This credit only takes place if the rest of the item is devoid of other URLs, which protects you in many cases from a spammer trying to submit one good comment and they follow-up with a lot of spam comments.

Recommended settings

We recommend that both tests be enabled and their scores set to 1.

SpamLookup can be directed to count the number of links included in a comment or TrackBack. High numbers of links are almost always indicative of spam whereas items without links are almost never Junk.

There are three link limit settings which are pretty self-explanatory:

  • Credit feedback rating when no hyperlinks are present
  • Moderate when more than link(s) are given
  • Junk when more than link(s) are given

The first actually contributes a positive score (non-junk) to the final score. The last contributes a negative score. The second does not contribute to the junk score but instead directs Movable Type to hold the item for approval.

Recommended settings

We recommend that all three tests be enabled and the scoring tests be set to 1. We recommend that you moderate with 3 links and junk with 10.

NOTE: There is a typographical error in the spam lookup text. Wherre it says "Moderate/Junk when more than N link(s) are given" it should say "Moderate/Junk when N or more link(s) are given".

In blocking spam, there is always a struggle between blocking as much spam as you can and inadvertently blocking things which are not spam (otherwise known as "false positives").

Luckily, Lookup whitelists allow you to cast a wide net for spammers while also dealing with persistent false positives which might otherwise keep you from using effective tests.

How to use

If you find that, in addition to a lot of spam, legitimate feedback items are being caught by one or more of the SpamLookup tests, you can add either the sending server's IP address (or even a partial IP address to allow an entire IP block) or the domain that would be included as the TrackBack URL.

This is especially effective if you've turned on advanced TrackBack lookups but have one visitor who often sends TrackBack pings from their blogging client.

Simply place each full or partial domain or IP address on a line by itself. If a comment or TrackBack is submitted by an IP address that matches or contains a specified domain, SpamLookup will not contribute to its junk rating.

Six Apart
Makers of weblog software and services for individuals, organizations and businesses.
This website is powered by Movable Type.