April 15, 2007

Administrative Note on Spam

I’ve modified my blog software to help me catch the latest deluge of spam. I’ve gotten sick of deleting it. I haven’t yet instituted the hated automatic moderation, CAPTCHA or forced-registrations (although I have not ruled them out in the future) but I am instituting keyword scoring.

Certain words and words in certain combinations are now more likely to flag a comment as either likely junk, or subject to moderation. However, if you’re a friendly human, you get bonus points that keep things flowing smoothly.

What this means to you: The software might tell you that your comment is awaiting moderation if you use certain words and/or are not acting like a friendly human.

What are those words? Since they are subject to change, it doesn’t make sense to tell you so you can try to avoid them. But think credit card consolidation and Viagra and you get the idea.

What’s a “friendly human”? Someone who uses a consistent email address can be recognized as friendly the next time they post, and so the system scores them as human more highly. The URL field is another opportunity to rack up bonus human points. In the current system, the way I’ve set it up, it’s harder to get your comment rejected if you use the same email address every time you post, and the same URL. If you want to leave those fields blank, you start out at zero points. If you fill them with something generic (fakemail@somedomain.com), I’m going to have to consider rejecting them because it poisons the scoring system.

If your comment falls into the moderation bin, I get notified, and I can publish it. If it gets junked, I don’t get notified. The comment is not lost, but I won’t be checking the junk drawer that frequently. So, be a friendly human if you like things to go smoothly!

Posted by James at April 15, 2007 11:40 AM
I don't mind moderation, but I think you'd be driven crazy by it: you get too many comments to want to bother moderating them all. I do detest CAPTCHAs, though, because they're annoying and hard to deal with, and some implementation (like the one that blogger uses) are truly horrible. Blogger times out the CAPTCHA way too soon, for instance. So I go to make a comment and it shows me a CAPTCHA. I type in the comment, answer the CAPTCHA... and I get another CAPTCHA, because it took me "too long" to type my comment.

Registrations are a mixed bag. If I think I'll make frequent comments, I don't mind registering. But a registration requirement probably will stop me from making a casual one-time comment if I think I might not do so very often. That usually means that the first couple of times I might have commented, I won't... and then by the third time, I'll say, "Well, OK, I'll go ahead and register."

Posted by: Barry Leiba at April 15, 2007 6:51 PM

I feel pretty much the same way.

The system MT has built into it (at least, in the version I'm using) allows me to score each comment. So, if a bot wants to talk about debt and credit card consolidation, it's going to get trashed. If a regular poster does it, it will get through.

If it works, I think it's a happy medium. It could be cracked by savvy spammers, and it will annoy people who don't want to use an email address or a URL, but it's easier than registration.

Of course, as soon as I change to new blog software, all this will change yet again. But I keep running into roadblocks when I consider doing that (this server doesn't support the latest WordPress, for example, and I'm afraid the blog will stop working if I try upgrading MT again -- upgrading MT is a nightmare). I really want to upgrade or crossgrade, because I'm sick of how slow this blog software is.

Posted by: James at April 15, 2007 6:58 PM

I have to say that, with all the advantages of running things oneself and using MT or WP, and all the possible negative things about using a public blog service, I quite like using Blogger, and having it all managed for me.

Posted by: Barry Leiba at April 16, 2007 10:02 AM

Sorry, I just commented (to a different post) without an email address. I forgot. Just FYI, this is an old email address, and while it's technically still "good," it kind of sucks (bounces a lot of emails) and I'll probably get rid of it in the near future.

Posted by: Julie at April 16, 2007 12:20 PM

It's a fairly simple rating scheme. If you decide to use an old email address, it will work fine. Heck, you can switch among 3 email addresses if you like. As long as it gets through the filter once, future posts get the bonus points.

Posted by: James at April 16, 2007 12:32 PM

Please, if you decide on CAPTCHA, can you also allow some regular commenters to be exempt? I have a really hard time using it, as do many other brain injury survivors. I just hate it, and I'll often just forget registering for anything requiring it.


Having said that, I don't blame you at all. I'm just waiting for Deb to get her laptop fixed so we can get rid of the 300+ spam comments on MarineCorpsMoms, since I apparently don't have permission to delete stuff. I hate spammers.

Posted by: Cindy at April 16, 2007 2:01 PM

"What are those words?"

This is why, whenever it comes up on this blog (rarely, but it sometimes does), I type in pig latin "ornpay."

Posted by: Patti M. at April 16, 2007 3:14 PM

Cindy: I hate CAPTCHA, and while I won't rule anything out, I consider it a last resort. If I had to use CAPTCHA I would prefer to give people a choice: allow anonymous visitors to use CAPTCHA and allow repeat visitors to have sign-in accounts.

Patti: It should be pretty safe now for people who are regulars on this blog to say things like "debt consolidation." But only because of the bonus points.

Posted by: James at April 16, 2007 4:36 PM

BY the way, I do not ever post the email addresses that people use when they comment on this blog. They're just used to let me know who is commenting, and so I can reply in email if I want to.

There used to be a note about that on the comment form, but it's gone now. I'll put it back when I get a chance.

Posted by: James at April 16, 2007 4:53 PM

I mentioned it for the specific reason that you might try to reply to that email address. :)

Posted by: Julie at April 16, 2007 5:11 PM

I think it's a good solution. A good heuristic algorithm can be trained to handle most of the junk in the short term. And since you are thinking of switching blogSW, it certainly is good enough for now.

Posted by: briwei at April 16, 2007 5:52 PM

That's what I figure. Good enough for now. Until spammers change. They're like the freaking Borg.

Posted by: James at April 16, 2007 6:42 PM

I saw a slightly tamer sort of CAPTCHA verification the other day, but I have no idea how well it works.

I forget the blog I was on, but the CAPTCHA picture contained several different series of numbers, and the text next to the image said "Click 2344 to proceed." If you didn't click the right spot, it didn't work, but you didn't need to type anything in.

Posted by: Chuck S. at April 17, 2007 1:24 PM

I have a friend who uses a simple math problem instead of a captcha.

Posted by: Julie at April 17, 2007 1:49 PM

Well that's one way to keep me from posting here.

Posted by: Patti M. at April 17, 2007 3:49 PM

