MSNbot Madness

If you’re experiencing an unexpected increase in bandwidth usage I recommend you take a look at your logfiles before celebrating higher traffic too much, you might find it’s all a Microsoft-induced mirage.

The other day I noticed one of my sites seemed to have been unusually busy and I checked the usage on the AWStats reports. Initially I noticed about 30Mb of hits from the Russian Federation, but soon realised that there was a much larger source of usage – nearly 1Gb of hits from MSNbot! That was actually twice the level of genuine traffic over the same period. Clearly something was amiss, and having read an article last week about the same robot apparently ignoring robots.txt files (MSNbot 2.0b is ignoring robots.txt and No Index meta tags) I went looking in the raw logfiles.

I soon found the problem – the bot was calling my blog feed about once every minute which meant about 56k every time. At that rate it adds up pretty quickly.  I first tried adding a line to the robots.txt file to restrict the frequency of bot requests but this had no effect. Next I tried blocking the blog feed directory but again the bot kept on requesting the feed. Eventually I was forced to try blocking msnbot from the entire site, and somewhat to my surprise it worked – I had been ready to use the htaccess file to deny the bot any access to the server at all.

Denying the bot for a while until Microsoft sort it out obviously means that I’ll lose rankings in MSN/Live, but the traffic from it is miniscule anyway compare to Google – 32 referrals compared to 828 for that site – so I can live with that if it means not getting hit for a Gb of pointless bandwidth in two weeks.

Get to know your logfiles – it could save you a lot of problems.

0saves
If you enjoyed this post, please consider leaving a comment or subscribing to the RSS feed to have future articles delivered to your feed reader.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>