My friend, Matt has a somewhat sensitive job. He keeps a blog here on Tabulas, but he'd rather not have search engines pick up his information (normal privacy concerns). So he logically set the option that Tabulas has: "Prevent search engines from crawling my site." This basically inserts a <meta name="robots" content="noindex, nofollow" /> into the header of his document, which is kindly asking search engines not to index or follow links on this particular page.

Matt recently told me that Live.com was picking up his Tabulas (my previous review on live.com here), so I went to investergate (that word sounds pretty cool, donchathink?)

Google, and Yahoo! have absolutely no problem not displaying the site, because they respect that tag. But let's check out MSN/Live.com: 3rd result.

Now, in MSN's defense, it doesn't look like they're actually crawling the site, but just indexing the link. How is this any better? The intent of nofollow,noindex is: "I don't want my stuff cached by any search engines, ever". I hope this is a bug, and not the normal operating procedure from the MSN team, otherwise I'm going to be very dissappointed. Why would you completely disrespect a golden standard that web publishers and search engines have both been supporting for years?

MSN/Microsoft, please play nice. Scoble's been doing such a good job giving you guys positive karma to the developer community... just don't shaft me like this.

I may only have 14,000 URLs in MSN's database, but I have absolutely no qualms about completely blocking MSN's search bots from Tabulas if they cannot respect noindex,nofollow. I'll do it for my users.

Posted by roy on March 21, 2006 at 06:50 PM in Ramblings | 2 Comments

Related Entries

Linked Entries

These are Tabulas entries which have linked to this particular entry.

Want to comment with Tabulas?. Please login.

Comment posted on March 21st, 2006 at 10:41 PM
your lifeblood is google anyway =P down with big evil!

user (guest)

Comment posted on March 21st, 2006 at 10:32 PM
<a href="http://www.mattcutts.com/blog/googlebot-keep-out/" rel="nofollow">http://www.mattcutts.com/blog/googlebot-keep-out/</a>

That's some good reading from Google's point of view.