Come and take a look at the website's robots that block Baidu spiders

by hexiaoyao on 2008-09-19 20:53:32

The recent C2C battle between Baidu and Taobao has been intense. Taobao blocked Baidu's web crawler, and subsequently, Sohu Blogs, 51.com, Xiaonei (a social networking site), and Hainei also blocked Baidu’s spider. Have you ever thought about checking their robots.txt files?

Here are the addresses where you can go and take a look for fun, or out of sheer boredom if you like:

Taobao: http://www.taobao.com/robots.txt

-----------------------------------------------------Evil dividing line------------------------

User-agent: Baiduspider

Disallow: /

User-agent: baiduspider

Disallow: /

-----------------------------------------------------Evil dividing line------------------------

Xiaonei: http://www.xiaonei.com/robots.txt

-----------------------------------------------------Evil dividing line------------------------

# Robots.txt file from http://www.xiaonei.com

# All robots will spider the domain

User-agent: BaiduSpider

Disallow: /

-----------------------------------------------------Evil dividing line------------------------

Hainei: http://www.hainei.com/robots.txt

It’s fierce — all directories are off-limits to all search engine crawlers…

Come and take a look at the website's robots that block Baidu spiders - Age1983