Barrack’s Robots.txt: Google, Yes You Can!
Barrack Obama’s new administration said Yes You Can to Googlebot on their first day in office, as reported by Jason Kottke.
According to Kottke, up until yesterday, the robots.txt file (the file on a web site which tells the search engines which pages to disallow) contained approx. 2400 exclusions including the following:
User-agent: *
Disallow: /cgi-bin
Disallow: /search
Disallow: /query.html
Disallow: /omb/search
Disallow: /omb/query.html
Disallow: /expectmore/search
Disallow: /expectmore/query.html
Disallow: /results/search
Disallow: /results/query.html
Disallow: /earmarks/search
Disallow: /earmarks/query.html
Disallow: /help
Disallow: /360pics/text
Disallow: /911/911day/text
Disallow: /911/heroes/text
Disallow: /911/messages/text
Disallow: /911/patriotism/text
Disallow: /911/patriotism2/text
Disallow: /911/progress/text
Disallow: /911/remembrance/text
Disallow: /911/response/text
Disallow: /911/sept112002/text
Disallow: /911/text
Disallow: /ConferenceAmericas/text
Disallow: /GOVERNMENT/text
Disallow: /africanamerican/text
Disallow: /africanamericanhistory/text
Disallow: /agencycontact/text
Disallow: /appointments/text
Disallow: /avian/text
Disallow: /avianflu/text
Disallow: /bioshield/text
Disallow: /birdflu/text
Disallow: /blackhistory/text
Disallow: /budget/text
So, Google, with the Bush administration, do not ‘expect more’… Being from across the pond, it is not my place to comment as to whether or not one was able to expect more from Bush and what was actually delivered. However, I think the world is excited by the prospect and the potential promise of a truly multicultural president at the highest auspices of the government. I digress.
Bush’s robot.txt says disallow information on 911 heroes, messages, patriotism - which to me seems sad. Ok, maybe they dont want to rank on the search engines for words within the messages left but isn’t this an important legacy? 911 is about many things, which is fiercely debated thoughout the web, but one of the biggest take-aways is the demonstration of the spirit of the American people which words struggle to sum up: courage, tenacity, kindness, selflessness, support.
Google, do not include information on our ‘bioshield’ - intriguing- ‘avianflu’ or our ‘budget’. Need we say more? But with the new dawn in US politics comes the launch of a new robots.txt and new ‘accessibility’ for the search engines.
Apparently Barrack Obama’s team, perhaps in an effort towards transparency(?), has removed all of these exclusions and has a robots.txt file which allows Googlebot to visit all of the files on its site, apart from the /include/ file. Interesting.
Looking at other people covering this on the web, from Jason’s post, I smiled at part of Ron’s comment on shimonsandler.com: “…would I need to disallow robots.txt file in my robots.txt file so the robots.txt file isn’t indexed?” which was possibly a serious comment but amusing nonetheless.
Thank you so much to Web Accessibility Guru , Ian, who forwarded this on to me this morning and it tickled me so much I had to commit ‘fingers to keys’ and share this one straight away.
Looking forward to the new edition of Build Your Own Web Site, Ian.
Category: internet, Barrack, robots.txt |
No Comments »