interesting-people message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [interesting-people Home]


Subject: [IP] New whitehouse.gov robots.txt file




Begin forwarded message:

From: Joseph Lorenzo Hall <joehall@gmail.com>
Date: January 21, 2009 8:03:57 AM EST
To: Dave Farber <dave@farber.net>
Subject: New whitehouse.gov robots.txt file

(see here: http://www.kottke.org/09/01/the-countrys-new-robotstxt-file
via Aaron Burstein)

Hi Dave,

Here's another fascinating sign of increased transparency in the new
administration:

The whitehouse.gov robots.txt file -- a file that specifies what areas
of a web site that web spiders may crawl[1] -- has gone from 2400
lines to just two lines:

  User-agent: *
  Disallow: /includes/

This means that most of whitehouse.gov will now be available to search
engines and other web resources that use automated crawlers to
retrieve, index, etc. content.

best, Joe

[1]: http://en.wikipedia.org/wiki/Robots.txt

--
Joseph Lorenzo Hall
ACCURATE Postdoctoral Research Associate
UC Berkeley School of Information
Princeton Center for Information Technology Policy
http://josephhall.org/




-------------------------------------------


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [interesting-people Home]


Powered by eList eXpress LLC