There goes a little practical hint on how to control how the webcrawlers (robots, spiders) out there see your web site. Might be useful in your html header section.
content="robots-terms" is a comma separated list used in the Robots META Tag that may contain one or more of the following keywords without regard to case: noindex, nofollow, all, index and follow.
Page may not be indexed by a search service.
<meta name="robots" content="noindex">
Robots are not to follow links from this page.
<meta name="robots" content="nofollow">
Admin Note: The robots directives of
index, follow or
all are not required as it is the default behavior of indexing spiders.
<meta name="robots" content="index, follow">
<meta name="robots" content="all">
Robots are welcome to include this page in search services.
<meta name="robots" content="index">
Robots are welcome to follow links from this page to find other pages.
<meta name="robots" content="follow">
If this meta tag is missing, or if there is no content, or the robot terms are not specified, then the robot terms will be assumed to be
"index, follow" (e.g.
"all"). If the keyword
all is found in the robots terms list it overrides all other values. That is, a robots terms that is “nofollow, all, noindex, nofollow”, would effectively be “all”.
If the robots terms contains contradictory information (e.g.
"follow, nofollow, follow") then the robot is free to do whatever it wishes with regard to the behavior being addressed (in this case the follow behavior).
There is also another way to provide control over the robots/crawlers: You can also use a robots.txt file inside your webroot. The robots file must be accessed as www.yoursite.com/robots.txt, this means the file has to be in the root of your website.
The content of a robots.txt file might look like this :
# robots.txt generated at http://www.mcanerin.com User-agent: Googlebot Disallow: / User-agent: googlebot-image Disallow: / User-agent: * Disallow: Disallow: /cgi-bin/ Sitemap: http://www.yoursite.com/sitemap.gz
You might also use a good generator like this one:
Everything you have to know about robots can be found at http://www.robotstxt.org