"Paul Zhao" <ppzhao (AT) aol (DOT) com> wrote:
Quote:
My question before was when you do site:washingtonpost.com,
none of the pages are cached. My original finding is the tags
below: |
<META CONTENT="NO-CACHE" HTTP-EQUIV="CACHE-CONTROL">
<META CONTENT="NO-CACHE" HTTP-EQUIV="PRAGMA">
The purpose of those tags applies to the web-browser on the client
side and to proxy servers which might cache content providing access
to a network of systems. If the browser or the proxy does not cache
the page, it always reloads the page from the source, meaning the
client browser asks to reload, it doesn't reload the page from its own
cache, it sends a request out, and if there happens to be a proxy
server in between the client computer and the source website, the
proxy then looks at those tags and if it needs to reload it sends the
request for the page to the original source. It has nothing to do with
Google caching the content of the page.
The actual caching of the content is controlled by a robots.txt
or by a:
<meta name="robots" content="noindex,nofollow" />
Google provides a meta tag as well:
<meta name="googlebot" content="noindex" />
See:
http://www.google.com/webmasters/bot.html
Hope this helps.
--
Jim Carlock
http://www.aquaticcreationsnc.com/
Post replies to the group.