HighDots Forums  

Meta robots noarchive

Search Engine Optimization Discussion about SEO/Search Engine Optimization (alt.internet.search-engines)


Discuss Meta robots noarchive in the Search Engine Optimization forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
MyndPhlyp
 
Posts: n/a

Default Meta robots noarchive - 11-06-2004 , 10:45 AM






I've been looking ... and looking ... and looking, but I have yet to find an
authoritative and official source defining the robots meta tag's valid
content values and combination with respect to noarchive.

Google's information hints that noarchive can be combined with index|noindex
and follow|nofollow as do several other unofficial sites and forums.

robotstxt.org knows nothing about noarchive but that is to be expected since
it really pertains only to the robots.txt file.

w3.org mentions the robots meta tag but lists only index|noindex and
follow|nofollow as possibilities.

Is something like the following valid?

<meta name="robots" content="index, nofollow, noarchive">

Or is noarchive really only a Google thing that should have its own meta tag
like the following?

<meta name="googlebot" content="noarchive">

And if the latter is true, how does one go about indicating noarchive to all
the other bots?

I'd also really appreciate it if somebody could post a URL to an
authoritative and official source for this.



Reply With Quote
  #2  
Old   
Big Bill
 
Posts: n/a

Default Re: Meta robots noarchive - 11-06-2004 , 12:17 PM






On Sat, 06 Nov 2004 15:45:00 GMT, "MyndPhlyp" <nobody (AT) homeright (DOT) now>
wrote:

Quote:
I've been looking ... and looking ... and looking, but I have yet to find an
authoritative and official source defining the robots meta tag's valid
content values and combination with respect to noarchive.

Google's information hints that noarchive can be combined with index|noindex
and follow|nofollow as do several other unofficial sites and forums.

robotstxt.org knows nothing about noarchive but that is to be expected since
it really pertains only to the robots.txt file.
And these aren't robots.txt files we're discussing, that's something
completely different. I shouldn't bother with them as the engines seem
to have a kind of, "Yeah well, maybe.......(pauses for sulk)...... if
we're in the mood then we MIGHT take notice of them, K?"
bratty-teenager attitude towards them. Which for our devoutly
professional purposes is no good at all. So forget about them.

BB
www.kruse.co.uk SEO (AT) kruse (DOT) demon.co.uk


Reply With Quote
  #3  
Old   
MyndPhlyp
 
Posts: n/a

Default Re: Meta robots noarchive - 11-06-2004 , 01:24 PM




"Big Bill" <kruse (AT) cityscape (DOT) co.uk> wrote

Quote:
On Sat, 06 Nov 2004 15:45:00 GMT, "MyndPhlyp" <nobody (AT) homeright (DOT) now
wrote:

I've been looking ... and looking ... and looking, but I have yet to find
an
authoritative and official source defining the robots meta tag's valid
content values and combination with respect to noarchive.

Google's information hints that noarchive can be combined with
index|noindex
and follow|nofollow as do several other unofficial sites and forums.

robotstxt.org knows nothing about noarchive but that is to be expected
since
it really pertains only to the robots.txt file.

And these aren't robots.txt files we're discussing, that's something
completely different.
Right. I'm only interested in the <meta name="robots">, specifically the
so-called standard "noarchive" value in the "content=" portion, and whether
or not it is valid to use in conjunction with the (no)index and (no)follow.
There is lots of discussion out there, but nothing official and
authoritative. Everybody appears to be parroting each other. (Yeah, like
that never happens on the 'Net.) I would expect W3 to have the low down on
this but they don't seem to mention "noarchive" as a possible option for
this <meta> leading me to believe it is only a quasi-standard loosely
adopted (if that term can even be used).




Reply With Quote
  #4  
Old   
Big Bill
 
Posts: n/a

Default Re: Meta robots noarchive - 11-06-2004 , 03:09 PM



On Sat, 06 Nov 2004 18:24:08 GMT, "MyndPhlyp" <nobody (AT) homeright (DOT) now>
wrote:

Quote:
"Big Bill" <kruse (AT) cityscape (DOT) co.uk> wrote in message
news:5e1qo09stdu753n5jpco3uqvn18unonfp2 (AT) 4ax (DOT) com...
On Sat, 06 Nov 2004 15:45:00 GMT, "MyndPhlyp" <nobody (AT) homeright (DOT) now
wrote:

I've been looking ... and looking ... and looking, but I have yet to find
an
authoritative and official source defining the robots meta tag's valid
content values and combination with respect to noarchive.

Google's information hints that noarchive can be combined with
index|noindex
and follow|nofollow as do several other unofficial sites and forums.

robotstxt.org knows nothing about noarchive but that is to be expected
since
it really pertains only to the robots.txt file.

And these aren't robots.txt files we're discussing, that's something
completely different.

Right. I'm only interested in the <meta name="robots">, specifically the
so-called standard "noarchive" value in the "content=" portion, and whether
or not it is valid to use in conjunction with the (no)index and (no)follow.
There is lots of discussion out there, but nothing official and
authoritative. Everybody appears to be parroting each other. (Yeah, like
that never happens on the 'Net.) I would expect W3 to have the low down on
this but they don't seem to mention "noarchive" as a possible option for
this <meta> leading me to believe it is only a quasi-standard loosely
adopted (if that term can even be used).
Isn't that what I said?

BB
www.kruse.co.uk SEO (AT) kruse (DOT) demon.co.uk


Reply With Quote
  #5  
Old   
David Off
 
Posts: n/a

Default Re: Meta robots noarchive - 11-06-2004 , 05:34 PM



MyndPhlyp wrote:
Quote:
Right. I'm only interested in the <meta name="robots"
I see Big Bill's point. How Robots actually interpret the Robots Meta
tag can differ from the standard (like they may ignore it).

For Google the following seems to be true:

First of all you can use googlebot as the name if you wish to restrict
this action just to Google:

<meta name="googlebot" content="robots-terms">

The googlebot understands the following terms: noindex, nofollow, and
noarchive. The tag should, of course, be placed in the <HEAD> section of
your HTML file. The terms are a comma separated list.

noindex: The document should not be indexed by Googlebot.

nofollow: hyperlinks within the document will not be followed by the
googlebot - hmmm sneaky, could be useful for all you folks worried about
PageRank dilution :-)

noarchive: Don't store the doc in Google's cache

nosnippet: as noarchive but also do not return a snippet of the doc with
the query terms highlighted in the SERPS - not obeyed by google.

By default the terms are "index, follow"


Reply With Quote
  #6  
Old   
MyndPhlyp
 
Posts: n/a

Default Re: Meta robots noarchive - 11-07-2004 , 01:59 AM




"David Off" <david.off_dumpthisbit_ (AT) voila (DOT) fr> wrote

Quote:

Okay, so the drift I'm get here is that "noarchive" is a loose
adaptation/enhancement to the <meta name="robots"> tag, and that anything
outside the (no)index and (no)follow parameters is nonstandard. Ya?

Any firsthand knowledge of the impact from including nonstandard parameters
along with standard parameters in the <meta name="robots"> tag? For example,
if an arbitrary SE finds a <meta name="robots"> with "index, nofollow,
noarchive", will (should) it dismiss the entire <meta name="robots"> or
simply ignore the "noarchive" parameter?

Yeah, I know - each SE operates on its own set of rules and what might be
tolerable for or respected by one SE is not an indication of what other SE's
may tolerate or respect. I'm just trying to get a grip on what the SOP is
supposed to be for these types of situations. Whether or not all SE's follow
the rules is not so much an issue as to what the rules really are. W3 goes
through some pretty lengthy stuff on how to render HTML to the display but
they don't seem to have that level of detail on the <meta> tags.

If there is another official source that might answer my ponderings, by all
means point me in that direction.




Reply With Quote
  #7  
Old   
MyndPhlyp
 
Posts: n/a

Default Re: Meta robots noarchive - 11-07-2004 , 02:32 AM



Well, I managed to stumble upon some antique stuff (vintage 1996) on the
subject.

http://www.w3.org/Search/9605-Indexi.../Spidering.txt

http://www.kollar.com/robots.html

A few of the SE's got together with W3 (or vice versa) and got into a
similar discussion by the looks of things. The key phrase that stuck in my
mynd is that adding robot-specific permissions (e.g., "noarchive") to the
standard has been dropped.

Still looking for processing rules for <meta> tags though. I'm curious
whether the entire <meta> is supposed to be dropped if a nonstandard
parameter exists or if only the nonstandard parameter is supposed to be
ignored.

(Yes, I'm still aware that standards may or may not be followed and that
standards may or may not be interpreted correctly. I've accepted that as a
fact of life ever since I started in on RS-232 only to discover the only
part of that standard faithfully followed is pins 2, 3, and 7.)



Reply With Quote
  #8  
Old   
David Off
 
Posts: n/a

Default Re: Meta robots noarchive - 11-07-2004 , 06:54 AM



MyndPhlyp wrote:
Quote:
Well, I managed to stumble upon some antique stuff (vintage 1996) on the
subject.

http://www.w3.org/Search/9605-Indexi.../Spidering.txt

http://www.kollar.com/robots.html

A few of the SE's got together with W3 (or vice versa) and got into a
similar discussion by the looks of things. The key phrase that stuck in my
mynd is that adding robot-specific permissions (e.g., "noarchive") to the
standard has been dropped.
which is probably why nosnippet, which I think was a googlism was dropped.

Quote:
Still looking for processing rules for <meta> tags though. I'm curious
whether the entire <meta> is supposed to be dropped if a nonstandard
parameter exists or if only the nonstandard parameter is supposed to be
ignored.
Hmmm good question - one for the reverse engineering guys. I will try it
on one of my pages too and let you know the results.

Quote:
I started in on RS-232 only to discover the only
part of that standard faithfully followed is pins 2, 3, and 7.)
You are lucky that Microsoft were never interested in the RS232
interface! Still it is the prime example of something overengineered.


Reply With Quote
  #9  
Old   
MyndPhlyp
 
Posts: n/a

Default Re: Meta robots noarchive - 11-07-2004 , 01:09 PM




"David Off" <david.off_dumpthisbit_ (AT) voila (DOT) fr> wrote

Quote:
Still looking for processing rules for <meta> tags though. I'm curious
whether the entire <meta> is supposed to be dropped if a nonstandard
parameter exists or if only the nonstandard parameter is supposed to be
ignored.

Hmmm good question - one for the reverse engineering guys. I will try it
on one of my pages too and let you know the results.
Reverse engineering, or laying a trap? :-)

Quote:
I started in on RS-232 only to discover the only
part of that standard faithfully followed is pins 2, 3, and 7.)

You are lucky that Microsoft were never interested in the RS232
interface! Still it is the prime example of something overengineered.
You just HAD to mention the "M" word. That bunch of misfits couldn't follow
a standard if they were lashed to the lead horse. They can't even follow
their OWN internal so-called standards.




Reply With Quote
  #10  
Old   
MyndPhlyp
 
Posts: n/a

Default Re: Meta robots noarchive (ping: David Off) - 11-12-2004 , 08:01 AM




"David Off" <david.off_dumpthisbit_ (AT) voila (DOT) fr> wrote

Quote:
MyndPhlyp wrote:
Well, I managed to stumble upon some antique stuff (vintage 1996) on the
subject.


http://www.w3.org/Search/9605-Indexi.../Spidering.txt

http://www.kollar.com/robots.html

A few of the SE's got together with W3 (or vice versa) and got into a
similar discussion by the looks of things. The key phrase that stuck in
my
mynd is that adding robot-specific permissions (e.g., "noarchive") to
the
standard has been dropped.

which is probably why nosnippet, which I think was a googlism was dropped.


Still looking for processing rules for <meta> tags though. I'm curious
whether the entire <meta> is supposed to be dropped if a nonstandard
parameter exists or if only the nonstandard parameter is supposed to be
ignored.

Hmmm good question - one for the reverse engineering guys. I will try it
on one of my pages too and let you know the results.
Did you arrive at a conclusion after setting your traps?




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.