HighDots Forums  

Google ignoring noindex META Tag

Search Engine Optimization Discussion about SEO/Search Engine Optimization (alt.internet.search-engines)


Discuss Google ignoring noindex META Tag in the Search Engine Optimization forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Rik
 
Posts: n/a

Default Google ignoring noindex META Tag - 06-07-2006 , 05:54 PM






I have recently noticed some of my pages showing up in the Google cache

even though the page contained a "noindex" META Tag. These are private
pages for inter office use and are not meant for public display.

Is there another META tag that will prevent Google from caching these
pages?

Since the pages are not meant for public view, I have just re-named the

files so anyone that may click them from Google will just get my not
found page.

My problem is that I really have no way to keep up with pages which
Google has ignored my noindex META. I have now included the noarchive
meta in the hopes the Googlebot might understand that one.

Any suggestions?

Rik

P.S. I posted this question in the public forum but received no
suggestions so I'm hoping some-one in here may have run in to this
problem. Please pardon my cross post.


Reply With Quote
  #2  
Old   
Paul
 
Posts: n/a

Default Re: Google ignoring noindex META Tag - 06-07-2006 , 06:03 PM






On 7 Jun 2006 14:54:58 -0700, "Rik" <rik (AT) rmcaudio (DOT) com> wrote:

Quote:
I have recently noticed some of my pages showing up in the Google cache

even though the page contained a "noindex" META Tag. These are private
pages for inter office use and are not meant for public display.

Is there another META tag that will prevent Google from caching these
pages?

Since the pages are not meant for public view, I have just re-named the

files so anyone that may click them from Google will just get my not
found page.

My problem is that I really have no way to keep up with pages which
Google has ignored my noindex META. I have now included the noarchive
meta in the hopes the Googlebot might understand that one.

Any suggestions?

Rik

P.S. I posted this question in the public forum but received no
suggestions so I'm hoping some-one in here may have run in to this
problem. Please pardon my cross post.
What about password protected files ?
plh
Paul

--

----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----


Reply With Quote
  #3  
Old   
Big Bill
 
Posts: n/a

Default Re: Google ignoring noindex META Tag - 06-07-2006 , 07:25 PM



On 7 Jun 2006 14:54:58 -0700, "Rik" <rik (AT) rmcaudio (DOT) com> wrote:

Quote:
I have recently noticed some of my pages showing up in the Google cache

even though the page contained a "noindex" META Tag. These are private
pages for inter office use and are not meant for public display.

Is there another META tag that will prevent Google from caching these
pages?

Since the pages are not meant for public view, I have just re-named the

files so anyone that may click them from Google will just get my not
found page.

My problem is that I really have no way to keep up with pages which
Google has ignored my noindex META. I have now included the noarchive
meta in the hopes the Googlebot might understand that one.

Any suggestions?
Use robots.txt to exclude the files. You can't do this with the old
ones but you can with any new ones you make.

You know robots.txt?

Have a read;

http://www.robotstxt.org/wc/robots.html

BB


--

http://www.kruse.co.uk/seo-services.htm
http://www.here-be-posters.co.uk/lempicka-prints.htm
http://www.crystal-liaison.com/armani/index.html



Reply With Quote
  #4  
Old   
Paul
 
Posts: n/a

Default Re: Google ignoring noindex META Tag - 06-07-2006 , 07:34 PM



On Wed, 07 Jun 2006 23:25:33 GMT, Big Bill <kruse (AT) cityscape (DOT) co.uk>
wrote:

Quote:
On 7 Jun 2006 14:54:58 -0700, "Rik" <rik (AT) rmcaudio (DOT) com> wrote:

I have recently noticed some of my pages showing up in the Google cache

even though the page contained a "noindex" META Tag. These are private
pages for inter office use and are not meant for public display.

Is there another META tag that will prevent Google from caching these
pages?

Since the pages are not meant for public view, I have just re-named the

files so anyone that may click them from Google will just get my not
found page.

My problem is that I really have no way to keep up with pages which
Google has ignored my noindex META. I have now included the noarchive
meta in the hopes the Googlebot might understand that one.

Any suggestions?

Use robots.txt to exclude the files. You can't do this with the old
ones but you can with any new ones you make.

You know robots.txt?

Have a read;

http://www.robotstxt.org/wc/robots.html

BB
Only works with good bots though BB.

password protected is far better.
plh
Paul

--

----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----


Reply With Quote
  #5  
Old   
Rik
 
Posts: n/a

Default Re: Google ignoring noindex META Tag - 06-07-2006 , 07:43 PM




Paul wrote:
Quote:
On 7 Jun 2006 14:54:58 -0700, "Rik" <rik (AT) rmcaudio (DOT) com> wrote:

I have recently noticed some of my pages showing up in the Google cache

even though the page contained a "noindex" META Tag. These are private
pages for inter office use and are not meant for public display.

Is there another META tag that will prevent Google from caching these
pages?

Since the pages are not meant for public view, I have just re-named the

files so anyone that may click them from Google will just get my not
found page.

My problem is that I really have no way to keep up with pages which
Google has ignored my noindex META. I have now included the noarchive
meta in the hopes the Googlebot might understand that one.

Any suggestions?

Rik

P.S. I posted this question in the public forum but received no
suggestions so I'm hoping some-one in here may have run in to this
problem. Please pardon my cross post.

What about password protected files ?
plh
Paul

The page that contains the links leading to our private pages is
password protected. That navagation page resides in a folder that is
disallowed through my robots.txt file.

The private pages in question reside in folders that contain public
pages so I was afraid to disallow anything in those folders using the
robots.txt file for fear of the bot ignoring the folder.
That's why I chose to use the noindex meta on the individual pages.

Is it common for Google to ignore META tags like the noindex,noarchive
I am currently using? I have seen Google ignore my robots.txt file
before but this is the first time I have seen them ignore the noindex
command.



Reply With Quote
  #6  
Old   
Roy Schestowitz
 
Posts: n/a

Default Re: Google ignoring noindex META Tag - 06-07-2006 , 10:59 PM



__/ [ Rik ] on Thursday 08 June 2006 00:43 \__

Quote:
Paul wrote:
On 7 Jun 2006 14:54:58 -0700, "Rik" <rik (AT) rmcaudio (DOT) com> wrote:

I have recently noticed some of my pages showing up in the Google cache

I will gladly take this 'problem' off your hands. Google Cache has been
problematic in recent months.


Quote:
even though the page contained a "noindex" META Tag. These are private
pages for inter office use and are not meant for public display.

Is there another META tag that will prevent Google from caching these
pages?

Meta tags are not most reliable as not every crawler/cacher will honour them.
Exclusions using robots.txt likewise and, in a sense, they are even worse as
they publicly expose the listing of potentially 'sensitive' pages.


Quote:
Since the pages are not meant for public view, I have just re-named the

files so anyone that may click them from Google will just get my not
found page.

My problem is that I really have no way to keep up with pages which
Google has ignored my noindex META. I have now included the noarchive
meta in the hopes the Googlebot might understand that one.

Any suggestions?

Also see the following:

http://www.i18nguy.com/markup/metatags.html


Quote:
Rik

P.S. I posted this question in the public forum but received no
suggestions so I'm hoping some-one in here may have run in to this
problem. Please pardon my cross post.

I notice that Paul (or you) has reduced the distributions


Quote:
What about password protected files ?

I would suggest the same. Too many times in the past I had my 'hidden' pages
indexed. This was a bit embarrassing at time. The bigger issue is with cache
as information is no longer in your control and cannot be removed from the
public eye immediately. If you call Google, however, and follow the correct
route, then you can request that they remove unwanted cache.


Quote:
The page that contains the links leading to our private pages is
password protected. That navagation page resides in a folder that is
disallowed through my robots.txt file.

The private pages in question reside in folders that contain public
pages so I was afraid to disallow anything in those folders using the
robots.txt file for fear of the bot ignoring the folder.
That's why I chose to use the noindex meta on the individual pages.

Is it common for Google to ignore META tags like the noindex,noarchive
I am currently using? I have seen Google ignore my robots.txt file
before but this is the first time I have seen them ignore the noindex
command.

I think I have heard similar stories. They should never be trusted and there
is also a certain need for careful testing of the files, for which I know
no tools.

It's the same situation with "X-No-Archive: Yes" in newsgroups. Too many
ratbots and aggregators ignore these and, once somebody replies to messages,
all protection is stripped off. You can think of this as the equivalent of
someone scraping your 'noindex' pages, putting them in public space
elsewhere.

Best wishes,

Roy

--
Roy S. Schestowitz | Open Source Othello: http://othellomaster.com
http://Schestowitz.com | SuSE GNU/Linux ¦ PGP-Key: 0x74572E8E
3:45am up 41 days 9:18, 11 users, load average: 0.30, 0.70, 0.87
http://iuron.com - help build a non-profit search engine


Reply With Quote
  #7  
Old   
Paul
 
Posts: n/a

Default Re: Google ignoring noindex META Tag - 06-08-2006 , 02:33 AM



On Thu, 08 Jun 2006 03:59:01 +0100, Roy Schestowitz
<newsgroups (AT) schestowitz (DOT) com> wrote:

Quote:
I notice that Paul (or you) has reduced the distributions
eh ? In Engish ?

plh
Paul

--

----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----


Reply With Quote
  #8  
Old   
Big Bill
 
Posts: n/a

Default Re: Google ignoring noindex META Tag - 06-08-2006 , 03:15 AM



On 7 Jun 2006 16:43:17 -0700, "Rik" <rik (AT) rmcaudio (DOT) com> wrote:

Quote:
Paul wrote:
On 7 Jun 2006 14:54:58 -0700, "Rik" <rik (AT) rmcaudio (DOT) com> wrote:

I have recently noticed some of my pages showing up in the Google cache

even though the page contained a "noindex" META Tag. These are private
pages for inter office use and are not meant for public display.

Is there another META tag that will prevent Google from caching these
pages?

Since the pages are not meant for public view, I have just re-named the

files so anyone that may click them from Google will just get my not
found page.

My problem is that I really have no way to keep up with pages which
Google has ignored my noindex META. I have now included the noarchive
meta in the hopes the Googlebot might understand that one.

Any suggestions?

Rik

P.S. I posted this question in the public forum but received no
suggestions so I'm hoping some-one in here may have run in to this
problem. Please pardon my cross post.

What about password protected files ?
plh
Paul


The page that contains the links leading to our private pages is
password protected. That navagation page resides in a folder that is
disallowed through my robots.txt file.

The private pages in question reside in folders that contain public
pages so I was afraid to disallow anything in those folders using the
robots.txt file for fear of the bot ignoring the folder.
That's why I chose to use the noindex meta on the individual pages.

Is it common for Google to ignore META tags like the noindex,noarchive
I am currently using? I have seen Google ignore my robots.txt file
before but this is the first time I have seen them ignore the noindex
command.
I don't think anything takes any notice of meta commands.

BB
--

http://www.kruse.co.uk/seo-services.htm
http://www.here-be-posters.co.uk/lempicka-prints.htm
http://www.crystal-liaison.com/armani/index.html



Reply With Quote
  #9  
Old   
Eric Johnston
 
Posts: n/a

Default Re: Google ignoring noindex META Tag - 06-08-2006 , 04:17 AM




"Paul" <lamewolf2004[REMOVE]@yahoo.com> wrote

Quote:
On Wed, 07 Jun 2006 23:25:33 GMT, Big Bill <kruse (AT) cityscape (DOT) co.uk
wrote:

On 7 Jun 2006 14:54:58 -0700, "Rik" <rik (AT) rmcaudio (DOT) com> wrote:

I have recently noticed some of my pages showing up in the Google cache

even though the page contained a "noindex" META Tag. These are private
pages for inter office use and are not meant for public display.

Is there another META tag that will prevent Google from caching these
pages?

Since the pages are not meant for public view, I have just re-named the

files so anyone that may click them from Google will just get my not
found page.

My problem is that I really have no way to keep up with pages which
Google has ignored my noindex META. I have now included the noarchive
meta in the hopes the Googlebot might understand that one.

Any suggestions?

Use robots.txt to exclude the files. You can't do this with the old
ones but you can with any new ones you make.

You know robots.txt?

Have a read;

http://www.robotstxt.org/wc/robots.html

BB

Only works with good bots though BB.

password protected is far better.
plh
Paul

--

----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet
News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+
Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption
=----
See this http://www.google.co.uk/intl/en/webmasters/remove.html

Also, it is a good idea to have a default home page in every subdirectory
(e.g. called index.html) to prevent the server revealing listings of all
files present.

Best regards, Eric.




Reply With Quote
  #10  
Old   
Eric Johnston
 
Posts: n/a

Default Re: Google ignoring noindex META Tag - 06-08-2006 , 04:34 AM




"Eric Johnston" <nospam (AT) redyonder (DOT) co.uk> wrote

Quote:
"Paul" <lamewolf2004[REMOVE]@yahoo.com> wrote in message
news:ejoe821oq38jo6oatjs6162ikda8h70jgh (AT) 4ax (DOT) com...
On Wed, 07 Jun 2006 23:25:33 GMT, Big Bill <kruse (AT) cityscape (DOT) co.uk
wrote:

On 7 Jun 2006 14:54:58 -0700, "Rik" <rik (AT) rmcaudio (DOT) com> wrote:

I have recently noticed some of my pages showing up in the Google cache

even though the page contained a "noindex" META Tag. These are private
pages for inter office use and are not meant for public display.

Is there another META tag that will prevent Google from caching these
pages?

Since the pages are not meant for public view, I have just re-named the

files so anyone that may click them from Google will just get my not
found page.

My problem is that I really have no way to keep up with pages which
Google has ignored my noindex META. I have now included the noarchive
meta in the hopes the Googlebot might understand that one.

Any suggestions?

Use robots.txt to exclude the files. You can't do this with the old
ones but you can with any new ones you make.

You know robots.txt?

Have a read;

http://www.robotstxt.org/wc/robots.html

BB

Only works with good bots though BB.

password protected is far better.
plh
Paul

--

----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet
News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+
Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption
=----

See this http://www.google.co.uk/intl/en/webmasters/remove.html

Also, it is a good idea to have a default home page in every subdirectory
(e.g. called index.html) to prevent the server revealing listings of all
files present.

Best regards, Eric.
Further to this it is also a good idea to make sure the Google toolbar PR
display is turned off whenever you or your colleagues view your private
documents otherwise you are telling Google the file names. (read the
privacy implications about the PR display
http://www.google.com/support/toolba...cy&hl=en&v=3.0 )

Ideally, of course, all your private documents should be simply deleted from
the public area of your server.

Best regards, Eric.




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.