![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Google is not obeying robots.txt at all now. Here is the typical Drupal robots.txt with a few modifications: User-agent: * Disallow: /aggregator Disallow: /tracker Disallow: /comment/reply Disallow: /node/add Disallow: /user Disallow: /search Disallow: /admin Google has quite a few pages like the following indexed: /comment/reply/1 /user/register /aggregator/sources/1 /user/password /tracker Has anyone else seen this problem? (or is there a mistake in robots.txt?) This is the second site that I have seen it on. |
#3
| |||
| |||
|
|
Google is not obeying robots.txt at all now. Here is the typical Drupal robots.txt with a few modifications: User-agent: * Disallow: /aggregator Disallow: /tracker Disallow: /comment/reply Disallow: /node/add Disallow: /user Disallow: /search Disallow: /admin Google has quite a few pages like the following indexed: /comment/reply/1 /user/register /aggregator/sources/1 /user/password /tracker Has anyone else seen this problem? (or is there a mistake in robots.txt?) This is the second site that I have seen it on. |
#4
| |||
| |||
|
|
Google is not obeying robots.txt at all now. Here is the typical Drupal robots.txt with a few modifications: User-agent: * Disallow: /aggregator Disallow: /tracker Disallow: /comment/reply Disallow: /node/add Disallow: /user Disallow: /search Disallow: /admin Google has quite a few pages like the following indexed: /comment/reply/1 /user/register /aggregator/sources/1 /user/password |
#5
| |||
| |||
|
|
wd wrote: Google is not obeying robots.txt at all now. Here is the typical Drupal robots.txt with a few modifications: User-agent: * Disallow: /aggregator Disallow: /tracker Disallow: /comment/reply Disallow: /node/add Disallow: /user Disallow: /search Disallow: /admin Google has quite a few pages like the following indexed: /comment/reply/1 /user/register /aggregator/sources/1 /user/password Sounds interesting. What's in /user/password? |
#6
| |||
| |||
|
|
Brian Wakem <no (AT) email (DOT) com> wrote: wd wrote: Google is not obeying robots.txt at all now. Here is the typical Drupal robots.txt with a few modifications: User-agent: * Disallow: /aggregator Disallow: /tracker Disallow: /comment/reply Disallow: /node/add Disallow: /user Disallow: /search Disallow: /admin Google has quite a few pages like the following indexed: /comment/reply/1 /user/register /aggregator/sources/1 /user/password Sounds interesting. What's in /user/password? 151,000 answers: http://www.google.com/search?q=inurl...er/password%22 |
#7
| |||
| |||
|
|
wd wrote: Google is not obeying robots.txt at all now. Here is the typical Drupal robots.txt with a few modifications: User-agent: * Disallow: /aggregator Disallow: /tracker Disallow: /comment/reply Disallow: /node/add Disallow: /user Disallow: /search Disallow: /admin Google has quite a few pages like the following indexed: /comment/reply/1 /user/register /aggregator/sources/1 /user/password Sounds interesting. What's in /user/password? |
It's a standard page in the Drupal
#8
| |||
| |||
|
|
wd wrote: Google is not obeying robots.txt at all now. Here is the typical Drupal robots.txt with a few modifications: User-agent: * Disallow: /aggregator Disallow: /tracker Disallow: /comment/reply Disallow: /node/add Disallow: /user Disallow: /search Disallow: /admin Google has quite a few pages like the following indexed: /comment/reply/1 /user/register /aggregator/sources/1 /user/password /tracker Has anyone else seen this problem? (or is there a mistake in robots.txt?) This is the second site that I have seen it on. Try a trailing slash if you exclude directories. It may or may not help. |
#9
| |||
| |||
|
|
It could be, as can always be the case, that a less conscientious spider is indexing the disallowed files and that Google is in turn indexing them from there. If you don't want it indexed, don't put it on the web. |
#10
| |||
| |||
|
|
PS - I think that's what they call Google hacking. |
![]() |
| Thread Tools | |
| Display Modes | |
| |