HighDots Forums  

Spectacular Googlebot arrival

Search Engine Optimization Discussion about SEO/Search Engine Optimization (alt.internet.search-engines)


Discuss Spectacular Googlebot arrival in the Search Engine Optimization forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Phil Payne
 
Posts: n/a

Default Spectacular Googlebot arrival - 05-05-2006 , 06:17 AM






At around 09:00 UTC this morning.

First it tried to get 0T717Q3K81F45P9CHK78.htm - obviously a 404
functionality test.

Then it proceeded to download the site. Yes, all of it. We'll see
what turns up in the SERPs.


Reply With Quote
  #2  
Old   
Roy Schestowitz
 
Posts: n/a

Default Re: Spectacular Googlebot arrival - 05-05-2006 , 06:23 AM






__/ [ Phil Payne ] on Friday 05 May 2006 11:17 \__

Quote:
At around 09:00 UTC this morning.

First it tried to get 0T717Q3K81F45P9CHK78.htm - obviously a 404
functionality test.

Then it proceeded to download the site. Yes, all of it. We'll see
what turns up in the SERPs.
How many pages in total? Googlebot never appears to do 404 tests. Neither
do MSNBot, Yahoo/Inktom Slurp and other noticeable spiders (albeit Yahoo
used to be so buggy, so it crawled incorrectly to request wrong files from
the wrong sites). What I am trying to suggest that somebody may have
forged user-agent. It's very simple to do this. It gives a cloak of
stealth to someone wishing to rip off your site entirely, possibly using a
grabber, e.g.

wget -R --user-agent="Googlebot whatever..." your_site_URL

Best wishes,

Roy

--
Roy S. Schestowitz, Ph.D. Candidate (Medical Biophysics)
http://Schestowitz.com | Free as in Free Beer ¦ PGP-Key: 0x74572E8E
11:15am up 7 days 18:12, 12 users, load average: 1.02, 0.86, 0.76
http://iuron.com - semantic engine to gather information


Reply With Quote
  #3  
Old   
Roy Schestowitz
 
Posts: n/a

Default Re: Spectacular Googlebot arrival - 05-05-2006 , 06:25 AM



__/ [ Roy Schestowitz ] on Friday 05 May 2006 11:23 \__

Quote:
__/ [ Phil Payne ] on Friday 05 May 2006 11:17 \__

At around 09:00 UTC this morning.

First it tried to get 0T717Q3K81F45P9CHK78.htm - obviously a 404
functionality test.

Then it proceeded to download the site. Yes, all of it. We'll see
what turns up in the SERPs.

How many pages in total? Googlebot never appears to do 404 tests. Neither
do MSNBot, Yahoo/Inktom Slurp and other noticeable spiders (albeit Yahoo
used to be so buggy, so it crawled incorrectly to request wrong files from
the wrong sites). What I am trying to suggest that somebody may have
forged user-agent. It's very simple to do this. It gives a cloak of
stealth to someone wishing to rip off your site entirely, possibly using a
grabber, e.g.

wget -R --user-agent="Googlebot whatever..." your_site_URL

Best wishes,

Roy
PS: get the IP address/es from the logs and run a reverse DNS lookup:

http://remote.12dt.com/rns/

Was it truly Google? The last thing you want is someone mirroring your site
or using it as a starting point.


Reply With Quote
  #4  
Old   
Phil Payne
 
Posts: n/a

Default Re: Spectacular Googlebot arrival - 05-05-2006 , 06:38 AM



You may well be right. It doesn't look like a Google dotted quad:

2006-05-05 08:56:53 212.94.37.218 - W3SVC19 WWW4 217.161.12.181 80 GET
/0T717Q3K81F45P9CHK78.htm - 404 2 4203 169 0 HTTP/1.1
www.hotlines.co.uk Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1) -
-

2006-05-05 08:58:30 212.94.37.218 - W3SVC19 WWW4 217.161.12.181 80 GET
/index.html - 200 0 4543 168 453 HTTP/1.1 www.hotlines.co.uk
Mozilla/5.0+(compatible;+Googlebot/2.42;++http://www.google.com/bot.html)
- -

Oh, well. Here comes YET ANOTHER six-month penalty from Google for
duplicate content.


Reply With Quote
  #5  
Old   
Roy Schestowitz
 
Posts: n/a

Default Re: Spectacular Googlebot arrival - 05-05-2006 , 06:47 AM



__/ [ Phil Payne ] on Friday 05 May 2006 11:38 \__

Quote:
You may well be right. It doesn't look like a Google dotted quad:

2006-05-05 08:56:53 212.94.37.218 - W3SVC19 WWW4 217.161.12.181 80 GET
/0T717Q3K81F45P9CHK78.htm - 404 2 4203 169 0 HTTP/1.1
www.hotlines.co.uk Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1) -
-

2006-05-05 08:58:30 212.94.37.218 - W3SVC19 WWW4 217.161.12.181 80 GET
/index.html - 200 0 4543 168 453 HTTP/1.1 www.hotlines.co.uk
Mozilla/5.0+(compatible;+Googlebot/2.42;++http://www.google.com/bot.html)
- -

Oh, well. Here comes YET ANOTHER six-month penalty from Google for
duplicate content.
Not necessarily. You look too far ahead. The violator seems to have come
from http://www. softplus. net/ [collapse spaces]. Consider contacting
abuse@that domain while quoting the bits above for backing. I too had some
people harvesting my site very heavily. I never found out why, but I knew
where it all came from.

Best wishes,

Roy

--
Roy S. Schestowitz | Useless fact: 12345679 x 8 = 98765432
http://Schestowitz.com | Open Prospects ¦ PGP-Key: 0x74572E8E
11:40am up 7 days 18:37, 12 users, load average: 1.00, 0.62, 0.58
http://iuron.com - knowledge engine, not a search engine


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.