HighDots Forums  

Re: I am giving up wikipedia mirror

Search Engine Optimization Discussion about SEO/Search Engine Optimization (alt.internet.search-engines)


Discuss Re: I am giving up wikipedia mirror in the Search Engine Optimization forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Big Bill
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-06-2006 , 03:40 PM






On Mon, 06 Mar 2006 19:32:17 GMT, Ignoramus23035
<ignoramus23035 (AT) NOSPAM (DOT) 23035.invalid> wrote:

Quote:
About 8 months ago I set a up a wikipedia mirror. I also let search
engines crawl it. It return, I got about $10 per day adsense earnings
and an incredible amount of hassle. Googlebot is completely out of
control and was mercilessly hammering my website. It does around 4
queries per second. I think that I pay in bandwidth about as much as I
make, plus I have a big headache.

So, I decided to keep wikipedia mirror (I use it as content for some
of my chapters), but I will no longer let search engines, especially
the badly behaving googlebot, index them.

Last night, I made changes for robots.txt, so far no effect.

I tried using sitemaps to tell googlebot not to crawl page more than
1x per months, but that made it only worse and bolder.

i
Take the pages down for a bit, then put them back up again. Let the
Googlebot get the idea that they aren't there. Also validate your
robots.txt.

BB
--

http://homepage.ntlworld.com/bill.kr...ird-prints.htm
http://www.crystal-liaison.com/harmo...dom/index.html
kruse (AT) crystal-liaison (DOT) com Gifty! Shiny! BB!


Reply With Quote
  #2  
Old   
William Tasso
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-06-2006 , 04:02 PM






Fleeing from the madness of the NTL jungle
Big Bill <kruse (AT) cityscape (DOT) co.uk> stumbled into
news:alt.internet.search-engines,alt.www.webmaster
and said:

Quote:
On Mon, 06 Mar 2006 19:32:17 GMT, Ignoramus23035
ignoramus23035 (AT) NOSPAM (DOT) 23035.invalid> wrote:
...
Last night, I made changes for robots.txt, so far no effect.

I tried using sitemaps to tell googlebot not to crawl page more than
1x per months, but that made it only worse and bolder.

Take the pages down for a bit, then put them back up again. Let the
Googlebot get the idea that they aren't there...
how long is a bit?

--
William Tasso

whither a trophy?


Reply With Quote
  #3  
Old   
GreyWyvern
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-06-2006 , 04:04 PM



And lo, William Tasso didst speak in
alt.internet.search-engines,alt.www.webmaster:

Quote:
Big Bill <kruse (AT) cityscape (DOT) co.uk> said:

Take the pages down for a bit, then put them back up again. Let the
Googlebot get the idea that they aren't there...

how long is a bit?
An 8th of a byte.

*rimshot

Grey

--
The technical axiom that nothing is impossible sinisterly implies the
pitfall corollary that nothing is ridiculous.
- http://www.greywyvern.com/orca#sear - Orca Search: Full-featured spider
and site-search engine


Reply With Quote
  #4  
Old   
Borek
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-06-2006 , 04:37 PM



On Mon, 06 Mar 2006 22:02:29 +0100, William Tasso <SpamBlocked (AT) tbdata (DOT) com>
wrote:

Quote:
Last night, I made changes for robots.txt, so far no effect.

I tried using sitemaps to tell googlebot not to crawl page more than
1x per months, but that made it only worse and bolder.

Take the pages down for a bit, then put them back up again. Let the
Googlebot get the idea that they aren't there...

how long is a bit?
Depends on definition of 'enough'

Best,
Borek
--
http://www.chembuddy.com
http://www.bpp.com.pl


Reply With Quote
  #5  
Old   
Borek
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-06-2006 , 04:44 PM



On Mon, 06 Mar 2006 22:05:18 +0100, Ignoramus23035
<ignoramus23035 (AT) NOSPAM (DOT) 23035.invalid> wrote:

Quote:
Take the pages down for a bit, then put them back up again. Let the
Googlebot get the idea that they aren't there. Also validate your
robots.txt.

That's an interesting idea. If I can get googlebot to crawl a lot less
often, I would certainly like to resume.
I have an idea what to do in such situation, but I have no place to test
it, seems you may be able to try. Check the user agent string - and if it
is googlebot, delay answer for - say - 5 seconds. If the next query is not
sent before receiving answer to the previous one, you will force googlebot
to slow down.

Best,
Borek
--
http://www.chembuddy.com
http://www.bpp.com.pl


Reply With Quote
  #6  
Old   
Big Bill
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-06-2006 , 05:41 PM



On Mon, 06 Mar 2006 21:02:29 -0000, "William Tasso"
<SpamBlocked (AT) tbdata (DOT) com> wrote:

Quote:
Fleeing from the madness of the NTL jungle
Big Bill <kruse (AT) cityscape (DOT) co.uk> stumbled into
news:alt.internet.search-engines,alt.www.webmaster
and said:

On Mon, 06 Mar 2006 19:32:17 GMT, Ignoramus23035
ignoramus23035 (AT) NOSPAM (DOT) 23035.invalid> wrote:
...
Last night, I made changes for robots.txt, so far no effect.

I tried using sitemaps to tell googlebot not to crawl page more than
1x per months, but that made it only worse and bolder.

Take the pages down for a bit, then put them back up again. Let the
Googlebot get the idea that they aren't there...

how long is a bit?
How long you want it? I was thinking till the bot had been a couple of
times, enough to hopefully register that the pages had gone. Then redo
the robots.txt, pray it gets followed this time, and put the pages up
again in the by-now forbidden directory.

BB
--

http://homepage.ntlworld.com/bill.kr...ird-prints.htm
http://www.crystal-liaison.com/harmo...dom/index.html
kruse (AT) crystal-liaison (DOT) com Gifty! Shiny! BB!


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.