HighDots Forums  

Google just can't get it up

Search Engine Optimization Discussion about SEO/Search Engine Optimization (alt.internet.search-engines)


Discuss Google just can't get it up in the Search Engine Optimization forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
aengel@paterra.com
 
Posts: n/a

Default Google just can't get it up - 01-05-2006 , 04:07 PM






I run a website with about 4 million pages (http://cxp.paterra.com).
Although the Google spiders are very active and have been pulling
100,000+ pages per day for the last 3 months, few pages show up on
Google. See http://www.paterra.com/GoogleVsAskVsBaidu.pdf. Google
indexing of this site essentially collapsed in January 2005 when the
number of pages was increased from about 1 million to 4 million.

AskJeeves, on the other hand, indexes 95% of the site.

My current working hypothesis as to why these pages don't show up on
Google centers on Google's repetitive pulling of pages to test
stability and refresh its indexes. Suppose Google has to be able to
pull the same page twice over a two week period before it posts to the
index. Suppose also that Google has a maximum pull rate per site.
Also, suppose that Google expires pages after a month. With more than
4 million pages, Google cannot do repeat pulls fast enough to keep the
pulled pages in the index.

Does this make sense to anyone intimately familiar with Google
indexing? If this hypothesis is correct, is there a way to get Google
to ease the repeatability requirements?


Reply With Quote
  #2  
Old   
jazzrabbit@gmail.com
 
Posts: n/a

Default Re: Google just can't get it up - 01-06-2006 , 08:06 AM






I dont think it is releated to number of link but the depth of the link
and the strange subject of your site + bad directory structure, all
files located in the main directory + band link titles that do not mean
anything really.

Less batching up and more real information on the page would do a
better trick.

such as showing the last 100 new items on the front page.


Reply With Quote
  #3  
Old   
Alan
 
Posts: n/a

Default Re: Google just can't get it up - 01-07-2006 , 10:33 AM



To test this, I have throttled Googlebot back to about 1/3 of the 4
million pages and may throttle it back even further, even though
Google's indexing of Chinese content is starting to rise above the
baseline. See http://www.paterra.com/GoogleVsAskVsBaidu.pdf.

If the hypothesis is correct, Google coverage should rise in a few
weeks.

Quote:
From the web logs, it is clear that Google is spidering the full depth
and breadth of the site. It just doesn't add these pages to its
indexes. In contrast, AskJeeves also spiders the full depth and
breadth of the site and does add the pages to its indexes. The key
difference, according to this hypothesis, is that AskJeeves works on a
two to three month refresh cycle and uses a different verification
mechanism.



Reply With Quote
  #4  
Old   
Alan
 
Posts: n/a

Default Re: Google just can't get it up - 01-07-2006 , 10:35 AM



The directory structure is a fiction. The site is almost totally
dynamic.


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.