HighDots Forums  

checking to see if SE can spider or not

Search Engine Optimization Discussion about SEO/Search Engine Optimization (alt.internet.search-engines)


Discuss checking to see if SE can spider or not in the Search Engine Optimization forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
-keevill-
 
Posts: n/a

Default checking to see if SE can spider or not - 07-04-2005 , 11:05 PM






Is there a way of checking if Google or any SE spider can spider your
website? Perhaps something like Xenu ? If Xenu cannot pass through the
website is it safe to say that the SE spiders cannot also? ( It's a
dynamically created website using XML/Php )



Reply With Quote
  #2  
Old   
Big Bill
 
Posts: n/a

Default Re: checking to see if SE can spider or not - 07-05-2005 , 12:39 AM






On Tue, 5 Jul 2005 10:05:12 +0700, "-keevill-" <keevillus (AT) yahoo (DOT) com>
wrote:

Quote:
Is there a way of checking if Google or any SE spider can spider your
website? Perhaps something like Xenu ? If Xenu cannot pass through the
website is it safe to say that the SE spiders cannot also? ( It's a
dynamically created website using XML/Php )
If you look at www.kruse.co.uk/seo-tools.htm there's a load of
spider-checkers and related stuff. Go peek!

BB

--
www.kruse.co.uk/ seo (AT) kruse (DOT) demon.co.uk
seo that watches the river flow...
--


Reply With Quote
  #3  
Old   
John Bokma
 
Posts: n/a

Default Re: checking to see if SE can spider or not - 07-05-2005 , 01:05 AM



"-keevill-" <keevillus (AT) yahoo (DOT) com> wrote:

Quote:
Is there a way of checking if Google or any SE spider can spider your
website?
If you can with a browser with JavaScript off, no cookies, and there are no
session IDs encoded in the URL then you can be quite sure.

Quote:
Perhaps something like Xenu ? If Xenu cannot pass through the
website is it safe to say that the SE spiders cannot also? ( It's a
dynamically created website using XML/Php )
Again: dynamically is something that can not be seen by a bot unless you
give it away. E.g. script.php?id=1231231 might be considered dynamic based
on the URL, and yet it can be just a static HTML page with a funny URL.

--
John Perl SEO tools: http://johnbokma.com/perl/
Experienced (web) developer: http://castleamber.com/
Get a SEO report of your site for just 100 USD:
http://johnbokma.com/websitedesign/seo-expert-help.html


Reply With Quote
  #4  
Old   
Big Bill
 
Posts: n/a

Default Re: checking to see if SE can spider or not - 07-05-2005 , 01:39 AM



On 5 Jul 2005 05:05:03 GMT, John Bokma <john (AT) castleamber (DOT) com> wrote:

Quote:
"-keevill-" <keevillus (AT) yahoo (DOT) com> wrote:

Is there a way of checking if Google or any SE spider can spider your
website?

If you can with a browser with JavaScript off, no cookies, and there are no
session IDs encoded in the URL then you can be quite sure.

Perhaps something like Xenu ? If Xenu cannot pass through the
website is it safe to say that the SE spiders cannot also? ( It's a
dynamically created website using XML/Php )

Again: dynamically is something that can not be seen by a bot unless you
give it away. E.g. script.php?id=1231231 might be considered dynamic based
on the URL, and yet it can be just a static HTML page with a funny URL.
Did he say dynamic? I missed that. Early here!

BB
--
www.kruse.co.uk/ seo (AT) kruse (DOT) demon.co.uk
seo that watches the river flow...
--


Reply With Quote
  #5  
Old   
-keevill-
 
Posts: n/a

Default Re: checking to see if SE can spider or not - 07-05-2005 , 01:50 AM



Quote:
Again: dynamically is something that can not be seen by a bot unless you
give it away. E.g. script.php?id=1231231 might be considered dynamic based
on the URL, and yet it can be just a static HTML page with a funny URL.

Did he say dynamic? I missed that. Early here!

BB
I still would like to know if the inability of Xenu to get past the first
page means that the bots cannot also. My website has been around long enough
and upon examining the logs, Googlebot comes in each day but never gathers
more than a couple of pages.





Reply With Quote
  #6  
Old   
Big Bill
 
Posts: n/a

Default Re: checking to see if SE can spider or not - 07-05-2005 , 04:51 AM



On Tue, 5 Jul 2005 12:50:37 +0700, "-keevill-" <keevillus (AT) yahoo (DOT) com>
wrote:

Quote:
Again: dynamically is something that can not be seen by a bot unless you
give it away. E.g. script.php?id=1231231 might be considered dynamic based
on the URL, and yet it can be just a static HTML page with a funny URL.

Did he say dynamic? I missed that. Early here!

BB
I still would like to know if the inability of Xenu to get past the first
page means that the bots cannot also. My website has been around long enough
and upon examining the logs, Googlebot comes in each day but never gathers
more than a couple of pages.
I forget where your site is at now. Are you one of those folk who have
their domain registered with one company, like uk2.net, for instance,
and have their actual web pages hosted somewhere else? Maybe the
forwarding isn't working properly. I've seen it happen. Remind me
where it is, I'll have a look.

BB

--
www.kruse.co.uk/ seo (AT) kruse (DOT) demon.co.uk
seo that watches the river flow...
--


Reply With Quote
  #7  
Old   
-keevill-
 
Posts: n/a

Default Re: checking to see if SE can spider or not - 07-05-2005 , 04:54 AM



Quote:
I forget where your site is at now. Are you one of those folk who have
their domain registered with one company, like uk2.net, for instance,
and have their actual web pages hosted somewhere else? Maybe the
forwarding isn't working properly. I've seen it happen. Remind me
where it is, I'll have a look.

BB
it's www.mygermanyhotels.com

Thx,




Reply With Quote
  #8  
Old   
Borek
 
Posts: n/a

Default what is dynamic, what is static? - 07-05-2005 , 05:29 AM



On Tue, 05 Jul 2005 07:05:03 +0200, John Bokma <john (AT) castleamber (DOT) com>
wrote:

Quote:
Again: dynamically is something that can not be seen by a bot unless you
give it away. E.g. script.php?id=1231231 might be considered dynamic
based on the URL, and yet it can be just a static HTML page with a funny
URL.
That ringed a bell - I have a discussion with a Junior few days ago.

If the content for a given URL is displayed every time the same, but
is on the server site created using scripts (so there are no ready HTML
files, but php or some cgi) is it static, or dynamic?

Perhaps one should differentiate between dynamic generation and dynamic
content?

Best,
Borek
--
http://www.chembuddy.com - chemical calculators for labs and education
BATE - program for pH calculations
CASC - Concentration and Solution Calculator
pH lectures - guide to hand pH calculation with examples


Reply With Quote
  #9  
Old   
Borek
 
Posts: n/a

Default Re: checking to see if SE can spider or not - 07-05-2005 , 05:43 AM



On Tue, 05 Jul 2005 07:50:37 +0200, -keevill- <keevillus (AT) yahoo (DOT) com> wrote:

Quote:
I still would like to know if the inability of Xenu to get past the first
page means that the bots cannot also. My website has been around long
enough and upon examining the logs, Googlebot comes in each day but
never gathers more than a couple of pages.
Nothing strange - as I see it the problem is not how often whole site
is spidered, but _whether_ whole site is spidered.

Googlebot visits my sites (PR2 and 3) every day, fetching few pages
from each. Some pages are spidered more often, some less often. About
every two weeks every page (these a small sites) is fetched.

On June 24th I have added to one of my sites about 30 pages and
they have been not spidered yet - about 20 were spidered, about
10 already made it to the Google index.

I decided it is going a little bit too slow and added sitemap on Sunday
(together with next 30 pages). Funny thing is Googlebot have fetched the
sitemap 6 times in 24 hours, but at the same time it have not fetched any
of the new pages yet. That's a little bit stupid.

Best,
Borek
--
http://www.chembuddy.com - chemical calculators for labs and education
BATE - program for pH calculations
CASC - Concentration and Solution Calculator
pH lectures - guide to hand pH calculation with examples


Reply With Quote
  #10  
Old   
Craig
 
Posts: n/a

Default Re: what is dynamic, what is static? - 07-05-2005 , 06:04 AM



On Tue, 05 Jul 2005 11:29:13 +0200, Borek
<borek (AT) parts (DOT) bpp.to.com.remove.pl> wrote:

Quote:
On Tue, 05 Jul 2005 07:05:03 +0200, John Bokma <john (AT) castleamber (DOT) com
wrote:

Again: dynamically is something that can not be seen by a bot unless you
give it away. E.g. script.php?id=1231231 might be considered dynamic
based on the URL, and yet it can be just a static HTML page with a funny
URL.

That ringed a bell - I have a discussion with a Junior few days ago.

If the content for a given URL is displayed every time the same, but
is on the server site created using scripts (so there are no ready HTML
files, but php or some cgi) is it static, or dynamic?

Perhaps one should differentiate between dynamic generation and dynamic
content?
John pretty much summed it up well

Surely, the term dynamic is any page that is created with content on
the fly instead of being hard coded and presented as file. If the
page is modified by a parser or interpreter (Such as ASP or PHP) and
is just streamed out by a server then its pretty much going to be a
static page.

But what does that matter? all .ASP pages and .PHP run through an
interpreter even if they dont contain any code, so can be static and
regardless there is no real way of google being 100% sure that a page
is dynamic (Although it can make good assumptions, and usually this is
assumed from the URL, which as John stated above can be wrong, its
just an assumption)

The problem occurs when a product database for example uses a page
such as product.asp?prodid=12345 and theres no positive link to that
page anywhere. For example, if the only way to get to a page
displaying details about heinz beanz is to enter the word "beanz" into
a search box, then theres no way google or any search engine will be
abe to find that (note to self, create searech engine that is able to
complete forms with random data and take over the world)

So make sure that if you have a dynamic page, that somehow, each page
(or at least every page that you wantindexed) is linked to using a
full url that will return that page (Not relaying on cookies or
sessions)

Phew... sorry bout that

Craig
----------------------------------------------------
Active member of the googuru community


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.