HighDots Forums  

does the team think this link

Search Engine Optimization Discussion about SEO/Search Engine Optimization (alt.internet.search-engines)


Discuss does the team think this link in the Search Engine Optimization forum.



Reply
 
Thread Tools Display Modes
  #11  
Old   
Big Bill
 
Posts: n/a

Default Re: does the team think this link - 11-18-2004 , 12:18 PM






On Thu, 18 Nov 2004 13:52:27 +0000, Victoria Clare
<victoria (AT) markpoles (DOT) org.uk> wrote:

Quote:
Big Bill <kruse (AT) cityscape (DOT) co.uk> wrote in
news:5uenp0dllts9nfu56m7fmn1q8bt48edt9v (AT) 4ax (DOT) com:

will actually be followed by the engines?

http://www.foo.co.uk/document.aspx?id=yada.htm

I'm thinking that big bad session id is going to upset them.

But I'd like some input in case there's some dumb thing I'm
overlooking.


Um, what session id?
Well I wondered that too, even though I am the OP. Definitions seem to
vary, which does actually make sense.

Quote:
I can see a ? and something that probably calls a particular document into
a template, but nothing that looks likely to change by session in that URL.

Or do you mean that this page is setting a cookie, and uses a session ID if
the cookie is denied?
Ummmmmmmmmmmm, no.

Quote:
Usually a URL with a session on it looks more like this:

http://www.foo.co.uk/document.aspx?i...=1232454129879

and the SID is often only visible if you turn off cookies.

URL in the style of http://www.foo.co.uk/document.aspx?id=yada.htm should
be fine in Google if that's all it is. Yahoo and MSN usually OK too,
though they are a bit more picky - I would not rely on them to spider a
dynamic doc from another dynamic doc, though Google will.
Ah, now we're getting to the "only spiders dynamic links one deep"
thing I've mentioned and no-one else has (so far) picked up on.

Quote:
Oh, I did once have a dynamic site where changing id=bla to something like
page=bla solved a spidering problem - there was some idea about at the time
that spiders disliked particular variable names - but I didn't test it
extensively as the first thing I tried worked ;-)
So we're even more in the dark than ever then.

BB
--
www.kruse.co.uk SEO (AT) kruse (DOT) demon.co.uk
home of SEO that's shiny!
--


Reply With Quote
  #12  
Old   
John Bokma
 
Posts: n/a

Default Re: does the team think this link - 11-18-2004 , 02:11 PM






Big Bill wrote:

Quote:
On 17 Nov 2004 22:40:36 GMT, John Bokma <postmaster (AT) castleamber (DOT) com
wrote:

Since there is no way that Google can see if a page is dynamic or not,
they go for a few simple rules, and one is id=somenumber afaik/iirc.

So the webmaster has some control over the spiderspeed. If you want it
not too fast, use URLs that triggers one of Googles "this is a dynamic
page" rule. Otherwise use mod_rewite, and hide the fact :-)

Or at least, that is how I explain things, they might be wrong.

and if you explain things wrong, who does know? Anywhere?
Google.

Quote:
Who do I believe?
Not the people who claim that they know how Google works :-D.

Quote:
I'm for sticking to static pages linking in to a series of
dynamic links for product exhbition. It cuts all the imponderables
out.
Technically, a web browser/bot can't see if a page is statically or
dynamically generated (unless it comes every few seconds and compares),
so Googlebot must make an educated guess.

You can not see it by the headers the webserver sends, since you can
fake those headers.

And the educated guess is based on the *looks* of the URI.

One thing people agree on, is that if it contains id=somenumber. An
other one is if is has several & signs (maybe ; too since ; is prefered
above & to separate parts in a QUERY_STRING, however it's little used)

See my other posting wrt using PATH_INFO

Or use mod_rewrite

--
John -> http://johnbokma.com/ MexIT: http://johnbokma.com/mexit/
Perl & Google/WWW: http://johnbokma.com/perl/
Experienced programmer and SEO available: PR7 http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html


Reply With Quote
  #13  
Old   
John Bokma
 
Posts: n/a

Default Re: does the team think this link - 11-18-2004 , 02:18 PM



David Off wrote:

Quote:
Big Bill wrote:
will actually be followed by the engines?

http://www.foo.co.uk/document.aspx?id=yada.htm

I'm thinking that big bad session id is going to upset them.


Depends where id comes from. If it is obtained by filling in a form or
by URL rewriting robots won't do it.
You mean if the URL rewriting process hides the id? ie,

http://example.invalid/go/yada.htm is rewritten internally to:

http://example.invalid/document.aspx?id=yada.htm?

In that case, indeed, the robot can't see the id. It is perfectly
hidden. But of course the robot follows those links, it can't see that
yada.htm is static or dynamic (without comparing the contents every
time)

Quote:
If it is picked up from an OBL on
another page, no problem, at least in theory.

Session ids are usually used to refer to information obtained by
logging into a site or to information stored in the URL (vie URL
rewriting) or a cookie to track a user around a site. Spiders won't
save these pieces of data for subsequent requests.
If the dynamic generated page with id=xxxx uses id as a session id,
links on that page can have id=xxx as part of their URI, so a bot *does*
keep that info.

However, if it is a genuine session id, the session might have been
expired on the next visit of the bot, and hence it gets an error page.

One trick to solve this is to check if it's googlebot, and don't expire
its session id. This sounds like cloaking, but it isn't.

--
John -> http://johnbokma.com/ MexIT: http://johnbokma.com/mexit/
Perl & Google/WWW: http://johnbokma.com/perl/
Experienced programmer and SEO available: PR7 http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html


Reply With Quote
  #14  
Old   
John Bokma
 
Posts: n/a

Default Re: does the team think this link - 11-18-2004 , 02:23 PM



Big Bill wrote:

Quote:
Ah, now we're getting to the "only spiders dynamic links one deep"
thing I've mentioned and no-one else has (so far) picked up on.
I doubt that statement. But maybe someone else can clarify?

Quote:
Oh, I did once have a dynamic site where changing id=bla to something
like page=bla solved a spidering problem - there was some idea about
at the time that spiders disliked particular variable names - but I
didn't test it extensively as the first thing I tried worked ;-)

So we're even more in the dark than ever then.
No, you called id=.... a session id. A session id is basically a unique
identifier (id :-) a visitor gets so you can track him/her on your site.

In your example the value of id looked like the name of a template, so it
is not a session id.

By just looking at the URI one can't say if something is a session id or
not, for example:

http://example.invalid/show.cgi?id=14

14 can be the category id of some products in a webshop.


--
John -> http://johnbokma.com/ MexIT: http://johnbokma.com/mexit/
Perl & Google/WWW: http://johnbokma.com/perl/
Experienced programmer and SEO available: PR7 http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html


Reply With Quote
  #15  
Old   
Tony
 
Posts: n/a

Default Re: does the team think this link - 11-18-2004 , 04:14 PM



Big Bill wrote:

Quote:
Much of which I knew, actually, but I don't think the web hosts I'm
dealing with know about any of this stuff. Client is entirely
bewildered and I'm trying to do best by all parties. There's a follow
up to this post II'd like you to keep an eye out for please. I also
understand from various sources that Google will only go one deep into
a dynamic layer of links (think, won't go past second breadcrumb) any
comments there please?


(One) source:

http://www.webmasterworld.com/forum3/23297-3-10.htm

"If you want your site to be crawled by as many search engines as
possible, things like static urls always help, and if you can't do that
then fewer parameters probably help."

and:

http://answers.google.com/answers/threadview?id=196839



Personally, I'd always use mod rewrite with apache for a site that
required dynamic links (one, two or more).

It does get pretty techy though. <you've been warned ;-)>


On another note, url rewriting is a job for the webserver imo, rather
the scripting language (i.e. PATH_INFO with PHP)


http://httpd.apache.org/docs/mod/mod_rewrite.html


Quote:
It's nice to see some new people coming out of the woodwork because a
lot of the knowledgable people in here seem to have got pissed off
with the likes of Stoma and small mouse trumpeting their ignorance and
gone off to forums. I've not found one I'm comfortable in yet so,
still here.

http://www.sitepoint.com is pretty good for this kinda stuff, and
possibly better than alt.*.usenet.




--
Tony


Reply With Quote
  #16  
Old   
Big Bill
 
Posts: n/a

Default Re: does the team think this link - 11-18-2004 , 04:54 PM



On Thu, 18 Nov 2004 21:14:21 GMT, Tony <spamkill (AT) nospam (DOT) net> wrote:

Quote:
Big Bill wrote:

Much of which I knew, actually, but I don't think the web hosts I'm
dealing with know about any of this stuff. Client is entirely
bewildered and I'm trying to do best by all parties. There's a follow
up to this post II'd like you to keep an eye out for please. I also
understand from various sources that Google will only go one deep into
a dynamic layer of links (think, won't go past second breadcrumb) any
comments there please?



(One) source:

http://www.webmasterworld.com/forum3/23297-3-10.htm

"If you want your site to be crawled by as many search engines as
possible, things like static urls always help, and if you can't do that
then fewer parameters probably help."

and:

http://answers.google.com/answers/threadview?id=196839



Personally, I'd always use mod rewrite with apache for a site that
required dynamic links (one, two or more).

It does get pretty techy though. <you've been warned ;-)


On another note, url rewriting is a job for the webserver imo, rather
the scripting language (i.e. PATH_INFO with PHP)


http://httpd.apache.org/docs/mod/mod_rewrite.html


It's nice to see some new people coming out of the woodwork because a
lot of the knowledgable people in here seem to have got pissed off
with the likes of Stoma and small mouse trumpeting their ignorance and
gone off to forums. I've not found one I'm comfortable in yet so,
still here.


http://www.sitepoint.com is pretty good for this kinda stuff, and
possibly better than alt.*.usenet.
Is that the sitepoint forums? I know them. I just got a couple of
their CSS books actually.
All link reading info seems to contradictory. I don't have the
technical nous to work this out fir myself and it's such a pain when
trying to explain to clients.

BB
--
www.kruse.co.uk SEO (AT) kruse (DOT) demon.co.uk
home of SEO that's shiny!
--


Reply With Quote
  #17  
Old   
David Off
 
Posts: n/a

Default Re: does the team think this link - 11-18-2004 , 05:57 PM



John Bokma wrote:
Quote:
However, if it is a genuine session id, the session might have been
expired on the next visit of the bot, and hence it gets an error page.
sorry John, that is what I meant to say.


Reply With Quote
  #18  
Old   
Big Bill
 
Posts: n/a

Default Re: does the team think this link - 11-19-2004 , 02:38 AM



On 18 Nov 2004 19:23:01 GMT, John Bokma <postmaster (AT) castleamber (DOT) com>
wrote:

Quote:
Big Bill wrote:

Ah, now we're getting to the "only spiders dynamic links one deep"
thing I've mentioned and no-one else has (so far) picked up on.

I doubt that statement. But maybe someone else can clarify?
I might. My machine's playing up and messing with my newsreader. Posts
are appearing and disappearing out of sequence. My bads.

BB


--
www.kruse.co.uk SEO (AT) kruse (DOT) demon.co.uk
home of SEO that's shiny!
--


Reply With Quote
  #19  
Old   
Victoria Clare
 
Posts: n/a

Default Re: does the team think this link - 11-19-2004 , 06:21 AM



Big Bill <kruse (AT) cityscape (DOT) co.uk> wrote in
news:8rlpp0tdhprr7g4es424tft5i49gevriq1 (AT) 4ax (DOT) com:

Quote:
URL in the style of http://www.foo.co.uk/document.aspx?id=yada.htm
should be fine in Google if that's all it is. Yahoo and MSN usually
OK too, though they are a bit more picky - I would not rely on them to
spider a dynamic doc from another dynamic doc, though Google will.

Ah, now we're getting to the "only spiders dynamic links one deep"
thing I've mentioned and no-one else has (so far) picked up on.
I have seen MSN and Yahoo pick up a dynamic url apparently from another
dynamic page*.

For example: www.plymouthguild.org.uk/cdb/ course.php?id=82 I'm pretty sure
is only linked to from another dynamic page, and is in Yahoo.

I just don't trust them to do it reliably or quickly, in the same way that
I don't trust them to spider quickly as many levels down as Google does.

Yahoo seems to take forever to update its index: as most dynamic pages are
data driven *because* they need to change often, it's not uncommon for the
site to be changing faster than the search engine can keep up.

It's hard to tell if there is actually a technological limitation here, or
if it just takes a long time for the info from pages several levels down on
a really big site to make it into the database.

*hard to be absolutely certain, because there could be a link to the
dynamic page from somewhere you haven't considered or aren't aware of -
certainly by the time the dratted thing has been spidered and indexed by
Yahoo the URL could be all over the place.

Victoria
--
Clare Associates Ltd
http://www.clareassoc.co.uk/
--


Reply With Quote
  #20  
Old   
Big Bill
 
Posts: n/a

Default Re: does the team think this link - 11-19-2004 , 07:09 AM



On Fri, 19 Nov 2004 11:21:09 +0000, Victoria Clare
<victoria (AT) markpoles (DOT) org.uk> wrote:

Quote:
Big Bill <kruse (AT) cityscape (DOT) co.uk> wrote in
news:8rlpp0tdhprr7g4es424tft5i49gevriq1 (AT) 4ax (DOT) com:

URL in the style of http://www.foo.co.uk/document.aspx?id=yada.htm
should be fine in Google if that's all it is. Yahoo and MSN usually
OK too, though they are a bit more picky - I would not rely on them to
spider a dynamic doc from another dynamic doc, though Google will.

Ah, now we're getting to the "only spiders dynamic links one deep"
thing I've mentioned and no-one else has (so far) picked up on.

I have seen MSN and Yahoo pick up a dynamic url apparently from another
dynamic page*.

For example: www.plymouthguild.org.uk/cdb/ course.php?id=82 I'm pretty sure
is only linked to from another dynamic page, and is in Yahoo.

I just don't trust them to do it reliably or quickly, in the same way that
I don't trust them to spider quickly as many levels down as Google does.
Me neither. Or at all.

BB


--
www.kruse.co.uk SEO (AT) kruse (DOT) demon.co.uk
home of SEO that's shiny!
--


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.