![]() | |
![]() |
| | Thread Tools | Display Modes |
#11
| ||||
| ||||
|
|
Big Bill <kruse (AT) cityscape (DOT) co.uk> wrote in news:5uenp0dllts9nfu56m7fmn1q8bt48edt9v (AT) 4ax (DOT) com: will actually be followed by the engines? http://www.foo.co.uk/document.aspx?id=yada.htm I'm thinking that big bad session id is going to upset them. But I'd like some input in case there's some dumb thing I'm overlooking. Um, what session id? |
|
I can see a ? and something that probably calls a particular document into a template, but nothing that looks likely to change by session in that URL. Or do you mean that this page is setting a cookie, and uses a session ID if the cookie is denied? |
|
Usually a URL with a session on it looks more like this: http://www.foo.co.uk/document.aspx?i...=1232454129879 and the SID is often only visible if you turn off cookies. URL in the style of http://www.foo.co.uk/document.aspx?id=yada.htm should be fine in Google if that's all it is. Yahoo and MSN usually OK too, though they are a bit more picky - I would not rely on them to spider a dynamic doc from another dynamic doc, though Google will. |
|
Oh, I did once have a dynamic site where changing id=bla to something like page=bla solved a spidering problem - there was some idea about at the time that spiders disliked particular variable names - but I didn't test it extensively as the first thing I tried worked ;-) |
#12
| |||
| |||
|
|
On 17 Nov 2004 22:40:36 GMT, John Bokma <postmaster (AT) castleamber (DOT) com wrote: Since there is no way that Google can see if a page is dynamic or not, they go for a few simple rules, and one is id=somenumber afaik/iirc. So the webmaster has some control over the spiderspeed. If you want it not too fast, use URLs that triggers one of Googles "this is a dynamic page" rule. Otherwise use mod_rewite, and hide the fact :-) Or at least, that is how I explain things, they might be wrong. and if you explain things wrong, who does know? Anywhere? |
|
Who do I believe? |
|
I'm for sticking to static pages linking in to a series of dynamic links for product exhbition. It cuts all the imponderables out. |
#13
| |||
| |||
|
|
Big Bill wrote: will actually be followed by the engines? http://www.foo.co.uk/document.aspx?id=yada.htm I'm thinking that big bad session id is going to upset them. Depends where id comes from. If it is obtained by filling in a form or by URL rewriting robots won't do it. |
|
If it is picked up from an OBL on another page, no problem, at least in theory. Session ids are usually used to refer to information obtained by logging into a site or to information stored in the URL (vie URL rewriting) or a cookie to track a user around a site. Spiders won't save these pieces of data for subsequent requests. |
#14
| |||
| |||
|
|
Ah, now we're getting to the "only spiders dynamic links one deep" thing I've mentioned and no-one else has (so far) picked up on. |
|
Oh, I did once have a dynamic site where changing id=bla to something like page=bla solved a spidering problem - there was some idea about at the time that spiders disliked particular variable names - but I didn't test it extensively as the first thing I tried worked ;-) So we're even more in the dark than ever then. |
#15
| |||
| |||
|
|
Much of which I knew, actually, but I don't think the web hosts I'm dealing with know about any of this stuff. Client is entirely bewildered and I'm trying to do best by all parties. There's a follow up to this post II'd like you to keep an eye out for please. I also understand from various sources that Google will only go one deep into a dynamic layer of links (think, won't go past second breadcrumb) any comments there please? |
|
It's nice to see some new people coming out of the woodwork because a lot of the knowledgable people in here seem to have got pissed off with the likes of Stoma and small mouse trumpeting their ignorance and gone off to forums. I've not found one I'm comfortable in yet so, still here. |
#16
| |||
| |||
|
|
Big Bill wrote: Much of which I knew, actually, but I don't think the web hosts I'm dealing with know about any of this stuff. Client is entirely bewildered and I'm trying to do best by all parties. There's a follow up to this post II'd like you to keep an eye out for please. I also understand from various sources that Google will only go one deep into a dynamic layer of links (think, won't go past second breadcrumb) any comments there please? (One) source: http://www.webmasterworld.com/forum3/23297-3-10.htm "If you want your site to be crawled by as many search engines as possible, things like static urls always help, and if you can't do that then fewer parameters probably help." and: http://answers.google.com/answers/threadview?id=196839 Personally, I'd always use mod rewrite with apache for a site that required dynamic links (one, two or more). It does get pretty techy though. <you've been warned ;-) On another note, url rewriting is a job for the webserver imo, rather the scripting language (i.e. PATH_INFO with PHP) http://httpd.apache.org/docs/mod/mod_rewrite.html It's nice to see some new people coming out of the woodwork because a lot of the knowledgable people in here seem to have got pissed off with the likes of Stoma and small mouse trumpeting their ignorance and gone off to forums. I've not found one I'm comfortable in yet so, still here. http://www.sitepoint.com is pretty good for this kinda stuff, and possibly better than alt.*.usenet. |
#17
| |||
| |||
|
|
However, if it is a genuine session id, the session might have been expired on the next visit of the bot, and hence it gets an error page. |
#18
| |||
| |||
|
|
Big Bill wrote: Ah, now we're getting to the "only spiders dynamic links one deep" thing I've mentioned and no-one else has (so far) picked up on. I doubt that statement. But maybe someone else can clarify? |
#19
| |||
| |||
|
|
URL in the style of http://www.foo.co.uk/document.aspx?id=yada.htm should be fine in Google if that's all it is. Yahoo and MSN usually OK too, though they are a bit more picky - I would not rely on them to spider a dynamic doc from another dynamic doc, though Google will. Ah, now we're getting to the "only spiders dynamic links one deep" thing I've mentioned and no-one else has (so far) picked up on. |
#20
| |||
| |||
|
|
Big Bill <kruse (AT) cityscape (DOT) co.uk> wrote in news:8rlpp0tdhprr7g4es424tft5i49gevriq1 (AT) 4ax (DOT) com: URL in the style of http://www.foo.co.uk/document.aspx?id=yada.htm should be fine in Google if that's all it is. Yahoo and MSN usually OK too, though they are a bit more picky - I would not rely on them to spider a dynamic doc from another dynamic doc, though Google will. Ah, now we're getting to the "only spiders dynamic links one deep" thing I've mentioned and no-one else has (so far) picked up on. I have seen MSN and Yahoo pick up a dynamic url apparently from another dynamic page*. For example: www.plymouthguild.org.uk/cdb/ course.php?id=82 I'm pretty sure is only linked to from another dynamic page, and is in Yahoo. I just don't trust them to do it reliably or quickly, in the same way that I don't trust them to spider quickly as many levels down as Google does. |
![]() |
| Thread Tools | |
| Display Modes | |
| |