HighDots Forums  

Re: I am giving up wikipedia mirror

Search Engine Optimization Discussion about SEO/Search Engine Optimization (alt.internet.search-engines)


Discuss Re: I am giving up wikipedia mirror in the Search Engine Optimization forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
John Bokma
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-06-2006 , 03:14 PM






Ignoramus23035 <ignoramus23035 (AT) NOSPAM (DOT) 23035.invalid> wrote:

Quote:
Last night, I made changes for robots.txt, so far no effect.
Takes at least a day. Did you check with Google Sitemaps if Google
understands your new version? It's easy to make a tiny mistake.


--
John Experienced (web) developer: http://castleamber.com/

Perl RSS Builder: http://johnbokma.com/perl/rss-web-feed-builder.html


Reply With Quote
  #2  
Old   
John Bokma
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-06-2006 , 03:37 PM






Ignoramus23035 <ignoramus23035 (AT) NOSPAM (DOT) 23035.invalid> wrote:

Quote:
On 6 Mar 2006 20:14:06 GMT, John Bokma <john (AT) castleamber (DOT) com> wrote:
Ignoramus23035 <ignoramus23035 (AT) NOSPAM (DOT) 23035.invalid> wrote:

Last night, I made changes for robots.txt, so far no effect.

Takes at least a day. Did you check with Google Sitemaps if Google
understands your new version? It's easy to make a tiny mistake.

Well, I think that robots.txt overrides sitemaps.
No, I mean, Site maps has an option to check robots.txt. Apologies for
being a bit vague.


http://www.google.com/webmasters/sit...stats?siteUrl=

and click on robots.txt tab.

Nifty eh?

--
John Freelance Perl programmer: http://castleamber.com/

Quick Bookmarks:http://johnbokma.com/firefox/quick-l...bookmarks.html


Reply With Quote
  #3  
Old   
William Tasso
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-06-2006 , 04:05 PM



Fleeing from the madness of the Castle Amber - software development jungle
John Bokma <john (AT) castleamber (DOT) com> stumbled into
news:alt.internet.search-engines,alt.www.webmaster
and said:

Quote:
...
and click on robots.txt tab.
talking of robots.txt - is it possible to stick wildcards in the exclusion
list?

--
William Tasso

whither a trophy?


Reply With Quote
  #4  
Old   
John Bokma
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-06-2006 , 04:44 PM



"William Tasso" <SpamBlocked (AT) tbdata (DOT) com> wrote:

Quote:
Fleeing from the madness of the Castle Amber - software development
jungle John Bokma <john (AT) castleamber (DOT) com> stumbled into
news:alt.internet.search-engines,alt.www.webmaster
and said:

...
and click on robots.txt tab.

talking of robots.txt - is it possible to stick wildcards in the
exclusion list?
<http://www.robotstxt.org/wc/norobots.html>

From what I read, no. Problem is, the robots.txt "standard" (ha ha) is
badly written. Maybe because back in those days you could count the bots
on one finger.

--
John
Sig-o-matic says:
OMG, lolzzz (c)... brb


Reply With Quote
  #5  
Old   
Borek
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-06-2006 , 04:45 PM



On Mon, 06 Mar 2006 22:05:21 +0100, William Tasso <SpamBlocked (AT) tbdata (DOT) com>
wrote:

Quote:
and click on robots.txt tab.

talking of robots.txt - is it possible to stick wildcards in the
exclusion list?
In general - no (check www.robotstxt.org for details). Some bots may know
better.

Best,
Borek
--
http://www.chembuddy.com
http://www.bpp.com.pl


Reply With Quote
  #6  
Old   
Borek
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-06-2006 , 06:45 PM



On Mon, 06 Mar 2006 22:45:22 +0100, Borek
<m.borkowski (AT) delete (DOT) chembuddy.these.com.parts> wrote:

Quote:
and click on robots.txt tab.

talking of robots.txt - is it possible to stick wildcards in the
exclusion list?

In general - no (check www.robotstxt.org for details). Some bots may
know better.
Sorry all - no idea why dupes are posted I am sending only once, looks
there is some bug either in news server or in Opera. It happens only when
the thread is shared between newsgroups (alt.internet.search-engines and
alt.www.webmaster in this case). In one group (aise - where I have
originally posted) there are single posts, in the second - aww - there are
dupes.

I am posting in aww this time, perhaps there will be dupes in aise?

Best,
Borek
--
http://www.chembuddy.com
http://www.ph-meter.info


Reply With Quote
  #7  
Old   
Mark Parnell
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-06-2006 , 07:12 PM



Deciding to do something for the good of humanity, Borek
<m.borkowski (AT) delete (DOT) chembuddy.these.com.parts> declared in
alt.internet.search-engines,alt.www.webmaster:

Quote:
In one group (aise - where I have
originally posted) there are single posts, in the second - aww - there are
dupes.
Not here. Only saw your post once.

--
Mark Parnell

Now implementing http://blinkynet.net/comp/uip5.html


Reply With Quote
  #8  
Old   
Borek
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-06-2006 , 07:19 PM



On Tue, 07 Mar 2006 01:12:22 +0100, Mark Parnell
<webmaster (AT) clarkecomputers (DOT) com.au> wrote:

Quote:
In one group (aise - where I have
originally posted) there are single posts, in the second - aww - there
are
dupes.

Not here. Only saw your post once.
Doesn't matter where I post - it lands twice in aww and once in aise. But
if it is only my problem I am not going to investigate

Best,
Borek
--
http://www.chembuddy.com
http://www.bpp.com.pl


Reply With Quote
  #9  
Old   
Jim
 
Posts: n/a

Default Re: I am giving up wikipedia mirror - 03-07-2006 , 09:42 AM



Quote:
talking of robots.txt - is it possible to stick wildcards in the
exclusion
list?

There's "User-Agent: *", but apart from that, not according to the
standard. (Though certain robots support certain extensions to the
standard.)
Google has some support for wildcards:
(http://www.google.com/webmasters/remove.html)
To remove all files of a specific file type (for example, .gif), you'd use
the following robots.txt entry:
User-agent: Googlebot
Disallow: /*.gif$

To remove dynamically generated pages, you'd use this robots.txt entry:
User-agent: Googlebot
Disallow: /*?




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.