HighDots Forums  

Will Googlebit crawl non-indexed files?

Search Engine Optimization Discussion about SEO/Search Engine Optimization (alt.internet.search-engines)


Discuss Will Googlebit crawl non-indexed files? in the Search Engine Optimization forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
dgarrard@nufocusinc.com
 
Posts: n/a

Default Will Googlebit crawl non-indexed files? - 02-21-2008 , 07:35 AM






Hi, I have a database that generates stories and an associated image
that I wish to have indexed by Google and Google Images. Here is an
example: http://www.nufocusinc.com/sightings/...ons_KAN_3.html

These pages are autogenerated in the /sightings/ directory and I do
not want to have to maintain an index.html file that lists all the
files.

- My site root is successfully being crawled.

- My directory listing access in Apache for /sightings/ is forbidden
as in:
http://www.nufocusinc.com/sightings/

- I have added a link in my home page to a dummy file:
http://www.nufocusinc.com/sightings/google.html

So are questions are:
1) Do I have to allow directory listing access to that directory for
googlebot?
2) Is the dummy file idea of any use?
3) Do I have to have an index.html file in /sightings/ ?

My goal is that the pages that are placed in the /sightings/ directory
are crawled and indexed by googlebot.


Reply With Quote
  #2  
Old   
CreativeMind
 
Posts: n/a

Default Re: Will Googlebit crawl non-indexed files? - 02-21-2008 , 10:47 AM






google will not spider through the newly generated content but may be
possible after some specific period. Actually i have the same case.i m
also looking for ur question's answer.


On Feb 21, 5:35 pm, dgarr... (AT) nufocusinc (DOT) com wrote:
Quote:
Hi, I have a database that generates stories and an associated image
that I wish to have indexed by Google and Google Images. Here is an
example:http://www.nufocusinc.com/sightings/...ons_KAN_3.html

These pages are autogenerated in the /sightings/ directory and I do
not want to have to maintain an index.html file that lists all the
files.

- My site root is successfully being crawled.

- My directory listing access in Apache for /sightings/ is forbidden
as in:
http://www.nufocusinc.com/sightings/

- I have added a link in my home page to a dummy file:http://www.nufocusinc.com/sightings/google.html

So are questions are:
1) Do I have to allow directory listing access to that directory for
googlebot?
2) Is the dummy file idea of any use?
3) Do I have to have an index.html file in /sightings/ ?

My goal is that the pages that are placed in the /sightings/ directory
are crawled and indexed by googlebot.


Reply With Quote
  #3  
Old   
Tonnie Lubbers
 
Posts: n/a

Default Re: Will Googlebit crawl non-indexed files? - 02-21-2008 , 12:49 PM



dgarrard (AT) nufocusinc (DOT) com schreef:
Quote:
Hi, I have a database that generates stories and an associated image
that I wish to have indexed by Google and Google Images. Here is an
example: http://www.nufocusinc.com/sightings/...ons_KAN_3.html

These pages are autogenerated in the /sightings/ directory and I do
not want to have to maintain an index.html file that lists all the
files.

- My site root is successfully being crawled.

- My directory listing access in Apache for /sightings/ is forbidden
as in:
http://www.nufocusinc.com/sightings/

- I have added a link in my home page to a dummy file:
http://www.nufocusinc.com/sightings/google.html

So are questions are:
1) Do I have to allow directory listing access to that directory for
googlebot?
NO

Quote:
2) Is the dummy file idea of any use?
NO

Quote:
3) Do I have to have an index.html file in /sightings/ ?
YES. Preferably with links. The bot follows links, nothing else. It
might pick up textual placed URl's and try to access them.

Quote:
My goal is that the pages that are placed in the /sightings/ directory
are crawled and indexed by googlebot.
Then make a page where the bot can pick them up.

--
Webdesign: http://vision2form.nl/webontwerp/
Korte handleiding zoekmachine optimalisatie / gevonden worden:
http://vision2form.nl/webontwerp/gevonden-worden.html
Lifestyle - wonen reizen en genieten : http://vision4living.com


Reply With Quote
  #4  
Old   
John Bokma
 
Posts: n/a

Default Re: Will Googlebit crawl non-indexed files? - 02-21-2008 , 01:35 PM



dgarrard (AT) nufocusinc (DOT) com wrote:

Quote:
Hi, I have a database that generates stories and an associated image
that I wish to have indexed by Google and Google Images. Here is an
example: http://www.nufocusinc.com/sightings/...ons_KAN_3.html
You might want to name your files smarter, for example:

/sightings/orange-chat-aurifrons.html

or better:

/sightings/orange-chat-epthianura-aurifrons.html

If the 3 is required (for ID, or because there are more orange chat
sightings):

/sightings/orange-chat-epthianura-aurifrons-3.html


* Don't use _
* Don't use CAPS

in the filename, though


Use h1 instead of h2 if it's the first heading on the page (which it
is).

You might want to link to other pages then the "click here" software,
even if your site is just set up to feed many pages to Google in order
to promote the software.

You might want to relate to similar or related species, for example. Or
an overview.

Quote:
These pages are autogenerated in the /sightings/ directory and I do
not want to have to maintain an index.html file that lists all the
files.
Simple answer: generate that page. It's not that hard.

Quote:
- My site root is successfully being crawled.

- My directory listing access in Apache for /sightings/ is forbidden
as in:
http://www.nufocusinc.com/sightings/

- I have added a link in my home page to a dummy file:
http://www.nufocusinc.com/sightings/google.html
So, google will find a page with hardly any content, which looks spammy,
and no links.

Quote:
So are questions are:
1) Do I have to allow directory listing access to that directory for
googlebot?
What you need is links to every page in sightings. The best way to do
this is by making a index page which several category pages instead of
an index page which links to each and every other page. On the index
page you link to, say 25, category pages, which should be of use to your
visitors so they can pick fast the right category for a bird. On each
category page you put links to birds in that category.

Quote:
2) Is the dummy file idea of any use?
No. Google sees no links on it. It *can't* find pages unless there is a
link somewhere to a page.

Quote:
3) Do I have to have an index.html file in /sightings/ ?
Yes

Quote:
My goal is that the pages that are placed in the /sightings/ directory
are crawled and indexed by googlebot.
So you need links to those pages. If you configure Apache to auto
generate a index for you (one link for each file) you have the worst way
of linking, but it's better than nothing (i.e. having a google.html page
with no links and some keywords as content, even the filename is
pointless and misleading).

--
John Bokma http://johnbokma.com/


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.