Roll your own search engine -
12-05-2005
, 10:37 PM
Is anyone here familiar with what might be some good software for a
niche search engine (crawling a subset of the web - say 10 million
pages). So far we've found ASPseek seems dead and apparently has issues
on new linux distros. We're playing wth nutch which is active, but
we're a lamp not a java shop and my developers been struggling with it
for a week or two to get it to really do much (I suspect it's just a
very powerful tool that's poorly documented). Ideally I'd like
something GPL'ed so I can monkey with the code as needed - for example
so I can specify how I define what sites match the niche.
Is there any software out there that might fit the bill?
Thanks in advance. |