[squid-users] Squid-cache search engine?

From: Michael Carmack <[email protected]>
Date: Sun, 18 Nov 2001 09:33:46 +0000

Just curious, has there been any discussion about writing a search engine
for Squid? A quick google search didn't reveal anything, so I apologize
if this has been brought up before.

I recently set up a cache, and I'm thinking it would be interesting to
perform searches that were restricted to the local node, and then extend
this so that queries could be transparently passed on to peers, and in
doing so build a distributed search engine.

It seems that you would be able to build a large search engine quite
cheaply. For example, peered with 10 machines caching 40GB each, a search
will cover 400GB of data. Forward to another 10 each and it's 4TB. Once
or twice more approaches Google.

It also seems that with such a configuration you'd get ranking "for free"
simply by looking at timestamps and the number of overlapping matches.

I would think that a few thousand volunteers contributing to a distributed
search network would secure a weak point in Internet infrastructure. If one
of the nodes dies or unacceptably changes it's ranking policy, you ignore
it and only lose a tiny fraction of search engine. OTOH, if something like
Google goes, you lose 100% and you're left in limbo.

Does such a thing already exist? I've looked at a few distributed search
engines, but none that I looked at worked this way.

m.
Received on Sun Nov 18 2001 - 02:33:49 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:04:17 MST