Re: [squid-users] accelerator farm: optimizing the sibbling_hit

From: Ard van Breemen <[email protected]>
Date: Thu, 6 Mar 2003 19:04:17 +0100

On Mon, Mar 03, 2003 at 10:26:51PM +0100, Henrik Nordstrom wrote:
> For your situationt and most accelerator farms the following
> configuration should be optimal I think:
>
> 1. Use smart request routing within the array of Squids to make sure
> that for each URL one of the Squids is denoted "master". For example
> by using the CARP hash algorithm or a manual division using
> cache_peer_access. This gets rid of ICP while at the same time
> preserving cache redundancy (assuming clients hits all your servers
> "randomly").

Hmmm, that sounds very good :-). This makes the cache coherency
approach a good 100%.

> 2. prefer_direct off (as ICP is not used..)
>
> 3. very short peer connect timeout for quick recovery in case of peer
> failure, to compensate for the lack of ICP in determining peer
> availability.
>
> 4. Squid modified to collapse refreshes of the same cached object into
> one request to avoid storms of refreshes when an object expires.
> Normally Squid will issue a refresh for each new request received
> until a new reply have been received after the object have expired
> which may cause a bit of grief in an heavily hit accelerator setup..

That will always be very interesting, independent of having a
farm or not. I have seen webservers hit more than 30 times with
the same request, just because some script is too slow. But there
was only one squid in front of it.

> 5. Maybe also to collapse requests for the same URL into one even if
> not already cached, but this is more risky and may cause high latency
> for dynamic information not cached..

I think 4 is usually enough for an accelerator setup, since the
high profile dynamic pages will already be cached albeit stale,
although it would be cleaner to also collapse their first GET.
 
> The drawback of such design is that the more members of your
> accelerator array the closer the cache hop count approaches 2 on
> average for cache misses, as the more members the less likely the
> client request hits the "denoted master cache" for this URL.

Yes, I think you can say the hop count really approaches 2 :-)

I can only think of one reason not to do it, and that is the
failure of one of the caches. A cache fail means that other
caches will do a direct, instead of using a second-in-line cache.
That means the site will probably get the same request about
(nr of caches) times instead of a single time.
For some sites that is a real no-go.

Using ICP you get a more or less a sort of put another number of
caches there, and it works. If one cache fails, who would have
noticed? Hmmm, maybe the two can be combined. Using CARP to route
to a cluster of two squids that talk ICP to eachother.

(Too bad I only have time for the fuzzy expire :-( )

-- 
program signature;
begin  { telegraaf.com
} writeln("<ard@telegraafnet.nl> SMA-IS | Geeks don't get viruses");
end
.
Received on Thu Mar 06 2003 - 11:04:27 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:13:57 MST