Re: Redundancy from Dancer on 1997-11-06 (squid-users)

From: Dancer <[email protected]>
Date: Thu, 06 Nov 1997 21:03:54 +1000

Well, that's why I wrote this code. I've got three servers set up. The main one,
where people normally connect, and two proxy-only squids on other machines. If
people connect through a proxy that's using sparent, then if the main one goes out
for any reason, it seamlessly connects to the next server in line. Users don't
even notice.

This is do-able for us, as we have a central hub serving a number of leaf-systems.
The leaf systems are where the users dial up and each has a squid proxy that
fetches it's requests from the main proxy at the hub. By setting up the leaf
systems each with a copy of 'sparent', I can do anything at all to the main proxy,
and only connections in-progress are affected. If I put a long enough
shutdown_lifetime on each one, then users never realise that any given unit isn't
responding, as sparent has routed their request to one of the redundant units that
_is_ responding. The redundant units don't cache, of course.

WWW server manager wrote:

> Henny Bekker wrote:
> >
> > [snip; about how to avoid transient disruptions to service due to
> > reconfiguring, rotating logs, etc., also crashes]
> >
> > True.. But in practics the Netscape and Microsoft Internet explorer are
> > the most use browsers.. I agree it's no solution for other browsers such
> > as lynx (which I'm using quite often). However the main proxy does not
> > crache that often (I hope).
>
> Crashes are a fact of life, and hopefully rare. More of a problem is that
>
> * when asked to load a new configuration, Squid rejects connections while
> it writes a new cache/log file containing details of the current cached
> objects. On our server, that takes a 2-5 minutes. That's in addition to
> the configurable timeout during which it will reject new connections
> but attempt to finish old ones; with the sample configuration, that adds
> a 30-second delay.
>
> * when asked to rotate the logs, in addition to rotating the "real" logs
> (the ones recording what it's been doing for later analysis) it also
> writes out the current object details to cache/log. That's good since
> the file would otherwise grow without bound, but it's bad because
> while Squid seems to continue accepting connections and to maintain
> existing ones, those connections are "on hold" until it's finished
> writing out cache/log, which again may take 2-5 minutes on our server.
>
> When users start getting annoyed if a remote server cannot be connected or
> fails to return the desired document in maybe 5-10 seconds, multi-minute
> periods with connections rejected or frozen are *not* helpful and are, as
> indicated in the original query, liable to result in users bypassing the
> cache and never bothering to use it again.
>
> The same issue arises for the all too common situation where "it takes
> forever to load via the cache but is immediate if I bypass the cache",
> mostly due to parent caches failing to respond quickly, for whatever reason
> though I'm not convinced that's the whole story. It's not relevant to the
> question under discussion, though, except in relation to user response to
> how the cache behaves (but if there's a subtle but common cause that can
> readily be fixed, details please!).
>
> > You can also use it as a load balancing meschanisme.. The only problem with
> > that is that you want to have specific Web traffic on the same Web caching
> > server to improve the HIT-rate..
>
> Load-balancing how? Unless things have changed relatively recently, few if
> any browsers will try more than the first DNS A record found under the cache
> server's name. And if one server of two (say) is not responding then relying
> on DNS round-robin shuffling of A-records would at best arrange for users to
> get a working cache server 50% of the time - which for a page with many
> embedded icons or other images means that even if the textual part of the
> page happens to be loaded via the working server, 50% of the images will be
> "broken image" icons or equivalent, representing failed retrievals via the
> unresponsive cache.
>
> It would be extremely helpful if Squid could be enhanced so as to avoid the
> extended delay while writing out cache/log, either with the sockets still
> live (as with log rotation) or already closed (hence no new connections, as
> with reconfiguration). I suspect this might be tricky, though...?
>
> Would it be viable if in those situations Squid were to treat all requests
> handled while the file was being written as proxy-only, using cache files
> which would be immediately erased after the document had been transmitted to
> the browser? The only specific issue that occurs to me is that it would be
> essential to avoid reusing a cache file in such a way that after reloading
> cache/log a cache file might exist but not contain the file identified by
> data from cache/log ...
>
> An alternative approach might be to keep track of documents processed while
> cache/log was being written, and freeze processing of requests only very
> briefly while those details are appended to cache/log after the pre-existing
> object details have been written and before closing the file.
>
> This would also help in the case Squid restarts, system shutdowns, etc.,
> where at present you also get a configurable delay with the sockets closed,
> then a long delay while cache/log is written. Either approach would mean
> that the "dead time" would be much reduced.
>
> This doesn't directly address the period with sockets closed after a HUP or
> shutdown request, but it's implicit in the above suggestions that the
> sockets would *not* be closed as early as they are now, else making it
> possible to continue processing requests would be useful only in the log
> rotation case.
>
> A configurable (down to zero) delay with new connections rejected, while
> attempting to finish old requests, *after* writing cache/log (with
> in-progress requests proxied not cached) would make more sense if it were
> possibly to continue handling requests while writing cache/log.
>
> Is any of this feasible?
>
> One of my few regrets in deciding to use Squid rather than Netscape's proxy
> server is that with the latter, reconfiguration is "instantaneous" and
> likewise shutdown, at the expense of dropping current requests on the floor;
> in terms of maintaining the confidence of your user community and not having
> them give up on the cache as "useless", dropping a handful of in-progress
> requests with no perceptible period rejecting new connections is a big plus.
> 2-5 minutes of no response or rejected connections equates to a lot more
> disruption than dropping a handful of current connections!
>
> John Line
> --
> University of Cambridge WWW manager account (usually John Line)
> Send general WWW-related enquiries to webmaster@ucs.cam.ac.uk

--
Note to evil sorcerers and mad scientists: don't ever, ever summon powerful
demons or rip holes in the fabric of space and time. It's never a good idea.
ICQ UIN: 3225440

Received on Thu Nov 06 1997 - 03:38:43 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:37:27 MST