Re: Redundancy

From: WWW server manager <[email protected]>
Date: Fri, 7 Nov 1997 22:14:11 +0000 (GMT)

David J N Begley wrote:
>
> On Thu, 6 Nov 1997, WWW server manager wrote:
>
> > * when asked to load a new configuration, Squid rejects connections while
> > it writes a new cache/log file containing details of the current cached
> > objects. On our server, that takes a 2-5 minutes.
>
> Ouch! Is this something that can be fixed in software without risking the
> integrity of the data, or is it a problem at which "bigger iron" should be
> thrown? How different is this process from what happens when Squid first
> starts up (it's still reading the log file, yet it responds to requests
> anyway)?

Oops, on re-reading the text you quoted above, I just realised I made a
mistake there. cache/log doesn't get written out for a reconfigure, but
does when you rotate the logs and when you shut down squid (which, if you're
e.g. stopping it to load a new version, may be something you want to be as
quick and/or non-disruptive as possible, though clearly it's less important
if you're rebooting the whole system). So, for reconfigure the only delay
is the configurable one while Squid waits for existing requests to finish
(if possible, within the timeout). I'm sorry about the accidental
misinformation there.

My main concern at the moment is the delay (with connections accepted but
nothing happening) while cache/log is saved during log rotation; until now,
I've been rotating the logs only monthly, but that will soon be changing to
daily as the size of the logs has grown too much to keep a whole
month's-worth non-compressed for later analysis, and with daily rotation
it's more likely to be noticed and cause complaints (or cause people to stop
using the cache).

While a less frequent occurrence, the delay issue (this time without
new connections accepted, I muddled this with the situation for
reconfiguration) also arises during Squid shutdown, when it waits for
connections to finish and only then saves squid/log. Since writing squid/log
takes a long time (somewhat variable, at least 20-30 seconds, up to several
minutes, for our cache), that means somewhere around a minute minimum, up to
several minutes, with connections rejected when the aim of the exercise may
have been an "unnoticed" shutdown of one version and restart with another
(or a configuration change which only takes effect with a full restart, e.g.
changing cache_swap seems to be ignored until Squid is restarted).

In both those situations, if Squid could continue to process requests while
writing cache/log (only rejecting connections totally for the configured
"finishing old requests" timeout when shutting down), the disruption would
be kept to a minimum (and could be set as low as required by reducing the
timeout).

I'm hoping Duane or someone else who is familiar with Squid's inner workings
will be able to tell us if it's feasible (and if feasible, whether it's at
all likely it might be done now that the question has been raised).

> At midnight our logs are rotated - at this point, Squid does a
> "storeWriteCleanLog" .. upon its conclusion it reports:
>
> 97/11/07 00:00:52| Finished. Wrote 968123 lines.
> 97/11/07 00:00:52| Took 52 seconds (18617.8 lines/sec).
>
> During that 52 seconds, users trying to desparately "click 'n' shoot" may
> have a couple of false starts, but for the majority of users the delay is
> completely unnoticed (or by the time they hit "reload", everything is
> running again). Total complaints after >12 months of doing this - zero.

It sometimes been quite a bit longer on our server, admittedly when I needed
to do a "quick" Squid restart during the day rather than for log rotation
overnight. I'd still like to avoid it if possible (which the external
redirector suggestions wouldn't do, as Squid accepts connections but then
does nothing with them for a while) since even if people don't complain,
they may vote with their feet, or rather, their browser configurations.

A faster system could reduce the delay, but that would be a rather expensive
solution...

> > indicated in the original query, liable to result in users bypassing the
> > cache and never bothering to use it again.
>
> If it matters not to the site if the users go direct or via the proxy,
> then who cares? If it *does* matter, then there are alternatives:
>
> - force users to use the proxy;
> - if users are charged for traffic volume, provide discounts for use of
> the proxy (ie., it becomes cheaper for users to wait for the proxy than
> to impatiently go direct);
> - provide properly redundant proxying systems (a single front-end to a
> single back-end proxy isn't "redundancy"); or,
> - throw enough grunt at the one proxy you have so that it *does* stay up
> without imposing unbearable delays on proxy usage.

Access is not charged by usage and we cannot force people to use the cache,
but there is a requirement to try and encourage people to use the cache (or
more importantly, the indirect effect of reducing usage on the international
links due to cache hits at the parent (national) cache systems. The redundant
systems options doesn't help if connections are accepted but then "put on
hold" for a while. Beefing up the system might reduce the delays
sufficiently that they were ignorable...

> I know this sounds simplistic and there probably is something in Squid
> that can be "tweaked" .. I just don't agree with the idea that whenever
> something goes wrong it's automatically "Squid's fault".

I didn't mean to suggest that it was "Squid's fault" - especially as I can
see in general terms how the implementation arrived in its current form
(and indeed, the current handling of cache/log is vastly better than the way
the cache was - or was not - preserved over restarts in early versions of
Squid and/or its Harvest cache precursor). And I certainly did not mean to
sound like I was demanding that it be fixed, if that's how it came across.

Rather, I was prompted by the discussion to document (or misdocument, in
part; sorry) the points which had been worrying me, and ask whether in
fact there was any possibility of changes that would reduce or avoid their
impact, presumably as part (a new part, unless it was already on Duane's
list of desirable enhancements) of the general Squid development plan.
[As a side effect of raising the issues, it could also serve to find out
whether there was much support for this, or whether no-one else saw it as an
issue...]

                                John Line

-- 
University of Cambridge WWW manager account (usually John Line)
Send general WWW-related enquiries to webmaster@ucs.cam.ac.uk
Received on Fri Nov 07 1997 - 14:28:19 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:37:28 MST