Autoup tuning - WAS: Squid hangs weekly on SUN Ultra (fwd)

From: Stewart Forster <[email protected]>
Date: Wed, 01 Apr 1998 10:17:14 +1000

> Where synchronous writes are taking place (e.g. to raw or via a ufs filesystem
> using directio) fsflush should not be causing much load at all, since it will
> have nothing to flush, cos the data is written directly to the disk.

        The asynchronous library I wrote was implemented using pthreads, and
selecting a free thread from a pool and issuing the disk operation from the
thread. All disk operations are therefore synchronous with respect to the OS.
This is the model that is being used in squid 1.2 and currently on our modified
squid 1.1 caches.

> Usual cases where the system can benefit from tuning are where there is a great
> deal of local filesystem i/o - so proxy servers sound good.

        Agreed.

> Remember this is no magic bullet to speed up your filesystems - it just
> reduces the amount of CPU time that the fsflush process consumes.

        I think you've overlooked that disks, if they are are asked to do
too much at once, start to thrash badly. Disk drives have a VERY real knee
in their performance curve. Once you push them beyond a cartain point, the
service times start to skyrocket with each additional disk transaction.
The only way it balances is through the disks running so slow that cache
hits are so sluggish such that users start backing off. This is not an
ideal scenario.

        CPU is irrelevant for the context of the `autoup' tuning for the
way a proxy-cache demands disk IO.

> Basically, tune fsflush as follows - watch the amount of CPU the process
> takes. If more than about 5% (ie 5 CPU seconds in 100 real seconds) then
> add 120 say to the autoup value in /etc/system:
>
> set autoup=120
>
> Give it a boot and see what happens. Note that this is an iterative process
> and you might need to do it a few times before you can get the loading down.
>
> One thing to be aware of is that fsflush is also repsonsible for flushing
> modified entries from the inode cache to disk. So possibly you need to
> look at the inode cache tuning as well. Check out SunWorld Online for
> some tuning docs on this, or get Adrian Cockrofts new edition of Sun Performance
> and Tuning. I think it is due out shortly.

        CPU is not the issue (again). By setting a larger autoup value, the
buffer cache is being asked to keep more objects. This will strain VM
resources severely since pages cannot simply be tossed quickly when more
memory is required, they have to be flushed out synchronously, delaying the
processing of the cache software. Further, every `autoup' seconds there
appears to be a massive burst of IO (I haven't been able to identify what
this consists of, inode??), which sends the disks into a state of thrashing,
which lasts for up to 15 seconds (we have 10 2GB and 6 4GB spindles on one
of our caches for example). During this thrash period, EVERYTHING suffers.
Writes and reads get delayed, meaning that dirty pages sit in memory, this
forces the machine out of memory causing massive IO hangs. Cache hits take
ages to complete.

        The trick is to use Squid's own object cache. Squid caches entire
objects, not disk pages, so it already has a better chance of utilising
available memory correctly. A web proxy cache does a whole lot of writes
that never get read again within the period during which the object resides
in Squid memory object store. Therefore disk write buffering becomes
effectively irrelevent. Better to flush out that data ASAP to free more
memory.

        Our caches are moving around 2MB/sec of data. If the disks start to
thrash for 10 seconds, that a 20MB memory shortfall the system has to overcome
immediately because the disks are failing to keep up. If the buffer cache is
full of dirty pages from tuning autoup too big, then the problem just
compounds. We've seen our caches turn suffer a measurable 50x slowdown when
the autoup period occurred. Making autoup bigger made the problem worse.

        Squid and proxy-caching places demands on general purpose operating
systems that they were never really designed to handle. The theory may be
sound, but in practice the machines are running just below 4 different
performance knees constantly (disk, memory, CPU, network), and it's a major
balancing act. By tuning variables without understanding the side-effects
(and there are MANY), you are causing more damage. Tuning a web-proxy system
is an entirely "wholistic" approach. Everything must be taken into account.

        Stew.

-- 
Stewart Forster (Snr. Development Engineer)
connect.com.au pty ltd, Level 9, 114 Albert Rd, Sth Melbourne, VIC 3205, Aust.
Email: slf@connect.com.au   Phone: +61 3 9251-3684   Fax: +61 3 9251-3666
Received on Tue Mar 31 1998 - 16:22:40 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:39:30 MST