Re: Squid is OK, but which UNIX for high loads?

From: Stewart Forster <[email protected]>
Date: Fri, 23 May 1997 13:07:05 +1000

> Have you tried increasing your TCP Listen Queue? This sounds like the listen
> queue is overflowing. On Solaris, you can use "ndd" to increase it, on Linux,
> you need a kernel/libc(?)/squid rebuild.

        This will only help minimally (5% performance diff if even that). This
is not the bottleneck that you are hitting. Squid under heavy loads
experiences multiple bottlenecks which I'll explain:

1) UDP_HIT_OBJ causes a disk access every time, and causes squid to wait until
    the object is gotten off disk. We experienced a substantial speed gain by
    modifiying the code to only return UDP_HIT_OBJ's on object in memory.
2) At 70000 hits/hr you might be starting to hit VM limits with 256Mb, as each
    incoming object in brought into RAM completely before being written to
disk.
    I have mentioned several times on this list that you need a VM friendly
    malloc routine. NB. GNU-malloc and libc malloc are NOT VM friendly. We
    experienced a 5x speedup under tight memory operations when moving to a
    VM friendly malloc. Our malloc routine was written internal to our company
    by myself so I can't redistribute it, but it has the following properties:
        a) Separates malloc meta-data from malloc allocated space
        b) Keep ALL allocations (meta data and allocated space) on VM page
            boundaries.
        c) Keeps all like-sized allocations together on the same VM page.
        d) Any allocations occur from as low in the address space as possible.
    Squid does many small allocs and frees, requiring the malloc free list to
    be searched constantly. By separating the freelist meta-data from the
    allocated space, all the allocated space doesn't need to be paged in when
    looking for a free block, and so thrashing is reduced. Further, by keeping
    allocations on page boundaries and like sized allocations together, cross
    page usage overlap is prevented, hence reducing thrashing again.

    I wrote our malloc package in an afternoon using the above parameters and
    reverse engineering from the malloc man page. It's not too much effort.
3) Disk's start slowing stuff down in a BIG way. The open(), close(),
unlink()
    system calls all take an extraordinary amount of time to complete under
    heavy load (up to 2 secs or more). Squid waits for these calls to happen
    and then does nothing else (it can't, it's blocked). Further, read() and
    write() aren't non-blocking even though the man pages say they are. They
    WILL block. I've seen write()'s taking up to 1 second to complete.

    All this badness starts happening at a magical point, where disk loads
start
    to creep up and average service times start to skyrocket. At this point,
    everything goes haywire unless you do something about it.

    There are two solutions here.
        a) Flush disk buffers regularly with sync(). Squid builds up an
            enormous amount of change data in the regular 30s sync interval and
            most of that data is a one-off, no point being cached in the UBC
            sort of stuff. When it does sync(), we're talking major disk
            overload for some number of seconds while they thrash their little
            hearts out. We got major wins by having a looping shell script
            
#!/bin/sh
while :
do
    sleep 2
    sync
done

            This reduced our disk service times from 500ms average to 40ms av.
                BIG WIN!
        b) Implementing non-blocking (asynch) read/write/open/close/unlink.
            This is a MAJOR win. I'll be releasing a patched 1.1.10 with this
            code soonish.

        Our machine are Sun Sparc Ultra 2's, with 768MB RAM, 2xCPU and 20GB disk
We regularly see loads of 350000 conns/hr and still have response of under .1s
to a telnet to port 8080 (our cache port) to get the "Connected..." string.

        Using the above patches we have been able to effectively increase the
speed of squid by 100x under heavy loads, such that if is rarely even noticable
that the cache boxes are under heavy load. We have yet to see a peak/cap where
our baxes will get slow. We've seen 1.85M TCP connections and 3M UDP conns
in a day and transferred 24GB that day through the box, and it still did not
run noticeably slower for anything less than an 512Kb/sec connection to our
networks.

        Cheers.
>
>
> >
> > I'm also running LInux (2.0.30) and am having similiar performance
> > problems. I'd consider switching to freeBSD (I have the latest CDROM);
> > what are the caveots?
> >
> > Brian
> >
> > On Thu, 22 May 1997, Arjan de Vet wrote:
> >
> > > In article <01BC66CD.59BFEB90@dima.cyberia.net.lb> you write:
> > >
> > > >I'm using Squid 1.1.10 on a PPro 200MHZ + 256MB RAM, Linux 2.0.29.
> > > >Squid was running OK for month (started with 1.1.0), until recently, =
> > > >problems started as the load increased (~70,000 connections/hour). =
> > > >Symptoms:
> > > >1. Squid's response time increased dramatically. Telnetting to port 8082 =
> > > >(the cache port) takes 3-4 seconds on the command line for the =
> > > >"Connected.." string to appear.=20
> > > >2. Even when connected, Squid takes 3-4 seconds to retrieve even cached =
> > > >objects!
> > >
> > > I have experienced the same problems when I was running Squid on a
> > > Sparc20, 384MB memory with Solaris 2.5.1 and around 50000 conns/hour. As
> > > soon as the number of open connections came above a certain level,
> > > performanced dropped to even worse numbers than yours (>20 secs for
> > > small-sized cache hits).
> > >
> > > >It was suggested that this may be a problem with Squid, so we upgraded =
> > > >to 1.1.10 (we were running 1.1.8). It was also suggested that this may =
> > > >be a Linux problem, so we upgraded to 2.0.30 (which has the ISS). =
> > > >Performance improved, but the delays were still unacceptable. Note that =
> > > >memory is abundant (I always have 70-80 Megs free physical RAM) and CPU =
> > > >load rarely goes beyond 1.
> > >
> > > >I want to keep my existing hardware, which leaves me with 2 choices: NT =
> > > >and Solaris x86 (Don't want FreeBSD). NT is a no-no, but=20
> > >
> > > Why don't you want FreeBSD? Did you ever try it? I replaced the
> > > Sparc/Solaris setup with a PPRO200/128MB with FreeBSD 2.2.1 for a few days
> > > and the response times for the small-sized cache hits went down from >20
> > > secs to <300 milliseconds... I'm now using BSD/OS 3.0 (also BSD4.4
> > > derived) because the company wanted a commercial OS.
> > >
> > > >Is Solaris x86 2.5.1 recommended for such a setup?
> > >
> > > I've never tried Solaris on x86. But why not try the available options for
> > > some time and then make your decision? And if you need some tuning tips for
> > > FreeBSD or BSD/OS, contact me.
> > >
> > > Arjan
> > >
> > >
> >
>
> --
> Michael 'Moose' Dinn \ Make Lots of Money
> Michael.Dinn@iSTAR.Ca \ Operate Within the Law
> iSTAR Internet Inc. \ Enjoy Your Work
> (902) 481-4524 Voice \ .... choose 2
>
> If you own a 1972 Mustang Convertible... I want to hear from you!
>
Received on Thu May 22 1997 - 20:09:24 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:35:14 MST