Re: How to tell that my cache is overloaded...

From: Oskar Pearson <[email protected]>
Date: Mon, 30 Jun 1997 11:05:00 +0200

Nigel Metheringham writes:
>
> I'd like a list of things to watch so that I can tell when our cluster of
> cache machines is running out of steam. What points, other than general
> response time, which tends to depend very much one what you are looking
> at, show a cache system that is just getting more hits than it can cope
> with??
hmm - this is an ugly question :)

what we have here:

A script runs every 5 minutes that gets the cache-info page, it then does
all sorts of things like print out disk objects etc - this is the same one
as http://ircache.nlanr.net/Cache/Statistics/Vitals/ runs, but it has all
sorts of stuff that does graphing etc.

Also - we then run a script called response-times on the access.log files
that does the following (I didn't write these - Duane did, but he considers
them "non-release-code" so don't bother him with requests! This is just
giving credit when it's due!):

tcp-svc-time-hist.pl < ~/access.log > svc-time-hist
system("~/response-times.gnuplot | ppmchange rgb:0/0/0 rgb:ffff/ffff/ffff rgb:0303/0303/0303 rgb:0/0/0 | ppmtogif > /home/httpd/html/cache/vitals/response-time-$DATE.gif 2>/dev/null");

tcp-svc-time-hist.pl looks like this:
--------------
#!/usr/bin/perl

# $Id: tcp-svc-time-hist.pl,v 1.1 1997/05/16 04:56:13 wessels Exp $

# Make a log-based histogram of HTTP request service times.
#
# perl tcp-svc-time-hist.pl < /usr/local/squid/logs/access.log > hist
# gnuplot
# > set logscale x
# > plot 'hist' using 1:4 with lines

$SF = 10;

while (<>) {
        @F = split;
        next unless ($F[3] =~ /^TCP_/);
        next if ($F[1] == 0);
        $bin = int( $SF * log($F[1]) + 0.5);
        $H[$bin]++;
}

$sum1 = 0;
$sum2 = 0;

for ($i=0; $i<=$#H; $i++) {
        $sum1 += $H[$i];
}

for ($i=0; $i<=$#H; $i++) {
        $sum2 += $H[$i];
        printf "%14.5f %9d %10.5f %10.5f\n",
                (exp($i/$SF)/1000),
                $H[$i],
                $H[$i]/$sum1,
                $sum2/$sum1;
}

# print the median
$sum3 = 0;
for ($i=0; $i<=$#H; $i++) {
        $sum3 += $H[$i];
        $S[$i] = $sum3;
        last if ($sum3 / $sum1 >= 0.5);
}

$X = $S[$i-1];
$Z = $S[$i];
$Y = $sum2 / 2;
print "#i=$i\n";
print "#X=$X\n";
print "#Z=$Z\n";
print "#Y=$Y\n";
die if ($Y < $X);
die if ($Y > $Z);
$B = ($i -1) + ($Y-$X) / ($Z - $X);
print "#B=$B\n";
printf "# median is %f seconds\n", exp($B/$SF) / 1000;
---------------------

response-times.gnuplot looks like this:

------------------
#!/usr/bin/gnuplot
set term pbm small color
#set size 0.88,0.88
set xlabel 'Time transaction took (seconds)'
set ylabel 'Number of documents'
set logscale x
plot 'svc-time-hist' using 1:2 title 'cache' with lines
------------------
Received on Mon Jun 30 1997 - 02:11:40 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:35:35 MST