Re: [squid-users] squid "stops working" several times a day

From: Marcus Kool <marcus.kool_at_urlfilterdb.com>
Date: Wed, 29 Feb 2012 12:22:44 -0300

the reads have many errors, about 25% and your other email shows the error code EAGAIN.
All calls to connect() fail and all calls to recfrom() fail.

Its seems to me that the system has a resource problem.
Running out of file descriptor or other system resources.

I suggest to look at /var/log/messages and strace again and
find out why connect, recvfrom and accept fails.

Marcus

karj wrote:
> The output of the stace at a normal time
>
> Process 18021 attached - interrupt to quit
> ^CProcess 18021 detached
> % time seconds usecs/call calls errors syscall
> ------ ----------- ----------- --------- --------- ----------------
> 40.42 0.078685 1 131179 16 write
> 23.38 0.045513 0 125618 32228 read
> 11.07 0.021552 0 90238 epoll_ctl
> 6.81 0.013266 0 64968 fcntl
> 4.64 0.009038 1 12177 open
> 3.95 0.007696 0 22260 close
> 3.92 0.007640 0 24285 stat
> 1.82 0.003541 0 19901 lseek
> 1.82 0.003536 0 12021 2105 accept
> 0.98 0.001913 0 4169 epoll_wait
> 0.74 0.001440 0 9916 getsockname
> 0.27 0.000533 0 8244 fstat
> 0.14 0.000265 1 245 245 recvfrom
> 0.02 0.000046 0 240 240 connect
> 0.00 0.000000 0 12 brk
> 0.00 0.000000 0 240 socket
> 0.00 0.000000 0 3 sendto
> 0.00 0.000000 0 240 bind
> 0.00 0.000000 0 240 setsockopt
> 0.00 0.000000 0 239 getsockopt
> 0.00 0.000000 0 1 getrusage
> 0.00 0.000000 0 15 getdents64
> ------ ----------- ----------- --------- --------- ----------------
> 100.00 0.194664 526451 34834 total
>
> Thanks again
>
>
>
> -----Original Message-----
> From: karj [mailto:gkaragiannidis_at_dolnet.gr]
> Sent: �������, 29 ����������� 2012 3:40 ��
> To: 'Sebastian Muniz'; squid-users_at_squid-cache.org
> Subject: RE: [squid-users] squid "stops working" several times a day
>
> I 'm able to ping the machines
> The one thing that I observed is that
> by the time of crisis squid process is using 100% of the CPU.
> That's happening to every server which has the problem...
> I 've tried to use strace but I've got no success since the strace output is
> huge.
> What else can I do to identify the problem.?
>
>
> At the time of problem seems from cache.log that squid loses connectivity
> with almost everybidy
>
> 2012/02/29 09:15:51| TCP connection to assets_servers (xxx.xxx.xxx.xxx:80)
> failed
> 2012/02/29 09:15:51| TCP connection to typos_servers (xxx.xxx.xxx.xxx:80)
> failed
> 2012/02/29 09:15:51| Detected DEAD Parent: tityros_servers
> 2012/02/29 09:15:51| TCP connection to assets_servers (xxx.xxx.xxx.xxx:80)
> failed
> 2012/02/29 09:15:51| TCP connection to tityros_servers (xxx.xxx.xxx.xxx:80)
> failed
> 2012/02/29 09:15:51| TCP connection to typos_servers (xxx.xxx.xxx.xxx:80)
> failed
>
>>From another sibling log at the same time
> 2012/02/29 09:15:51| Detected DEAD Sibling: xxx.xxx.xxx.xxx
>
>
> Thanks in advance
> Yiannis
>
> -----Original Message-----
> From: Sebastian Muniz [mailto:basurerosebita_at_gmail.com]
> Sent: �����, 28 ����������� 2012 11:52 ��
> To: squid-users_at_squid-cache.org
> Subject: Re: [squid-users] squid "stops working" several times a day
>
> On 2/28/2012 2:54 PM, karj wrote:
>> Hi All,
>> I have a problem with my squid's.
>> Squid "stops working" several times a day.
>> The only thing that warns me that something is wrong in cache.log is
>> the "Detected DEAD Sibling: xxx.xx.xxx.xxx" message.
>> After a few seconds everything goes back to normal.
>> We are using 5 squids version (2.7.Stable 9) in Accelerator Mode which
>> are sibling to each other.
>> So we have 5 sibling squid in front of our web farms. Serving almost
>> 7000/request per second at peak time, and an average of 4500/request
>> per second.
>> The problem occurs randomly in all servers...
> Are you able to reach (telnet or ping or anything) the sibling during the
> times that squid stops working?
> What can you tell about the sibling logs? Specially the cache.log
>
> Regards
> Sebastian
>
>
>
Received on Wed Feb 29 2012 - 15:22:49 MST

This archive was generated by hypermail 2.2.0 : Wed Feb 29 2012 - 12:00:06 MST