Re: [squid-users] what are the Pros and cons filtering urls using squid.conf?

From: Jose-Marcio Martins <Jose-Marcio.Martins_at_mines-paristech.fr>
Date: Tue, 11 Jun 2013 09:13:54 +0200

Thanks for the info,

On 06/11/2013 05:26 AM, Amos Jeffries wrote:
> On 11/06/2013 9:03 a.m., Jose-Marcio Martins wrote:

> When Squid reloads it pauses *everything* it is doing while the reload is happening.
> * 100% of resources get dedicated to the reload ensuring fastest possible recovery then everything
> resumes exactly as before.

OK ! But, IMHO, one doesn't need to reload squid if a helper just needs to reload its database. It's
a design error on huge servers.

> When a squid helper reloads it pauses *just* transations which are depending on it, other
> transactions remain processing.
> * Some small % of resources get dedicated to the reload.
> * each helper instance of that type must do its own reload, multiplying the work performed during
> reload by M times.

OK. So the goal is to MINIMIZE the time taken to reload/reopen databases, WITHOUT reloading all squid.

This kind of problem arrives on many other kind of online filtering software. E.g. mail filters
(milters, ...).

If the looooong time is the one needed to convert, e.g., a text file into a .db file, you can do
things like the following :

1. move bl.db bl.db.old
2. makedatabase bl.db
3. tell all helpers to reopen database (e.g. using some signal).

On (1.) renaming the file doesn't change file descriptors, so from 1. to 3. the helper will still
use old database. On 3. all helpers will just close the old database and open the new one. The
needed time to do this is minimal (surely much less than a milisecond). Just lock database access
during reopening database.

If you use in memory databases, you can think the same way, except that you're using memory pointers
instead of file descriptors/database handlers.

Sure, if you have a very big number of helpers, this may be a problem for memory databases. But in
this case, maybe you shall think about why you need so much helpers. Maybe there are some
optimisation to be done on the programming side, or use some fast disk based database, or shared memory.

Another situation can arrive when you have different kind of helpers : one kind doing url filtering
and another one doing content filtering (e.g., virus, ...). So, if when each one need to reload all
squid... it's crazy...

It seems that ICAP allows all this in a cleaner way.

What about multithreading. Which solution can be used on multithreaded helpers ?

Thanks for the pointer on hotconf, I'll take a look.

Regards,

Jos�-Marcio

>
> When ICAP reloads it has the option of signalling Squid no more transactions and completing the
> existing ones first, or spawning a new service instance with the new config and then swapping over
> seamlessly.
> * the resources of some other server are usually being applied to the problem - leaving Squid to run
> happily
> * Squid can failover to other instances of that ICAP service for handling new transactions.
>
> No matter how you slice it, Squid will eventually need reconfiguring for something and we come back
> to Squid needing to accept new configuration without pausing at all.
> There is the "HotConf" project (http://wiki.squid-cache.org/Features/HotConf) which 3.x releases are
> being prepared for through the code cleanup we are doing in the background on each successive
> release. There is also CacheMgr.JS project Kinkie and I have underway to polish up the manager API,
> which will eventually result in some configuration options being configurable via the web API.
>
> Amos

-- 
  Envoy� de ma machine � �crire.
  ---------------------------------------------------------------
   Spam : Classement statistique de messages �lectroniques -
          Une approche pragmatique
   Chez Amazon.fr : http://amzn.to/LEscRu ou http://bit.ly/SpamJM
  ---------------------------------------------------------------
  Jose Marcio MARTINS DA CRUZ            http://www.j-chkmail.org
  Ecole des Mines de Paris                   http://bit.ly/SpamJM
  60, bd Saint Michel                      75272 - PARIS CEDEX 06
  mailto:Jose-Marcio.Martins_at_mines-paristech.fr
Received on Tue Jun 11 2013 - 07:14:01 MDT

This archive was generated by hypermail 2.2.0 : Tue Jun 11 2013 - 12:00:13 MDT