Splay tree bug

From: Arjan de Vet <[email protected]>
Date: Thu, 13 May 1999 18:57:21 +0200 (CEST)

In article <E10enxg-0005n0-00@cygnus.csx.cam.ac.uk> you write:

>Tim Burgess wrote:
>>
>> We have a pornography etc blocking set of ACLs and we get a similar list of
>> errors when we start. But it still works - so who cares?
>>
>> Sure, the errors may be misleading - but squid still runs and the ACLs work.
>
>Did you check it really did work (rather than simply not crash), in all
>cases? The warning messages, and comments I received about a related issue,
>made me rather wary; the impression given is that depending on "random
>factors", the lookup may not match in the way you'd expect, depending on how
>building the splay tree structure works out based on the particular entries.
>And that could mean it works one day, but miss some matches the next after
>the list has been updated.

There's indeed some kind of bug. Triggered by these warning messages I
decided to investigate what was going on and after some trying I had
created a list of domains and hostname to check to illustrate the
problem in a reproducable manner.

Domain list:

    def.sub.company.com
    company.com
    qrs.def.sub.company.com

host to check:

    abcdef.sub.company.com

This check will fail.

After reordering the domain list:

    def.sub.company.com
    qrs.def.sub.company.com
    company.com

abcdef.sub.company.com checks OK.

In the past I have contributed binary trees for Squid but these had the
same problem; however I wasn't able to prove that splay trees had the
problem too.

>Actually, since Duane Wessels mailed me a reply indicating it *was* just an
>incorrect diagnostic (with a patch to try), you would have been right - but

Can you mail me that patch? I'm very interested because I'll be putting
some 2.2stable2 proxies with access lists consisting of 4000+ entries
into production soon.

>I'd have been wary of taking the chance without checking that it was going
>to work in general, not just with specific entries in the tree. [I've
>confirmed that the patch worked, except for a misleadingly-worded warning
>for an exact duplicate entry, which was reported as a subdomain, so the
>patch or some variant of it will presumably appear in future versions and
>maybe as a freestanding patch.]
>
>In response to a suggestion by private mail from someone else, noting that I
>could avoid the problem by using multiple acl definition, those would
>have to be in the main config file and I was using an acl referencing a file
>to avoid frequent edits to the main config (possibly by a variety of
>people), plus it would be messy splitting the entries into the minimal
>number that avoided conflicts. Not an issue to the extent that the specific
>problem was due to a bug anyway.
>
>However, as noted in my reply to Duane, I'd be happier if Squid
>automatically ignored exact duplicates and hosts/subdomains covered by
>higher-level domain entries in an ACL definition, since that would make it
>easier and less error-prone maintaining long lists of unrelated special
>cases which could by chance be identical or overlap. Squid should be
>well-placed to handle it automatically (but the current implementation may,
>of course, make it less easy than it would appear), whereas doing it by hand
>(reinstating commented-out entries when conflicting entries are deleted, or
>whatever) would be error-prone, and writing checking scripts to spot problems
>or derive a "sanitised" version from a master file seems like the wrong way
>to do it (complicating things unnecessarily)...

I do agree. What I've been thinking about is the following.

Let the aclDomainCompare(a, b) function return -2 if a < b in the normal
(lexical order) sense and b is a subdomain of a; return 2 if a > b and a
is a subdomain of b. These are the cases for which the warnings are
given and which now return -1 and 1. Note that most compare functions
simply return 0, <0 and >0 and should be changed to return 0, -1 and 1
in the normal case.

When inserting a new entry into the splay tree we have two extra cases:

aclDomainCompare(new, node->data) == 2:

    The new entry is a subdomain of the current node, do not insert the
    new entry in the tree: everything that matches the new entry already
    matches the current node.

aclDomainCompare(new, node->data) == -2:

    The current node is a subdomain of the new entry, therefore it makes
    sense to replace the current node with the new entry because it is
    more general. But now comes the complicated stuff: there can be
    other nodes which are subdomains of the new node, these nodes should
    be deleted as well.

The second case can be avoided by sorting the list first based on
lengths of the domains; the more generic domains will then come first.
But it's an ugly workaround and nothing more.

Does anybody have other suggestions to solve this bug?

Note: the problem for domain lists is also present for ip lists where
network specifications can overlap too: 192.1.1.16/28 and 192.1.1.0/24
for example.

Arjan
Received on Thu May 13 1999 - 10:57:37 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:46:17 MST