Re: [squid-users] anyone have a good expressions list

From: Henrik Nordstrom <[email protected]>
Date: Fri, 14 Feb 2003 19:54:21 +0100

Building a good regex list which blocks only porn is a almost impossible
task, if you also want it to block porn..

In almost all cases you will need a whitelist when using regex patterns
for blocking to exclude things which are not wanted to be blocked but
which resembles too closely a name which normally should be blocked..

Writing regex expressions is not that hard. Some quick guidelines:

1. regex matches are partial string matches, not "word" matches.

2. . is a special character matching any character. To match . you need
to use \.

3. ^ and $ is also special charaters, matching the beginning and end of
the string respectively. This means that a regex pattern starting with
^ starts matching only at the beginning of the string (i.e. ^www\.
matches www.anything), and a pattern ending in $ matches only if it
matches the end of the string (i.e. \.com$ matches anything.com)

4. * and {nn} makes repetitions. * repeats the previous atom 0 to
infinity number of times, {nn} exacly nn times. There is also {min,max}
repetition count.

5. To group things you can use (). i.e. (ab){4} matches abababab (4
times ab)

6. To make different alternatives you can use |. i.e. a(b|c|de)f matches
abf or acf or adef

7. There is a number of magic constructs such as word boundary matches
etc.. see the "man 7 regex" manual for a full list of regex
capabilities. (squid uses what is referred to in most documentation as
"modern" or "extended" regex syntax)

Regards
Henrik

Jeff Donovan wrote:
>
> greetings
>
> I'm looking for a good expressions list. Something that only targets
> porn sites. I had been using the default exp list that comes with the
> blacklists, but it seems to block out many sites that are not adult
> related.
>
> I'm pretty much REGEX illiterate.
>
> --jeff
Received on Fri Feb 14 2003 - 11:58:45 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:13:24 MST