Wednesday, April 28, 2010

Installing and configuring Dansguardian with Squid proxy in linux rhel5 or centos5

 Installing and configuring squid internet proxy can be found in the following link.

After configuring the Squid proxy, You can install and configure Dansguardian.

Dansguardian is an add-on for squid. DG is having lots of list files to which we can add and remove entries easily. No need to write complicated ACL rules in squid.

DG acts as a filter point before the squid. We have to configure two things in DG. A filter port and filter IP. The web request will be hitting this port first. Then only it'll goto squid.

The data flows is as shown below.

From web browser request will hit iptables rules first. Then the DG and then Squid.

The package can be downloaded from site installation check whether all the pre-requisites are met.

  1. gcc
  2. rpm-build
  3. kernel-devel
  4. pcre – [perl compatible regular expressions] , pcre-devel

Download the package from

Extract the package:
#tar zxvf dansguardian-

Change Directory:
#cd dansguardian-

Configure the software:
We are going to install the DG in the location /usr/local/dans
#./configure --prefix=/usr/local/dans


#make install

Installation is over. Now we can see four directories created under /usr/local/dans/
etc - configuration files
sbin - deamons
share - language and display settings
var - log files

Edit the configuation file:
#vi /usr/local/dans/etc/dansguardian/dansguardian.conf

filterip =
filterport = 9999
proxyip =
proxyport = 8080

Give the proper permissions and ownership:
#chown -R root:squid /usr/local/dans/var/log/dansguardian/
#chmod -R 777 /usr/local/dans/var/log/dansguardian/

Now start the server.
# /usr/local/dans/sbin/dansguardian
Change the proxy IP & Port in browser to filter IP and Port

Important files:
#cd /usr/local/dans/etc/dansguardian/lists/
All the files in this directory are the Access control files..

For example:
in bannedsitelist if you add, and restart the DG as below
# /usr/local/dans/sbin/dansguardian -Q
You can no more access

in bannedphraselist if you add and restart the DG
Then you can no more view any page which contains the word football.

Use and syntax is explained with each and every file in lists.

Monday, April 26, 2010

How to install and configure squid proxy server in linux rhel5 / Centos

Squid is one of the best web proxy servers in the world. This explains how to install and configure Squid Proxy in Linux rhel5 / centos system.
Software = SQUID
Version = squid-2.6.STABLE6-3.el5

[root@vm1 ~]# yum install squid*
If  you  dont have a yum server then use rpm.

Starting the service:
1st set a fully qualified domain name. Otherwise squid may fail.
#/etc/init.d/squid start

Defaultly SQUID binds to port 3128. But usually it is changed to 8080.

In squid we configure things by writing rules. They are known as ACL rules.
A simple ACL rule:
acl aclname acltype string1
http_access allow|deny aclname
aclname = name of the rule
acltype = the type of string we are using eg:src, dst
string = can be IPs, networks, URLs etc
acl mynetwork src
http_access allow mynetwork

NOTE: Specify the rules before the line
# http_access deny all
Its because the rules are parsed from top to bottom.

To Check to which port the proxy currently binds to:
[root@vm1 ~]# nmap
Starting Nmap 4.11 ( ) at 2010-04-16 16:59 IST
Interesting ports on
Not shown: 1672 closed ports
22/tcp open ssh
[output truncated]
3128/tcp open squid-http
Nmap finished: 1 IP address (1 host up) scanned in 0.448 seconds

To block internet usage from a particular IP address:

Write this rule:
#acl block_ip src
#http_access deny block_ip

It should be above these lines
#acl mynetwork src
#http_access allow mynetwork

Else the rule will be cancelled because of the above rule[mynetwork]. Always keep in mind that Squid interprets rules from topto bottom.

To block internet usage from two or more IP addresses By ACL Lists:

Write this rule:
#acl block_ips src IP1 IP1
#http_access deny block_ips

#acl block_ips src
#http_access deny block_ips

Or you can define rules like this:
#acl block_ips src
#acl block_ips src
#http_access deny block_ips

To block a particular URL:
For blocking the URL
For blocking only one URL use the acl_type dst.

#acl block_yahoo dst
#http_access deny block_yahoo
You can see that site yahoomail is still accessible. So it blocks single URL only.

To block only one domain:
Eg for blocking all systems from accessing

#acl block_orkut dstdomain
#http_access deny block_orkut

To block a list of sites from / specified in a file:
First we have to create a file and save all the URLs we want to block in that.
In this example file is saved in /etc/squid/block_list.txt. And its given the read permission for all.

[root@vm1 ~]# cat /etc/squid/block_list.txt

#acl block_list url_regex "/etc/squid/block_list.txt"
#http_access deny block_list

Blocking web access by time:

The syntax is as follows:
#acl aclname time [day-abbrevs] [h1:m1-h2:m2]
h1:m1 must be less than h2:m2
S - Sunday
M - Monday
T – Tuesday
W - Wednesday
H - Thursday
F - Friday
A - Saturday

We are going to block all systems of mynetwork from accessing web at lunch time. Where lunch time is 02:32-03:00

#acl mynetwork src
#acl lunch time MTWHFA 02:32-03:00
#http_access deny mynetwork lunch
mynetwork must be defined before.

Log files of squid:
/var/log/squid/cache.log -> Memory and CPU informations
/var/log/squid/squid.conf -> Basic system informations.

You can check your requests were a HIT or MISS in this log file. MISS means it was not taken from cache and HIT means it was taken from the cache.

Syntax / Fields in /var/log/squid/access.log

1st field is Request_time shown in unix epoch format appended with milli sec.
2nd feild is elapsed time[ in ms] of page/object delivery.
3rd field is remote host ip.
4th field is code(squid actions)/status(http errors)
5th field is bytes delivered to the client.
6th field is the method that is used to retrieve the page.
7th field is URL.
8th field is IDENT identification.
9th field is heirarchy (eg: DIRECT/IP).
10th field is MIME type.

store.log stores the objects in the cache.

Syntax / Feilds in /var/log/squid/store.log:
1st field is time in unix epoch format.
2nd field is action (Release, Create, swapout, swapin)
swapout - the object is moved from memory to disk
released – neither in disk nor in memory
all objects are in /var/spool/squid
3rd field is location on disk
4th field is HTTP status
5th field is HTTP date
6th field is last modified
7th field is expiration of content
8th field is MIME type
9th field is size of the content. content_lenght/actual size of content.
10th field is Method ( Get,Post,Connect)
11th field is URL

To turn on the Common Log Format (CLF):

#vi /etc/squid/squid.conf

# emulate_httpd_log off
emulate_httpd_log on

To Check the speed or the time squid taking to load a page:

[root@vm1 ~]# squidclient -g
2010-04-17 12:59:11 [1]: 0.142 secs, 6.567727 KB/s (1KB)
2010-04-17 12:59:12 [2]: 0.145 secs, 6.431843 KB/s (1KB)
2 requests, round-trip (secs) min/avg/max = 0.142/0.143/0.145

If you are checking from a remote machine, then:
#squidclient -h Squid_server_ip -g URL

[root@server ~]# squidclient -h -g
2010-04-17 13:12:47 [1]: 0.128 secs, 7.286072 KB/s (1KB)
2010-04-17 13:12:48 [2]: 0.115 secs, 8.109715 KB/s (1KB)
3 requests, round-trip (secs) min/avg/max = 0.115/0.132/0.153

How to set proxy through bash shell:

[root@server ~]# export http_proxy=URL/IP_of_Squid_Server
[root@server ~]# export http_proxy=
Now we are on the machine and set the proxy with above command. Now while accessing net, see its going through the proxy site.

[root@server ~]# wget
Connecting to connected.
Proxy request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `index.html'
[ <=> ] 17,208 33.3K/s in 0.5s
13:42:36 (33.3 KB/s) - `index.html' saved [17208]

Cache Managing Script:
We have a cache managing script associated with squid: cachemanager.cgi
We access this script through web interface and can get lots of cache informations.

First we have to install and start httpd in our system. Then we have to copy the cgi script to the apache cgi-bin directory.

[root@vm1 ~]# rpm -ql squid | grep cgi
[root@vm1 ~]# cp /usr/lib/squid/cachemgr.cgi /var/www/cgi-bin/

Now take the browser. In address bar and give http://IP_proxy/cgi-bin/cachemgr.cgi
You will be getting a cache managemnt page.Click continue.You will get cache manager menu.

To change the default squid proxy port:

Now we'll change the default squid port 3128 to 8080
#vi /etc/squid/squid.conf
Find the following variable.
http_port 3128
Now change it to
http_port 8080

Restart the service
#service squid reload

Now its changed.
[root@vm1 ~]# netstat -tlpn | grep squid
tcp 0 0* LISTEN 5228/(squid)

How to write combined rules/ACLs in Squid:

Syntax is as follows:
#acl acl_name1 acl_type string
#acl acl_name2 acl_type string
#acl acl_name3 acl_type string
#http_access deny acl_name1 acl_name2 acl_name3

Suppose you want to block the web access of a group of users at lunch time.
#acl lunch time MTWHF 09:00-17:00
#acl spys src
#http_access deny spys lunch

To block the URLs contains the word "word":

We will use a acl_type url_regex for blcking regular expressions in url. The rule is as follows.

#acl block_word_url url_regex sex
#http_access deny block_word_url

Reload the service.
Now you wont be getting any urls contain the word sex.

If u give -i it will become case insensitive.
#acl block_word_url url_regex -i sex

To prevent downloading files:

Suppose we are going to block download of .exe files.
#acl block_exe url_regex .*\.exe$
#http_access deny block_exe

Reload the service. Now you wont be able to download any .exe files. If you want to block a lot of formats, you can specify all in a file and give the file name in the string field.

#acl block_exe url_regex "/etc/squid/block_downloads.txt"

To block access to some TLDs:

Suppose you want to block access to some TLDs (.uk, .pk etc )
#acl block_tld dstdom_regex \.pk$
#http_access deny block_tld

To setup squid as a non-caching proxy server:

The following rule tells squid not to cache from any hosts in any networks.
#acl non_caching_hosts src
#no_cache deny non_caching_hosts

To not to cache from a particular IP
#acl non_caching_hosts src
#no_cache deny non_caching_hosts

To disable caching of specified sites:

#acl no_cache_sites dstdomain
#no_cache deny no_cache_sites

The above rule will not let squid to cache contents from any pages.
You can see MISS in access.log

How to configure load balancing with squid and DNS:
DNS has a special feature: Round Robin mechanism.
in zone file if we gave like this IN A IN A

The first request coming to DNS for, DNS will answer with IP
for the second request it will give, for the thrid again and it will toggle the IP one after one.

Using this property we can achive load balancing for squid upto 50%.

suppose your squid proxy server is configured in Now you take another system and install squid exaclty as You can copy the same if you want.
Start the service in both systems.

In the DNS server, add two entries as shown below. IN A IN A

in the browser change the proxy name to So for the first request from the browser will goto and the second request will goto and it will toggle for the further requests. But there is no guarantee that it'll operate as expected.

Bandwidth Management using Delay Pools:

Delay pools have three different classes.
    Class one allows us to restrict the rate for large downloads. Now we are going to restrict downloading speed of files > 75mb to less than or equal to 200kbps. For all machines in network

acl worker_bees src #defining worker_bees
#delay_pools Number_of_Delay_pools
delay_pools 1 #the number of delay pools we are using
#delay_class Number_of_delaypool The_class _number
delay_class 1 1 #setup class based on pool number 1 and class number 1
delay_parameters pool_number restore_rate/max_rate
delay_parameters 1 200000/15000 # pool 1 20kbps/15kb
delay_access pool_number allow worker_bees #allowing worker_bees
delay_access 1 allow worker_bees #allowing worker_bees

acl worker_bees src
delay_pools 1
delay_class 1 1
delay_parameters 1 200000/75000000
delay_access 1 allow worker_bees

We created a 100mb file proxy_test_100mb in and downloading it.

[root@vm1 ~]# wget
Connecting to connected.
Proxy request sent, awaiting response... 200 OK
Length: 102400000 (98M) [text/plain]
Saving to: `proxy_test_100mb'
100%[=======================================================================>] 102,400,000 12.2M/s in 7.2s
15:19:03 (13.5 MB/s) - `proxy_test_100mb' saved [102400000/102400000]

After adding the acls and restarting squid.

[root@vm1 ~]# wget
Connecting to connected.
Proxy request sent, awaiting response... 200 OK
Length: 102400000 (98M) [text/plain]
Saving to: `proxy_test_100mb'
100%[=======================================================================>] 102,400,000 189K/s in 8m 25s
15:29:07 (198 KB/s) - `proxy_test_100mb' saved [102400000/102400000]

Bandwidth Management with Aggragate Rate:

Using class one we can only limit the rate at which all downloads. What if we want to use aggragate restriction and per user restriction. Thats where we use class two. Suppose maximum speed of our connection is 2mbps [2097152 bytes]. We'll allow aggragate of 0.5mbps [524288bytes] for nestwork. And per user we will resrtict to 0.05mbps [52428.8bytes]. Thats 10 users can use net with the speed of 0.05mbps [51kbps] at a time at an aggragate of 0.5mbps [510kbps].

#acl worker_bees src
#delay_pools 1
#delay_class 1 2
#delay_parameters 1 524288/524288 52428/52428
#delay_access 1 allow worker_bees


[root@vm1 ~]# wget
Connecting to connected.
Proxy request sent, awaiting response... 200 OK
Length: 10240000 (9.8M) [text/plain]
Saving to: `proxy_test_10mb'
100%[=======================================================================>] 10,240,000 42.9K/s in 3m 49s
16:09:20 (43.6 KB/s) - `proxy_test_10mb' saved [10240000/10240000]
      43.6KB/s < 51 KB/s

Bandwidth Management with Aggragate Rate With per Subnet limits:

It can be achived using the class 3 of delay pool.

delay_pools pool_num
delay_class pool_num class_num
delay_parameters pool_num agg_max/agg_max For_each_nw/ For_each_nw Per_user/Per_user
delay_access pool_num allow all # all networks

agg_max = Max speed squid can allow all together.
For_each_nw = Max speed each subnet can get.
Per_user = Max speed each user can get.

#delay_pools 1
#delay_class 1 3
#delay_parameters 1 524288/524288 262144/262144 100000/100000
#delay_access 1 allow all

[root@vm1 ~]# wget
Connecting to connected.
Proxy request sent, awaiting response... 200 OK
Length: 10240000 (9.8M) [text/plain]
Saving to: `proxy_test_10mb'
100%[=======================================================================>] 10,240,000 89.4K/s in 1m 50s
17:02:17 (91.0 KB/s) - `proxy_test_10mb' saved [10240000/10240000]
      91.0 KB/s < 100 KB/s
How to limit the number of connections per client:
We have to use a new acl_type named maxconn for this. The rule is as follows
#acl con_limit maxconn 2
#http_access deny con_limit all
It will restrict allclient from making more than 2 conncetions per client.