Monday, April 26, 2010

How to install and configure squid proxy server in linux rhel5 / Centos


Squid is one of the best web proxy servers in the world. This explains how to install and configure Squid Proxy in Linux rhel5 / centos system.
Software = SQUID
Version = squid-2.6.STABLE6-3.el5

Installing SQUID PROXY SERVER:
[root@vm1 ~]# yum install squid*
If  you  dont have a yum server then use rpm.

Starting the service:
1st set a fully qualified domain name. Otherwise squid may fail.
#/etc/init.d/squid start

Defaultly SQUID binds to port 3128. But usually it is changed to 8080.

ACCESS CONTROL
In squid we configure things by writing rules. They are known as ACL rules.
A simple ACL rule:
acl aclname acltype string1
http_access allow|deny aclname
aclname = name of the rule
acltype = the type of string we are using eg:src, dst
string = can be IPs, networks, URLs etc
acl mynetwork src 192.168.0.0/255.255.255.0
http_access allow mynetwork

NOTE: Specify the rules before the line
# http_access deny all
Its because the rules are parsed from top to bottom.

To Check to which port the proxy currently binds to:
[root@vm1 ~]# nmap 192.168.0.21
Starting Nmap 4.11 ( http://www.insecure.org/nmap/ ) at 2010-04-16 16:59 IST
Interesting ports on 192.168.0.21:
Not shown: 1672 closed ports
PORT STATE SERVICE
22/tcp open ssh
[output truncated]
3128/tcp open squid-http
Nmap finished: 1 IP address (1 host up) scanned in 0.448 seconds

To block internet usage from a particular IP address:

Write this rule:
#acl block_ip src 192.168.0.66
#http_access deny block_ip

It should be above these lines
#acl mynetwork src 192.168.0.0/255.255.255.0
#http_access allow mynetwork

Else the rule will be cancelled because of the above rule[mynetwork]. Always keep in mind that Squid interprets rules from topto bottom.

To block internet usage from two or more IP addresses By ACL Lists:

Write this rule:
#acl block_ips src IP1 IP1
#http_access deny block_ips

Eg:
#acl block_ips src 192.168.0.21 192.168.0.22
#http_access deny block_ips

Or you can define rules like this:
#acl block_ips src 192.168.0.21
#acl block_ips src 192.168.0.22
#http_access deny block_ips

To block a particular URL:
For blocking the URL www.yahoo.com
For blocking only one URL use the acl_type dst.

#acl block_yahoo dst www.yahoo.com
#http_access deny block_yahoo
You can see that site yahoomail is still accessible. So it blocks single URL only.

To block only one domain:
Eg for blocking all systems from accessing orkut.com

#acl block_orkut dstdomain .orkut.com
#http_access deny block_orkut

To block a list of sites from / specified in a file:
First we have to create a file and save all the URLs we want to block in that.
In this example file is saved in /etc/squid/block_list.txt. And its given the read permission for all.

[root@vm1 ~]# cat /etc/squid/block_list.txt
www.hotmail.com
www.ibm.com
www.hp.com

#acl block_list url_regex "/etc/squid/block_list.txt"
#http_access deny block_list

Blocking web access by time:

The syntax is as follows:
#acl aclname time [day-abbrevs] [h1:m1-h2:m2]
h1:m1 must be less than h2:m2
Day-abbrevs:
S - Sunday
M - Monday
T – Tuesday
W - Wednesday
H - Thursday
F - Friday
A - Saturday

We are going to block all systems of mynetwork from accessing web at lunch time. Where lunch time is 02:32-03:00

#acl mynetwork src 192.168.0.0/255.255.255.0
#acl lunch time MTWHFA 02:32-03:00
#http_access deny mynetwork lunch
mynetwork must be defined before.

Log files of squid:
/var/log/squid/access.log
/var/log/squid/cache.log -> Memory and CPU informations
/var/log/squid/store.log
/var/log/squid/squid.conf -> Basic system informations.

/var/log/squid/access.log
You can check your requests were a HIT or MISS in this log file. MISS means it was not taken from cache and HIT means it was taken from the cache.

Syntax / Fields in /var/log/squid/access.log

1st field is Request_time shown in unix epoch format appended with milli sec.
2nd feild is elapsed time[ in ms] of page/object delivery.
3rd field is remote host ip.
4th field is code(squid actions)/status(http errors)
5th field is bytes delivered to the client.
6th field is the method that is used to retrieve the page.
7th field is URL.
8th field is IDENT identification.
9th field is heirarchy (eg: DIRECT/IP).
10th field is MIME type.

/var/log/squid/store.log
store.log stores the objects in the cache.

Syntax / Feilds in /var/log/squid/store.log:
1st field is time in unix epoch format.
2nd field is action (Release, Create, swapout, swapin)
swapout - the object is moved from memory to disk
released – neither in disk nor in memory
all objects are in /var/spool/squid
3rd field is location on disk
4th field is HTTP status
5th field is HTTP date
6th field is last modified
7th field is expiration of content
8th field is MIME type
9th field is size of the content. content_lenght/actual size of content.
10th field is Method ( Get,Post,Connect)
11th field is URL

To turn on the Common Log Format (CLF):

#vi /etc/squid/squid.conf

#Default:
# emulate_httpd_log off
emulate_httpd_log on

To Check the speed or the time squid taking to load a page:

[root@vm1 ~]# squidclient -g http://www.google.com
2010-04-17 12:59:11 [1]: 0.142 secs, 6.567727 KB/s (1KB)
2010-04-17 12:59:12 [2]: 0.145 secs, 6.431843 KB/s (1KB)
Interrupted.
2 requests, round-trip (secs) min/avg/max = 0.142/0.143/0.145

If you are checking from a remote machine, then:
#squidclient -h Squid_server_ip -g URL

[root@server ~]# squidclient -h 192.168.0.21 -g http://www.google.com
2010-04-17 13:12:47 [1]: 0.128 secs, 7.286072 KB/s (1KB)
2010-04-17 13:12:48 [2]: 0.115 secs, 8.109715 KB/s (1KB)
Interrupted.
3 requests, round-trip (secs) min/avg/max = 0.115/0.132/0.153

How to set proxy through bash shell:

[root@server ~]# export http_proxy=URL/IP_of_Squid_Server
[root@server ~]# export http_proxy=http://192.168.0.21:3128
Now we are on the machine 192.168.0.99 and set the proxy with above command. Now while accessing net, see its going through the proxy site.

[root@server ~]# wget http://www.yale.edu/index.html
--13:42:35-- http://www.yale.edu/index.html
Connecting to 192.168.0.21:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `index.html'
[ <=> ] 17,208 33.3K/s in 0.5s
13:42:36 (33.3 KB/s) - `index.html' saved [17208]

Cache Managing Script:
We have a cache managing script associated with squid: cachemanager.cgi
We access this script through web interface and can get lots of cache informations.

First we have to install and start httpd in our system. Then we have to copy the cgi script to the apache cgi-bin directory.

[root@vm1 ~]# rpm -ql squid | grep cgi
/usr/lib/squid/cachemgr.cgi
/usr/share/doc/squid-2.6.STABLE6/cachemgr.cgi.8
/usr/share/doc/squid-2.6.STABLE6/cachemgr.cgi.8.in
/usr/share/man/man8/cachemgr.cgi.8.gz
[root@vm1 ~]# cp /usr/lib/squid/cachemgr.cgi /var/www/cgi-bin/

Now take the browser. In address bar and give http://IP_proxy/cgi-bin/cachemgr.cgi
You will be getting a cache managemnt page.Click continue.You will get cache manager menu.

To change the default squid proxy port:

Now we'll change the default squid port 3128 to 8080
#vi /etc/squid/squid.conf
Find the following variable.
http_port 3128
Now change it to
http_port 8080

Restart the service
#service squid reload

Now its changed.
[root@vm1 ~]# netstat -tlpn | grep squid
tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 5228/(squid)

How to write combined rules/ACLs in Squid:

Syntax is as follows:
#acl acl_name1 acl_type string
#acl acl_name2 acl_type string
#acl acl_name3 acl_type string
#http_access deny acl_name1 acl_name2 acl_name3

Suppose you want to block the web access of a group of users at lunch time.
#acl lunch time MTWHF 09:00-17:00
#acl spys src 192.168.0.66 192.168.0.99
#http_access deny spys lunch

To block the URLs contains the word "word":


We will use a acl_type url_regex for blcking regular expressions in url. The rule is as follows.

#acl block_word_url url_regex sex
#http_access deny block_word_url

Reload the service.
Now you wont be getting any urls contain the word sex.

If u give -i it will become case insensitive.
#acl block_word_url url_regex -i sex

To prevent downloading files:

Suppose we are going to block download of .exe files.
#acl block_exe url_regex .*\.exe$
#http_access deny block_exe

Reload the service. Now you wont be able to download any .exe files. If you want to block a lot of formats, you can specify all in a file and give the file name in the string field.

#acl block_exe url_regex "/etc/squid/block_downloads.txt"

To block access to some TLDs:

Suppose you want to block access to some TLDs (.uk, .pk etc )
#acl block_tld dstdom_regex \.pk$
#http_access deny block_tld

To setup squid as a non-caching proxy server:

The following rule tells squid not to cache from any hosts in any networks.
#acl non_caching_hosts src 0.0.0.0/0.0.0.0
#no_cache deny non_caching_hosts

To not to cache from a particular IP
#acl non_caching_hosts src 192.168.0.66
#no_cache deny non_caching_hosts

To disable caching of specified sites:

#acl no_cache_sites dstdomain .blogspot.com
#no_cache deny no_cache_sites


The above rule will not let squid to cache contents from any .blogspot.com pages.
You can see MISS in access.log

How to configure load balancing with squid and DNS:
DNS has a special feature: Round Robin mechanism.
in zone file if we gave like this

cache.abc.come IN A 192.168.0.21
cache.abc.come IN A 192.168.0.22

The first request coming to DNS for cache.abc.come, DNS will answer with IP 192.168.0.21
for the second request it will give 192.168.0.22, for the thrid again 192.168.0.21 and it will toggle the IP one after one.

Using this property we can achive load balancing for squid upto 50%.

Step1:
suppose your squid proxy server is configured in 192.168.0.21. Now you take another system 192.168.0.22 and install squid exaclty as 192.168.0.21. You can copy the same if you want.
Start the service in both systems.

Step2:
In the DNS server, add two entries as shown below.

cache.abc.come IN A 192.168.0.21
cache.abc.come IN A 192.168.0.22

Step3:
in the browser change the proxy name to cache.abc.come. So for the first request from the browser will goto 192.168.0.21 and the second request will goto 192.168.0.22 and it will toggle for the further requests. But there is no guarantee that it'll operate as expected.

Bandwidth Management using Delay Pools:

Delay pools have three different classes.
    Class one allows us to restrict the rate for large downloads. Now we are going to restrict downloading speed of files > 75mb to less than or equal to 200kbps. For all machines in network 192.168.0.0/24

acl worker_bees src 192.168.0.0/24 #defining worker_bees
#delay_pools Number_of_Delay_pools
delay_pools 1 #the number of delay pools we are using
#delay_class Number_of_delaypool The_class _number
delay_class 1 1 #setup class based on pool number 1 and class number 1
delay_parameters pool_number restore_rate/max_rate
delay_parameters 1 200000/15000 # pool 1 20kbps/15kb
delay_access pool_number allow worker_bees #allowing worker_bees
delay_access 1 allow worker_bees #allowing worker_bees

Eg:
acl worker_bees src 192.168.0.0/24
delay_pools 1
delay_class 1 1
delay_parameters 1 200000/75000000
delay_access 1 allow worker_bees

Before:
We created a 100mb file proxy_test_100mb in 192.168.0.99 and downloading it.

[root@vm1 ~]# wget http://192.168.0.99/proxy_test_100mb
--15:18:55-- http://192.168.0.99/proxy_test_100mb
Connecting to 192.168.0.21:8080... connected.
Proxy request sent, awaiting response... 200 OK
Length: 102400000 (98M) [text/plain]
Saving to: `proxy_test_100mb'
100%[=======================================================================>] 102,400,000 12.2M/s in 7.2s
15:19:03 (13.5 MB/s) - `proxy_test_100mb' saved [102400000/102400000]

After:
After adding the acls and restarting squid.

[root@vm1 ~]# wget http://192.168.0.99/proxy_test_100mb
--15:20:42-- http://192.168.0.99/proxy_test_100mb
Connecting to 192.168.0.21:8080... connected.
Proxy request sent, awaiting response... 200 OK
Length: 102400000 (98M) [text/plain]
Saving to: `proxy_test_100mb'
100%[=======================================================================>] 102,400,000 189K/s in 8m 25s
15:29:07 (198 KB/s) - `proxy_test_100mb' saved [102400000/102400000]

Bandwidth Management with Aggragate Rate:

Using class one we can only limit the rate at which all downloads. What if we want to use aggragate restriction and per user restriction. Thats where we use class two. Suppose maximum speed of our connection is 2mbps [2097152 bytes]. We'll allow aggragate of 0.5mbps [524288bytes] for 192.168.0.0/24 nestwork. And per user we will resrtict to 0.05mbps [52428.8bytes]. Thats 10 users can use net with the speed of 0.05mbps [51kbps] at a time at an aggragate of 0.5mbps [510kbps].

Eg:
#acl worker_bees src 192.168.0.0/24
#delay_pools 1
#delay_class 1 2
#delay_parameters 1 524288/524288 52428/52428
#delay_access 1 allow worker_bees

After:

[root@vm1 ~]# wget http://192.168.0.99/proxy_test_10mb
--16:05:31-- http://192.168.0.99/proxy_test_10mb
Connecting to 192.168.0.21:8080... connected.
Proxy request sent, awaiting response... 200 OK
Length: 10240000 (9.8M) [text/plain]
Saving to: `proxy_test_10mb'
100%[=======================================================================>] 10,240,000 42.9K/s in 3m 49s
16:09:20 (43.6 KB/s) - `proxy_test_10mb' saved [10240000/10240000]
      43.6KB/s < 51 KB/s

Bandwidth Management with Aggragate Rate With per Subnet limits:

It can be achived using the class 3 of delay pool.

delay_pools pool_num
delay_class pool_num class_num
delay_parameters pool_num agg_max/agg_max For_each_nw/ For_each_nw Per_user/Per_user
delay_access pool_num allow all # all networks

agg_max = Max speed squid can allow all together.
For_each_nw = Max speed each subnet can get.
Per_user = Max speed each user can get.

#delay_pools 1
#delay_class 1 3
#delay_parameters 1 524288/524288 262144/262144 100000/100000
#delay_access 1 allow all

[root@vm1 ~]# wget http://192.168.0.99/proxy_test_10mb
--17:00:27-- http://192.168.0.99/proxy_test_10mb
Connecting to 192.168.0.21:8080... connected.
Proxy request sent, awaiting response... 200 OK
Length: 10240000 (9.8M) [text/plain]
Saving to: `proxy_test_10mb'
100%[=======================================================================>] 10,240,000 89.4K/s in 1m 50s
17:02:17 (91.0 KB/s) - `proxy_test_10mb' saved [10240000/10240000]
      91.0 KB/s < 100 KB/s
How to limit the number of connections per client:
We have to use a new acl_type named maxconn for this. The rule is as follows
#acl con_limit maxconn 2
#http_access deny con_limit all
It will restrict allclient from making more than 2 conncetions per client.


3 comments:

  1. this site is very very helpful for all the beginners and i appreciate .........

    ReplyDelete
  2. Very nice.Thanks

    ReplyDelete
  3. Thank you very much!
    very useful.....

    ReplyDelete