binding qpopper to one ip address with xinetd
In its default package on Debian, qpopper is a pop3 daemon, which is easy to configure and quite complete in its implementation (supporting apop and ssl encryption), however has the nasty disadvantage of listening to port 110 on all the interfaces a server provides. The reason is that on Debian, qpopper is started via inetd and inetd does not know how to listen to specific interfaces. If we want to change this, we have two possibilities to choose from:
1. Compiling qpopper ourselves as standalone server and have it listened to one interface.
2. Replacing inetd with xinetd.
The latter is the one I would like to focus on because it allows what we want to reach with minimal changes. At first it might be useful to explain xinetd is: xinetd is thought as a replacement of inetd and one of its biggest advantages is that it make services listen to specific interfaces only, even if they themselves do not provide such a configuration option. Xinetd can be easily retrieved via the usual apt-get install command. /etc/xinetd.d is the directory where all the services that are supposed to be run by the daemon should have their configuration file. As we like to run qpopper, we simply create a new file called “pop3″ (after the service) and fill it with the following values:
service pop3
{
disable = no
id = pop3
socket_type = stream
protocol = tcp
user = root
wait = no
flags = nameinargs
server = /usr/sbin/tcpd
server_args = /usr/sbin/in.qpopper -f /etc/qpopper.conf
bind = 1.2.3.4
}
Of course you want to replace 1.2.3.4 with the ip address of the interface you would like to use for qpopper. Restart xinitd by invoking
# /etc/init.d/xinetd restart
as root and if things went well, you should see qpopper now listening at your specified ip address:
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 127.0.0.1:587 0.0.0.0:* LISTEN 20888/sendmail: MTA tcp 0 0 1.2.3.4:110 0.0.0.0:* LISTEN 14263/xinetd [...]
How to do PHP based 301 redirects
One common problem with script based redirects is that often they default to using 302 (moved temporarily) as response code. However, as the meaning moved temporarily already implies, that code is not meant for pointing to permanent locations like linked sites of a redirector script (for instance an outbound click tracker). In this case it would be more appropriate to tell both browsers and search engines that the endpoint of the redirect should be preferred over the link that caused the redirect. So how to get it done the correct way, when the stock location header sent by PHP defaults to code 302?
The answer lies in reading the PHP documentation thoroughly, especially the provided examples ;-). As long as no html output has been spilt (sometimes accidentally via whitespace as result of sloppy editing) you can send as much headers as you like. The documentation specifically mentions two cases:
There are two special-case header calls. The first is a header that starts with the string “HTTP/” (case is not significant), which will be used to figure out the HTTP status code to send.
[...]
The second special case is the “Location:” header. Not only does it send this header back to the browser, but it also returns a REDIRECT (302) status code to the browser unless some 3xx status code has already been set.
(emphasis mine)
There lies the answer: If we want to use a 301 redirect, we will have to send two headers:
<?php
header("HTTP/1.1 301");
header("Location: http://www.example.com/");
?>
which results in:
HTTP/1.1 301 Moved Permanently Date: Fri, 11 Apr 2008 21:22:56 GMT Server: Apache/1.3.34 (Unix) Location: http://www.example.com/ Content-Type: text/html
Exactly what we wanted.
Also, this response demonstrates that headers mentioned will replace similar ones and the rest will be accomplished by server defaults. Another usage of this “replace” feature could be to fool nasty bots with unexpected error codes like:
<?php
header("HTTP/1.1 402");
?>
Which yields in:
HTTP/1.1 402 Payment Required Date: Fri, 11 Apr 2008 21:29:57 GMT Server: Apache/1.3.34 (Unix) Content-Type: text/html
You may wish to add an error page with a credit card payment form to complete the confusion :-).
Configuring Sendmail to use a specific ip address
As default Sendmail listens to each ip address it can find on a host. if those are aliased and not separate interfaces, mail will always be send and received from the main address (eth0 on Linux). This is less than optimal in case you have a couple of ip addresses to play with and like to separate services from ip addresses that are primarily meant to serve web pages. Ideally we have ip addresses in different subnets and are able to select the least troublesome ip address for sending and receiving mail, which is not listed on any blocklist or reveals a history of spam from its previous owner in search engines. The goal of this article will be to configure sendmail in a way, that it only listens to one interface and uses it for sending and receiving mail, pretending to the outside world to be a separate server and revealing less about our server setup. The example refers to Debian Linux, but should work similar on other Linux flavours, too.
Assumed we want to use the ip address 10.10.0.1 as main “mail interface” but at the same time make sure that local mail submissions (daemon notifications on root via loopback) are still working as expected, than we can add the following entries to sendmail.mc:
DAEMON_OPTIONS(`Addr=127.0.0.1, Family=inet, Name=MTA-v4, Port=smtp')dnl DAEMON_OPTIONS(`Addr=127.0.0.1, Family=inet, Name=MSA, Port=submission')dnl DAEMON_OPTIONS(`Addr=10.10.0.1, Family=inet, Name=MTA-v4, Port=smtp, M=bh')dnl
Our loopback will now listen on two ports, because on Debian submission is used by daemons to send their notifications to root. This may be Debian specific and other Linux distributions may not require the submission port to be open for local mails. The address used for communication with the outside world features two “Modifier” flags. These flags tell Sendmail to use the same interface and its hostname for sending and receiving email. Mind the order, you cannot mix different interfaces as this will result in at least one of them not getting started at all. Best is to group them by address and port number.
In order to make these changes take effect, you need to switch into the /etc/mail directory and run make. This will update the residing configuration files accordingly. After that you should change to /etc/init.d and restart sendmail. If everything was working as expected, you should be able to see something like this when typing netstat -an
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:587 0.0.0.0:* LISTEN tcp 0 0 10.10.0.1:25 0.0.0.0:* LISTEN
In case you still see sendmail sitting on all interfaces, check whether the previous Family=inet lines, that did not specify any addresses are still around. In case they are you need to remove them, switch to /etc/mail, run make, change to /etc/init.d and restart sendmail again.
Redirects to renamed documents with Mod Rewrite
Everybody has some words one loves to misspell, like for instance “seperate” instead of “separate”. As long as these typos occur within the document body, they are easy to correct, but once the document name itself has been misspelt and was spidered this way, simply renaming may result in people getting a 404 error as search result, rather than the desired information they were looking for. This is of course not what a sensible webmaster wants, especially if the document is ranking well for its relevant keywords in search engines, therefore the best solution on an Apache server is to utilise a set of Rewrite Rules to remedy the problem:
Let us assume the site www.example.com has a popular page with the path
http://www.example.com/seperate-code-from-gui.php
and we want it to look like
http://www.example.com/separate-code-from-gui.php
then the best would be to check for requests containing the misspelt word and permanently redirect them to the corrected version. This can be realised with the following Rewrite rule:
RewriteEngine On
# typo handling
RewriteCond %{THE_REQUEST} seperate-
RewriteRule (.+)seperate(.+)$ $1separate$2 [R=301,L]
Best served with ice and within the <directory> section of a virtual host or the .htaccess file of your choice ;-). Please note, that the line RewriteEngine On should only be added in the event of its absense. If it has already been added, there is no need for any redundancy.
So what does this funny thing do? The Rewrite condition checks whether a request contains the misspelt “seperate” expression followed by a dash. In case it does, the Rewrite rule is applied and a permanent redirect (301) to the corrected spelling will be performed. Because of the redirect, it does not make much sense to continue with any further checks, therefore the “L” (=last) flag is used to stop parsing more rulesets. As you may have noticed two expressions in brackets are used in the Rewrite rule, both of which contain captured data that must not be altered. They are called back references for that matter and are stored in variables, that have been assigned in the order they appear (read from left to right) and can be used in any order you see fit for in the replacement string.
By now, both bots and visitors will be directed to the right document, no matter what spelling was used. This is of advantage whenever people had linked to the previous spelling. Additionally, we tell search engines with a 301 redirect to permanently replace the obsolete uri with the new one. Yahoo is a special case, whatever it has fetched once will remain forever in its index, thus a lot of patience or prior dictionary checking for the correct spelling is required ;-).
Tracking entire search phrases with BBClone
Those of you who run Google Adwords would of course like to know their strongest phrases they can build their campaign upon. BBClone, per default, however lists single keywords, something that is useful for tweaking one’s meta headers so they contain only words people use to find one’s site. But when it comes to the higher spheres of SEO, tracking entire phrases like electro acoustic expressionism would be of advantage. This can be accomplished via a relatively simple, yet effective hack that helps you optimise your Adwords to gain competitive edge and increase your revenue.
Look for the file log_processor.php in your BBClone directory and go to line 230:
// Search engine keywords in detailed stats
$flt_search = bbc_get_keywords($connect['referer'], $char);
$last['traffic'][$old_cnt]['search'] = ($flt_search !== false) ?
implode(" ", $flt_search) : "-";
// Search engine keywords in global stats
if ($flt_search !== false) bbc_update_key_stats($flt_search);
This is what the original should look like. Now replace this part with the following code (you can use the comments as orientation guides):
// Search engine keywords in detailed stats
$flt_search = bbc_get_keywords($connect['referer'], $char);
$last['traffic'][$old_cnt]['search'] = ($flt_search !== false) ?
str_replace("-", " ", implode(" ", $flt_search)) : "-";
// Search engine keywords in global stats
$query = $last['traffic'][$old_cnt]['search'];
if ($query != "-") {
$access['key'][$query] = !isset($access['key'][$query]) ? 1 :
++$access['key'][$query];
}
And voila, from now on BBClone will display entire phrases on the global statistics page and help you use ad campaigns more effectively.
Multiple ip addresses with Ip aliasing on FreeBSD
One feature I really like is Ip aliasing, which makes it possible to assign a bunch of ip addresses to the same network interface. This comes in quite handy once you want to have separated ip addresses for virtual hosts on an Apache server or simply separated ip addresses for services you offer in your network. This feature is supported on both FreeBSD and Linux. In this article I would like to describe how this feature works on FreeBSD, a description for Linux will follow soon.
Aliases can be specified by two ways:
1. at runtime
ifconfig rl0 inet 192.168.5.23 netmask 255.255.255.255 alias up
This would use the first Realtek card with the specified ip address and bring the alias up immediately. Note, that 255.255.255.255 is intentionally chosen to avoid conflicts with the base ip address. If the address is no longer needed you would simply replace “up” with “down”:
ifconfig rl0 inet 192.168.5.23 netmask 255.255.255.255 alias down
While typing this on the console at runtime is suitable for testing purposes or as temporary measure (in case of an outage of the machine normally using the alias address), we’d certainly prefer to bring our aliases up…
2. on bootup
In this case /etc/rc.conf would be the location to add these aliases. Note, that there’s a pitfall, as the alias syntax is different from adding normal interfaces to.
ifconfig_rl0_alias0="inet 192.168.5.23 netmask 255.255.255.255"
The first alias gets number zero. So for each additional alias you’d like to have you needed to increment this value accordingly:
(still file /etc/rc.conf)
ifconfig_rl0_alias0="inet 192.168.5.23 netmask 255.255.255.255" ifconfig_rl0_alias1="inet 192.168.7.24 netmask 255.255.255.255"
A second note: Before adding aliases you must already have assigned a base address to this interface with a reasonable subnet mask:
ifconfig_rl0="inet 192.168.7.1 netmask 255.255.255.0"
When I set up a new box I usually group base addresses and aliases separated to avoid naming conflicts. Also note, that unlike Linux FreeBSD has a different naming convention: interface names are derived from the driver, therefore 3com or Intel cards have different names. Here’s an Intel card as example:
ifconfig_fxp0="inet 192.168.5.23 netmask 255.255.255.255"
If things went well, you should see the aliases after a reboot:
tabidachi# ifconfig rl0
rl0: flags=8843<up ,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
options=8<vlan_mtu>
inet6 fe80::200:e8ff:fe85:dbb0%rl0 prefixlen 64 scopeid 0x2
inet 192.168.0.1 netmask 0xffffff00 broadcast 192.168.0.255
inet 192.168.0.2 netmask 0xffffffff broadcast 192.168.0.2
inet 192.168.0.3 netmask 0xffffffff broadcast 192.168.0.3
inet 192.168.0.4 netmask 0xffffffff broadcast 192.168.0.4
inet 192.168.0.5 netmask 0xffffffff broadcast 192.168.0.5
inet 192.168.0.6 netmask 0xffffffff broadcast 192.168.0.6
inet 192.168.0.7 netmask 0xffffffff broadcast 192.168.0.7
inet 192.168.0.8 netmask 0xffffffff broadcast 192.168.0.8
inet 192.168.0.9 netmask 0xffffffff broadcast 192.168.0.9
inet 192.168.0.10 netmask 0xffffffff broadcast 192.168.0.10
ether 00:00:e8:85:db:b0
media: Ethernet autoselect (100baseTX <full -duplex>)
status: active
These addresses can now be used for any service or virtual host (Apache) running on that machine.
Custom logfile for referrer spam
One of the most annoying characteristics of referrer spam is the clutter it leaves in Apache’s access log files, making analysis of them nearly pointless. But Apache would not be Apache if there were not a work around for it. Thanks to Mod SetEnvIf we have the choice of looking at client headers with regular expressions and in the event of a match register an environmental variable. This sounds quite useless at first glance, because we would expect any sort of immediate action instead. But I can assure you it is not, because the same way we can write
Deny from 192.168.0.0/24
we can also check for the variable’s existence and regulate access depending on the outcome:
Deny from env=spam
or use it for creating special logfiles only for matches or the opposite:
CustomLog /var/log/httpd/test.example.com.access.log env=!spam
And based on this property we are now able to isolate unwanted referrer spam or autoposting spambots from our access.log and get a handy source for looking up bad behaving visitors to take appropriate action (like firewalling or merely denying access to them). In order to make everything work, we need root access to httpd.conf, so those without a dedicated server may not be able to organise their log entries like this.
Ok, ready? Fine, then let us get started. At first a summary of what we are trying to accomplish:
- Create a set of SetEnvIf rules that will be used to check incoming requests
- If a requests meets the criterion of being unwanted, write it into the block log
- In case it is a legitimate request write it into the access log
I skip the section about explaining what SetEnvIf is and what good it can do for you, so in case you lack the knowledge you may have to study the Apache manual first. But fear not, if you are familiar with regular expressions you surely will quickly get into it. What we now have to consider is the location of our rules: It could be either httpd.conf or .htaccess files and each decision can have its advantages:
Everything in httpd.conf is loaded once on start up (or each time you reload the server and force it to reread its configuration file) and then kept in memory as long as the server is running. Thus things that principally need to be done like some permanent redirects or search engine friendly links via mod rewrite rules are best kept here.
.htaccess however is read each time a directory is accessed. Therefore any changes made here are taken into effect immediately. But due to the way this file works, a huge load of rules can considerably slow down the server, so you only want to keep things that change very often and put as much as possible into your httpd.conf. To get back to the subject, our preferred location of the ruleset should be httpd.conf of course, since spammy keywords in referrer strings are very unlikely to change over time.
In case there is not already a section with SetEnvIf rules we create one now by adding the following scheme:
<IfModule mod_setenvif.c>
SetEnvIfNoCase User-Agent "Indy Library|OmniExplorer" spam
SetEnvIfNoCase Referer "^http://([0-9a-z_.\-]*(poker|casino)\.)" spam
SetEnvIf Remote_Addr "^69\.31\.(79|93|132)\.[0-9]{1,3}$" spam
</IfModule mod_setenvif.c>
Note that this is only an example, of course there is no limit to the level of complexity you want to apply to your regexp rules, and there are other environmental variables that can be used for filtering as well. Now each time one of our rules is triggered, the variable “spam” will be defined and a query for it will return true (boolean comparison).
Next location we are heading to is the virtual host section of our httpd.conf:
<virtualhost 192.168.0.1> ServerAdmin webmaster@example.com DocumentRoot /usr/local/www/test/ ServerName test.example.com ErrorLog /var/log/httpd/test.example.com-error.log CustomLog /var/log/httpd/test.example.com-spam.log combined env=spam CustomLog /var/log/httpd/test.example.com-access.log combined env=!spam
Notable additions/changes are marked bold. The negation of a match is no typo, it is really written like this (those familiar with programming will be slightly irritated as most languages use != to mark a negative comparison).
Now there may be more complex comparisons that require the usuage of Rewrite rules and then an entry, although blocked, will remain in our access log. Luckily, Mod Rewrite even takes such situations into account and allows setting environmental variables by using the E flag. Which means we can continue using our “spam” variable, only that we have to assign “1″ as value for it in order to indicate that this variable has been set. An example as illustration:
# spambot detection
RewriteCond %{THE_REQUEST} ?lng= [NC]
RewriteRule .* - [E=spam:1,L]
The only thing we need to take care of with this combination of SetEnvIf and Rewrite rules is that we still have the final Deny from env=spam line, so that the detection of the variable actually results in denying access to our server (or whatever action seems appropriate).
And voila, now we have accomplished the following: Spam will be written to a separated logfile and no longer appears in access log itself, thus we will be able to analyse our traffic again.