electro acoustic expressionism
nodepet
January 17th, 2008

Redirects to renamed documents with Mod Rewrite

Filed under: Howto — olliver @ 21:10 h

Everybody has some words one loves to misspell, like for instance “seperate” instead of “separate”. As long as these typos occur within the document body, they are easy to correct, but once the document name itself has been misspelt and was spidered this way, simply renaming may result in people getting a 404 error as search result, rather than the desired information they were looking for. This is of course not what a sensible webmaster wants, especially if the document is ranking well for its relevant keywords in search engines, therefore the best solution on an Apache server is to utilise a set of Rewrite Rules to remedy the problem:

Let us assume the site www.example.com has a popular page with the path

http://www.example.com/seperate-code-from-gui.php

and we want it to look like

http://www.example.com/separate-code-from-gui.php

then the best would be to check for requests containing the misspelt word and permanently redirect them to the corrected version. This can be realised with the following Rewrite rule:

RewriteEngine On

# typo handling
RewriteCond %{THE_REQUEST} seperate-
RewriteRule (.+)seperate(.+)$ $1separate$2 [R=301,L]

Best served with ice and within the <directory> section of a virtual host or the .htaccess file of your choice ;-). Please note, that the line RewriteEngine On should only be added in the event of its absense. If it has already been added, there is no need for any redundancy.

So what does this funny thing do? The Rewrite condition checks whether a request contains the misspelt “seperate” expression followed by a dash. In case it does, the Rewrite rule is applied and a permanent redirect (301) to the corrected spelling will be performed. Because of the redirect, it does not make much sense to continue with any further checks, therefore the “L” (=last) flag is used to stop parsing more rulesets. As you may have noticed two expressions in brackets are used in the Rewrite rule, both of which contain captured data that must not be altered. They are called back references for that matter and are stored in variables, that have been assigned in the order they appear (read from left to right) and can be used in any order you see fit for in the replacement string.

By now, both bots and visitors will be directed to the right document, no matter what spelling was used. This is of advantage whenever people had linked to the previous spelling. Additionally, we tell search engines with a 301 redirect to permanently replace the obsolete uri with the new one. Yahoo is a special case, whatever it has fetched once will remain forever in its index, thus a lot of patience or prior dictionary checking for the correct spelling is required ;-).

Comments (0)

No Comments »

No comments yet.

Leave a comment

Posting comments requires Javascript to be turned on.