Imagine you have a Homepage that has a database of articles.
To make the URLs SEO-friendly you usually put something like this in your .htaccess:
RewriteRule ^([0-9A-Za-z\-\_]+)_([0-9]+)\.html$ /show_article.htm?articleId=$2&tit=$1 [QSA,L]
An URL like http://yourdomain.com/my_first_article_1.html will then lead to the article with ID 1.
But what happens if someone enters
http://yourdomain.com/my_second_article_1.html?
Then he will also be leaded to article 1 but through an other URL. This happens because the article usually is selected by the ID only.
If somebody link you with that wrong urls search engines will cache both URLs that both will lead to the same article. Thats bad for both pagerank and analysing the data.
And http://yourdomain.com/my_second_article_333.html will produce a STATUS 200 which means for
Search Engines that this article is existing.
There are two solutions two avoid this:
1. Pass the whole URI to PHP
Instead of defining single Rewrites for articles, categories, etc. you can pass the whole URI to PHP.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
Non-existing files(=RewriteUrls) are redirected to index.php this way.
The index.php can interprete the the value of $_SERVER['REQUEST_URI'] which contains the whole URI without the domain. Then you can validate the entered URL and compare it with the URI you want certain pages to have, and if they are not equal you can put out a 301 redirect.
2. Check in your show_article.php if the article exist and the URL is correct
Then you can put out redirects if the URLs don’t match or the article does not exists.
$articleData=$dbModule->getArticle($articleId);
if(!isset($articleData)){
header( ‘HTTP/1.0 404 Not Found’ );
header( ‘Status: 404 Not Found’ );
require ‘your404page.php’;
header( ‘Connection: close’ );
exit();
}
//RUN actions here
$articleSEOurl=’/’.$yourutils->makeSEOurl($articleData);
if($_SERVER['REQUEST_URI']!=$articleSEOurl){
header(”HTTP/1.1 301 Moved Permanently”);
header(”Location: “.$articleSEOurl);
header(”Connection: close”);
exit();
}
Another cool side effect of the second solution is, that if you run actions (make_comment, new_article)
before you are making the 301 redirect, something like this:
http://yourdomain.com/show_article.php?id=4&action=make_comment
will cause a new comment and after that the user is again redirected to the SEO friendly URL.
I think this is a very elegant way to keep the browser’s address bar clean.