Basic Elements of URL Rewriting for Beginners

URL Rewriting: Even several renowned Web companies afflicts in search of best name for their clients which looks captivating on business card, hassle-free to read and remember as well as spell correctly even on phone, accordant and appropriate, moreover still is distinctive and sounds professional with availability as dot-com.

However, they never hesitate to invest thousand of dollars to bargain such of one they desire indeed, which just appeared to be registered by hard-to-find pioneer and forward-thinking.

They suffer all around only with domain name but neglect rest of the URL, element next to the domain name which is too significant. This also must include the features like should be appropriate, convenient to spell, read and memorize, accordant, professional towards the same endeavor – To drag users/customers as well as boost in search ranking.

You will be pleased to know that there is a superb technique named URL rewriting which allows you to give up the all trouble as well as transform clumsy URLs into nice ones. Moreover, Essentials of URL Rewriting enables you to pick good and desired domain name with lowest affliction and investment. In account of friendly and readable keywords you can fill up your URLs without impinging the underlying structure of respective pages.

Let’s take a look with following questions that covers next to the entire article:

1. What is URL rewriting?
2. How can URL rewriting help your search rankings?
3. Examples of URL rewriting, including regular expressions, flags and conditionals.
4. URL rewriting in the wild, such as on Wikipedia, WordPress and shopping websites.
5. Creating Friendly URLs
6. Changing pages names and URLs
7. Checklist and troubleshooting

What is URL Rewriting?

In order to send a business proposal to a partner, undoubtedly you would write a letter to the concerned partner. However, you would open your word processor application and create a letter file with a name like proposal.doc. This particular file might saved in your Document directory with full path of C:\Windows\users\david\Documents\proposal.doc. Hence, one file path = one document.In the same way, suppose you are creating a business website, you will create a concerned page and named it as page1.html, now upload it and then direct your browser to http://www.mybusiness.com/page1.html. Hence, it assumes that One URL = one resource. In this instance, Physical Web Page is the resource but might be it is a page or product drag from a CMS.

It changes all that. URL Rewriting enables you to completely asunder the URL from the resource. You could have http://www.mybusiness.com/aboutus.html directing the user to …/page1.html or …/about-us/ or …/about-this-website-and-Xy7863/ in account of URL Rewriting. It is like symbolic links or shortcuts on your system hard drive. Hence, One URL = one way to find a resource.

With URL rewriting, the resource that leads and the URL can be wholly individualistic with each other. However, practically they are not completely independent; as primarily the URL comprises some number or name or code which allows the CMS to look up the resource. While in theory, this is the URL rewriting that bestows a complete separation.

How Does URL Rewriting Help ?

This is about impossible to identify what this URL is selling.

http://www.diy.com/diy/jsp/bq/nav.jsp?action=detail&fh_secondid=11577676

B&Q go through all troubles and investment for attaining diy.com, however implement a stock controlled website called e-commerce, but neglect to maintain the URLs conscience. So, “brown guttering” guessed by you might be supposed to playing lottery.

When you search directly on Google UK and B&Q’s for “miniflow gutter brown”; In the list of organic search results pages ranks at seventh. Moreover, B&Q has more than 300 branches and so is large in budget, exposure and size, so it has been used as the same search term. Therefore, other search result contains URLs such as http://www.prof…co.uk/products/brown-miniflo-gutter-148/ where the URL have itself the words in search keywords.

Assuming a URL from the B&Q, suppose a file named rom.jsp in sub-directory /diy/jsp/bq/ is adopted to view products by assigning their ID no., like 12639979. That is the resource which tied to this URL informally.

Therefore, In account of URL Rewriting B&Q transforms this into more recognizable way without restructuring the website as

http://www.diy.com/products/miniflow-gutter-brown/12639979

Through this illustration it is concluded that URL Rewriting directs the Web Server to show the web page /products/miniflow-gutter-brown/12639979 without the knowledge of users or customers or search engine /diy/jsp/bq/rom.jsp?action=detail&fh_secondid=12639979

How To Rewrite URLs

This superb technique of URL Rewriting implementation on a website depends on the Web Server. However, Apache usually comes with pre-installed URL rewriting module, mod_rewrite . It’s setup is very simple and easy as well as all examples comprised in this article are based on same.

The Simplest Case

One amongst the most easy and common way of URL rewriting is to rename a single constant Web page, which much easier than above example of B&Q. To access the function of Apache’s URL rewriting, just edit or create the .htaccess file in your websites’ document.

Let’s take an illustration, if you have a computer web page named Ba7BauefTau.htm, you can include these lines to .htaccess.

Therefore, when you visit http://www.mywebsite.com/computer.htm , Now, in actual you will show the web page Ba7BauefTau.htm. Moreover, the browser remain constant at computer.htm, therefore, search engines and visitors would never come to know that you originally modified the page with such a occult name.

Introducing Regular Expressions

In URL Rewriting, it is not necessary to match the domain name or first slash, but the path of the URL. This directs the Apache to show the Web page Ba7BauefTau.htm where the path contains computer.htm. It can make you a little bit bother as one can also visit http://www.mywebsite.com/supercomputer.html, but nothing to worry, it works too. So, we need only this:

1 RewriteEngine On
2 RewriteRule ^computer.htm$ Ba7BauefTau.htm

The ^computer.htm$ is a regular string not conferred as a search string, where special characters like ^ – + ? * { } ( ) [ ] and$ – have extra significance. The ^ matches the begining of URL’s path, while $ matches the end. It stands that the path must begin and end with computer.htm. So, only computer.htm will respond, but not supercomputer.htm or computer.html. It is significant for search-engine like Google that can penalize in account of what it shows as tantamount pages which can be reached through multiple URLs.

Without File Endings

This can be enriched more better by etching the file ending together, enabling you to visit either

http://www.mywebsite.com/computer or http://www.mywebsite.com/computer/

1 RewriteEngine On
2 RewriteRule ^computer.htm/?$ Ba7BauefTau.html [NC]

The ? resembles the preceding character is optional. So, in this context the URL will respond effectively with or without slash at the end. It will not considered as duplicate URLs even by a search engine, however, prevents confusion when people added a slash accidentally. [NC] is a flag which stands that the rule is case insensitive, therefore http://www.mywebsite.com/CoMpUtEr will also work.

Wikipedia Example

Now its the time to take a look on real-world example. Wikipedia emerges to use URL rewriting, directing the title of the page to a PHP file. Go through this illustration:

http://en.wikipedia.org/wiki/Barack_obama

can be rewritten to

http://en.wikipedia.org/w/index.php?title=Barack_obama

This can be well executed with .htaccess file, in order like so

In the previous rule /? stands for zero or one slashes. If it had /+ , says one or more slashes, However, http://www.mywebsite.com/computer//// will work too. In this rule, the dot (.) matches any character, as .+ is capable to matches one or more of any character. In addition, the parenthesis – ( ) allows Apache to remember the value of .+. Now, the above rule says Apache to look and remember for wiki/ followed by one or more characters. This is remembered by Apache and then rewritten as $1. After accomplishment of rewriting, wiki/Barack_obama transforms into w/index.php?title=Barack_obama

Comment and Flags

Anything comes after # is a comment and ignored by Apache. The above example also introduced comments which allows you to explain your rewriting rules in order to let educate the future generation. The [L] flag works on condition as if any particular rule matches, it directs Apache to stop now. Otherwise it keeps the subsequent rule on execution, that is a significant feature with most complex rule sets but unnecessary for all.

Implementing the B&Q Example

The illustration for B&Q discussed above can be implemented with .htaccess file like this –

In this rule .* stands for nothing or anything, means, zero or more of any character while [0-9] and [0-9]+ stands for a single numerical digit and one or more digits respectively.

Conditional Rewriting

URL rewriting can make use of environment variables and also include conditions. However, these features enables you to easily redirect requests from one domain to another. Conditional Rewriting is very significant in case of changing domain of a website such as from mywebsite.co.uk to mywebsite.com.

Domain forwarding

Domain forwarding is allowed by most of the domain registrars. They used to redirects all requests made to one domain to another one, therefore, one can send request for www.mywebsite.co.uk/computer to the home page at www.mywebsite.com but not to www.mywebsite.com/computer. But it is possible to achieve this with URL Rewriting:

The second line in this illustration has RewriteCond , instead of a RewriteRule. It is used to compare an Apache environment variable on left with a regular expression on the right. The next line in the rule will be considered only if this condition is true.

Here, %{HTTP_HOST} represents the host or domain www.mywebsite.co.uk , that the browser is trying to visit. The ! Says “not.” This tells Apache, if the host does not begin as well as end with www.mywebsite.com, then remember and rewrite zero or more of any character to www.mywebsite.com/$1 . This transforms www.mywebsite.co.uk/anything-at-all to www.mywebsite.com/anything-at-all . Moreover, it work with all other aliases like www.mywebsite.biz/anything-at-all and mywebsite.com/anything-at-all.

One of the important flag is [R=301]. It says Apache to do a 301 (i.e. permanent) redirect. The new URL will be send back to the search-engine or browser by Apache and again requested by the search-engine or browser. Now, the new URL will appear in the browser’s location bar, however, search-engine will take note of this and update their database. [R] by itself is is the same as [R=302] and conveys temporary redirect.

File Existence and WordPress

Webgranth sports on renowned blogging software named WordPress which enables the author to writer their own URL, called a “slug.”

WordPress’ .htaccess file resembles like this –

Where, -f says “this is a file” and -d says “this is a directory.” This direct Apache to rewrite everything (including any path containing any character) to the page index.php, if the requested file name is not a file and requested directory is not a directory. This rule is not triggered, if you request for an existing image or the log-in page wp-login.php. But when you are requesting for anything else, then the file index.php comes into action.

Primarily, the index.php looks at the environment variable like $_SERVER[‘REQUEST_URL’] and extract the information which is necessary to find out what it is looking for. This gives it more flexibility than Apache’s rules and lets WordPress to imitate URL rewriting rules. Moreover, in order to administer a WordPress blog, simply go to setting – Permalink on the left side, and choose the URL rewriting type that you want to imitate.

Rewriting Query Strings

If you want to recreate an existing website from scratch, you might impose to redirect some significant and popular URLs from the old website to the location on the new website. This may include redirecting things like prod.php?id=20 to products/great-product/2166, that itself gets redirected to the actual page of product.

RewriteRule of Apache is implemented only to the path in the URL, not parameters like id=20 . In this type of rewriting, you will have to allude the Apache environment variable %{ QUERY_STRING}. It can be achieved like this –

In this example, first RewriteRule accomplished a permanent redirect from the old website’s URL to the URL of new website, whereas the second rule rewrites the new URL to the actual PHP page that show the product.

Examples of URL Rewriting On Shopping Websites

There is still the issue concerning complex content-managed websites that how to map friendly URLs to underlying resources. The above said example did that mapping manually with a URL like computer.htm with the file or resource Ba7BauefTau.htm. Wikipedia go through the resource based on title, and WordPress applies some complex internal rule sets. But its very hassle with complex data that include thousands of products in hundreds of categories. This next section shows the way that ShopbyChoice and several other online shopping portals take.

Have you ever come across a URL like this on Shopbychoice, http://www.shopbychoice.com/smartphone/htc/R0000MY83S, you can guess that Shopbychoice website has a sub-directory called /smartphone/htc/ which contains a file named R0000MY83S .

This is very worst feeling, even you try to change the name of the top level directory and still reach on the same page

http://www.shopbychoice.com/smartphone/htc/R0000MY83S

What really matters is the bit at the end. Look at this URL, you will find that R0000MY83S is this htc album’s standard identification number. After changing this you will have a “Page not found” or entirely different page of product

http://www.shopbychoice.com/smartphone/htc/R0083M9Y28

The /htc/ also matters. Changing it may also lead to a “Page not found.” So, the R0000MY83S asked the shopbychoice what to display, and htc tells how to show it. This is URL rewriting in action, with the original URL possibly may be getting rewritten as –

http://www.shopbychoice.com/displayproduct.php?asin=R0000MY83S.

Feature of the Shopbychoice URL

This introduces some significant feature concerned with Shopbychoice’s URLs which can be implemented to any other website with a complex set of resources. However, it shows that a URL can be generated automatically as well as include upto 3 parts.

1. The words in this case, the words are based on album and artist and all non-alphanumeric characters are replaced. However, the slash in smartphone/htc converts in a hyphen. This thing helps the humans and search-engines.

2. An ID number or some thing that direct the website what to look up such as R0000MY83S.

3. An identifier or something that asked the website regarding how to display it and where to look for it. If htc tells Shopbychoice to search for a product, then probably it triggers a database statement such as

SELECT * FROM products WHERE id=’R0000MY83S’

Friendly URLs

Toward the endeavor of creating nice and friendly URLs, if you follow the current advice, you separate the words with hyphens rather than underscores and capitalize consistently. Most people search in lower case so, prefer lowercase too. Some punctuation like commas and dots should be also changed into hyphens, otherwise as a result they would turned into like %2C, that looks clumsy and might break the URL during cut, copy & paste. For the same purpose, you should remove apostrophes and parenthesis too.

To replace accented characters is arguable too. URLs with accents (or any non-Roman characters) may look bad or even break when rendered in distinct character format. However, replacing them with their non-accented equivalents as well as hyphens might make the URLs harder for search-engines to find. If the website is dedicated to French audience, then leave the French accents in. Moreover, substitute them, if French words are in scanty and far between on a mainly English website.

This PHP function handles all above suggestions in a nutshell –

This would create URLs like this –

Changing Page Names

Duplicate content i.e. multiple pages containing the same information are basically ignored by search-engines. Search-engines may chasten the website when it found that duplicate content are being employed. So, you must avoid this where possible. Google recommend to send users from old pages to new ones by using 301 redirects.

To prevent the users of agitation both old and new URL should work, in account of URL-rewritten page is renamed. Moreover, in order to avoid risk of duplication, the old URL should automatically redirect to the new URL, as WordPress do.
It is very easy in PHP. The rule given below looks at the current URL, and if it not found the desire URL, redirects the user –

This will be used like this –

Before using this function, first test it in your environment with your rewrite rules in order to ensure that it does not make any infinite redirects.

Checklist And Troubleshooting

In order to implement URL rewriting, use the following checklist –

Check That It’s Supported

All Web servers are not supported by URL rewriting, however, putting up your .htaccess file on that, it will be ignored or triggered to “500 Internal Server Error.”

Plan Your Approach

Analyze for what will get mapped to what as well as how to still get the correct information. Suppose to introduce new URLs, like my-great-product/p/123 , to replace your current product URLs, like product.php?id=123 , and to substitute new-category/c/12 for category.php?id=12.

Create Your Rewrite Rules

Create an .htaccess file for your new rules. You can start it in a /testing/ sub directory and use [R] flag, which lets you to trace where the thing go.

Now, when you visit www.mywebsite.com/testing/my-great-product/p/123 , you are supposed to sent at www.mywebsite.com/testing/product.php?id=123 . As product.php is not in your /testing/ sub directory hence, you will get a “Page not found”. However, you will come to know that your rules work. When you become satisfied, move the .htaccess file to document root and remove the [R] flag. Now www.mywebsite.com/testing/my-great-product/p/123 should work.

Check Your Pages

It is necessary to ensure that your new URLs fetch in all the correct images, CSS and JavaScript files. For instance, the Web browser now thinks that your Web page is named 123 in a directory named my-great-product/p/. If HTML renders to a file named images/logo.jpg, then the Web browser would request the image from www.mywebsite.com/my-great-product/p/images/logo.jpg and would come up with a “File not found.”

You would supposed to also rewrite the image locations or make the references absolute  or put a base href at the top of the of the page (). But if you do that, you would need to fully specify any internal links that begin with # or ? because they would now go to something like product.php#details.

Change Your URLs

Now, find out the all references to your old URLs and replace them with new URLs, featuring a function like GenerateUrl to regularly create the new URLs. This is the only step that might crave ultra vision into the underlying code of your website.

Automatically Redirect Your Old URLs

When the URL rewriting is accomplished, you will supposed Google to give up your old URLS and take in action of the new ones. However, when a search result fetch up product.php?id=20, you would want the user to be visibly redirected to my-great-product/p/123, that will be internally redirected back to product.php?id=20

In fact, you can add another rule to .htaccess to accomplish this, but the browser can go into a redirect loop if you get the rules in wrong order.

Alternately, it is proposed to first redirect in PHP, including like Checkurl function above. This has the significant advantages that when you rename the product, the old URL will become invalid and redirected to the new one immediately.

Update and Resubmit Your Site Map

Be ensure to carry through your products feeds, new URLs to your site map and everywhere else they appear.

Conclusion

URL Rewriting is indeed a superb technique and is relatively quick and easy in order to invoke and dragging to customers and search-engines in your website. However, here we have discussed some examples of URL rewriting and the technical details, to make you capable in order to implement in your own website.

David Meyer

As the most experienced developer of CSSChopper - PSD to HTML Conversion Company, David Meyer firmly believes in building the new ways that lead the people towards success. He focuses on an ideal approach and tries to deliver the perfect services close to the defined needs.