1

My Stack:

  • LAMP
  • Apache/2.4.41

Background Information:

I recently launched a new website for a client. During the re-design process we decided to:

  • Switch to sitewide HTTPS
  • Remove the .php extension from the URLs
  • Switch to a CMS

Example of OLD URL:
http://www.example.com/courses/acme-course.php

Example of NEW URL:
https://www.example.com/courses/acme-course

My Issue:

An unnecessary additional 301 Redirect is occurring when a user navigates to one of the OLD URLs.

I do not understand why the additional 301 redirect is being created and not sending the user directly to the correct destination URL using a single 301 redirect.

Interesting Observation:

The unnecessary additional 301 Redirect does not occur when I use the OLD URL with HTTPS instead of HTTP.

Example:
https://www.example.com/courses/acme-course.php _

Using the above URL will correctly do a single 301 Redirect to the correct destination URL of: https://www.example.com/courses/acme-course

Here's an Example of a 301 Redirect Chain:

Original Request URL:

http://www.example.com/courses/acme-course.php

1ST 301 Redirect (Unnecessary):

FROM:

http://www.example.com/courses/acme-course.php

TO:

https://www.example.com/index.php?url=courses/acme-course.php

2ND 301 Redirect (Correct Final Destination URL):

FROM:

https://www.example.com/index.php?url=courses/acme-course.php

TO:

https://www.example.com/courses/acme-course

My .htaccess code:

# (1) General Settings <IfModule mod_rewrite.c> Options +FollowSymLinks RewriteEngine On </IfModule> # (2) Force WWW <IfModule mod_rewrite.c> RewriteCond %{HTTPS} !=off RewriteCond %{HTTP_HOST} !^www\. [NC] RewriteCond %{SERVER_ADDR} !=127.0.0.1 RewriteCond %{SERVER_ADDR} !=::1 RewriteRule ^ %{ENV:PROTO}://www.%{HTTP_HOST}%{REQUEST_URI} [R=301,L] </IfModule> # (3) Force HTTPS <IfModule mod_rewrite.c> RewriteCond %{HTTPS} !=on RewriteRule ^(.*)$ https://%{HTTP_HOST}/$1 [R=301,L] </IfModule> # (4) URL Routing for CMS <IfModule mod_rewrite.c> RewriteCond %{HTTPS} =on RewriteRule ^ - [env=proto:https] RewriteCond %{HTTPS} !=on RewriteRule ^ - [env=proto:http] ## Check if file/directory exists RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d ## Route all other URLs to index.php/URL RewriteRule ^(.*)$ index.php?url=$1 [PT,L,QSA] </IfModule> 
4
  • Do you have any intention of implementing HSTS in the future? Commented Jan 11, 2020 at 19:24
  • @MrWhite Yes, I am familiar with and intend on implementing HSTS in the future, AFTER I've seen (i) the old HTTP URLs get pushed out of Google Index and (ii) confirm that the new HTTPS URLs have fully-resolved and are indexed. Commented Jan 11, 2020 at 21:22
  • @MrWhite If I reorder the rules/conditions, does my placement of Options +FollowSymlinks stay the same? Commented Jan 11, 2020 at 21:36
  • It doesn't much matter. I've addressed the placement of these directives in more detail in my answer. Commented Jan 11, 2020 at 22:25

1 Answer 1

0

You have two main issues....

  1. Your directives are in the wrong order in the .htaccess file. Your HTTP to HTTPS and www canonical redirects need to go before your front-controller that routes the URL to your CMS. Hence the incorrect external redirect to /index.php?url=courses/acme-course.php - exposing your internal CMS URL structure.
  1. The removal of .php is not actually being performed by your .htaccess directives?! I assume this must be being done by your application/CMS logic? Consequently, this will always result in a second redirect (since .htaccess is redirecting to HTTPS on the same URL-path). You need to do something like the following at the top of your .htaccess file to remove the .php extension.

    RewriteRule (.+)\.php$ https://www.example.com/$1 [R=301,L] 

UPDATE: If I reorder the rules/conditions, does my placement of Options +FollowSymlinks stay the same?

It doesn't really matter where the Options directive occurs. However, it is logical (from a readability standpoint) to have it near the top. (Apache directives don't necessarily execute in the order they appear in the config file, as each module works independently.)

Assuming you are hand-coding your .htaccess file then it can be tidied up...

  1. There is no need for the (multiple) <IfModule mod_rewrite.c> wrappers. Is mod_rewrite optional? Is your site be ported to multiple servers where mod_rewrite is not enabled?

  2. There is no need for multiple RewriteEngine directives. The last instance actually wins and controls the entire file.

    Multiple <IfModule> blocks and RewriteEngine are typical of systems that are automatically edited by code and/or designed to function unedited on multiple servers.

So, your .htaccess file should be rewritten like this in this order:

Options +FollowSymlinks # Enable the rewrite engine... RewriteEngine On # ---------------------------------------------------------------------- # | Forcing `https://` | # ---------------------------------------------------------------------- # Redirect to HTTPS on the "same host" (requirement for HSTS) RewriteCond %{HTTPS} !=on RewriteRule (.*) https://%{HTTP_HOST}/$1 [R=301,L] # ---------------------------------------------------------------------- # | Forcing `www` | # ---------------------------------------------------------------------- RewriteCond %{HTTP_HOST} !^www\. RewriteCond %{SERVER_ADDR} !=127.0.0.1 RewriteCond %{SERVER_ADDR} !=::1 RewriteRule ^ https://www.%{HTTP_HOST}%{REQUEST_URI} [R=301,L] # ---------------------------------------------------------------------- # | URL Routing for CMS | # ---------------------------------------------------------------------- # (3) RewriteCond %{HTTPS} =on RewriteRule ^ - [env=proto:https] RewriteCond %{HTTPS} !=on RewriteRule ^ - [env=proto:http] # (4) - Check if physical file exists RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d # (5) - Rewrite all other URLs to index.php/URL RewriteRule (.*) index.php?url=$1 [L,QSA] 

Additional Notes:

  • The PROTO environment variable contains whatever protocol is being requested. With the order of the redirects this is now always going to be HTTPS. The reason for this variable at all is so the CMS can redirect to HTTP if HTTP is accessed, or HTTPS if HTTPS is accessed. If you are forcing HTTPS then it doesn't really apply. (Although this env var may still be used by your application.)

  • Rarely should you use the NC flag on a negated condition. Hence why I removed it from the condition !^www\.. You want it to redirect when the host does not start with www. - all lowercase. With the NC flag it will fail to redirect WwW. - although this would be very rare anyway.

  • I've removed the unnecessary check for HTTPS on the www canonical redirect.

  • The PT flag on the last RewriteRule is not required in .htaccess. In .htaccess this is the default behaviour (pass through).

  • You will need to clear your browser cache before testing, as the erroneous 301 redirects will likely have been cached by the browser. It is a good idea to test with 302 (temporary) redirects for this reason.

5
  • #2: No, I currently don't have anything that 'removes' the .php extension. If the requested URI is found in my url_redirects table, it redirects the user to the new destination URL. Example: courses/acme-course.php > courses/acme-course Commented Jan 11, 2020 at 21:29
  • 1
    Ok, so your "url_redirects table" is effectively your .php removal script then it would seem? A small redirect "chain" of two redirects (which you can't avoid if you are performing one redirect in .htaccess and a second in your application) is not necessarily a bad thing. And if you are planning to implement HSTS (preload list) then this is actually becomes a requirement (at least to redirect HTTP to HTTPS on the same hostname). Commented Jan 11, 2020 at 21:55
  • I've updated my answer. Commented Jan 11, 2020 at 22:27
  • Your Additional Notes section really helped explain and help me understand. After reviewing your updated code, do I really need the following? ` RewriteCond %{HTTPS} =on RewriteRule ^ - [env=proto:https] RewriteCond %{HTTPS} !=on RewriteRule ^ - [env=proto:http] ` Commented Jan 11, 2020 at 22:49
  • 1
    "do I really need the following?" - As mentioned above, only if your application uses it. Some CMS use an environment variable like this to construct absolute URLs internally. Whether your CMS uses this I couldn't say. Commented Jan 11, 2020 at 22:54

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.