3

I am attempting to modifying my .htaccess file within a specific directory. If a web user attempts to find any file in this directory that may be named like the following options, I want them to be redirected back to home. Below are some file name examples.

  • /cat_1234.pdf
  • /cat_blahbla.doc
  • /cat_$9989&428.jpg
  • /cat_-309bn-020n.webp

...how can I tell my RewriteCond to look out for these patterns? Here was my best attempt, which I thought would work, but it doesn't...

<IfModule mod_rewrite.c> RewriteCond %{REQUEST_URI} ^cat_([0-9a-zA-Z_]+)\.(pdf|doc|jpg|webp) [NC] RewriteRule . /index.php [R=302,L] </IfModule> 

What am I missing?

3
  • 2
    Please question the <IfModule mod_rewrite> in this. It's often recommended because it means "a missing module shouldn't cause the server to not work at all", but it also means if you deploy your software on a server that hasn't mod_rewrite enabled, your cat_* files are completely unprotected. Commented Mar 29, 2024 at 6:51
  • For starters, it doesn't look like that regex will match the hyphen in your fourth example. On the other hand, it will match an underscore that you don't appear to need. Commented Mar 29, 2024 at 8:30
  • Examples don't make a pattern. Describe the patterns more specifically. Is any character allowed between cat_ and the extension? Or only specific ones? Do they depend on the extension? Does it matter if the regular expression matches anything starting with cat_? Does the extension actually matter? Commented Mar 29, 2024 at 14:48

1 Answer 1

4

You've not stated the "specific directory" in which the .htaccess file and files you are protecting is located? (Although that shouldn't matter if we rework the rule.)

RewriteCond %{REQUEST_URI} ^cat_([0-9a-zA-Z_]+)\.(pdf|doc|jpg|webp) [NC] RewriteRule . /index.php [R=302,L] 

The REQUEST_URI server variable contains the full URL-path (including the slash prefix), so this would normally need to include the "specific directory", not just the filename (unless you adjust the regex). You have a start-of-string anchor on the regex (although you have omitted the end-of-string anchor) so this condition (RewriteCond directive) will never match.

Your regex would also fail to match your 3rd and 4th examples because your regex character class ([0-9a-zA-Z_]) omits the special characters $, & and - that are present in these filenames. Although I would surmise you do not need to be so specific and catching cat_<anything>.pdf (for example) would be OK.

However, you do not need a separate condition here. It is easier and more efficient to just use the RewriteRule pattern, which matches relative to the directory that contains the .htaccess file (and excludes the slash prefix), so you do not need to worry about the rest of the URL-path.

I also doubt that you should be redirecting to /index.php. Should this not be simply / (the root directory) and allow the directory index (ie. index.php) to be served by mod_dir? Is that not your canonical URL?

Try the following instead, in the .htaccess file in the directory you are protecting.

RewriteRule ^cat_[^/]+\.(pdf|doc|jpg|webp)$ / [R=302,L] 

This regex is perhaps slightly more broad than it needs to be, but that also makes it simpler. ie. [^/] matches anything that is not a / (path separator).

And no need for the <IfModule> wrapper, unless this rule is entirely optional.


However, instead of redirecting to the homepage (which is confusing for users and unnecessary for bots) I would simply block (with a 403 Forbidden) such requests instead. For example:

<FilesMatch "^cat_[^/]+\.(pdf|doc|jpg|webp)$"> Require all denied </FilesMatch> 
6
  • 1
    MrWhite, thank you for the learning curve on this process. This was a major help for me. Commented Mar 28, 2024 at 17:53
  • 1
    @klewis If things like cat_.docx are also restricted then consider using this regex: ^cat_[^/]*\.(pdf|docx?|jpe?g|webp)$ Commented Mar 29, 2024 at 10:42
  • 2
    @klewis Also, I prefer one-liners and think a 404 would be better than a 403. RewriteRule ^cat_[^/]*\.(pdf|docx?|jpe?g|webp)$ - [R=404,NC,L]. A redirect or forbidden reveals that the asset exists, a 404 at least masks the existence. Commented Mar 29, 2024 at 10:47
  • 1
    Does it actually make sense to allow other extensions? Otherwise just ^cat_ (without anything after) would be enough. Commented Mar 29, 2024 at 12:21
  • @DidierL ^cat_[^/]*$ Commented Mar 29, 2024 at 15:12

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.