Opened 4 years ago

Closed 4 years ago

#2320 closed defect (duplicate)

Inconsistent URL decoding with X-Accel-Redirect depending on whether the original request URL was url-encoded

Reported by: ThiefMaster Owned by:
Priority: major Milestone:
Component: nginx-core Version: 1.19.x
Keywords: Cc: ThiefMaster
uname -a: Linux eluvian 5.15.19-gentoo #1 SMP Sun Feb 6 19:44:25 CET 2022 x86_64 AMD Ryzen 5 3600X 6-Core Processor AuthenticAMD GNU/Linux
nginx -V: nginx version: nginx/1.20.1
built with OpenSSL 1.1.1l 24 Aug 2021
TLS SNI support enabled
configure arguments: --prefix=/usr --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error_log --pid-path=/run/nginx.pid --lock-path=/run/lock/nginx.lock --with-cc-opt=-I/usr/include --with-ld-opt=-L/usr/lib64 --http-log-path=/var/log/nginx/access_log --http-client-body-temp-path=/var/lib/nginx/tmp/client --http-proxy-temp-path=/var/lib/nginx/tmp/proxy --http-fastcgi-temp-path=/var/lib/nginx/tmp/fastcgi --http-scgi-temp-path=/var/lib/nginx/tmp/scgi --http-uwsgi-temp-path=/var/lib/nginx/tmp/uwsgi --with-compat --with-debug --with-http_v2_module --with-pcre --without-http_browser_module --without-http_empty_gif_module --without-http_geo_module --without-http_grpc_module --without-http_limit_req_module --without-http_limit_conn_module --without-http_memcached_module --without-http_mirror_module --without-http_scgi_module --without-http_ssi_module --without-http_split_clients_module --without-http_upstream_hash_module --without-http_upstream_ip_hash_module --without-http_upstream_keepalive_module --without-http_upstream_least_conn_module --without-http_upstream_zone_module --without-http_userid_module --with-http_addition_module --with-http_stub_status_module --with-http_sub_module --with-http_realip_module --add-module=external_module/ngx-fancyindex-0.4.4 --with-http_ssl_module --without-stream_access_module --without-stream_geo_module --without-stream_limit_conn_module --without-stream_map_module --without-stream_return_module --without-stream_split_clients_module --without-stream_upstream_hash_module --without-stream_upstream_least_conn_module --without-stream_upstream_zone_module --without-mail_imap_module --without-mail_pop3_module --without-mail_smtp_module --user=nginx --group=nginx

Description

I'm using nginx and x-accel-redirect in what's probably a slightly creative way to proxy file downloads from S3 in order to not show the very ugly and temporary S3 URL to my users. This works fine, but I noticed that in a few very rare cases the requests fail (usually due to an invalid S3 signature).

In any case, I managed to reproduce it without S3 and it seem to happen because depending on whether the URL the user accesses contains url-encoded charaters or not, the URL encoding behavior in the internal request changes.

This is a problem of course, because it means I cannot reliable know whether I need to URL-encode the path included in my x-accel-redirect header or not unless I use a horrible workaround of checking whether the current request url contains a % character.

I believe this behavior is so strange that it can only be a bug, but if there are any workarounds to get consistent behavior (so I can either always url-encode the path in my x-accel-redirect header or never do so) please let me know.

I'm attaching a minimal test case to reproduce this. The nginx version used there was the one that's stable on my Gentoo system, but I'm also including a Dockerfile which can be used to test against the nginx:latest image on Docker hub in this Git repo: https://github.com/ThiefMaster/nginx-bug-repro

If you want to run the test case locally:
Put the two files in some folder, create a Python virtualenv, activate it and install flask in there (pip install flask).

Then run the flask app with flask run -p 12345 -h 0.0.0.0 and nginx with nginx -c $(realpath nginx.conf) -e /dev/stdout.

nginx.conf:

daemon off; worker_processes 2; pid /tmp/nginx/nginx.pid; error_log /dev/stdout info; events { worker_connections 1024; use epoll; } http {	client_body_temp_path /tmp/nginx/client;	proxy_temp_path /tmp/nginx/proxy;	fastcgi_temp_path /tmp/nginx/fcgi;	uwsgi_temp_path /tmp/nginx/uwsgi; log_format short '$remote_addr [$time_local] "$request" $status $bytes_sent';	server { listen 8000 default_server; listen [::]:8000 default_server; access_log /dev/stdout short;	location ~ ^/_internal/(https?)/([^/]+)/(.+)$ {	internal;	set $download_protocol $1;	set $download_host $2;	set $download_path $3;	set $download_url $download_protocol://$download_host/$download_path;	resolver 127.0.0.1;	proxy_pass $download_url$is_args$args;	}	location / {	proxy_pass http://127.0.0.1:12345;	}	root /var/empty; } } 

app.py:

from urllib.parse import quote from flask import Flask, url_for, request app = Flask(__name__) @app.route('/') def index(): return f''' WITHOUT explicitly encoding x-accel-redirect url:<br> <a href="{url_for('test_without')}">Test [main url without encoding, target without encoding]</a><br> <a href="{url_for('test_without', enc_target=1)}">Test [main url without encoding, target with encoding]</a> 💣<br> <br> <a href="{url_for('test_with')}">Test [main url with encoding, target without encoding]</a><br> <a href="{url_for('test_with', enc_target=1)}">Test [main url with encoding, target with encoding]</a><br> <br><br> WITH explicitly encoding x-accel-redirect url:<br> <a href="{url_for('test_without', enc_xar=1)}">Test [main url without encoding, target without encoding]</a><br> <a href="{url_for('test_without', enc_xar=1, enc_target=1)}">Test [main url without encoding, target with encoding]</a><br> <br> <a href="{url_for('test_with', enc_xar=1)}">Test [main url with encoding, target without encoding]</a><br> <a href="{url_for('test_with', enc_xar=1, enc_target=1)}">Test [main url with encoding, target with encoding]</a> ⚠ ''' def _test(): endpoint = 'target_with' if request.args.get('enc_target') == '1' else 'target_without' path = url_for(endpoint) internal_url = f'/_internal/http/{request.host}{path}' print(f'Redirecting internally to {internal_url}') response = app.make_response('') if request.args.get('enc_xar') == '1': internal_url = quote(internal_url) response.headers['X-Accel-Redirect'] = internal_url return response @app.route('/test-without/') def test_without(): return _test() @app.route('/test with/') def test_with(): return _test() def _target(): return f"target called url={request.base_url}" @app.route('/target-without') def target_without(): return _target() @app.route('/target with') def target_with(): return _target() @app.errorhandler(404) def notfound(exc): return f'NOT FOUND: {request.base_url}' 

Change History (1)

comment:1 by Maxim Dounin, 4 years ago

Resolution: duplicate
Status: newclosed

This looks like a duplicate of #348: excessive URI encoding is done by set ... $1; if there were escaped characters in the original request URI. As a workaround, consider using named captures to avoid this extra escaping.

Note: See TracTickets for help on using tickets.