Project

General

Profile

Bug #2738

mediawiki redirect loop if REQUEST_URI not orig req in 1.4.40

Added by LoneFox over 1 year ago. Updated over 1 year ago.

Status:
Fixed
Priority:
Normal
Assignee:
-
Category:
core
Target version:
Start date:
2016-07-19
Due date:
% Done:

100%

Estimated time:
Missing in 1.5.x:

Description

I have mediawiki set up with some URL cleanup rewrites, basically copied from https://www.mediawiki.org/wiki/Manual:Short_URL/wiki/Page_title_--_Lighttpd_rewrite--root_access

  alias.url += ("/w" => "/usr/share/mediawiki/")
  url.rewrite-if-not-file += (
    "^/wiki/(mw-)?config/?" => "$0",
    "^/wiki/([^?]*)(?:\?(.*))?" => "/w/index.php?title=$1&$2",
    "^/wiki/([^?]*)" => "/w/index.php?title=$1" 
  )

This works in earlier versions of lighttpd, but with 1.4.40 it results in an endless 301 Moved Permanently loop.

According to git bisect dbdab5dbc9b98df9c40f11e0fc6a6ce49bfea804 is the first bad commit.

lighttpd.conf.tar.gz (2.32 KB) lighttpd.conf.tar.gz LoneFox, 2016-07-19 07:50

Associated revisions

Revision ed340897 (diff)
Added by gstrauss over 1 year ago

do not set REDIRECT_URI in mod_magnet, mod_rewrite (#2738)

reverts b473220d

x-ref:
"mediawiki redirect loop if REQUEST_URI not orig req in 1.4.40"
https://redmine.lighttpd.net/issues/2738

Revision 9af58a97 (diff)
Added by gstrauss over 1 year ago

revert 1.4.40 swap of REQUEST_URI, REDIRECT_URI (fixes #2738)

reverts part of dbdab5db which swapped REQUEST_URI, REDIRECT_URI

x-ref:
"mediawiki redirect loop if REQUEST_URI not orig req in 1.4.40"
https://redmine.lighttpd.net/issues/2738

Explanation:

REQUEST_URI and REDIRECT_URI are not part of CGI standard environment.
The reason for their existence is that PATH_INFO in CGI environment may
be different from the path in the current request. The main reason for
this potential difference is that the URI path is normalized to a path
in the filesystem and tested against the filesystem to determine which
part is SCRIPT_NAME and which part is PATH_INFO. In case-insensitive
filesystems, the URI might be lowercased before testing against the
filesystem, leading to loss of case-sensitive submission in any
resulting PATH_INFO. Also, duplicated slashes "///" and directory
references "/." and "/.." are removed, including prior path component in
the case of "/..". This might be undesirable when the information after
the SCRIPT_NAME is virtual information and there target script needs the
virtual path preserved as-is. In that case, the target script can
re-parse REQUEST_URI (or REDIRECT_URI, as appropriate) to obtain the
unmodified information from the URI.

con->request.uri is equivalent to con->request.orig_uri unless the
request has been internally rewritten (e.g. by mod_rewrite, mod_magnet,
others), in which case con->request.orig_uri is the request made by the
client, and con->request.uri is the current URI being processed.

Historical REQUEST_URI (environment variable) lighttpd inconsistencies
- mod_cml set REQUEST_URI to con->request.orig_uri
- mod_cgi set REQUEST_URI to con->request.orig_uri
- mod_fastcgi set REQUEST_URI to con->request.orig_uri
- mod_scgi set REQUEST_URI to con->request.orig_uri

- mod_ssi set REQUEST_URI to current con->request.uri
- mod_magnet set MAGNET_ENV_REQUEST_URI to current con->request.uri
and MAGNET_ENV_REQUEST_ORIG_URI to con->request.orig_uri

Historical REDIRECT_URI (environment variable) previously set only in
mod_fastcgi and mod_scgi, and set to con->request.uri

Since lighttpd 1.4.40 provides REDIRECT_URI with con->request.orig_uri,
changes were made to REQUEST_URI for consistency, with the hope that
there would be little impact to existing configurations since the
request uri and original request uri are the same unless there has been
an internal redirect. It turns out that various PHP frameworks use
REQUEST_URI and require that it be the original URI requested by client.

Therefore, this change is being reverted, and lighttpd will set
REQUEST_URI to con->request.orig_uri in mod_cgi, mod_fastcgi, mod_scgi
as was done in lighttpd 1.4.39 and earlier. Similarly, REDIRECT_URI
also has the prior behavior in mod_fastcgi and mod_scgi, and added to
mod_cgi.

A future release of lighttpd might change mod_ssi to be consistent with
the other modules in setting REQUEST_URI to con->request.orig_uri and to
add REDIRECT_URI, when an internal redirect has occurred.

History

#1

Updated by gstrauss over 1 year ago

Thanks for the report. I'll try to repro. Some more details would be useful (besides "download and install mediawiki" :))

#2

Updated by gstrauss over 1 year ago

The git bisect commit you referenced did not touch mod_alias or mod_rewrite, and affected error status >= 400. Might there be any other relevant parts of your lighttpd.conf? Would you attach your lighttpd.conf?

#3

Updated by gstrauss over 1 year ago

FYI: the commit following dbdab5db is b473220d which sets REDIRECT_URI in mod_rewrite if it has not already be set.

#4

Updated by gstrauss over 1 year ago

  • Status changed from New to Need Feedback

What is the request that results in a 301? (What other details can you provide about this request and what you expect it to access and what you expect lighttpd to do?)

#5

Updated by LoneFox over 1 year ago

gstrauss wrote:

Might there be any other relevant parts of your lighttpd.conf? Would you attach your lighttpd.conf?

I'm attaching a tarball that contains the config files.
On mediawiki side I have the changes from that link in the original report, and I'm also using Auth_remoteuser extension. Other that that, it should be the standard configuration.

gstrauss wrote:

What is the request that results in a 301? (What other details can you provide about this request and what you expect it to access and what you expect lighttpd to do?)

Any attempt to access the wiki causes it, for example https://localhost:1985/wiki/Main_Page
(Normally I use a domain from FreeDNS, but I don't want to put that on this public bugtracker, so I changed it to use localhost instead. Same behavior in both cases.)

#6

Updated by gstrauss over 1 year ago

LoneFox, what is the version of lighttpd you are currently using (which works for you)?

How did you install mediawiki? In /wiki or /w? This is not a pure vanilla installation. You have made changes for short URLs. What changes were made? What else might be useful information for someone who has NEVER before run mediawiki (me)? We're not going to get very far if you think that I am going to install an entire framework and figure out what customizations you made for special features, having never used it before. Do things work if you (temporarily!) symlink wiki -> w?

If you have things working on localhost in a test config, please try disabling various modules (in your test config) so that they can be ruled out. mod_compress, mod_fastcgi (unused in your config), mod_userdir, mod_status, mod_auth, mod_setenv, mod_redirect. Comment out the directives for the functionality provided by these modules. Try to narrow down the problem. If only accessible from localhost, can you also disable the ssl? It would be helpful to reduce your config to the smallest interaction between modules.

Separately from the above, you can set debug.log-request-handling = "enable", restart lighttpd, trigger the 301 loop for a second. Lots of output will be logged. Please take a look to make sure the data can be shared and then post the log so that we can see what is looping. (If you change anything in the log, please note what you substituted to, but obviously omit the original.)

#7

Updated by LoneFox over 1 year ago

I did some experiments and found out what is happening. The mod_cgi.c change in that commit modifies REQUEST_URI.
With 1.4.39 (and 1.4.40 with that change reverted), mediawiki sees REQUEST_URI = /wiki/Main_Page and works properly.
With unmodified 1.4.40 it sees REQUEST_URI = /w/index.php?title=Main_Page& and redirects.

#8

Updated by gstrauss over 1 year ago

Thanks for that! It would appear the behavior change came from the >= 0 test, instead of > 0. If you compile your own version of lighttpd, you might test with that change in mod_cgi.c.

-               if (!buffer_string_is_empty(con->request.orig_uri)) {
+               if (con->error_handler_saved_status >= 0) {
+                       cgi_env_add(&env, CONST_STR_LEN("REQUEST_URI"), CONST_BUF_LEN(con->request.uri));
+               } else {
                        cgi_env_add(&env, CONST_STR_LEN("REQUEST_URI"), CONST_BUF_LEN(con->request.orig_uri));
                }

I'm going to look into the mediawiki code, too, to confirm this is what is happening. REQUEST_URI is a non-standard CGI environment variable, and so is REDIRECT_URI, which is added in the patch you mention. Perhaps mediawiki should prefer REDIRECT_URI over REQUEST_URI? I am not sure yet. Still looking...

#9

Updated by gstrauss over 1 year ago

mediawiki includes/DefaultSettings.php contains the following

/**
 * Whether to support URLs like index.php/Page_title These often break when PHP
 * is set up in CGI mode. PATH_INFO *may* be correct if cgi.fix_pathinfo is set,
 * but then again it may not; lighttpd converts incoming path data to lowercase
 * on systems with case-insensitive filesystems, and there have been reports of
 * problems on Apache as well.
 *
 * To be safe we'll continue to keep it off by default.
 *
 * Override this to false if $_SERVER['PATH_INFO'] contains unexpectedly
 * incorrect garbage, or to true if it is really correct.
 *
 * The default $wgArticlePath will be set based on this value at runtime, but if
 * you have customized it, having this incorrectly set to true can cause
 * redirect loops when "pretty URLs" are used.
 * @since 1.2.1
 */

In lighttpd 1.4.40, lighttpd attempts to preserve the case in PATH_INFO, so the comment above about lighttpd is no longer true.

Since you're using the rewrite rules, have you tried changing LocalSettings.php to use $wgUsePathInfo = false; instead of true?

#10

Updated by LoneFox over 1 year ago

gstrauss wrote:

Thanks for that! It would appear the behavior change came from the >= 0 test, instead of > 0. If you compile your own version of lighttpd, you might test with that change in mod_cgi.c.

Yes, that fixes the problem.

#11

Updated by gstrauss over 1 year ago

Glad to hear it that it works for you and mediawiki.

I'm not convinced that change is the generic solution, as the problem stems from PHP and, separately, mediawiki, path shenanigans.

As previously noted, REQUEST_URI is not part of the CGI spec, and neither is REDIRECT_URI.

I think that mediawiki should use REDIRECT_URI, if set, instead of REQUEST_URI, if mediawiki otherwise uses the same parsing code.

However, you've already parsed path info into the query string in the rewrite rule, so mediawiki should just get the info from there (in REDIRECT_URI, which already provides 'title=...' in the query string), rather than attempting to reparse path info! All these shenanigans are mediawiki attempts to obtain path info before physical path translation to filesystem paths, which remove double slashes "//" and paths segments like "/../", and potentially lowercases on case-insensitive filesystems, or other physical filesystem path translation. You've provided that unmodified path in the query string as title=..., and I think that mediawiki should be using that.

#12

Updated by gstrauss over 1 year ago

I am curious if the following would work if you added it to initialization code in mediawiki/include/WebStart.php (and if you don't patch lighttpd as you did above)

untested:

# mediawiki expects REQUEST_URI to be the original URL.  
# However, lighttpd 1.4.40 and later set REDIRECT_URI to 
# the original URL and set REQUEST_URI to the current URL
# when request has been internally redirected in lighttpd.
if ( !empty( $_SERVER['REDIRECT_URI'] )
  && isset(  $_SERVER['SERVER_SOFTWARE'] )
  && strpos( $_SERVER['SERVER_SOFTWARE'], 'lighttpd/1.4.40') === 0 ) {
    $_SERVER['REQUEST_URI'] = $_SERVER['REDIRECT_URI'];
}

edited to specify this only for lighttpd 1.4.40

#13

Updated by gstrauss over 1 year ago

  • Subject changed from rewrite-if-not-file broken in 1.4.40 to mediawiki redirect loop if REQUEST_URI not orig req in 1.4.40
#14

Updated by LoneFox over 1 year ago

gstrauss wrote:

I'm not convinced that change is the generic solution, as the problem stems from PHP and, separately, mediawiki, path shenanigans.

As previously noted, REQUEST_URI is not part of the CGI spec

But it is part of this de facto standard called apache, and mediawiki is probably not the only popular web application that expects it to behave like it does in apache...

I am curious if the following would work if you added it to initialization code in mediawiki/include/WebStart.php (and if you don't patch lighttpd as you did above)

Yes, it does work.

#15

Updated by gstrauss over 1 year ago

Thank you very much for your help in tracking this down and identifying where the problem is.

The solution is still up for discussion. It may be that lighttpd should revert this behavior.

Thanks to your help, there is a workaround for those affected, and the workaround is doable in the backend, which is typically a dynamic language. (Asking people to recompile lighttpd would be a higher barrier to entry, though you were able to test that.)

#16

Updated by olegcorner over 1 year ago

I hope this can help

Arch Linux

lighttpd 1.4.40-1

config :

url.rewrite-once = (
            "^(/assets.*)$"  => "$1",
            "^(/partials.*)$"  => "$1",
            "^/([^.?]*)\?(.*)$" =>  "/index.php?_url=/$1&$2",
            "^/([^.?]*)$"       =>  "/index.php?_url=/$1" 
    )

Steps to reproduce:
Just headers dump in PHP

echo '<pre>';
echo 'REQUEST_URI : ';
var_dump($_SERVER['REQUEST_URI']);
echo 'REDIRECT_URI : ';
var_dump($_SERVER['REDIRECT_URI']);

On version 1.39 we have response

REQUEST_URI : string(1) "/" 
REDIRECT_URI : string(17) "/index.php?_url=/" 

And on version 1.40

REQUEST_URI : string(17) "/index.php?_url=/" 
REDIRECT_URI : string(1) "/" 

Result of query to internal php server

curl http://desktop:8000
REQUEST_URI : string(1) "/" 
REDIRECT_URI : <pre><span style='color: #ff0000'><br />
<b>Notice</b>:  Undefined index: REDIRECT_URI in <b>/home/olegn/Lukas/projects/bridge/public/index.php</b> on line <b>25</b><br />
</span></pre>NULL
</pre>

Just REQUEST_URI and REDIRECT_URI confused

#17

Updated by lonypny over 1 year ago

I can confirm this is happening on any wordpress installation with the following rewrite rules (default):

url.rewrite-if-not-file = (
"^/(wp-.+).*/?" => "$0",
"^/keyword/([A-Za-z_0-9\-]+)/?$" => "/index.php?keyword=$1",
"^/.*?(\?.*)?$" => "/index.php$1"
)

#18

Updated by gstrauss over 1 year ago

  • Target version changed from 1.4.x to 1.4.41

FYI: there will be a release of lighttpd 1.4.41 in the next two weeks (and possibly before end of July), which reverts this change made in lighttpd 1.4.40.

#19

Updated by gstrauss over 1 year ago

  • Status changed from Need Feedback to Patch Pending
#20

Updated by gstrauss over 1 year ago

  • Category set to core
#21

Updated by gstrauss over 1 year ago

  • Status changed from Patch Pending to Fixed
  • % Done changed from 0 to 100

Also available in: Atom