Project

General

Profile

Actions

Bug #2738

closed

mediawiki redirect loop if REQUEST_URI not orig req in 1.4.40

Added by LoneFox almost 8 years ago. Updated over 7 years ago.

Status:
Fixed
Priority:
Normal
Category:
core
Target version:
ASK QUESTIONS IN Forums:

Description

I have mediawiki set up with some URL cleanup rewrites, basically copied from https://www.mediawiki.org/wiki/Manual:Short_URL/wiki/Page_title_--_Lighttpd_rewrite--root_access

  alias.url += ("/w" => "/usr/share/mediawiki/")
  url.rewrite-if-not-file += (
    "^/wiki/(mw-)?config/?" => "$0",
    "^/wiki/([^?]*)(?:\?(.*))?" => "/w/index.php?title=$1&$2",
    "^/wiki/([^?]*)" => "/w/index.php?title=$1" 
  )

This works in earlier versions of lighttpd, but with 1.4.40 it results in an endless 301 Moved Permanently loop.

According to git bisect dbdab5dbc9b98df9c40f11e0fc6a6ce49bfea804 is the first bad commit.


Files

lighttpd.conf.tar.gz (2.32 KB) lighttpd.conf.tar.gz LoneFox, 2016-07-19 07:50
Actions #1

Updated by gstrauss almost 8 years ago

Thanks for the report. I'll try to repro. Some more details would be useful (besides "download and install mediawiki" :))

Actions #2

Updated by gstrauss almost 8 years ago

The git bisect commit you referenced did not touch mod_alias or mod_rewrite, and affected error status >= 400. Might there be any other relevant parts of your lighttpd.conf? Would you attach your lighttpd.conf?

Actions #3

Updated by gstrauss almost 8 years ago

FYI: the commit following dbdab5db is b473220d which sets REDIRECT_URI in mod_rewrite if it has not already be set.

Actions #4

Updated by gstrauss almost 8 years ago

  • Status changed from New to Need Feedback

What is the request that results in a 301? (What other details can you provide about this request and what you expect it to access and what you expect lighttpd to do?)

Actions #5

Updated by LoneFox almost 8 years ago

gstrauss wrote:

Might there be any other relevant parts of your lighttpd.conf? Would you attach your lighttpd.conf?

I'm attaching a tarball that contains the config files.
On mediawiki side I have the changes from that link in the original report, and I'm also using Auth_remoteuser extension. Other that that, it should be the standard configuration.

gstrauss wrote:

What is the request that results in a 301? (What other details can you provide about this request and what you expect it to access and what you expect lighttpd to do?)

Any attempt to access the wiki causes it, for example https://localhost:1985/wiki/Main_Page
(Normally I use a domain from FreeDNS, but I don't want to put that on this public bugtracker, so I changed it to use localhost instead. Same behavior in both cases.)

Actions #6

Updated by gstrauss almost 8 years ago

LoneFox, what is the version of lighttpd you are currently using (which works for you)?

How did you install mediawiki? In /wiki or /w? This is not a pure vanilla installation. You have made changes for short URLs. What changes were made? What else might be useful information for someone who has NEVER before run mediawiki (me)? We're not going to get very far if you think that I am going to install an entire framework and figure out what customizations you made for special features, having never used it before. Do things work if you (temporarily!) symlink wiki -> w?

If you have things working on localhost in a test config, please try disabling various modules (in your test config) so that they can be ruled out. mod_compress, mod_fastcgi (unused in your config), mod_userdir, mod_status, mod_auth, mod_setenv, mod_redirect. Comment out the directives for the functionality provided by these modules. Try to narrow down the problem. If only accessible from localhost, can you also disable the ssl? It would be helpful to reduce your config to the smallest interaction between modules.

Separately from the above, you can set debug.log-request-handling = "enable", restart lighttpd, trigger the 301 loop for a second. Lots of output will be logged. Please take a look to make sure the data can be shared and then post the log so that we can see what is looping. (If you change anything in the log, please note what you substituted to, but obviously omit the original.)

Actions #7

Updated by LoneFox almost 8 years ago

I did some experiments and found out what is happening. The mod_cgi.c change in that commit modifies REQUEST_URI.
With 1.4.39 (and 1.4.40 with that change reverted), mediawiki sees REQUEST_URI = /wiki/Main_Page and works properly.
With unmodified 1.4.40 it sees REQUEST_URI = /w/index.php?title=Main_Page& and redirects.

Actions #8

Updated by gstrauss over 7 years ago

Thanks for that! It would appear the behavior change came from the >= 0 test, instead of > 0. If you compile your own version of lighttpd, you might test with that change in mod_cgi.c.

-               if (!buffer_string_is_empty(con->request.orig_uri)) {
+               if (con->error_handler_saved_status >= 0) {
+                       cgi_env_add(&env, CONST_STR_LEN("REQUEST_URI"), CONST_BUF_LEN(con->request.uri));
+               } else {
                        cgi_env_add(&env, CONST_STR_LEN("REQUEST_URI"), CONST_BUF_LEN(con->request.orig_uri));
                }

I'm going to look into the mediawiki code, too, to confirm this is what is happening. REQUEST_URI is a non-standard CGI environment variable, and so is REDIRECT_URI, which is added in the patch you mention. Perhaps mediawiki should prefer REDIRECT_URI over REQUEST_URI? I am not sure yet. Still looking...

Actions #9

Updated by gstrauss over 7 years ago

mediawiki includes/DefaultSettings.php contains the following

/**
 * Whether to support URLs like index.php/Page_title These often break when PHP
 * is set up in CGI mode. PATH_INFO *may* be correct if cgi.fix_pathinfo is set,
 * but then again it may not; lighttpd converts incoming path data to lowercase
 * on systems with case-insensitive filesystems, and there have been reports of
 * problems on Apache as well.
 *
 * To be safe we'll continue to keep it off by default.
 *
 * Override this to false if $_SERVER['PATH_INFO'] contains unexpectedly
 * incorrect garbage, or to true if it is really correct.
 *
 * The default $wgArticlePath will be set based on this value at runtime, but if
 * you have customized it, having this incorrectly set to true can cause
 * redirect loops when "pretty URLs" are used.
 * @since 1.2.1
 */

In lighttpd 1.4.40, lighttpd attempts to preserve the case in PATH_INFO, so the comment above about lighttpd is no longer true.

Since you're using the rewrite rules, have you tried changing LocalSettings.php to use $wgUsePathInfo = false; instead of true?

Actions #10

Updated by LoneFox over 7 years ago

gstrauss wrote:

Thanks for that! It would appear the behavior change came from the >= 0 test, instead of > 0. If you compile your own version of lighttpd, you might test with that change in mod_cgi.c.

Yes, that fixes the problem.

Actions #11

Updated by gstrauss over 7 years ago

Glad to hear it that it works for you and mediawiki.

I'm not convinced that change is the generic solution, as the problem stems from PHP and, separately, mediawiki, path shenanigans.

As previously noted, REQUEST_URI is not part of the CGI spec, and neither is REDIRECT_URI.

I think that mediawiki should use REDIRECT_URI, if set, instead of REQUEST_URI, if mediawiki otherwise uses the same parsing code.

However, you've already parsed path info into the query string in the rewrite rule, so mediawiki should just get the info from there (in REDIRECT_URI, which already provides 'title=...' in the query string), rather than attempting to reparse path info! All these shenanigans are mediawiki attempts to obtain path info before physical path translation to filesystem paths, which remove double slashes "//" and paths segments like "/../", and potentially lowercases on case-insensitive filesystems, or other physical filesystem path translation. You've provided that unmodified path in the query string as title=..., and I think that mediawiki should be using that.

Actions #12

Updated by gstrauss over 7 years ago

I am curious if the following would work if you added it to initialization code in mediawiki/include/WebStart.php (and if you don't patch lighttpd as you did above)

untested:

# mediawiki expects REQUEST_URI to be the original URL.  
# However, lighttpd 1.4.40 and later set REDIRECT_URI to 
# the original URL and set REQUEST_URI to the current URL
# when request has been internally redirected in lighttpd.
if ( !empty( $_SERVER['REDIRECT_URI'] )
  && isset(  $_SERVER['SERVER_SOFTWARE'] )
  && strpos( $_SERVER['SERVER_SOFTWARE'], 'lighttpd/1.4.40') === 0 ) {
    $_SERVER['REQUEST_URI'] = $_SERVER['REDIRECT_URI'];
}

edited to specify this only for lighttpd 1.4.40

Actions #13

Updated by gstrauss over 7 years ago

  • Subject changed from rewrite-if-not-file broken in 1.4.40 to mediawiki redirect loop if REQUEST_URI not orig req in 1.4.40
Actions #14

Updated by LoneFox over 7 years ago

gstrauss wrote:

I'm not convinced that change is the generic solution, as the problem stems from PHP and, separately, mediawiki, path shenanigans.

As previously noted, REQUEST_URI is not part of the CGI spec

But it is part of this de facto standard called apache, and mediawiki is probably not the only popular web application that expects it to behave like it does in apache...

I am curious if the following would work if you added it to initialization code in mediawiki/include/WebStart.php (and if you don't patch lighttpd as you did above)

Yes, it does work.

Actions #15

Updated by gstrauss over 7 years ago

Thank you very much for your help in tracking this down and identifying where the problem is.

The solution is still up for discussion. It may be that lighttpd should revert this behavior.

Thanks to your help, there is a workaround for those affected, and the workaround is doable in the backend, which is typically a dynamic language. (Asking people to recompile lighttpd would be a higher barrier to entry, though you were able to test that.)

Actions #16

Updated by olegcorner over 7 years ago

I hope this can help

Arch Linux

lighttpd 1.4.40-1

config :

url.rewrite-once = (
            "^(/assets.*)$"  => "$1",
            "^(/partials.*)$"  => "$1",
            "^/([^.?]*)\?(.*)$" =>  "/index.php?_url=/$1&$2",
            "^/([^.?]*)$"       =>  "/index.php?_url=/$1" 
    )

Steps to reproduce:
Just headers dump in PHP

echo '<pre>';
echo 'REQUEST_URI : ';
var_dump($_SERVER['REQUEST_URI']);
echo 'REDIRECT_URI : ';
var_dump($_SERVER['REDIRECT_URI']);

On version 1.39 we have response

REQUEST_URI : string(1) "/" 
REDIRECT_URI : string(17) "/index.php?_url=/" 

And on version 1.40

REQUEST_URI : string(17) "/index.php?_url=/" 
REDIRECT_URI : string(1) "/" 

Result of query to internal php server

curl http://desktop:8000
REQUEST_URI : string(1) "/" 
REDIRECT_URI : <pre><span style='color: #ff0000'><br />
<b>Notice</b>:  Undefined index: REDIRECT_URI in <b>/home/olegn/Lukas/projects/bridge/public/index.php</b> on line <b>25</b><br />
</span></pre>NULL
</pre>

Just REQUEST_URI and REDIRECT_URI confused

Actions #17

Updated by lonypny over 7 years ago

I can confirm this is happening on any wordpress installation with the following rewrite rules (default):

url.rewrite-if-not-file = (
"^/(wp-.+).*/?" => "$0",
"^/keyword/([A-Za-z_0-9\-]+)/?$" => "/index.php?keyword=$1",
"^/.*?(\?.*)?$" => "/index.php$1"
)

Actions #18

Updated by gstrauss over 7 years ago

  • Target version changed from 1.4.x to 1.4.41

FYI: there will be a release of lighttpd 1.4.41 in the next two weeks (and possibly before end of July), which reverts this change made in lighttpd 1.4.40.

Actions #19

Updated by gstrauss over 7 years ago

  • Status changed from Need Feedback to Patch Pending
Actions #20

Updated by gstrauss over 7 years ago

  • Category set to core
Actions #21

Updated by gstrauss over 7 years ago

  • Status changed from Patch Pending to Fixed
  • % Done changed from 0 to 100
Actions

Also available in: Atom