Project

General

Profile

[Solved] Understanding rewrite rules ...

Added by bwechner over 4 years ago

I admit I don't alas ....

I'm running a NextCloud with PicoCMS behind lighttpd 1.4.55.

It uses long URLs like this:

https://mydomain.tld/index.php/apps/cms_pico/pico/mysite

and supports short URLs like this:

https://mydomain.tld/sites/mysite

Requiring a server config to rewrite the upper to the lower form and providing Apache and Nginx examples.

So I tried this:

$HTTP["host"] =~ "(mydomain.tld|myserver.lan)" {
    server.name          = "mydomain.tld" 
    server.document-root = "/var/www/html/nextcloud" 

    url.rewrite-once = ("/sites/(.*)" => "/index.php/apps/cms_pico/pico/$1")
}

But it doesn't work. Well I admit I'm confided after a lot of playing around with forms, and reading but the main symptom seems to be that this behaves like a redirect. And maybe that is what it's meant to do but I see no clear docs and my lay understanding was that a rewrite, unlike a redirect, leaves the URL in the address bar of the browser unchanged. That is, should be invisible to an end user.

In practice this seems to function identically to a redirect. That is the I surf to the short URL and end up with the long URL int he address bar.

I can't find details debug output for the rewrite the best I've found is:

debug.log-request-handling   = "enable" 

which produces a block like this:

2020-08-24 20:45:33: (response.c.447) -- splitting Request-URI 
2020-08-24 20:45:33: (response.c.448) Request-URI     :  /sites/mysite 
2020-08-24 20:45:33: (response.c.449) URI-scheme      :  https 
2020-08-24 20:45:33: (response.c.450) URI-authority   :  mydomain.tld
2020-08-24 20:45:33: (response.c.451) URI-path (raw)  :  /sites/mysite 
2020-08-24 20:45:33: (response.c.452) URI-path (clean):  /sites/mysite 
2020-08-24 20:45:33: (response.c.453) URI-query       :   
2020-08-24 20:45:33: (mod_access.c.177) -- mod_access_uri_handler called 
2020-08-24 20:45:33: (response.c.598) -- before doc_root 
2020-08-24 20:45:33: (response.c.599) Doc-Root     : /var/www/html/nextcloud 
2020-08-24 20:45:33: (response.c.600) Rel-Path     : /sites/mysite 
2020-08-24 20:45:33: (response.c.601) Path         :  
2020-08-24 20:45:33: (response.c.643) -- after doc_root 
2020-08-24 20:45:33: (response.c.644) Doc-Root     : /var/www/html/nextcloud 
2020-08-24 20:45:33: (response.c.645) Rel-Path     : /sites/mysite 
2020-08-24 20:45:33: (response.c.646) Path         : /var/www/html/nextcloud/sites/mysite 
2020-08-24 20:45:33: (response.c.670) -- logical -> physical 
2020-08-24 20:45:33: (response.c.671) Doc-Root     : /var/www/html/nextcloud 
2020-08-24 20:45:33: (response.c.672) Basedir      : /var/www/html/nextcloud 
2020-08-24 20:45:33: (response.c.673) Rel-Path     : /sites/mysite 
2020-08-24 20:45:33: (response.c.674) Path         : /var/www/html/nextcloud/sites/mysite 
2020-08-24 20:45:33: (response.c.686) -- handling physical path 
2020-08-24 20:45:33: (response.c.687) Path         : /var/www/html/nextcloud/sites/mysite 
2020-08-24 20:45:33: (response.c.162) -- file not found 
2020-08-24 20:45:33: (response.c.163) Path         : /var/www/html/nextcloud/sites/mysite 
2020-08-24 20:45:34: (response.c.447) -- splitting Request-URI 
2020-08-24 20:45:34: (response.c.448) Request-URI     :  /index.php/apps/cms_pico/pico/mysite 
2020-08-24 20:45:34: (response.c.449) URI-scheme      :  https 
2020-08-24 20:45:34: (response.c.450) URI-authority   :  mydomain.tld
2020-08-24 20:45:34: (response.c.451) URI-path (raw)  :  /index.php/apps/cms_pico/pico/mysite 
2020-08-24 20:45:34: (response.c.452) URI-path (clean):  /index.php/apps/cms_pico/pico/mysite 
2020-08-24 20:45:34: (response.c.453) URI-query       :   
2020-08-24 20:45:34: (mod_access.c.177) -- mod_access_uri_handler called 
2020-08-24 20:45:34: (response.c.598) -- before doc_root 
2020-08-24 20:45:34: (response.c.599) Doc-Root     : /var/www/html/nextcloud 
2020-08-24 20:45:34: (response.c.600) Rel-Path     : /index.php/apps/cms_pico/pico/mysite 
2020-08-24 20:45:34: (response.c.601) Path         :  
2020-08-24 20:45:34: (response.c.643) -- after doc_root 
2020-08-24 20:45:34: (response.c.644) Doc-Root     : /var/www/html/nextcloud 
2020-08-24 20:45:34: (response.c.645) Rel-Path     : /index.php/apps/cms_pico/pico/mysite 
2020-08-24 20:45:34: (response.c.646) Path         : /var/www/html/nextcloud/index.php/apps/cms_pico/pico/mysite 
2020-08-24 20:45:34: (response.c.670) -- logical -> physical 
2020-08-24 20:45:34: (response.c.671) Doc-Root     : /var/www/html/nextcloud 
2020-08-24 20:45:34: (response.c.672) Basedir      : /var/www/html/nextcloud 
2020-08-24 20:45:34: (response.c.673) Rel-Path     : /index.php/apps/cms_pico/pico/mysite 
2020-08-24 20:45:34: (response.c.674) Path         : /var/www/html/nextcloud/index.php/apps/cms_pico/pico/mysite 
2020-08-24 20:45:34: (response.c.686) -- handling physical path 
2020-08-24 20:45:34: (response.c.687) Path         : /var/www/html/nextcloud/index.php/apps/cms_pico/pico/mysite 
2020-08-24 20:45:34: (response.c.694) -- handling subrequest 
2020-08-24 20:45:34: (response.c.695) Path         : /var/www/html/nextcloud/index.php 
2020-08-24 20:45:34: (response.c.696) URI          : /index.php 
2020-08-24 20:45:34: (response.c.697) Pathinfo     : /apps/cms_pico/pico/mysite 

In the too hard basket for me right now alas. Wondering however if my premised expectation of rewrites is awry, that they change the URL that is processed but do not update the browsers address bar.

Of course I am aware that Nexcloud might be to blame here, that it's conceivable that when accessed using rhe long URL it forces the address bar to honesty. Not my first thought given they are recommending rewrites to hide the long URLs fdrom users.


Replies (11)

RE: Understanding rewrite rules ... - Added by gstrauss over 4 years ago

Did you read the documentation for PicoCMS on how to configure lighttpd?
http://picocms.org/docs/#lighttpd

RE: Understanding rewrite rules ... - Added by bwechner over 4 years ago

Thanks for the tip. I had seen that before but didn't think to look back there this time. I did not, but alas can't seem to nut our what's going on from that either. It basically recommends this block:

url.rewrite-once = (
    "^/pico/(config|content|vendor|composer\.(json|lock|phar))(/|$)" => "/pico/index.php",
    "^/pico/(.+/)?\.(?!well-known(/|$))" => "/pico/index.php" 
)

url.rewrite-if-not-file = (
    "^/pico(/|$)" => "/pico/index.php" 
)

The first block just deny access to pico internals by bouncing back to index.php.

The second lends a possible syntax clue, so I tried the Nextcloud analog:

    url.rewrite-if-not-file = (
        "^/sites/" => "/index.php/apps/cms_pico/pico/" 
    )

But alas testing this, that just results in the URL https://mydomain.tld/sites/mysite landing on Nextcloud files. Go figure. And I cna't really see why. The log in this case though produces:


2020-08-25 11:52:30: (response.c.448) Request-URI     :  /sites/mysite 
2020-08-25 11:52:30: (response.c.449) URI-scheme      :  https 
2020-08-25 11:52:30: (response.c.450) URI-authority   :  mydomain.tld
2020-08-25 11:52:30: (response.c.451) URI-path (raw)  :  /sites/mysite 
2020-08-25 11:52:30: (response.c.452) URI-path (clean):  /sites/mysite 
2020-08-25 11:52:30: (response.c.453) URI-query       :   
2020-08-25 11:52:30: (mod_access.c.177) -- mod_access_uri_handler called 
2020-08-25 11:52:30: (response.c.598) -- before doc_root 
2020-08-25 11:52:30: (response.c.599) Doc-Root     : /var/www/html/nextcloud 
2020-08-25 11:52:30: (response.c.600) Rel-Path     : /sites/mysite 
2020-08-25 11:52:30: (response.c.601) Path         :  
2020-08-25 11:52:30: (response.c.643) -- after doc_root 
2020-08-25 11:52:30: (response.c.644) Doc-Root     : /var/www/html/nextcloud 
2020-08-25 11:52:30: (response.c.645) Rel-Path     : /sites/mysite 
2020-08-25 11:52:30: (response.c.646) Path         : /var/www/html/nextcloud/sites/mysite 
2020-08-25 11:52:30: (response.c.447) -- splitting Request-URI 
2020-08-25 11:52:30: (response.c.448) Request-URI     :  /index.php/apps/cms_pico/pico/ 
2020-08-25 11:52:30: (response.c.449) URI-scheme      :  https 
2020-08-25 11:52:30: (response.c.450) URI-authority   :  mydomain.tld
2020-08-25 11:52:30: (response.c.451) URI-path (raw)  :  /index.php/apps/cms_pico/pico/ 
2020-08-25 11:52:30: (response.c.452) URI-path (clean):  /index.php/apps/cms_pico/pico/ 
2020-08-25 11:52:30: (response.c.453) URI-query       :   
2020-08-25 11:52:30: (mod_access.c.177) -- mod_access_uri_handler called 
2020-08-25 11:52:30: (response.c.598) -- before doc_root 
2020-08-25 11:52:30: (response.c.599) Doc-Root     : /var/www/html/nextcloud 
2020-08-25 11:52:30: (response.c.600) Rel-Path     : /index.php/apps/cms_pico/pico/ 
2020-08-25 11:52:30: (response.c.601) Path         :  
2020-08-25 11:52:30: (response.c.643) -- after doc_root 
2020-08-25 11:52:30: (response.c.644) Doc-Root     : /var/www/html/nextcloud 
2020-08-25 11:52:30: (response.c.645) Rel-Path     : /index.php/apps/cms_pico/pico/ 
2020-08-25 11:52:30: (response.c.646) Path         : /var/www/html/nextcloud/index.php/apps/cms_pico/pico/ 
2020-08-25 11:52:30: (response.c.670) -- logical -> physical 
2020-08-25 11:52:30: (response.c.671) Doc-Root     : /var/www/html/nextcloud 
2020-08-25 11:52:30: (response.c.672) Basedir      : /var/www/html/nextcloud 
2020-08-25 11:52:30: (response.c.673) Rel-Path     : /index.php/apps/cms_pico/pico/ 
2020-08-25 11:52:30: (response.c.674) Path         : /var/www/html/nextcloud/index.php/apps/cms_pico/pico/ 
2020-08-25 11:52:30: (response.c.686) -- handling physical path 
2020-08-25 11:52:30: (response.c.687) Path         : /var/www/html/nextcloud/index.php/apps/cms_pico/pico/ 
2020-08-25 11:52:30: (response.c.694) -- handling subrequest 
2020-08-25 11:52:30: (response.c.695) Path         : /var/www/html/nextcloud/index.php 
2020-08-25 11:52:30: (response.c.696) URI          : /index.php 
2020-08-25 11:52:30: (response.c.697) Pathinfo     : /apps/cms_pico/pico/

and all we've achieved is for the subrequest to lose "mysite" which is no doubt why we bounce back to Nexcloud files as a default.

The original question stands though, is there a way to better debug these rewrites and is it kosher toe xpect that they ar ebehind the scenes not visible in the URL bar? And can caching issues confuse things? Hmmmm.

RE: Understanding rewrite rules ... - Added by gstrauss over 4 years ago

is it kosher toe xpect that they ar ebehind the scenes not visible in the URL bar

Your phasing suggests that English is not your first language. Instead of using poor phrasing, please try to write more simply and clearly. The mod_rewrite documentation begins: "internal redirects, url rewrite". Yes, mod_rewrite performs internal URL rewriting. If you are still confused, please locate the nearest dictionary and look up the word "internal".

And can caching issues confuse things? Hmmmm.

Client caching can confuse you, but client caching in a web browser is outside the scope of lighttpd (a web server). If you do not know how to use your client or clear the browser cache, please use your favorite search engine to find the answer. This forum is not for how to use a client web browser.

is there a way to better debug these rewrites

Among the best ways to troubleshoot anything is to simplify the test. One such simple test case would be lighttpd with mod_rewrite and mod_cgi, running a script which reflects its environment back at you. Such a script is one of the most basic things to write yourself and is one of the first examples in many PHP tutorials (phpinfo())

How to Get Support

RE: Understanding rewrite rules ... - Added by bwechner over 4 years ago

Now now. I apologise for some typing errors which I'd fix but it seems this forum doesn't support edits, but there's no need to get condescending.

There are indeed deep caching issues so I used a new Chromium browser leaving the cache clearing button between test, the shut down all other access to the server so the error is clear and easily. And I have discovered something interesting and I wonder if there's a way to deeper diagnose this. To get this far I have:

## enable debugging
debug.log-request-header     = "enable" 
debug.log-response-header    = "enable" 
debug.log-request-handling   = "enable" 
#debug.log-file-not-found     = "enable" 
#debug.log-condition-handling = "enable" 

accesslog.filename          = "/data/log/lighttpd/access.log" 
server.errorlog             = "/data/log/lighttpd/error.log" 

The first test is the redirect as follows:

        url.redirect = (
            "/sites/(.*)" => "/index.php/apps/cms_pico/pico/$1" 
        )

And it produces debug output that concludes with a 301 status pointing to "/index.php/apps/cms_pico/pico/mysite"

Perfectly as expected!

Here is the log output:

2020-08-25 21:39:54: (connections.c.774) fd: 9 request-len: 928
GET /sites/mysite HTTP/1.1
Host: mydomain.tld
Connection: keep-alive
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/83.0.4103.97 Chrome/83.0.4103.97 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Sec-Fetch-Site: none
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Cookie: oc_sessionPassphrase=2hKXLTrCCVmLxHRHS3IXTmKXqLMzwy7WH8XiBW8mgT8Kq10jLKCnSbx06nPJkv6TXddnmi59hti3cbKsZX5HTCguAP8ObJ8AJ50TB3zsBkoJ3v0AwvZkw7pTqUzulohm; __Host-nc_sameSiteCookielax=true; __Host-nc_sameSiteCookiestrict=true; ocnsjvv2rmtc=bstbhi9g2a1gsk41a097ce1dho; nc_username=bernd; nc_token=jcqpdpUH8DhVW0MkhMyqETMbYiZZ3cTN; nc_session_id=bstbhi9g2a1gsk41a097ce1dho

2020-08-25 21:39:54: (response.c.447) -- splitting Request-URI
2020-08-25 21:39:54: (response.c.448) Request-URI     :  /sites/mysite
2020-08-25 21:39:54: (response.c.449) URI-scheme      :  https
2020-08-25 21:39:54: (response.c.450) URI-authority   :  mydomain.tld
2020-08-25 21:39:54: (response.c.451) URI-path (raw)  :  /sites/mysite
2020-08-25 21:39:54: (response.c.452) URI-path (clean):  /sites/mysite
2020-08-25 21:39:54: (response.c.453) URI-query       :
2020-08-25 21:39:54: (mod_access.c.177) -- mod_access_uri_handler called
2020-08-25 21:39:54: (response.c.125) Response-Header:
HTTP/1.1 301 Moved Permanently
Location: /index.php/apps/cms_pico/pico/mysite
Content-Length: 0
Date: Tue, 25 Aug 2020 11:39:54 GMT
Server: lighttpd/1.4.55

The second test is with a rewrite:

        url.rewrite-once = (
            "/sites/(.*)" => "/index.php/apps/cms_pico/pico/$1" 
        )

only this one ends up with a 302 status that points to: https://mydomain.tld/index.php/apps/files/

The debug log contains:

2020-08-25 21:04:00: (connections.c.774) fd: 9 request-len: 928
GET /sites/mysite HTTP/1.1
Host: mydomain.tld
Connection: keep-alive
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/83.0.4103.97 Chrome/83.0.4103.97 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Sec-Fetch-Site: none
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Cookie: oc_sessionPassphrase=2hKXLTrCCVmLxHRHS3IXTmKXqLMzwy7WH8XiBW8mgT8Kq10jLKCnSbx06nPJkv6TXddnmi59hti3cbKsZX5HTCguAP8ObJ8AJ50TB3zsBkoJ3v0AwvZkw7pTqUzulohm; __Host-nc_sameSiteCookielax=true; __Host-nc_sameSiteCookiestrict=true; ocnsjvv2rmtc=bstbhi9g2a1gsk41a097ce1dho; nc_username=bernd; nc_token=jcqpdpUH8DhVW0MkhMyqETMbYiZZ3cTN; nc_session_id=bstbhi9g2a1gsk41a097ce1dho

2020-08-25 21:04:00: (response.c.447) -- splitting Request-URI
2020-08-25 21:04:00: (response.c.448) Request-URI     :  /sites/mysite
2020-08-25 21:04:00: (response.c.449) URI-scheme      :  https
2020-08-25 21:04:00: (response.c.450) URI-authority   :  mydomain.tld
2020-08-25 21:04:00: (response.c.451) URI-path (raw)  :  /sites/mysite
2020-08-25 21:04:00: (response.c.452) URI-path (clean):  /sites/mysite
2020-08-25 21:04:00: (response.c.453) URI-query       :
2020-08-25 21:04:00: (response.c.447) -- splitting Request-URI
2020-08-25 21:04:00: (response.c.448) Request-URI     :  /index.php/apps/cms_pico/pico/mysite
2020-08-25 21:04:00: (response.c.449) URI-scheme      :  https
2020-08-25 21:04:00: (response.c.450) URI-authority   :  mydomain.tld
2020-08-25 21:04:00: (response.c.451) URI-path (raw)  :  /index.php/apps/cms_pico/pico/mysite
2020-08-25 21:04:00: (response.c.452) URI-path (clean):  /index.php/apps/cms_pico/pico/mysite
2020-08-25 21:04:00: (response.c.453) URI-query       :
2020-08-25 21:04:00: (mod_access.c.177) -- mod_access_uri_handler called
2020-08-25 21:04:00: (response.c.598) -- before doc_root
2020-08-25 21:04:00: (response.c.599) Doc-Root     : /var/www/html/nextcloud
2020-08-25 21:04:00: (response.c.600) Rel-Path     : /index.php/apps/cms_pico/pico/mysite
2020-08-25 21:04:00: (response.c.601) Path         :
2020-08-25 21:04:00: (response.c.643) -- after doc_root
2020-08-25 21:04:00: (response.c.644) Doc-Root     : /var/www/html/nextcloud
2020-08-25 21:04:00: (response.c.645) Rel-Path     : /index.php/apps/cms_pico/pico/mysite
2020-08-25 21:04:00: (response.c.646) Path         : /var/www/html/nextcloud/index.php/apps/cms_pico/pico/mysite
2020-08-25 21:04:00: (response.c.670) -- logical -> physical
2020-08-25 21:04:00: (response.c.671) Doc-Root     : /var/www/html/nextcloud
2020-08-25 21:04:00: (response.c.672) Basedir      : /var/www/html/nextcloud
2020-08-25 21:04:00: (response.c.673) Rel-Path     : /index.php/apps/cms_pico/pico/mysite
2020-08-25 21:04:00: (response.c.674) Path         : /var/www/html/nextcloud/index.php/apps/cms_pico/pico/mysite
2020-08-25 21:04:00: (response.c.686) -- handling physical path
2020-08-25 21:04:00: (response.c.687) Path         : /var/www/html/nextcloud/index.php/apps/cms_pico/pico/mysite
2020-08-25 21:04:00: (response.c.694) -- handling subrequest
2020-08-25 21:04:00: (response.c.695) Path         : /var/www/html/nextcloud/index.php
2020-08-25 21:04:00: (response.c.696) URI          : /index.php
2020-08-25 21:04:00: (response.c.697) Pathinfo     : /apps/cms_pico/pico/mysite
2020-08-25 21:04:00: (mod_access.c.177) -- mod_access_uri_handler called
2020-08-25 21:04:00: (gw_backend.c.2416) handling it in mod_gw
2020-08-25 21:04:00: (response.c.125) Response-Header:
HTTP/1.1 302 Found
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate
Pragma: no-cache
Content-Security-Policy: default-src 'self'; script-src 'self' 'nonce-bm5WUDA2dlVycnpFd3dwdkN6WjBJemoyS2dwcUVZVVlKRGRNMlkrL28xND06NXowKzVkR08zZEM5bVY4QmFWa3hjQTZEVGlVWlpzaEpRRzFqa3U3WnlnWT0='; style-src 'self' 'unsafe-inline'; frame-src *; img-src * data: blob:; font-src 'self' data:; media-src *; connect-src *; object-src 'none'; base-uri 'self';
Referrer-Policy: no-referrer
X-Content-Type-Options: nosniff
X-Download-Options: noopen
X-Frame-Options: SAMEORIGIN
X-Permitted-Cross-Domain-Policies: none
X-Robots-Tag: none
X-XSS-Protection: 1; mode=block
Location: https://mydomain.tld/index.php/apps/files/
Content-type: text/html; charset=UTF-8
Strict-Transport-Security: max-age=15552000
Content-Length: 0
Date: Tue, 25 Aug 2020 11:04:00 GMT
Server: lighttpd/1.4.55

But there is no clue in hte lines above the Response header that help us understand why lighty chooses that Location to 302 to.

My best guess from your comment and underscoring of the word "internal" (as if that was unambiguously clear in meaning and I don't know what the word means) is that Lighty is locally (server side only) loading something defined by:

2020-08-25 21:04:00: (response.c.694) -- handling subrequest
2020-08-25 21:04:00: (response.c.695) Path         : /var/www/html/nextcloud/index.php
2020-08-25 21:04:00: (response.c.696) URI          : /index.php
2020-08-25 21:04:00: (response.c.697) Pathinfo     : /apps/cms_pico/pico/mysite

Something that index.php does not understand in the same way that it understand the URL: /index.php/apps/cms_pico/pico/mysite

Alas the debug logs also lend little insight into how

https:/myserver.tld/index.php/apps/cms_pico/pico/mysite

is different to the subrequest above (the URI of /index.php and Pathinfo of /apps/cms_pico/pico/mysite

I guess the question I need to explore furtehr (and can use some insight into) is centered on this difference.

Particularly:

  1. Is there something in my rewrite syntax that can be improved to elicit the desired behaviour?
  2. Is there a doc I've not found explaining the URI, Pathinfo and how they related to https://.../x.php/stuff?

Kind regards and hoping for some understanding and slightly more tolerance of some haste induced typing errors. This is costly diagnosis and a luxury for me to indulge in, motivated primarily by a thirst for learning.

RE: Understanding rewrite rules ... - Added by gstrauss over 4 years ago

I think you skipped a few steps.

is there a way to better debug these rewrites

Among the best ways to troubleshoot anything is to simplify the test. One such simple test case would be lighttpd with mod_rewrite and mod_cgi, running a script which reflects its environment back at you. Such a script is one of the most basic things to write yourself and is one of the first examples in many PHP tutorials (phpinfo())

Had you taken that step, the answer would likely be more apparent to you.

Did you read How to Get Support ?
You did not include your config. Instead, you included small pieces of it. If you have questions and don't know the answers, why would you conclude that you know what to include or omit?

But there is no clue in hte lines above the Response header that help us understand why lighty chooses that Location to 302 to.

lighttpd debugging does not try to tell you "why" it does anything. Instead, it tries to tell you "what" did.
2020-08-25 21:04:00: (gw_backend.c.2416) handling it in mod_gw
The shared gateway in lighttpd contacted the backend (your PHP), and you see the response. The PHP is likely the thing producing the 302. Had you included your lighttpd config, someone other than you would be able to confirm that the PHP is producing the response. Since you did not include your config, I can not see whether or not you defined things such as the Content-Security-Policy in your lighttpd.conf or if it is coming from the PHP. However, the inclusion of a dynamically generated nonce in the CSP suggests it was produced by the PHP.

Did you read the documentation for PicoCMS on how to configure lighttpd?
http://picocms.org/docs/#lighttpd

I think you might want to read the documentation more carefully (all one paragraph of it) about how to configure lighttpd and how to configure PicoCMS config.yml.

Separately, your rewrite rule ought to be anchored with ^

        url.rewrite-once = (
            "^/sites/(.*)" => "/index.php/apps/cms_pico/pico/$1" 
        )

RE: Understanding rewrite rules ... - Added by bwechner over 4 years ago

So, I snooped into the PHP of index.php with a debugger and I have some progress to report. NO solution yet nor full understanding. But inside of the PHP that runs, we see a difference between the redirect and rewrite as follows:

< [REQUEST_URI] => /index.php/apps/cms_pico/pico/ktlt-tourney
---
> [REDIRECT_URI] => /index.php/apps/cms_pico/pico/ktlt-tourney
> [REQUEST_URI] => /sites/ktlt-tourney

< is the redirect - and works.

is the rewrite - and does not work. the PHP acts on a REQUEST_URI that is invalid so falls back on a default view.

I have now to work out whence PHP gets these vars and if lightty can be coaxed into delivering what's needed so REQUEST_URI is rewritten!

Will keep plodding forward, but if these clues shed light on a more experience eye, I'm all ears. If not, fret not, more plodding ahead.

RE: Understanding rewrite rules ... - Added by bwechner over 4 years ago

And, further research uncovers that these are in $_SERVER:

https://www.php.net/manual/en/reserved.variables.server.php

and:

$_SERVER is an array containing information such as headers, paths, and script locations. The entries in this array are created by the web server. There is no guarantee that every web server will provide any of these; servers may omit some, or provide others not listed here.

I'm not sure of the mechanism by which lighty passes provides _$_SERVER to the PHP context, but it seems clear that the salient question is why a rewrite, after the solid claim it's a total internal rewrite, does not rewrite _$_SERVER["REQUEST_URI"] but provide _$_SERVER["REDIRECT_URI"] to the PHP context.

More reading ahead, but as ever all tips appreciated. Certainly url.rewrite-once does not seem to be doing what I expected.

RE: Understanding rewrite rules ... - Added by bwechner over 4 years ago

And the answer is before us. This is intentional behaviour:

https://www.lighttpd.net/page6.html

1.4.40: REDIRECT_URI is set for internal redirects (cgi, magnet, rewrite, errdoc)
1.4.41: reverted REQUEST_URI/REDIRECT_URI to match behavior in lighttpd <= 1.4.39
1.4.42: REQUEST_URI is original client request, instead of URI modified by mod_rewrite.

And to get a true rewrite needs the client to observe REDIRECT_URI if supplies or to use mod_magnet to fix this issue.

http://www.web-site-scripts.com/knowledge-base/article/AA-00382/0/URL-rewrite-support-for-Lighttpd-server.html
https://redmine.lighttpd.net/projects/1/wiki/Docs_ModMagnet

Now to fix it!

RE: Understanding rewrite rules ... - Added by gstrauss over 4 years ago

RFC 3875 The Common Gateway Interface (CGI) Version 1.1
https://tools.ietf.org/html/rfc3875
(published in Oct 2004)

REQUEST_URI is not a standard variable in the CGI environment.

REDIRECT_URI is not a standard variable in the CGI environment.

The commit message for "1.4.41: reverted REQUEST_URI/REDIRECT_URI to match behavior in lighttpd <= 1.4.39" has some gory details on why the change was reverted.
https://redmine.lighttpd.net/projects/lighttpd/repository/14/revisions/9af58a9716b120209c4011b265657b32a414dff9
If you follow the links, they mention issues with (PHP apps) MediaWiki and Wordpress for why the change to those variables was reverted, and has not changed in lighttpd since 2016.

.

I am not sure why you continue to think that there is an issue with lighttpd, when the problems emanate from PHP.

https://www.php.net/manual/en/reserved.variables.server.php

'REQUEST_URI'
    The URI which was given in order to access this page; for instance, '/index.html'. 

A user comment may help you, but note that lighttpd REQUEST_URI and lighttpd REDIRECT_URI (note the spelling) are not necessarily what PHP regurgitates for you
https://www.php.net/manual/en/reserved.variables.server.php#120979

RE: Understanding rewrite rules ... - Added by bwechner over 4 years ago

"I am not sure why you continue to think that there is an issue with lighttpd, when the problems eminate from PHP."

My, my, you are sensitive. I never meant to suggest it was a clear fault anywhere I guess, but you should be aware that there are standards, formal, informal, universal, prtly adopted, widely adopted, de facto, tweaked and more in the work that all devbs have to navigate.

Either way the fix is either mod_magnet based, though I cannot find access to REDIRECT_URI in lua:

https://redmine.lighttpd.net/projects/1/wiki/Docs_ModMagnet#lightyenv

Do you know perchance if it is available and undocumented? I'll see if I can find out empirically in any case.

If it's not available then the ur.rewrite is useless to Nexcloud, which like WIkipedia and others, its seems ignore REDIRECT_URI and use only REQUEST_URI.

I mean surely at some level you have to admit it seems logical that a request put to lighty to rewrite the URL should provide an identical server variables to a CGI script as does loading the rewritten URL. In short I find it a reasonable assumption (competing of course against other reasonable assumptions) that a request like this:

        url.rewrite-once = (
            "^/sites/(.*)" => "/index.php/apps/cms_pico/pico/$1" 
        )

makes the following to requests form a browser indistinguishable to the CGI script:

https://mydomain.tld/index.php/apps/cms_pico/pico/mysite
https://mydomain.tld/sites/mysite

and that isn't the case. The second URL differs from the first in only one area, a new server variable that is provided: REDIRECT_URI

That is hardly a rewrite, and smacks of a redirect oddly enough.

If lua cannot see the REDIRECT_URI then there are only two ways forward:

1) do the rewrite entirely in lua and do away with the url.rewrite directive above, or

2) patch the PHP with this at its head:

    if (isset($_SERVER['REDIRECT_URI'])) {
       $_SERVER['REQUEST_URI'] = $_SERVER['REDIRECT_URI'];
       unset($_SERVER['REDIRECT_URI']);
    }

I feel a little like if lua can see REDIRECT_URI then that's the path of least resistance. 1. seems to require lua coding for every URL rewrite which may or may not be efficient, and 2. demands a patch to Nextcloud (and attendant PR hurdles or upgrade issues in face of a local patch).

That PHP snippet sure does the trick and the rewrite now works (though as stated and as the PHP code impleis, it's not a rewrite but an internal redirect request).

RE: Understanding rewrite rules ... - Added by bwechner over 4 years ago

And it turns out there's a PHP fix.

In php.ini I add:

auto_prepend_file = /var/www/html/lighttp.php

and in lighttpd.php:

if (isset($_SERVER['REDIRECT_URI'])) {
    $_SERVER['REQUEST_URI'] = $_SERVER['REDIRECT_URI'];
    unset($_SERVER['REDIRECT_URI']);
}

As an aside, most the sites listed here:

https://redmine.lighttpd.net/projects/lighttpd/wiki/PoweredByLighttpd

have moved to nginx or apache, as revealed by "curl -I". Not a big issue, but I may take some time and remove entries that no longer use lighttpd.

    (1-11/11)