Project

General

Profile

Url rewriting/ clean URL's

Added by Peeteris over 4 years ago

Hello there!

I'm pretty new at this and I don't know everything I would like to know, sadly.
That's the reason why I'm asking for help:
I have a forum and I would like to hide .php in URLs. That's pretty easy to do in standart Apache with .htaccess, but what can I do in lighttpd?
I saw some instructions where you can rewrite one/every page by hand. That sounds a bit too ridiculous, so I'l better aks before doing anything - is there any clear and great way to make URL's clean?
Thank you!

---
Distributor ID: Raspbian
Description: Raspbian GNU/Linux 9.11 (stretch)
Release: 9.11
Codename: stretch
---
lighttpd/1.4.45 (ssl) - a light and fast webserver
Build-Date: Jan 14 2017 21:07:19


Replies (10)

RE: Url rewriting/ clean URL's - Added by gstrauss over 4 years ago

Please start with mod_rewrite

e.g. url.rewrite-if-not-file to rewrite URLs that do not map to files on disk (e.g. static files) from /url/path/file into /url/path/file.php

RE: Url rewriting/ clean URL's - Added by Peeteris over 4 years ago

So, I should write

url.rewrite-if-not-file = (
"^/forums[^?]*(\?.*)?$" => "/forums.php$1",
"^/viewtopic[^?]*(\?.*)?$" => "/viewtopic.php$1" 
)

For every page on server?
--
Also, is there a way to make content with a trailing slash URL inaccessible?
E.g. only site.com/forums work and site.com/forums/ doesn't exist.

I'm getting this error from [link deleted] -
This website returns page content with or without a trailing slash on the URLs. Search engines might see these as separate pages with duplicate content which they could penalise for. Show recommendations

RE: Url rewriting/ clean URL's - Added by gstrauss over 4 years ago

For every page on server?

No. I'd suggest a regex pattern that works for whatever framework you are using. (This is not the forum to ask about your favorite framework du jour)

Also, is there a way to make content with a trailing slash URL inaccessible?

Yes.

What you should do depends on the framework and your site. You can use mod_redirect or mod_access or possibly some other modules to achieve your aim.

RE: Url rewriting/ clean URL's - Added by stbuehler over 4 years ago

A proper framework that uses rewrites in the first place should use a single entry point (maybe a second one for admin/setup/update stuff), usually named "index.php" for php-based frameworks, and handle all request routing internally (apart from static files like css/js). This includes redirecting with/without trailing slash. That way the webserver config is small and easily portable between different webservers.

RE: Url rewriting/ clean URL's - Added by Peeteris over 4 years ago

I found out that it's possible to get rid of trailing slashes with this code:

$HTTP["host"] == "www.example.com" {
        url.redirect-code = 301
        url.redirect = ("^index.php(.+)/$" => "$1")

        server.document-root = "/var/www/public" 
        url.rewrite-if-not-file = (
                "^/(.*)$" => "index.php/$1",
        )
}

From : http://www.livingwithphp.com/removing-trailing-slash-using-lighttpd/

But this messes with all the other redirects - with this code added, I can't access other pages which don't have extension added.
E.g. Folders with their own index.php, like SiteURL.url/somefolder can't be accessed (it'll redirect to root index.php), only SiteURL.url/somefolder/index.php will work.
Or I can't access /forums, but can /forums.php, even though I've used

url.rewrite-if-not-file = ( "^/forums[^?]*(\?.*)?$" => "/forums.php$1" )

RE: Url rewriting/ clean URL's - Added by Peeteris over 4 years ago

Hi there!
I almost got a solution, but there's still something bothering me and that SEO rating.
The pages which I'm adressed in configs file are showing perfectly fine and / are being redirected to the version without trailing slash [1st screenshot]

BUT! As soon as page with .php is visited and trailing slash added - it shows a duplicate page [2nd screenshot]
I can't even.. I have no idea how to get rid of this duplicate.
Help will be very appreciated as this bothers and annoys me for so long.

CONFIG file:

server.modules = (
    "mod_access",
    "mod_alias",
    "mod_compress",
     "mod_redirect",
    "mod_rewrite",
)

server.document-root        = "/var/www/html" 
server.upload-dirs          = ( "/var/cache/lighttpd/uploads" )
server.errorlog             = "/var/log/lighttpd/error.log" 
server.pid-file             = "/var/run/lighttpd.pid" 
server.username             = "www-data" 
server.groupname            = "www-data" 
server.port                 = 80

index-file.names            = ( "index.php", "index.html", "index.lighttpd.html" )
url.access-deny             = ( "~", ".inc" )
static-file.exclude-extensions = ( ".php", ".pl", ".fcgi" )

compress.cache-dir          = "/var/cache/lighttpd/compress/" 
compress.filetype           = ( "application/javascript", "text/css", "text/html", "text/plain" )

# default listening port for IPv6 falls back to the IPv4 port
include_shell "/usr/share/lighttpd/use-ipv6.pl " + server.port
include_shell "/usr/share/lighttpd/create-mime.assign.pl" 
include_shell "/usr/share/lighttpd/include-conf-enabled.pl" 

$HTTP["host"] =~ "^www\.(.*)$" {
  url.redirect = ( "^/(.*)" => "http://%1/$1" )
}

$HTTP["host"] == "REMOVED" {
url.redirect-code = 301
url.redirect = (
"^/forums/$" => "/forums",
"^/search/$" => "/search",
"^/userlist/$" => "/userlist",
"^/register/$" => "/register",
"^/login/$" => "/login",
"^/PM/$" => "/PM",
"^/downloads/$" => "/downloads",
"^/oldbutgold/$" => "/oldbutgold",
                )
                                    }

url.rewrite-once = (
"^/forums$" => "/forums.php",
"^/search$" => "/search.php",
"^/userlist$" => "/userlist.php",
"^/register$" => "/register.php",
"^/login$" => "/login.php",
"^/PM$" => "/pmsnew.php",
"^/downloads$" => "/downloads.php",
"^/oldbutgold$" => "/oldbutgold/index.html",
                    )

1.png (135 KB) 1.png 1st screenshot
2.png (52.8 KB) 2.png 2nd screenshot

RE: Url rewriting/ clean URL's - Added by gstrauss over 4 years ago

BUT! As soon as page with .php is visited and trailing slash added - it shows a duplicate page [2nd screenshot]

What is adding the trailing slash?

It is valid PATH_INFO to have a trailing slash at the end of the xxxxx.php
Is your framework adding the trailing slash on the server side?
Is your framework adding the trailing slash on the client side? (e.g. javascript)

While you think you shared your lighttpd config, you only shared part of it.

include_shell "/usr/share/lighttpd/include-conf-enabled.pl"

That generates more config.

lighttpd -p -f /etc/lighttpd.conf to print the whole config. Then, you should look to see if anything is adding a trailing slash in those configs. The trailing slash is more likely being added by your framework. If your framework actually knew how to do things well for pretty URLs, the framework would hide the .php extensions and would have instructions to show you how to do so in the web server config, too.

RE: Url rewriting/ clean URL's - Added by Peeteris over 4 years ago

gstrauss wrote:

BUT! As soon as page with .php is visited and trailing slash added - it shows a duplicate page [2nd screenshot]

What is adding the trailing slash?

It is valid PATH_INFO to have a trailing slash at the end of the xxxxx.php
Is your framework adding the trailing slash on the server side?
Is your framework adding the trailing slash on the client side? (e.g. javascript)

While you think you shared your lighttpd config, you only shared part of it.

include_shell "/usr/share/lighttpd/include-conf-enabled.pl"

That generates more config.

lighttpd -p -f /etc/lighttpd.conf to print the whole config. Then, you should look to see if anything is adding a trailing slash in those configs. The trailing slash is more likely being added by your framework. If your framework actually knew how to do things well for pretty URLs, the framework would hide the .php extensions and would have instructions to show you how to do so in the web server config, too.

This is full config file:
https://pastebin.com/D3Fr4GjA
Seems like additional lines aren't affecting redirect.

I'm using FluxBB forum engine and even official forum has the same problem :
https://fluxbb.org/forums/index.php/
and
https://fluxbb.org/forums/index.php
Gives different pages. So that's a framework problem, but I have no idea how to fix that.

RE: Url rewriting/ clean URL's - Added by gstrauss over 4 years ago

https://fluxbb.org/forums/index.php
Gives different pages. So that's a framework problem, but I have no idea how to fix that.

If it is a framework problem, then it is probably best to post on the fluxbb forums or issue tracker.

I haven't looked at fluxbb, but if it always redirects /foo.php to /foo.php/, then you should do it in lighttpd with a redirect (if visible to the user) or via a rewrite to /foo.php/ instead of /foo.php.

RE: Url rewriting/ clean URL's - Added by Peeteris over 4 years ago

As it's likely not going to be fixed by the FluxBB team and I don't have the necessary knowledge to fix framework by myself.

So I made a terribly long and insufficient list of redirects and rewrites. At least it works the way I need to, but maybe anyone have suggestions on how to make it (maybe) simpler, more efficient?

$HTTP["host"] =~ "^www\.(.*)$" {
  url.redirect = ( "^/(.*)" => "http://%1/$1" )
}

$HTTP["host"] == "hostrm" {
url.redirect-code = 301
url.redirect = (
"^/forums/$" => "/forums",
"^/forums.php/[^?]*(\?.*)?$" => "/forums",
"^/search/$" => "/search",
"^/search.php/[^?]*(\?.*)?$" => "/search",
"^/userlist/$" => "/userlist",
"^/userlist.php/[^?]*(\?.*)?$" => "/userlist",
"^/register/$" => "/register",
"^/register.php/[^?]*(\?.*)?$" => "/register",
"^/login/$" => "/login",
"^/login.php/[^?]*(\?.*)?$" => "/login",
"^/PM/$" => "/PM",
"^/downloads/$" => "/downloads",
"^/downloads.php/[^?]*(\?.*)?$" => "/downloads",
"^/oldbutgold/$" => "/oldbutgold",
"^/oldbutgold/index.html/[^?]*(\?.*)?$" => "/oldbutgold",
"^/viewforum/$" => "/viewforum",
"^/viewforum.php/[^?]*(\?.*)?$" => "/viewforum",
"^/viewtopic/$" => "/viewtopic",
"^/viewtopic.php/[^?]*(\?.*)?$" => "/viewtopic",
"^/replies/$" => "/replies",
"^/newest/$" => "/newest",
"^/recent/$" => "/recent",
"^/unanswered/$" => "/unanswered",
"^/profile.php/[^?]*(\?.*)?$" => "/profile",
"^/pmsnew.php/[^?]*(\?.*)?$" => "/PM",
)}

url.rewrite-once = (
"^/forums$" => "/forums.php",
"^/search$" => "/search.php",
"^/userlist$" => "/userlist.php",
"^/register$" => "/register.php",
"^/login$" => "/login.php",
"^/PM$" => "/pmsnew.php",
"^/downloads$" => "/downloads.php",
"^/oldbutgold$" => "/oldbutgold/index.html",
"^/viewforum$" => "/viewforum.php",
"^/viewtopic$" => "/viewtopic.php",
"^/replies$" => "/search.php?action=show_replies",
"^/newest$" => "/search.php?action=show_new",
"^/recent$" => "/search.php?action=show_recent",
"^/unanswered$" => "/search.php?action=show_unanswered",
"^/profile$" => "/profile.php",
)

Maybe there's a better code so I could access (for example) /viewforum?id=15 instead of /viewforum.php?id=15, because with my written code /viewforum?id=15 redirects to 404 page.
Thank you! :)

    (1-10/10)