Feature #299: Accessing the unmatched part of a pcre in mod_rewrite - Lighttpd - lighty labs

Actions

Copy link

Feature #299

closed

Accessing the unmatched part of a pcre in mod_rewrite

Added by Anonymous over 19 years ago. Updated over 16 years ago.

Status:

Wontfix

Priority:

Normal

Category:

mod_rewrite

Target version:

ASK QUESTIONS IN Forums:

Description

Currently, there is no way to access the unmatched part of a pcre in the substitution part of a rewriting rule. You may want this if the matching rule contains a lookahead assertion (which doesn't go into a matched substring).
You may also want to have literal % and $ characters in the substituted string.

The following patch makes two additional magic tokens, $^ and $$, available in the substitution part, referencing the unmatched parts of the original string before and after the match.
It also adds %% and %$ for literal %s and $s.


--- /usr/pkgsrc/www/lighttpd/work/lighttpd-1.4.3/src/mod_rewrite.c      2005-08-15 19:18:32.000000000 +0200
+++ ./mod_rewrite.c     2005-09-29 18:20:10.000000000 +0200
@@ -379,7 +379,16 @@

                        start = 0; end = pattern_len;
                        for (k = 0; k < pattern_len; k++) {
-                               if ((pattern[k] == '$' || pattern[k] == '%') &&
+                               if (pattern[k] == '%' &&
+                                   (pattern[k + 1] == '$' || pattern[k + 1] == '%')) {
+                                       end = k;
+
+                                       buffer_append_string_len(con->request.uri, pattern + start, end - start);
+                                       buffer_append_string_len(con->request.uri, pattern + 1, 1);
+
+                                       k++;
+                                       start = k + 1;
+                               } else if ((pattern[k] == '$' || pattern[k] == '%') &&
                                    isdigit((unsigned char)pattern[k + 1])) {
                                        /* got one */

@@ -400,7 +409,19 @@

                                        k++;
                                        start = k + 1;
-                               }
+                               } else if (pattern[k] == '$' &&
+                                  (pattern[k + 1] == '^' || pattern[k + 1] == '$')) {
+                                       end = k;
+
+                                       buffer_append_string_len(con->request.uri, pattern + start, end - start);
+
+                                       if (pattern[k + 1] == '^')
+                                               buffer_append_string_len(con->request.uri, p->match_buf->ptr, ovec[0]);
+                                       else if (pattern[k + 1] == '$')
+                                               buffer_append_string_len(con->request.uri, p->match_buf->ptr + ovec[1], p->match_buf->used - 1 - ovec[1]);
+                                       k++;
+                                       start = k + 1;
+                               }
                        }

                        buffer_append_string_len(con->request.uri, pattern + start, pattern_len - start);

There may be better names than $^ and $$.

-- support