Bug #3137
closedTRACEME environment option in tests broken with LISTEN_PID
Description
5b1b9f78247e122804490b17112e94df32851464 switched to using (systemd) socket activation to pass an ephemeral listening port to the test instance.
For this, LISTEN_PID
needs to be set to the process id the test instance is running as.
But the TRACEME variants all call some wrapper program (gdb, strace, valgrind, ...), which doesn't update LISTEN_PID to the actual (sub) process id of the spawned instance; the nested process id also can't be predicted (for me adding 10 worked with gdb most of the time...).
The tests will then fail with something like this:
... lighttpd1.4/src/network.c.419) can't bind to socket: 0.0.0.0:80: Permission denied
- There should be a way to require systemd socket activation in the config (instead of falling back to bind :80), and it should be used in the test configs
- lighttpd should allow a special
LISTEN_PID
value (like "ignore") to skip the pid test; this should be used to spawn the test instance if TRACEME is active. (Maybe it would be possible to use something likeLISTEND_PID=parent=xxx
to have it match the parent process id; but I'm not sure if the TRACEME programs only add a single process layer in the tree.)
Updated by gstrauss about 3 years ago
- Description updated (diff)
- Category set to test_component
Updated by gstrauss about 3 years ago
Those ideas are good, and I am considering others.
On (native) Windows, I may need a different solution to socket activation.
It is possible that for testing, a lighttpd.conf.$test is created dynamically, and might incorporate LISTEN_PID=TRACEME or something like that to skip the test and reset the variable to getpid().
Updated by gstrauss about 3 years ago
My understanding is that the LISTEN_PID is a sanity check.
I believe that getppid()
is fairly portable (and part of the POSIX.1-2001 standard), so this might be sufficient for (most) tracing.
--- a/src/network.c +++ b/src/network.c @@ -581,7 +581,7 @@ static int network_socket_activation_from_env(server *srv, network_socket_config char *listen_fds = getenv("LISTEN_FDS"); pid_t lpid = listen_pid ? (pid_t)strtoul(listen_pid,NULL,10) : 0; int nfds = listen_fds ? atoi(listen_fds) : 0; - int rc = (lpid == getpid() && nfds > 0 && nfds < 5000) + int rc = ((lpid==getpid() || lpid==getppid()) && nfds > 0 && nfds < 5000) ? network_socket_activation_nfds(srv, s, nfds) : 0; unsetenv("LISTEN_PID");
Updated by stbuehler about 3 years ago
It certainly is a sanity check, and most certainly exactly against processes forwarding env and fds to nested processes. So allowing ppid to match by default is not a good idea imho.
Updated by gstrauss about 3 years ago
It certainly is a sanity check, and most certainly exactly against processes forwarding env and fds to nested processes. So allowing ppid to match by default is not a good idea imho.
Good point. [Edit: although if using the lighttpd server and these variables are set, it is likely that the lighttpd server is the target of systemd socket activation)]
What do you think of combining the getppid check along with requiring the existence of TRACEME environment variable?(lpid == getppid() && NULL != getenv("TRACEME"))
Updated by stbuehler about 3 years ago
gstrauss wrote in #note-5:
It certainly is a sanity check, and most certainly exactly against processes forwarding env and fds to nested processes. So allowing ppid to match by default is not a good idea imho.
Good point. [Edit: although if using the lighttpd server and these variables are set, it is likely that the lighttpd server is the target of systemd socket activation)]
Either lighttpd was the target and we don't need to check LISTEN_PID
in the first place, or we want a safe-guard if lighttpd wasn't actually the target, then we really shouldn't allow ppid.
The probability that lighttpd was the target has nothing to do with whether allowing ppid is a good idea (only whether checking LISTEN_PID
is needed).
What do you think of combining the getppid check along with requiring the existence of TRACEME environment variable?
(lpid == getppid() && NULL != getenv("TRACEME"))
a) whether ppid actually matches (for all "TRACME" tools) is still not certain, right?
b) checking TRACEME seems very specific to me; accepting a LISTEND_PID=any
(or LISTEND_PID=parent
if you really want to go for ppid matching) or similar still sounds like a better solution to me
Apart from that I suppose it would work.
Updated by gstrauss about 3 years ago
a) whether ppid actually matches (for all "TRACME" tools) is still not certain, right?
In a test on Linux, valgrind preserve the original process pid.
In a test on Linux, strace and gdb change the process pid, and LISTEN_PID matches parent pid.
In a test on FreeBSD, truss changes the process pid, and LISTEN_PID matches parent pid.
My hesitancy in extending LISTEN_PID stems from its origin: systemd socket activation. Who is to say when systemd might extend it?
Then again, this is for a feature of lighttpd testing, and LISTEND_PID=parent
presently seems like a reasonable solution.
Updated by gstrauss about 3 years ago
Well, LISTEN_PID=parent
does not specify the pid expected to match.
I suppose I could use something like LISTEN_PID=parent:<pid>
Updated by gstrauss about 3 years ago
- Status changed from New to Fixed
Applied in changeset 1e335b3724ae3f839170286d572b3056d6814b54.
Updated by gstrauss about 3 years ago
- Target version changed from 1.4.xx to 1.4.64
Also available in: Atom