Bug #2029

mod_fastcgi and spawn-fcgi USR1 signal handling

Added by kapouer over 5 years ago. Updated over 5 years ago.

Status:InvalidStart date:2009-07-12
Priority:NormalDue date:
Assignee:-% Done:

100%

Category:-
Target version:-
Missing in 1.5.x:

Description

Hi, i'm currently having a weird behavior using lighttpd + fastcgi + some rails app :
if i let lighttpd spawn the rails app (configured to use kill-signal 10), stopping lighttpd won't kill the app.
Sending USR1 kill signal manually to the rails app doesn't stop it either.
if i use spawn-fcgi, and send a kill-signal USR1 (10) to the spawned process, it stops gracefully.
Do lighttpd 1.4.23 use the same code base as spawn-fcgi to manage the spawning ?
Why not ? Isn't it redundant ?

Associated revisions

Revision 2582
Added by stbuehler over 5 years ago

Reset ignored signals to SIG_DFL before exec() in fastcgi/scgi (fixes #2029)

History

#1 Updated by stbuehler over 5 years ago

lighttpd ignores SIG_USR1, which is inherited trough exec():

http://www.opengroup.org/onlinepubs/000095399/functions/exec.html:

Except for SIGCHLD, signals set to be ignored (SIG_IGN) by the calling process image shall be set to be ignored by the new process image.

I guess resetting the ignored signals to the default behaviour should fix this.

#2 Updated by stbuehler over 5 years ago

  • Status changed from New to Fixed
  • % Done changed from 0 to 100

Applied in changeset r2582.

#3 Updated by kapouer over 5 years ago

i applied changeset r2582 to lighttpd 1.4.23,
the spawned process still doesn't respond to SIGUSR1
and stopping lighttpd doesn't stop the spawned process.

#4 Updated by stbuehler over 5 years ago

Could you please try adding a "setsid();" call before the exec call (where the signal(..) calls were added)?

#5 Updated by stbuehler over 5 years ago

  • Status changed from Fixed to Reopened
  • Target version set to 1.4.24

#6 Updated by kapouer over 5 years ago

unfortunately, no luck with this :

reset_signals();
setsid();
/* exec the cgi */
execve(arg.ptr[0], arg.ptr, env.ptr);

#7 Updated by kapouer over 5 years ago

also note that i got several other spawned processes :

www-data 27248     1  0 16:47 ?        00:00:00 /usr/sbin/lighttpd -f /etc/lighttpd/lighttpd.conf
www-data 27251 27248  1 16:47 ?        00:00:00 /usr/bin/php-cgi
www-data 27255 27251  0 16:47 ?        00:00:00 /usr/bin/php-cgi
www-data 27257 27251  0 16:47 ?        00:00:00 /usr/bin/php-cgi
www-data 27258 27248 32 16:47 ?        00:00:00 /usr/bin/octave -qf /home/dev/WORKSPACES/climelioth/octave/main.m --fcgi
www-data 27259 27248 32 16:47 ?        00:00:00 /usr/bin/octave -qf /home/dev/WORKSPACES/climelioth/octave/main.m --fcgi
www-data 27260 27248  3 16:47 ?        00:00:00 /home/dev/public_html/webkitpdf/webkitpdf-fcgi
www-data 27261 27248 72 16:47 ?        00:00:00 ruby /usr/share/redmine/public/dispatch.fcgi
www-data 27262 27248 69 16:47 ?        00:00:00 ruby /usr/share/redmine/public/dispatch.fcgi

all of them are configured with "kill-signal" => 10,
only the ruby ones don't quit.
if i stop lighttpd just after i start it (within 3 seconds or less) the ruby processes quit gracefully.
if i wait 10 seconds, then they don't. Maybe it's something with ruby, too ?

#8 Updated by darix over 5 years ago

well signal 10 is sigbus. not a proper signal to kill your apps. maybe try 15? which would be SIGTERM.

#9 Updated by kapouer over 5 years ago

not on my debian distrib :

kill -l USR1

10

kill -l BUS

7

of course i could use kill-signal 9, but that would not be very graceful.

#10 Updated by kapouer over 5 years ago

it is still buggy with the patch, but something strange is happening :
when i start lighttpd, i have two ruby processes launched
if i don't use the ruby webapp (here it's redmine), then stop lighttpd, the two instances are not killed.
if i use the webapp, browse some pages, then stop lighttpd, ONE instance is killed, not the other.

#11 Updated by stbuehler over 5 years ago

Could you please check if you still can't kill them manually with kill -USR1 ?

There may have been two problems: 1. ignoring the signal and 2. lighty didn't send a signal

#12 Updated by kapouer over 5 years ago

i checked again, the answer is no.
i tried on the other fastcgi instances, and it works on them.

i also added logs to see if lighttpd was sending the signal,
and i can confirm that :

kill(proc->pid, host->kill_signal);

is called (in mod_fastcgi_free) with the right parameters :
i compared the proc->pid with pids of running ruby instances,
and also the kill-signal was correct.
I guess there is some weird interaction between ruby and the
way lighttpd spawns processes compared to the way spawn-fcgi spawns
processes.

#13 Updated by stbuehler over 5 years ago

Looks like there is/was a problem with ruby-fcgi (different behaviour in ruby and c implementation), see http://dev.rubyonrails.org/ticket/8704 .

#14 Updated by stbuehler over 5 years ago

  • Status changed from Reopened to Invalid

Let me summarize:
- lighty resets default signal behaviour (so it should be the same as spawn-fcgi)
- lighty does the correct kill() call

I don't think that is our problem, sry. (As it works with spawn-fcgi, just use spawn-fcgi. It is the right thing anyway :) )

#15 Updated by stbuehler over 5 years ago

  • Target version deleted (1.4.24)

Also available in: Atom