Bug #419
closedrrd update fails during graceful restart
Description
It appears that during graceful restarts where the old lighttpd takes some time to spin down, the new lighttpd's updates of rrd data can cause an rrdtool error:
2005-12-18 06:28:00: (mod_rrdtool.c.410) rrdtool-response: update /path/to/rrd/lighttpd.rrd N:5876:138222:23 ERROR: illegal attempt to update using time 1134887281 when last update time is 1134887281 (minimum one second step) OK u:0.00 s:0.00 r:29.41 2005-12-18 06:28:00: (server.c.900) one of the triggers failed
This error does not occur under normal restarts, only when two lighttpd processes are running (one waiting for existing connections to close, the other accepting new connections). After the error, the new lighttpd process does not log further rrd data.
Is there any workaround? Is it possible to have same-second updates of the rrd database treated as a soft failure and logged, but not stop future updates? As you might imagine, the combination of rrd and graceful restarts are very desirable to have working in combination on systems under heavy load. :)
Thanks,
James
-- jbyers
Updated by Anonymous about 19 years ago
I'm going to investigate stopping rrdtool updates when lighttpd gets a SIGINT, combined with a short delay on performing the first update on startup. Will advise.
-- jbyers
Updated by Anonymous over 18 years ago
Note rrd also fails with lighttpd in multiple listener mode for similar reasons.
-- jbyers
Updated by Anonymous over 18 years ago
The problem still exists in 1.4.12. My suggestion for how to fix, is to stop rrdtool updates when old process receives SIGINT.
-- Elan Ruusamäe <glen
Updated by jan over 18 years ago
handle this as a known limitation. It will be fixed in 1.5.x.
Updated by Anonymous over 17 years ago
A hack-around to this problem in 1.4 is to kill the rrdtool owned by the first lighttpd instance before the restart. This will result in a few lines in the error log (broken pipe; trigger failed) but will allow the second lighttpd to pick up with rrd updates.
-- jbyers
Updated by stbuehler almost 16 years ago
- Status changed from New to Fixed
- % Done changed from 0 to 100
Applied in changeset r2401.
Also available in: Atom