Bug #419

rrd update fails during graceful restart

Added by Anonymous over 8 years ago. Updated over 5 years ago.

Status:FixedStart date:
Priority:LowDue date:
Assignee:jan% Done:

100%

Category:mod_rrdtool
Target version:1.5.0
Missing in 1.5.x:

Description

It appears that during graceful restarts where the old lighttpd takes some time to spin down, the new lighttpd's updates of rrd data can cause an rrdtool error:


2005-12-18 06:28:00: (mod_rrdtool.c.410) rrdtool-response: update /path/to/rrd/lighttpd.rrd N:5876:138222:23
 ERROR: illegal attempt to update using time 1134887281 when last update time is 1134887281 (minimum one second step)
OK u:0.00 s:0.00 r:29.41 
2005-12-18 06:28:00: (server.c.900) one of the triggers failed 

This error does not occur under normal restarts, only when two lighttpd processes are running (one waiting for existing connections to close, the other accepting new connections). After the error, the new lighttpd process does not log further rrd data.

Is there any workaround? Is it possible to have same-second updates of the rrd database treated as a soft failure and logged, but not stop future updates? As you might imagine, the combination of rrd and graceful restarts are very desirable to have working in combination on systems under heavy load. :)

Thanks,
James

-- jbyers

Associated revisions

Revision 2401
Added by stbuehler over 5 years ago

Fix rrd error after graceful restart (fixes #419)

Revision 2416
Added by stbuehler over 5 years ago

Port some mod_rrdtool fixes from 1.4.x (#604, #419 and more)

History

#1 Updated by Anonymous over 8 years ago

I'm going to investigate stopping rrdtool updates when lighttpd gets a SIGINT, combined with a short delay on performing the first update on startup. Will advise.

-- jbyers

#2 Updated by Anonymous almost 8 years ago

Note rrd also fails with lighttpd in multiple listener mode for similar reasons.

-- jbyers

#3 Updated by Anonymous almost 8 years ago

The problem still exists in 1.4.12. My suggestion for how to fix, is to stop rrdtool updates when old process receives SIGINT.

-- Elan Ruusamäe <glen

#4 Updated by jan almost 8 years ago

handle this as a known limitation. It will be fixed in 1.5.x.

#5 Updated by Anonymous almost 7 years ago

A hack-around to this problem in 1.4 is to kill the rrdtool owned by the first lighttpd instance before the restart. This will result in a few lines in the error log (broken pipe; trigger failed) but will allow the second lighttpd to pick up with rrd updates.

-- jbyers

#6 Updated by stbuehler over 5 years ago

  • Status changed from New to Fixed
  • % Done changed from 0 to 100

Applied in changeset r2401.

Also available in: Atom