Project

General

Profile

Actions

Bug #419

closed

rrd update fails during graceful restart

Added by Anonymous almost 19 years ago. Updated almost 16 years ago.

Status:
Fixed
Priority:
Low
Category:
mod_rrdtool
Target version:
ASK QUESTIONS IN Forums:

Description

It appears that during graceful restarts where the old lighttpd takes some time to spin down, the new lighttpd's updates of rrd data can cause an rrdtool error:


2005-12-18 06:28:00: (mod_rrdtool.c.410) rrdtool-response: update /path/to/rrd/lighttpd.rrd N:5876:138222:23
 ERROR: illegal attempt to update using time 1134887281 when last update time is 1134887281 (minimum one second step)
OK u:0.00 s:0.00 r:29.41 
2005-12-18 06:28:00: (server.c.900) one of the triggers failed 

This error does not occur under normal restarts, only when two lighttpd processes are running (one waiting for existing connections to close, the other accepting new connections). After the error, the new lighttpd process does not log further rrd data.

Is there any workaround? Is it possible to have same-second updates of the rrd database treated as a soft failure and logged, but not stop future updates? As you might imagine, the combination of rrd and graceful restarts are very desirable to have working in combination on systems under heavy load. :)

Thanks,
James

-- jbyers

Actions #1

Updated by Anonymous almost 19 years ago

I'm going to investigate stopping rrdtool updates when lighttpd gets a SIGINT, combined with a short delay on performing the first update on startup. Will advise.

-- jbyers

Actions #2

Updated by Anonymous over 18 years ago

Note rrd also fails with lighttpd in multiple listener mode for similar reasons.

-- jbyers

Actions #3

Updated by Anonymous about 18 years ago

The problem still exists in 1.4.12. My suggestion for how to fix, is to stop rrdtool updates when old process receives SIGINT.

-- Elan Ruusamäe <glen

Actions #4

Updated by jan about 18 years ago

handle this as a known limitation. It will be fixed in 1.5.x.

Actions #5

Updated by Anonymous about 17 years ago

A hack-around to this problem in 1.4 is to kill the rrdtool owned by the first lighttpd instance before the restart. This will result in a few lines in the error log (broken pipe; trigger failed) but will allow the second lighttpd to pick up with rrd updates.

-- jbyers

Actions #6

Updated by stbuehler almost 16 years ago

  • Status changed from New to Fixed
  • % Done changed from 0 to 100

Applied in changeset r2401.

Actions

Also available in: Atom