https://redmine.lighttpd.net/https://redmine.lighttpd.net/favicon.ico?13667327412006-12-01T00:50:36Zlighty labsLighttpd - Bug #673: Connection error on Solarishttps://redmine.lighttpd.net/issues/673?journal_id=16312006-12-01T00:50:36ZAnonymous
<ul></ul><p>ioctl may return any negative value as an error. I change the line to look for a value less than 0. Granted it hasn't been long since I've made the change, but I've not seen this error log message appear yet.</p>
<pre>
Line 221:
if (ioctl(con->fd, FIONREAD, &toread) < 0) {
</pre>
<p>-- joe</p> Lighttpd - Bug #673: Connection error on Solarishttps://redmine.lighttpd.net/issues/673?journal_id=16322006-12-04T01:50:16ZAnonymous
<ul></ul><p>The error message appears less now. When it does occur, it's logging it as remote host drop connection or broken pipe. I assume this is because the browser disappeared, but there could be yet a bug still involved.</p>
<p>-- joe</p> Lighttpd - Bug #673: Connection error on Solarishttps://redmine.lighttpd.net/issues/673?journal_id=16332007-01-15T02:47:36ZAnonymous
<ul></ul><p>Follow-up: The code change helped, but not much. However; after digging around SunSolve and OpenSolaris, I've turned up some information that says to set a setting in order to work around the problem. The links are below. The setting is:</p>
<p>ndd -set /dev/tcp tcp_co_min 1500</p>
<p>1500 = MTU of your network interface card.</p>
<p><a class="external" href="http://sunsolve.sun.com/search/document.do?assetkey=1-1-4701102-1">http://sunsolve.sun.com/search/document.do?assetkey=1-1-4701102-1</a></p>
<p><a class="external" href="http://bugs.opensolaris.org/bugdatabase/view_bug.do;jsessionid=2387e881a19c7affffffffdbf791ee9a8d6b1?bug_id=4789772">http://bugs.opensolaris.org/bugdatabase/view_bug.do;jsessionid=2387e881a19c7affffffffdbf791ee9a8d6b1?bug_id=4789772</a></p>
<p>-- joe</p> Lighttpd - Bug #673: Connection error on Solarishttps://redmine.lighttpd.net/issues/673?journal_id=16342007-06-26T16:57:41Zingenthr
<ul></ul><p>After looking into this for a customer, I can say pretty confidently that the cause is not bug 4701102, as it was fixed back in 2003 and the changes for that fix are still in current Solaris/OpenSolaris code.</p>
<p>I checked with another engineer and have learned this may just be incorrect error handling with the stream when using the devpoll backend. In other words, with this ioctl(), it's entirely possible to get an error but still have the stream readable. The best fix would probably be to change the error handling to anticipate a possible failure of this ioctl() when using this type of socket. The failure of this ioctl() in this case is not an indictation of error.</p> Lighttpd - Bug #673: Connection error on Solarishttps://redmine.lighttpd.net/issues/673?journal_id=16352007-06-26T16:58:49Zingenthr
<ul></ul><p>One other note, this was investigated with 1.4.15, but I also looked at a couple of files in 1.4.11 and it doesn't appear the behavior in this area has changed at all.</p> Lighttpd - Bug #673: Connection error on Solarishttps://redmine.lighttpd.net/issues/673?journal_id=16362007-06-29T01:14:01ZAnonymous
<ul></ul><p>Hello ingenthr. So you would recommend to just simply ignore the return value of ioctl() altogether?</p>
<p>-- joe</p> Lighttpd - Bug #673: Connection error on Solarishttps://redmine.lighttpd.net/issues/673?journal_id=16372007-06-29T01:36:27Zingenthr
<ul></ul><p>I believe so. After checking with another engineer to verify, we believe that ioctl() is not necessary with this nonblocking stream socket, and the error message therefore isn't required either. It can then fall through to the buffer code and the read.</p>
<p>In fact, it may not be necessary in the Linux epoll or poll cases either. This style check is normally not used with a nonblocking socket. That would remove a syscall in the critical path here. If whatever event mechanism (devpoll, epoll, poll()) says there's data there, it should be safe to do a read and check for errors from there. There could be something I'm not aware of on other implementations.</p>
<p>The one thing I'm certain of is that it is not related to bugid 4701102 or 4789772. The descriptions for those, and implementing the workaround, on a very busy system had no effect-- not to mention both have been closed and integrated for a couple of years. If it was that bug, the cause of which was notification propogating before data was available at the stream head, turning the tcp_co_min to the MTU (or higher) would mean you couldn't get in to that condition. It would, though, also have a negative effect on the performance-- so the workaround was more of a test to verify where the error was than it was a proper workaround. The fix was straightforward, and you can see it in the OpenSolaris code for tcp.c still to this day.</p>
<p>Do you still see those messages occasionally on your system as well? I would imagine you probably do, since we saw them even though the workaround was in place.</p>
<p>By the way, I'm matt dot ingenthron at sun dot com if you'd like to discuss directly and update the bug as needed.</p> Lighttpd - Bug #673: Connection error on Solarishttps://redmine.lighttpd.net/issues/673?journal_id=16382008-09-30T14:28:41Zstbuehler
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Fixed</i></li><li><strong>Resolution</strong> set to <i>fixed</i></li></ul><p>Fixed in r2317</p>