Project

General

Profile

Actions

Feature #2139

closed

Optimizing buffer_append_long()

Added by crypt over 14 years ago. Updated almost 8 years ago.

Status:
Fixed
Priority:
Normal
Category:
core
Target version:
-
ASK QUESTIONS IN Forums:

Description

lighttpd's internal buffer datatype has different functions in order to write data into the char array behind the scenes. One of them is buffer_append_long() which is as the name suggests, a function to append an integer to the buffer by using the internal function LI_ltostr(). The patch changes the calculation a bit to be able to deliver the result faster. I've made quite a few benchmarks for this as I needed a similar function in another program. And this little change improves the performance by a few percent - why should we give it away?


Files

buffer.patch (670 Bytes) buffer.patch crypt, 2009-12-31 14:35

Related issues 1 (0 open1 closed)

Related to Feature #2126: lighttpd patchesDuplicate2009-12-28Actions
Actions #1

Updated by stbuehler about 14 years ago

I had a look at the patch:
I think the compiler should be able to optimize the use of "x = val % 10, y = val / 10" - perhaps it doesn't, but i don't like optimizing it myself (on x86 the "DIV" command calculates both values anyway).
The swap optimization is even uglier - you could make the swap variable local to the loop, but i doubt XOR is faster than swapping with a temporary var (did you actually use "-O2" for benchmarking it?).

Actions #2

Updated by stbuehler about 14 years ago

  • Status changed from New to Wontfix
Actions #3

Updated by stbuehler about 14 years ago

  • Target version deleted (1.4.26)
Actions #4

Updated by crypt about 14 years ago

  • Assignee set to stbuehler

Sorry for my late response. The XOR-swap was just an attempt to remove the temporary variable. I think on most common architectures the way over a third variable should be imperceptibly slower than the way over bit manipulation. Under the few of readability you should leave this part.

But as you may noticed, the slow DIV is called twice even with -O23 and my optimization by rewriting the formula at this point makes the biggest performance benefit of the patch.

Ultimately, it's your decision, and there are surely many other points where divisions could be optimized in lighttpd, but as this function is called many times per request, I thought it could be a little improvement.

Actions #5

Updated by stbuehler about 14 years ago

I don't know who told you that, but it is wrong.

Example (i'm on x86_64):

You will see, that the swap with temporary var is definitely faster; temporary vars are almost always held in registers; and for xor you need to copy them into a register anyway (the compiler adds temporary vars for every calculation step anyway).

In the case with div/mod: if you don't know the divisor, using / and % is faster (only one idiv); if the divisor is 10, the resulting code is pretty much the same (just not really readable, gcc uses some magic foo here :D), so it doesn't matter.

Now, some of these things may behave differently in loops; perhaps the compiler can somehow optimize some variants better than others; but this very much depends on the compiler you are using, and so i prefer to use the clean way, which compilers should be able to optimize way better.

Perhaps you want to read http://dl.fefe.de/optimizer-isec.pdf

void swap1(char *a, char *b) {
    char tmp = *a; *a = *b; *b = tmp;
}

void swap2(char *a, char *b) {
    *a ^= *b ^= *a ^= *b;
}

gcc -S -O3 test_swaps.c

swap1:
    movzbl  (%rdi), %edx
    movzbl  (%rsi), %eax
    movb    %al, (%rdi)
    movb    %dl, (%rsi)
    ret

swap2:
    movzbl  (%rsi), %eax
    movl    %eax, %edx
    xorb    (%rdi), %dl
    xorl    %edx, %eax
    movb    %al, (%rsi)
    xorl    %edx, %eax
    movb    %al, (%rdi)
    ret

And here the results for div+mod:

int a, b;

void test_divmod1(int x, int y) {
    a = x % y;
    b = x / y;
}

void test_divmod2(int x, int y) {
    a = x - (b = x / y) * x;
}

void test_divmod3(int x) {
    a = x % 10;
    b = x / 10;
}

void test_divmod4(int x) {
    a = x - (b = x / 10) * 10;
}
test_divmod1:
    movl    %edi, %edx
    movl    %edi, %eax
    sarl    $31, %edx
    idivl    %esi
    movl    %edx, a(%rip)
    movl    %eax, b(%rip)
    ret

test_divmod2:
    movl    %edi, %edx
    movl    %edi, %eax
    sarl    $31, %edx
    idivl    %esi
    movl    %eax, %ecx
    movl    %eax, b(%rip)
    movl    $1, %eax
    subl    %ecx, %eax
    imull    %edi, %eax
    movl    %eax, a(%rip)
    ret

test_divmod3:
    movl    %edi, %eax
    movl    $1717986919, %edx
    imull    %edx
    movl    %edi, %eax
    sarl    $31, %eax
    sarl    $2, %edx
    subl    %eax, %edx
    leal    (%rdx,%rdx,4), %eax
    movl    %edx, b(%rip)
    addl    %eax, %eax
    subl    %eax, %edi
    movl    %edi, a(%rip)
    ret

test_divmod4:
    movl    %edi, %eax
    movl    $1717986919, %edx
    imull    %edx
    movl    %edi, %eax
    sarl    $31, %eax
    sarl    $2, %edx
    subl    %eax, %edx
    movl    %edx, b(%rip)
    leal    (%rdx,%rdx,4), %edx
    addl    %edx, %edx
    subl    %edx, %edi
    movl    %edi, a(%rip)
    ret
Actions #6

Updated by gstrauss about 8 years ago

  • Status changed from Wontfix to Fixed
  • Target version set to 1.4.x

optimized in r6afad87d

commit  6afad87d2ed66a48cda2a7c01dbcc59023774b73
Author: Stefan Bühler <stbuehler@web.de>
Date:   Sun Feb 8 12:37:10 2015 +0000

    git-svn-id: svn://svn.lighttpd.net/lighttpd/branches/lighttpd-1.4.x@2975 152afb58-edef-0310-8abb-c4023f1b3aa9
Actions #7

Updated by stbuehler almost 8 years ago

  • Assignee deleted (stbuehler)
  • Target version deleted (1.4.x)
Actions

Also available in: Atom