Feature #1355
closedreduce sizeof(connection) from 736 to 708 (on IA-32)
Description
reduce sizeof(connection) from 736 to 708 (on IA-32)
Files
Updated by ralf about 17 years ago
Why?
I dont think that there is real need to save this few bytes.
Iam not sure, but does the use of bit maskes "stresses" the cpu a bit more then use just some integers?
Updated by Safari about 17 years ago
Setting bit to 1 or 0 needs only one assembly instruction (on IA-32): orb or andb.
Updated by ralf about 17 years ago
Hi safari,
can i contact you by mail or so?
i've some questions about this and i think its not really lighttpd related.
Updated by ralf about 17 years ago
Replying to Safari:
Well, what do you have in mind?
Ok.
I play a bit, with 32bit on linux,
and the bitmask function seams to have more uhm "function calls"?
Therefore i think (i know nearly nothing about asm) this ends in more cpu coast.
Updated by Safari about 17 years ago
It kind of depends what you do which is faster, but with int you have to write the variables individually (one instruction per write)
whereas with bit fields you can test/set/clear up to 32 or 64 (or 128 if SSE*) bits at a time.
http://safari.iki.fi/testbits/testbits.c
http://safari.iki.fi/testbits/testbits.s
Which one is better now? =)
Updated by ralf about 17 years ago
Ok, but this test says its faster if you ask for (nearly) all variables.
But in a real world program (lighttpd) this is never happen.
I grep a bit around "{{{ grep -- '->file_started' *.c }}}" and there is (mostly?) only one ask for a part of the bitmask.
However, i think the price of this is immeasurable - but havent benchmarked it.
Updated by gstrauss over 8 years ago
- Description updated (diff)
- Status changed from New to Missing Feedback
- Assignee deleted (
jan)
Number of assembly instructions is one factor. Memory cache usage (cache hits, cache eviction) is another. bitmasks of n bits use less memory than an array of n ints. This might lead to fewer cache misses, and subsequently better performance. On the other hand, reading bits on some systems can take more assembly instructions than simply reading a value that is the same size as the CPU register. The additional cost to test bits can vary depending on the processor.
In any case, code changes should not be made for potential optimizations without demonstrable evidence (empirical data) of said optimization. This proposed change will not be considered without clear, reproducible benchmarks which (approximately) simulate lighttpd usage of bitmasks (not ideal use of all bits in the bitmask sequentially), and can show beneficial results across different architectures (and not detrimental on other architectures).
Also available in: Atom