Project

General

Profile

Feature #1355

reduce sizeof(connection) from 736 to 708 (on IA-32)

Added by Safari about 12 years ago. Updated about 3 years ago.

Status:
Missing Feedback
Priority:
Low
Assignee:
-
Category:
core
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Missing in 1.5.x:

Description

reduce sizeof(connection) from 736 to 708 (on IA-32)

History

#1

Updated by ralf about 12 years ago

Why?

I dont think that there is real need to save this few bytes.

Iam not sure, but does the use of bit maskes "stresses" the cpu a bit more then use just some integers?

#2

Updated by Safari about 12 years ago

Setting bit to 1 or 0 needs only one assembly instruction (on IA-32): orb or andb.

#3

Updated by ralf about 12 years ago

Hi safari,

can i contact you by mail or so?

i've some questions about this and i think its not really lighttpd related.

#4

Updated by Safari about 12 years ago

Well, what do you have in mind?

#5

Updated by ralf about 12 years ago

Replying to Safari:

Well, what do you have in mind?

Ok.

I play a bit, with 32bit on linux,
and the bitmask function seams to have more uhm "function calls"?
Therefore i think (i know nearly nothing about asm) this ends in more cpu coast.

#6

Updated by Safari about 12 years ago

It kind of depends what you do which is faster, but with int you have to write the variables individually (one instruction per write)
whereas with bit fields you can test/set/clear up to 32 or 64 (or 128 if SSE*) bits at a time.

http://safari.iki.fi/testbits/testbits.c

http://safari.iki.fi/testbits/testbits.s

Which one is better now? =)

#7

Updated by ralf about 12 years ago

Ok, but this test says its faster if you ask for (nearly) all variables.

But in a real world program (lighttpd) this is never happen.

I grep a bit around "{{{ grep -- '->file_started' *.c }}}" and there is (mostly?) only one ask for a part of the bitmask.

However, i think the price of this is immeasurable - but havent benchmarked it.

#8

Updated by gstrauss about 3 years ago

  • Description updated (diff)
  • Status changed from New to Missing Feedback
  • Assignee deleted (jan)

Number of assembly instructions is one factor. Memory cache usage (cache hits, cache eviction) is another. bitmasks of n bits use less memory than an array of n ints. This might lead to fewer cache misses, and subsequently better performance. On the other hand, reading bits on some systems can take more assembly instructions than simply reading a value that is the same size as the CPU register. The additional cost to test bits can vary depending on the processor.

In any case, code changes should not be made for potential optimizations without demonstrable evidence (empirical data) of said optimization. This proposed change will not be considered without clear, reproducible benchmarks which (approximately) simulate lighttpd usage of bitmasks (not ideal use of all bits in the bitmask sequentially), and can show beneficial results across different architectures (and not detrimental on other architectures).

Also available in: Atom