<<
 
>>
 
 
justin = {main feed , music , code , askjf , pubkey };
 
checking assumptions
July 15, 2014
Very often in computer code integers are divided by powers of two, such as 2, 4, 8, 16, 32, 64, and so on). These divisions are much faster than dividing by other numbers (since computers represent the underlying number in binary). In C/C++, there are two ways this type of division is typically expressed: normal division (x/256), or a shift (x<<8). These two methods produce the same results for non-negative values, and depending on the meaning of the code in question, (x/256) is often more readable, and thus I use it regularly.

If x is signed and is negative, division rounds towards 0, whereas the shift rounds towards negative infinity, but in situations where rounding of negative values is not important, I had generally assumed that modern compilers would generate similarly efficient code (reducing x/256 into a single x86 sar instruction).

It turns out, testing with a two modern compilers (and one slightly out of date compiler), this is not the case.

Here is the C code:

  void b(int r);
  void f(int v)
  {
    int i;
    for (i=0;i<v/256;i++) b(v > 0 ? v/65536 : 0);
  }

  void f2(int v)
  {
    int i;
    for (i=0;i<(v>>8);i++) b(v > 0 ? v >> 16 : 0);
  }

Ideally, a compiler should generate identical code for each of these functions. In the case of the loop counter, if v is less than 0, how it is rounded makes no difference. In the case of the parameter to b(), the code v >> 16 is only evaluated if v is known to be above 0.

Let's look at the output of some compilers (removing decoration and unrelated code). I've marked some code as bold to signify instructions that could be eliminated (with slight changes to the surrounding instructions):

Conclusions?

I'm not sure what to say here -- modern compilers can generate a lot of really good code, especially looking at floating point and SSE, but this makes me feel as though some of the basics have been neglected. If I were a better programmer I'd go dig into LLVM and GCC and submit patches.

I should have also tested ICC, but I've spent enough time on this, and the only ICC version we use is old enough that I would just regret not using the latest.

For comparison, here is what I would like to see LLVM generate for f():


f():                              # using divides
	cmpl	$256, %edi
	jl	LBB0_3            # if v is less than 256, skip loop

	movl	%edi, %ebx
	movl	%edi, %ebp
	shrl	$8, %ebx          # ebx = v/256, since v is non-negative
	shrl	$16, %ebp         # ebp = v/65536, since v is non-negative

LBB0_2:
	movl	%ebp, %edi
	callq	_b
	decl    %ebx
	jnz	LBB0_2
LBB0_3:

Performance-wise, I'm sure they wouldn't differ in any meaningful way, but the decrease in size would be nice.

Finally: write for the compiler you have, not the compiler you wish you had. When performance is important, use shifts instead of divides, or use unsigned types (which really should generate the same code for (x/256) vs (x>>8)). Move as much logic out of the loop as you can -- yes, the compiler might be able to do it for you, but why depend on that? But most important of all: test your assumptions.
5 Comments:

Posted by Eugene on Wed 16 Jul 2014 at 10:30 from 95.31.43.x

I've found out with clang that if you use unsigned integer type, it will generate code close to what you expect.

You can use the site
gcc.gnubolt.orgto experiment.

Posted by Eugene on Wed 16 Jul 2014 at 10:31 from 95.31.43.x

sorry, gcc.godbolt.org

Posted by Eugene on Wed 16 Jul 2014 at 10:33 from 95.31.43.x

Here's permalink to the example I'm talking about:

goo.gl/iYu8dw

Posted by Justin on Wed 16 Jul 2014 at 13:08 from 198.228.200.x

Yeah, in my conclusion I mentioned unsigned types as being a solution. It would be nice if it were smarter when operating on signed integers that are known to be non-negative.

Posted by Justin on Wed 16 Jul 2014 at 13:09 from 198.228.200.x

Here's a function for which LLVM generates the code I requested:

void f3(int v)
{
  if (v>=256) {
    int i=v>>8;
    const int p = v>>16;
    do {
      b(p);
    } while (--i);
  }
}

Add comment:

Name:
Human?: (no or yes, patented anti crap stuff here)
Comment:
search : rss : recent comments : Copyright © 2024 Justin Frankel