/customers/iconara.net/iconara.net/httpd.www/blog/wp-content/plugins/wp-super-cache/wp-cache-phase1.php Iconara » Unroll your loops!

Unroll your loops!

Today, while refactoring some code, I had an epiphany I’d like to share. It’s why you should chose clean code over apparent performance, because your gut feeling is usually wrong.

What I started with was something that conceptually was like the following code:

function doTheThing( ) {
  var sum = 0;
  var maxValue = 0;

for each ( var item in bunchOfItems ) { sum += item.someValue;

maxValue = Math.max(item.someOtherValue, maxValue);


// ... }

Basically I loop over a bunch of items and calculate the sum of one field and the max value of another field. I was doing a lot more things in my actual code, of course, but the example above suffices to show what I mean.

The refactoring I wanted to do was to extract methods, and (among other things) place the summation and the finding of the max value in their own methods, a quite simple refactoring:

function doTheThing( ) {
  var sum = getSumOfSomething();
  var maxValue = getMaxOfSomethingElse();

// ... }

function getSumOfSomething( ) { var sum = 0;

for each ( var item in bunchOfItems ) { sum += item.someValue; }

return sum; }

function getMaxOfSomethingElse( ) { var maxValue = 0;

for each ( var item in bunchOfItems ) { maxValue = Math.max(item.someOtherValue, maxValue); }

return maxValue; }

Having done this my gut feeling tells me that the resulting code is less optimal in terms of performance than the original (it is certainly more code, but cleaner code). In the original I loop over the list once, but in the refactored version I loop over the same list twice. That has to be bad, doesn’t it?

Now, I usually don’t listen to my gut feeling about speed optimizations, because I know that premature optimization is the root of all evil. Most likely it doesn’t even matter if it was less optimal, there is probably another bottleneck that is worse. I don’t optimize when I refactor, I save that for later and use a profiler to be sure. However, I still have that gut feeling. What makes my gut so sure? Is it even right?

For maximum effect, at this point you should decide too, and see if your gut feeling is anything to trust.

Turns out it’s not.

I had my epiphany when I unrolled the loops in my head, just for fun: There is actually no difference between the two versions. It doesn’t matter that the loop runs twice, because each only does half the job. There is some overhead in the method call, and the setting up of the loop, but it is negligible.

Unrolling the loops makes it obvious:

Original version

sum += items[0].someValue;
maxValue = Math.max(items[0].someOtherValue, maxValue);
sum += items[1].someValue;
maxValue = Math.max(items[1].someOtherValue, maxValue);
sum += items[2].someValue;
maxValue = Math.max(items[2].someOtherValue, maxValue);
// and so on...

Refactored version

// getSumOfSomething
sum += items[0].someValue;
sum += items[1].someValue;
sum += items[2].someValue;
// and so on...

// getMaxOfSomethingElse maxValue = Math.max(items[0].someOtherValue, maxValue); maxValue = Math.max(items[1].someOtherValue, maxValue); maxValue = Math.max(items[2].someOtherValue, maxValue); // and so on...

Save for the order in which the statements are executed there is no difference at all.

This might have been obvious, and I hope it’s obvious to everyone at this point, but I wouldn’t be surprised that many of you had the same gut feeling as I had. Actually, even Martin Fowler makes the mistake in his book Refactoring:

The other concern with this refactoring lies in performance. The old code executed the “while” loop once, the new code executes it three times. A while loop that takes a long time might impair performance. Many programmers would not do this refactoring simply for this reason. But note the words if and might. Until I profile I cannot tell how much time is needed for the loop to calculate or whether the loop is called often enough for it to affect the overall performance of the system. Don’t worry about this while refactoring. When you optimize you will have to worry about it, but you will then be in a much better position to do something about it, and you will have more options to optimize effectively.

Martin Fowler: Refactoring, p. 32

I should be said that Fowlers example is very similar, but not identical. Even so, if you unroll the loops in his example you find that it is indeed executing the same statements, albeit with the overhead of setting up an iterator three times, and two method calls.

Why is it that we think that running a loop twice must be twice as slow? Why do we place more significance in the loop than in what the loop does?

I think this fallacy needs a name (perhaps it already has?), please write suggestions in the comments.

5 Responses to “Unroll your loops!”

  1. ronen Says:

    Its all a matter of what is the operation on which your measuring the runtime (summation and max operations in your case), lets imagine a case where you have method calcA and calcB, each cost o(1) used in the following loop

    int acc=0,temp=0; for(int i=0;i<n;i++){ temp+=calcA(i); acc+=calcB(temp); }

    The run time is o((1+1)n)=o(n)

    Now lets unroll this loop into two separate loops:

    int methodA(int k){ int temp=0; for(int j=0;j<=k;j++){ temp+=calcA(j); }
    return temp; }

    int methodB(int n){ int acc=0; for(int i=0;i<n;i++){ acc+=calcB(methodA(i)); }
    return acc; }

    for each calcB we calcA i times, which makes the runtime: ((1+1) + (1+2) + .. + (1+n-1))=(2+3+..+n) = (n (2 + n)) / 2 = (2n+n^2)/2 = o(n^2) This is quadratic time instead of linear time! Now for simple un-rolling its visible and can be prevented by a worthy programmer the problem is that in real life code isn’t as clear as in this case, still you are correct since usually most of the code doesn’t reside within the 10% in which the program spends 90% of its processing time (therefor making optimizations worthless).

  2. Theo Says:

    You’re right of course, but it’s also a very different example. The point of mine was that the loop body wasn’t interdependent. In your example the parts don’t do half the work. It’s less likely that you’d want to refactor a piece of code that worked like that because the parts depend on each other. I’m not claiming that any loop can be split apart and maintain the same time complexity.

  3. bjorn Says:

    Good one Theo, my gut told me the same thing yours did .. but when thinking about it I can’t see why I thought setting up a loop should be so costly :-)

  4. Severin Says:

    Well, I’d say your guts were right, at least if this code is really a performance bottleneck: if you’d do let’s say a 1000000 iterations on doTheThing(), then the first thing would be substantially faster done, at least if there’s only a little bunch of items to iterate locally. Why? Because method calls in AS3 are much slower (I’m talking of factor 100) than assignments. But I agree, that – if you wouldn’t have to do the thing many, many times – there’s only a theoratical difference. Ehm, hope my 2 dirty cents made themselves kind of clear ;)

  5. Theo Says:

    The lesson here is that you shouldn’t optimize while you refactor.

Leave a Reply