You are on page 1of 8

Stack Exchange sign up log in careers 2.

0 Stack Overflow Questions Tags Tour Users Ask Question Tell me more Stack Overflow is a question and answer site for professional and e nthusiast programmers. It's 100% free, no registration required. Do compilers produce better code for do-while loops versus other types of loops? up vote 68 down vote favorite 7 There's a comment in the zlib compression library (which is used in the Chromium project among many others) which implies that a do-while loop in C generates "b etter" code on most compilers. Here is the snippet of code where it appears. do { } while (*(ushf*)(scan+=2) == *(ushf*)(match+=2) && *(ushf*)(scan+=2) == *(ushf*)(match+=2) && *(ushf*)(scan+=2) == *(ushf*)(match+=2) && *(ushf*)(scan+=2) == *(ushf*)(match+=2) && scan < strend); /* The funny "do {}" generates better code on most compilers */ https://code.google.com/p/chromium/codesearch#chromium/src/third_party/zlib/defl ate.c&l=1225 Is there any evidence that most (or any) compilers would generate better (e.g. m ore efficient) code? Update: Mark Adler, one of the original authors, gave a bit of context in the co mments. c performance compiler share|improve this question edited Nov 25 at 4:29 asked Nov 24 at 7:59 Dennis 2,88351938 5 By the way, to clarify, this is not part of Chromium. As you can deduce from the URL, this is a "3rd-party" project, and if you look at it even more closely, yo u can perceive that this code is from ZLib, a widely used, general-purpose compr ession library. H2CO3 Nov 24 at 8:22 1 The funny "do {}" generates better code --- better than what? Than funny while() or than boring, regular do{}? n.m. Nov 24 at 8:41 @H2CO3 thank you for the clarification, I've edited the question to be more spec ific about the origin. Dennis Nov 24 at 9:31 31 That comment was written more than 18 years ago in the era of Borland and Sun C

compilers. Any relevance to compilers today would be purely accidental. Note tha t this particular usage of do, as opposed to just a while does not avoid a condi tional branch. Mark Adler Nov 25 at 1:45 add comment 6 Answers activeoldestvotes up vote 81 down vote accepted First of all: A do-while loop is not the same as a while-loop or a for-loop. while and for loops may not run the loop body at all. A do-while loop always runs the loop body at least once - it skips the initial c ondition check. So that's the logical difference. That said, not everyone strictly adheres to th is. It is quite common for while or for loops to be used even when it is guarant eed that it will always loop at least once. (Especially in languages with foreac h loops.) So to avoid comparing apples and oranges, I'll proceed assuming that the loop wi ll always run at least once. Furthermore, I won't mention for loops again since they are essentially while loops with a bit of syntax sugar for a loop counter. So I'll be answering the question: If a while loop is guaranteed to loop at least once, is there any performance ga in from using a do-while loop instead. A do-while skips the first condition check. So there is one less branch and one less condition to evaluate. If the condition is expensive to check, and you know you're guaranteed to loop a t least once, then a do-while loop could be faster. And while this is considered a micro-optimization at best, it is one that the co mpiler can't always do: Specifically when the compiler is unable to prove that t he loop will always enter at least once. In other words, a while-loop: while (condition){ body } Is effectively the same as this: if (condition){ do{ body }while (condition); } If you know that you will always loop at least once, that if-statement is extran eous. Likewise at the assembly level, this is roughly how the different loops compile to: do-while loop:

start: body test conditional jump to start while-loop: test conditional jump to end start: body test conditional jump to start end: Note that the condition has been duplicated. An alternate approach is: unconditional jump to end start: body end: test conditional jump to start ... which trades away the duplicate code for an additional jump. Either way, it's still worse than a normal do-while loop. That said, compilers can do what they want. And if they can prove that the loop always enters once, then it has done the work for you. But things are bit weird for the particular example in the question because it h as an empty loop body. Since there is no body, there's no logical difference bet ween while and do-while. FWIW, I tested this in Visual Studio 2012: With the empty body, it does actually generate the same code for while and do-wh ile. So that part is likely a remnant of the old days when compilers weren't as great. But with a non-empty body, VS2012 manages to avoid duplication of the condition code, but still generates an extra conditional jump. So it's ironic that while the example in the question highlights why a do-while loop could be faster in the general case, the example itself doesn't seem to giv e any benefit on a modern compiler. Considering how old the comment was, we can only guess at why it would matter. I t's very possible that the compilers at the time weren't capable of recognizing that the body was empty. (Or if they did, they didn't use the information.) share|improve this answer edited Nov 27 at 8:01 answered Nov 24 at 8:02 Mysticial 130k22165196 11 So is checking the condition one less time such a great advantage? I highly doub t that. Run the loop 100 times and it becomes entirely insignificant. H2CO3 Nov

24 at 8:20 5 @H2CO3 But what if the loop only runs once or twice? And what about that increas ed code-size from the duplicated condition code? Mysticial Nov 24 at 8:22 5 @Mystical If a loop runs only once or twice, then that loop isn't worth optimizi ng. And the increased code size is... not a solid argument, at best. It's not a requirement that every compiler implement it the way you showed. I have written a compiler for my own toy language, and the compilation of while loops is implem ented with an unconditional jump to the beginning of the loop, so code for the c ondition is only emitted once. H2CO3 Nov 24 at 8:25 20 @H2CO3 "If a loop runs only once or twice, then that loop isn't worth optimizing ." - I beg to differ. It could be inside another loop. Tons of my own highly opt imized HPC code is like this. And yes the do-while does make a difference. Myst icial Nov 24 at 8:27 17 @H2CO3 Where did I say that I was encouraging it? The question asks is a do-whil e loop faster than a while loop. And I answered the question by saying it can be faster. I didn't say by how much. I didn't say whether it was worthwhile. I did n't recommend anybody to start converting to do-while loops. But simply denying that there is a possibility of an optimization, even if it's a small one, is in my opinion a disservice to those who do care and are interested in these things. Mysticial Nov 24 at 20:17 show 25 more comments up vote 18 down vote Is there any evidence that most (or any) compilers would generate better (e.g. m ore efficient) code? Not much, unless you look at the actual generated assembly of an actual, specifi c compiler on a specific platform with some specific optimization settings. This was probably worth worrying about decades ago (when ZLib has been written), but certainly not nowadays, unless you found, by real profiling, that this remo ves a bottleneck from your code. share|improve this answer answered Nov 24 at 8:28 H2CO3 103k1368146 9 Well put - the phrase premature optimization comes to mind here. v 24 at 15:17

James Snell No

@JamesSnell exactly. And that's what the top rated answer supports/encourages. H2CO3 Nov 24 at 17:06 13 I don't think the top rated answer encourages premature optimization. I would ar gue that it shows a difference in efficiency is possible, however slight or insi gnificant it may be. But people interpret things differently and some may see it as a sign to start using do-while loops when not necessary (I hope not). Anyway , I'm glad with all of the answers so far. They provide valuable information w.r .t. the question and generated interesting discussion. Dennis Nov 25 at 4:56 1 @Dennis Yeah, they generated discussion, for sure... H2CO3 Nov 25 at 17:33 add comment

up vote 9 down vote In a nutshell (tl;dr): I'm interpreting the comment in OPs' code a little differently, I think the "bet ter code" they claim to have observed was due to moving the actual work into the loop "condition". I completely agree however that it's very compiler specific a nd that the comparison they made, while being able to produce a slightly differe nt code, is mostly pointless and probably obsolete, as I show below. Detils: It's hard to say what the original author meant by his comment about this do {} while producing better code, but i'd like to speculate in another direction than what was raised here - we believe that the difference between do {} while and w hile {} loops is pretty slim (one less branch as Mystical said), but there's som ething even "funnier" in this code and that's putting all the work inside this c razy condition, and keeping the internal part empty (do {}). I've tried the following code on gcc 4.8.1 (-O3), and it gives an interesting di fference #include int main char char char "stdio.h" (){ buf[10]; *str = "hello"; *src = str, *dst = buf; // loop 1

char res; do { res = (*dst++ = *src++); } while (res); printf ("%s\n", buf); src = str; dst = buf; do { } while (*dst++ = *src++); printf ("%s\n", buf); return 0; } After compiling 00000000004003f0 <main>: ... ; loop 1 400400: 48 89 ce 400403: 48 83 c0 400407: 0f b6 50 40040b: 48 8d 4e 40040f: 84 d2 400411: 88 16 400413: 75 eb ... ;loop 2 400430: 48 83 c0 400434: 0f b6 48 400438: 48 83 c2

// loop 2

01 ff 01

mov add movzbl lea test mov jne

%rcx,%rsi $0x1,%rax 0xffffffffffffffff(%rax),%edx 0x1(%rsi),%rcx %dl,%dl %dl,(%rsi) 400400 <main+0x10>

01 ff 01

add $0x1,%rax movzbl 0xffffffffffffffff(%rax),%ecx add $0x1,%rdx

40043c: 84 c9 test %cl,%cl 40043e: 88 4a ff mov %cl,0xffffffffffffffff(%rdx) 400441: 75 ed jne 400430 <main+0x40> ... So the first loop does 7 instructions while the second does 6, even though they' re supposed to do the same work. Now, I can't really tell if there's some compil er smartness behind this, probably not and it's just coincidental but I haven't checked how it interacts with other compiler options this project might be using . On clang 3.3 (-O3) on the other hand, both loops generate this 5 instructions co de : 400520: 8a 88 a0 06 40 00 mov 0x4006a0(%rax),%cl 400526: 88 4c 04 10 mov %cl,0x10(%rsp,%rax,1) 40052a: 48 ff c0 inc %rax 40052d: 48 83 f8 05 cmp $0x5,%rax 400531: 75 ed jne 400520 <main+0x20> Which just goes to show that compilers are quite different, and advancing at a f ar faster rate than some programmer may have anticipated several years ago. It a lso means that this comment is pretty meaningless and probably there because no one had ever checked if it still makes sense. Bottom line - if you want to optimize to the best possible code (and you know ho w it should look like), do it directly in assembly and cut the "middle-man" (com piler) from the equation, but take into account that newer compilers and newer H W might make this optimization obsolete. In most cases it's far better to just l et the compiler do that level of work for you, and focus on optimizing the big s tuff. Another point that should be made - instruction count (assuming this is what the original OPs' code was after), is by no means a good measurement for code effic iency. Not all instructions were created equal, and some of them (simple reg-toreg moves for e.g.) are really cheap as they get optimized by the CPU. Other opt imization might actually hurt CPU internal optimizations, so eventually only pro per benchmarking counts. share|improve this answer edited 2 days ago answered Nov 24 at 10:06 Leeor 2,932321 It looks like it saves a register move. mov %rcx,%rsi :) I can see how rearrangi ng code can do that. Mysticial Nov 24 at 10:10 @Mystical, you're right about micro optimization though. Sometimes even saving a single instruction isn't worth anything (and reg-to-reg moves should be almost free with reg renaming today). Leeor Nov 24 at 10:13 It doesn't seem that move renaming was implemented until AMD Bulldozer and Intel Ivy Bridge. That's a surprise! Mysticial Nov 24 at 10:33 @Mysticial, note that these are roughly the first processors implementing a phys ical register file. Old out-of-order designs just place the register in the reor der buffer, where you can't do that. Leeor Nov 24 at 10:41 2 Looks like you interpreted the comment in the original code differently than mos

t, and it makes sense. The comment says "the funny do{}.." but doesn't say to wh at non-funny version it compares. Most people know the difference between do-whi le and while, so my guess is that "the funny do{}" didn't apply to that, but to the loop-unrolling and/or the lack of the extra assignment, as you showed here. Abel Nov 27 at 1:14 add comment up vote 7 down vote A while loop is often compiled as a do-while loop with an initial branch to the condition, i.e. bra $1 $2: ; loop body $1: tst <condition> ; the condition brt $2 ; branch if condition true whereas the compilation of a do-while loop is the same without the initial branc h. You can see from that that it's inherently less efficient by the cost of the initial branch, which is however only paid once. [Compare to the naive way of im plementing while, which requires both a conditional branch and an unconditional branch per iteration.] Having said that, they aren't really comparable alternatives. It is painful to t ransform a while loop into a do-while loop and vice versa. They do different thi ngs. And in this case the several method calls would totally dominate whatever t he compiler did with while as against do-while. share|improve this answer edited Nov 26 at 22:27 answered Nov 24 at 9:08 EJP 84.4k849100 add comment up vote 3 down vote The remark is not about the choice of the control statement (do vs. while), it i s about the loop unrolling !!! As you can see, this is a string comparison function (string elements probably b eing 2 bytes long), which could have been written with a single comparison rathe r than four in a shortcut-and expression. This latter implementation is for sure faster, as it does a single check of the end-of-string condition after every four element comparisons, whereas standard c oding would involve one check per comparison. Said differently, 5 tests per 4 el ement vs. 8 tests per 4 element. Anyway, it will work only if the string length is a multiple of four or has a se ntinel element (so that the two strings are guaranteed to differ past the strend border). Pretty risky ! share|improve this answer answered Nov 26 at 21:20 Yves Daoust ; unconditional branch to the condition

1,629411 That's an interesting observation and something that everyone has overlooked unt il now. But wouldn't a compiler have no effect on that? In other words, it would always be more efficient regardless of which compiler is used. So then why is t here a comment that mentions compilers? Dennis Nov 27 at 1:02 @Dennis: different compilers have different ways of optimizing generated code. S ome may do loop-unrolling themselves (to some extend) or optimize away assignmen ts. Here the coder forces the compiler into the loop-unrolling, making less opti mizing compilers perform well still. I think Yves is exactly right about his ass umptions, but without the original coder around, it remains a bit of a mystery w hat the real thought was behind the "funny" remark. Abel Nov 27 at 1:24 1 @Abel thanks for clarifying, I understand the (assumed) meaning behind the comme nt better now. Yves definitely came closest to solving the mystery behind the co mment, but I'm going to accept Mysticial's answer since I think he answered my q uestion best. Turns out I was asking the wrong question because the comment misl ead me to focus on the type of loop, while it's probably referring to the condit ion. Dennis Nov 27 at 4:36 add comment up vote 0 down vote This discussion of while vs. do efficiency is completely pointless in this case, as there is no body. while (Condition) { } and do { } while (Condition); are absolutely equivalent. share|improve this answer answered Nov 27 at 7:27 Yves Daoust 1,629411 add comment Your Answer