Question: Does Go runtime evaluate the for loop condition every iteration?


Does Go runtime evaluate the for loop condition every iteration?

Answers 3
Added at 2016-12-26 08:12

Here is a code snippet from the book "The Go Programming Language":

for t := 0.0; t < cycles*2*math.Pi; t += res {

It appears that the expression in the for loop condition t < cycles*2*math.Pi must be evaluated before every iteration of the for loop. Or, does the compiler optimize this by pre-calculating the result of the expression (assuming none of the variables change during the iteration)? Does the above style of coding affect performance?

nr: #1 dodano: 2016-12-26 09:12

It does not seems optimized, and is not listed in "Compiler And Runtime Optimizations".

As mentioned in this older discussion,

The gc compiler does not do any loop optimizations. One of the major goals of that compiler is to compile quickly. While improved optimization is always useful, it has to fit within that goal

But that could have changed. A good technique to see what is going on is illustrated in "Recursion And Tail Calls In Go", where you can look at the assembly code produced by a go program.
See also the more recent 2016 "Reversing GO binaries like a pro" article.
And "" can help too: see the assembly code here.

You can see the "t < cycles*2*math.Pi" part always evaluated

        movsd   xmm0, QWORD PTR [rbp-24]
        addsd   xmm0, xmm0
        movsd   xmm1, QWORD PTR .LC2[rip]
        mulsd   xmm0, xmm1
        ucomisd xmm0, QWORD PTR [rbp-8]
        seta    al
        test    al, al
nr: #2 dodano: 2016-12-26 09:12

This really depends on the Go version but go version go1.7 windows/amd64 does appear to calculate the value once.

Go code:

var cycles = 10.0
var res = 1000.0
for t := 0.0; t < cycles*2*math.Pi; t += res {

Asm code:

movsd   [rsp+58h+var_20], xmm0
mov     [rsp+58h+var_18], 0
mov     [rsp+58h+var_10], 0
lea     rax, qword_494500
mov     [rsp+58h+var_58], rax
lea     rax, [rsp+58h+var_20]
mov     [rsp+58h+var_50], rax
mov     [rsp+58h+var_48], 0
call    runtime_convT2E
mov     rax, [rsp+58h+var_40]
mov     rcx, [rsp+58h+a] ; a
mov     [rsp+58h+var_18], rax
mov     [rsp+58h+var_10], rcx
lea     rax, [rsp+58h+var_18]
mov     [rsp+58h+var_58], rax
mov     [rsp+58h+var_50], 1
mov     [rsp+58h+var_48], 1
call    fmt_Println
movsd   xmm0, cs:$f64_408f400000000000
movsd   xmm1, [rsp+58h+t]
addsd   xmm0, xmm1
movsd   [rsp+58h+t], xmm0
movsd   xmm1, cs:$f64_404f6a7a2955385e
ucomisd xmm1, xmm0
ja      loc_401083

f64_404f6a7a2955385e is a precalculated double value equal to 10 * 2 * math.Pi or 62.8318530718

Go compiler recently switched to SSA, so these kind of optimizations will just keep improving as they greatly benefit from it. For now SSA is only available on amd64:

Compiler Toolchain

This release includes a new code generation back end for 64-bit x86 systems, following a proposal from 2015 that has been under development since then. The new back end, based on SSA, generates more compact, more efficient code and provides a better platform for optimizations such as bounds check elimination.

1.8 should have it for all supported architectures:

Compiler Toolchain

Go 1.7 introduced a new compiler back end for 64-bit x86 systems. In Go 1.8, that back end has been developed further and is now used for all architectures.

nr: #3 dodano: 2016-12-26 11:12

The current Go compiler does not move loop invariant computations outside loops.

The list of passes of the compiler can be seen here

In the example by @creker, the compiler did constant folding, not loop invariant code motion.

As a side note, I did make a few months ago a LICM pass for the Go compiler

which does not improve performance very much on the typically used Go benchmarks. (I blame the atrocious register allocation :P)

Source Show
◀ Wstecz