As @rickster suggested, I looked at the created x86 assembly of this file (simple.swift):
func thingWithClosure(a: Int, b: (() -> Void)?) { println(a) b?() } thingWithClosure(3) { println("i'm a closure") } thingWithClosure(5, nil) thingWithClosure(2) {}
My build is pretty rusty, but I can squint a little ...
. The main section of the non-optimized generated x86 assembly, or at least part of it, is as follows:
callq _swift_once movq __TZvO Ss7Process11_unsafeArgvGVSs20UnsafeMutablePointerGS0_VSs4Int8__@ GOTPCREL(%rip), %rax movq -64(%rbp), %rcx movq %rcx, (%rax) leaq l_metadata+16(%rip), %rdi movl $32, %r9d movl %r9d, %eax movl $7, %r9d movl %r9d, %edx movq %rax, %rsi movq %rdx, -80(%rbp) movq %rax, -88(%rbp) callq _swift_allocObject leaq __TF6simpleU_FT_T_(%rip), %rcx movq %rcx, 16(%rax) movq $0, 24(%rax) leaq __TPA__TTRXFo__dT__XFo_iT__iT__(%rip), %rcx movq %rcx, -16(%rbp) movq %rax, -8(%rbp) movq -16(%rbp), %rsi movl $3, %r9d movl %r9d, %edi movq %rax, %rdx --> callq __TF6simple16thingWithClosureFTSiGSqFT_T___T_ movq $0, -24(%rbp) movq $0, -32(%rbp) movl $5, %r9d movl %r9d, %edi movq -72(%rbp), %rsi movq -72(%rbp), %rdx --> callq __TF6simple16thingWithClosureFTSiGSqFT_T___T_ leaq l_metadata2+16(%rip), %rdi movq -88(%rbp), %rsi movq -80(%rbp), %rdx callq _swift_allocObject leaq __TF6simpleU0_FT_T_(%rip), %rcx movq %rcx, 16(%rax) movq $0, 24(%rax) leaq __TPA__TTRXFo__dT__XFo_iT__iT__3(%rip), %rcx movq %rcx, -48(%rbp) movq %rax, -40(%rbp) movq -48(%rbp), %rsi movl $2, %r9d movl %r9d, %edi movq %rax, %rdx --> callq __TF6simple16thingWithClosureFTSiGSqFT_T___T_ xorl %eax, %eax addq $96, %rsp popq %rbp retq .cfi_endproc
I indicated where the function is being called with --> . callq looking at a few instructions from each callq , you can see where the argument a moved to the r9d register.
Similarly, optimized output:
callq _swift_once movq __TZvO Ss7Process11_unsafeArgvGVSs20UnsafeMutablePointerGS0_VSs4Int8__@ GOTPCREL(%rip), %rax movq %r14, (%rax) movq $3, -24(%rbp) movq __TMdSi@GOTPCREL (%rip), %rbx addq $8, %rbx leaq -24(%rbp), %rdi movq %rbx, %rsi callq __TFSs7printlnU__FQ_T_ leaq L___unnamed_1(%rip), %rax movq %rax, -48(%rbp) movq $13, -40(%rbp) movq $0, -32(%rbp) movq __TMdSS@GOTPCREL (%rip), %rsi addq $8, %rsi leaq -48(%rbp), %rdi --> callq __TFSs7printlnU__FQ_T_ movq $5, -56(%rbp) leaq -56(%rbp), %rdi movq %rbx, %rsi --> callq __TFSs7printlnU__FQ_T_ movq $2, -64(%rbp) leaq -64(%rbp), %rdi movq %rbx, %rsi --> callq __TFSs7printlnU__FQ_T_ xorl %eax, %eax addq $48, %rsp popq %rbx popq %r14 popq %rbp retq .cfi_endproc
Here, the compiler included the function, so I pointed to println calls with --> .
I took an intro to build x86 using an emulated 16-bit processor many years ago, so I'm not going to pretend to know exactly what is happening here, but it seems to me that when compiling with -O compiler outputs roughly equivalent code (in terms of quantity commands, but maybe not in terms of searching in memory, etc.). It seems that println calls alternate with leaq (load effective address) statements, so we can jump everywhere, but I'm not sure where (maybe more instructions? Maybe load static data?), Or if that matters.
The non-optimized version greatly expands the instructions for the nil parameter case, so the main difference may be debugging performance.
Of course, this is x86, so it can be completely different in ARM .... Perhaps the assembly of ARM, LLVM IR or Swift IR will go beyond the limits of the world?
If someone with a better understanding can clarify, I will gladly update this answer.