@blakespot So, both the "adjust flag"

@blakespot So, both the "adjust flag" (AF) and the "parity flag" (PF) come from the 8080-family CPUs from the 1970s. Today they're almost completely unused. The parity flag is set if, in the last eight bits of the result, the number of set-bits is odd. Otherwise, it's cleared. The adjust flag is set if there's a "carry out" from the low four-bits of the addition (and otherwise cleared). This was used for binary-coded decimal – that flag would indicate a carry from one 4-bit digit to the next.

@blakespot Both these flags are computed on every ADD or SUB instruction (extremely often), and 64-bit ARM has no such functionality. For example, to compute the parity flag on ARM, a subtraction turns into something like:

subs w4, w4, w5
dup v24.16b, w4
cnt v23.16b, v24.16b
umov w22, v24.b[0]
and w22, w22, #1

That's a lot of work for one subtraction (5x as many instructions), and that's not all of it - we didn't compute AF.

@blakespot But it's not really a lot of work for Intel - in every x86 CPU, they just build some logic that computes both AF and PF at the same time as doing the subtraction. So, since Apple design their own CPUs, they decided to also build this logic. When running Rosetta 2, the CPU is configured to enable this functionality (since it'd break the ARM specification to have it enabled all the time). With that set Rosetta 2 does:

subs w4, w4, w5

And both flags are computed for it by the hardware.

@blakespot Rosetta 2 can also run in Linux VMs on Apple Silicon. In a VM, it isn't able to configured the host CPU, so it can't use this functionality. There are two other options. Either, you can skip computing the flags, because they're mostly useless and most software won't care. Or you can compute them the long way shown above. Rosetta 2 chooses the second option, and this mostly works out fine, because they have an "unused flags" optimisation that avoids the computation a lot of the time.