[Fix] Strange uses of movzx by Clang and GCC

While coding in C language, I recently stumbled upon some movzx uses by both Clang and GCC compilers, which was very unexpected to me. Here is the sample of code I tried with GCC 12:

#include <stdint.h>

int add2bytes(uint8_t* a, uint8_t* b) {
    return uint8_t(*a + *b);
}
add2bytes(unsigned char*, unsigned char*):
        movzx   eax, BYTE PTR [rsi]
        add     al, BYTE PTR [rdi]
        movzx   eax, al
        ret

When i tried to use the code with clang i got an even more different and confusing output

add2bytes(unsigned char*, unsigned char*):                       # @add2bytes(unsigned char*, unsigned char*)
        mov     al, byte ptr [rsi]
        add     al, byte ptr [rdi]
        movzx   eax, al
        ret

I would like to share the steps that helped me to fix and understand the Strange uses of movzx by Clang and GCC

Why Strange uses of movzx by Clang and GCC is seen?

The Strange uses of movzx by Clang and GCC; is seen because both compilers GCC and Clang are not doing a good job in this situation, but clang’s code is even worse as it has no real upside anywhere in the code or in the output.

It is an easily avoidable downside on almost all CPU’s except decade-old Intel CPUs. The new x86 CPUs have more registers as compared to register names. That is the reason why the physical registers are constantly being renamed. This is a complicated process.

A simple and optimal solution is to, movzx load, then byte add, then leave a uint8_t result in the low byte and correctly zero-extended to int as directed by the C semantics.

A detailed solution to understand Strange uses of movzx by Clang and GCC is provided below:

How to fix and understand the Strange uses of movzx by Clang and GCC ?

To fix and understand the Strange uses of movzx by Clang and GCC, you would have to movzx load, then byte add, leaving a uint8_t result in the low byte, correctly zero-extended to int as required by the C semantics.

movzx is necessary in your C code, but it could also be possible that movzx is present in your initial load. (A movzx is usually a great idea for a byte load anyway, in order to avoid the dependency in the old RAX; clang’s choice to save 1 byte is not brilliant one even when there is not a need to separate movzx right after.)

Since the children prop was removed from the FunctionComponent (React.FC) in the latest update of React 18, so now you have to add it manually.

This should help you fix and understand the problem correctly.

Conclusion

To fix and undestand theStrange uses of movzx by Clang and GCC ; you would have to movzx load, then byte add, then leave a uint8_t result in the low byte and correctly zero-extended to int as directed by the C semantics.