This small investigation concerns automatic conversion of 32/64 bit byte swapping code (between BE and LE forms) in universal manner, not tied to any specific processor, by common Unix compilers. It is done for the following discussions: http://www.rsdn.ru/forum/cpp.applied/5398146.flat (in Russian) http://stackoverflow.com/questions/7342527/byte-swap-during-copy See the test program code below. To detect automatic conversion of the complex conversion code (splitting into separate bytes using masks and recombining using shifts and logical or), I've been compiling it without USE_* macros. Then, resulting code was analyzed manually. For all platforms, native target was checked (no cross compilation). All main available optimization levels were tested among: -O{0,1,2,3}, -Os, -Og (only since gcc 4.8). Platform 1: FreeBSD/i386. * gcc 4.2 (from base system), any optimization level: no bswaps. * gcc 4.7-4.9 (from ports): automatic detect (3 bswaps in code) for -O2, -O3, -Os; no detect for -O0, -O1, -Og. * clang 3.2 (from base system): automatic detect (3 bswaps in code) for any possible level except -O0. Eventually, this is the best variant among all checked platforms, that a compiler is able to detect byte swapping even for 64-bit value on the 32-bit platform; OTOH it isn't available on not so high optimization levels. Platform 2: FreeBSD/amd64. * gcc 4.2 (from base system), any optimization level: no bswaps. * gcc 4.7-4.9 (from ports): automatic detect only for 64-bit value for -O2, -O3, -Os; no detect for -O0, -O1, -Og. Masks and shifts are used for 32-bit value. * clang 3.2 (from base system) and 3.3 (from ports): automatic detect only for 64-bit value for any possible level except -O0; masks and shifts are used for 32-bit value. Platform 3: Linux/x86_64 (OpenSuSE 12.2). * gcc 4.6, 4.7 (from distro): automatic detect only for 64-bit value for -O2, -O3, -Os; no detect for -O0, -O1, -Og. Masks and shifts are used for 32-bit value. * clang-2.7, clang-3.5-current (both built locally): automatic detect only for 64-bit value for any level except -O0. Masks and shifts are used for 32-bit value. * open64-5.0 (from its site): no automatic detect for any optimization level. Conclusion: one still should not rely on compiler intelligence for the byte swapping, more so 32-bit one, which is more important (at least for IP4 addresses), suffers on 64-bit platform (x86-64 aka amd64). Even if compilers are able to generate bswap instructions, this is done only at optimization levels which can be improper for a specific task. Suggestions, like as built-in functions and assembler statements, are still preferred. This could be extended (if anybody is interesting) to another platform and compiler combinations but I won't spend much time for adapting a compiler; a short, simple and unambiguous instruction shall be present. The test program code: === #include #include #include #include #if USE_BUILTIN #define bswap_constant_32 __builtin_bswap32 #define bswap_constant_64 __builtin_bswap64 #elif USE_ASM static inline uint32_t bswap_constant_32(uint32_t x) { uint32_t r; asm ("bswap %0" : "=r" (r) : "0" (x)); return r; } #ifdef __x86_64 static inline uint64_t bswap_constant_64(uint64_t x) { uint64_t r; asm ("bswap %0" : "=r" (r) : "0" (x)); return r; } #else static inline uint64_t bswap_constant_64(uint64_t x) { return (((uint64_t)bswap_constant_32(x&0xFFFFFFFFllu)) << 32) | bswap_constant_32((x&0xFFFFFFFF00000000llu)>>32); } #endif #else #define bswap_constant_32(x) \ ((((x) & 0xff000000ul) >> 24) \ | (((x) & 0x00ff0000ul) >> 8) \ | (((x) & 0x0000ff00ul) << 8) \ | (((x) & 0x000000fful) << 24)) #define bswap_constant_64(x) \ ((((x) & 0xff00000000000000ull) >> 56) \ | (((x) & 0x00ff000000000000ull) >> 40) \ | (((x) & 0x0000ff0000000000ull) >> 24) \ | (((x) & 0x000000ff00000000ull) >> 8) \ | (((x) & 0x00000000ff000000ull) << 8) \ | (((x) & 0x0000000000ff0000ull) << 24) \ | (((x) & 0x000000000000ff00ull) << 40) \ | (((x) & 0x00000000000000ffull) << 56)) #endif int main(int argc, char *argv[]) { uint32_t u = 0x1A2B3C4Dul; uint64_t u2 = 0x1F2E3D4C5B6A7988ull; if (argc >= 2) u = strtoul(argv[1], NULL, 0); if (argc >= 3) u2 = strtoull(argv[2], NULL, 0); printf("%lX %llX\n", bswap_constant_32((unsigned)u), (uint64_t) bswap_constant_64(u2)); return 0; } === SY, Valentin Nechayev (Netch)