undefined | Better HN

0 pointsjmwilson4y ago0 comments

With shifts and masks you know where the bits are. With bitfields, you don't because the specification leaves everything up to the compiler.

  struct foo {
    char a : 4;
    char b : 4;
  };

Is a in the high-order 4 bits, or the lower 4 bits? Both choices are allowed, so it's up to the compiler and makes the code non-portable.

0 comments

19 comments · 5 top-level

dmitrygr4y ago· 6 in thread

sometimes you do not care, and

   x = foo.a

is simpler than

  x = (foo & FOO_MASK_A) >> FOO_SHIFT_A

and for assignments, the difference is even bigger:

  foo.a = x

is much better than

  foo = (foo &~ FOO_MASK_A) | ((a << FOO_SHIFT_A) & FOO_MASK_A)

InitialLastName4y ago

The case where a) you don't care about the in-memory representation of your struct and b) you care a lot about being able to pack into the absolute minimum memory space, but not enough to make sure the compiler actually packs the fields (depending on architecture and optimization settings, they might not!) is vanishingly small.

The more frequent perceived use for bit-fields (in the situation where they actually work) is to pack into a serialized data format, such that memory or a data stream can be accessed elsewhere. In that case, "the compiler can do whatever it wants with your data packing" is pretty useless, since your "elsewhere" might have a different compiler that does a totally different thing.

dfox4y ago

Optimization settings should not affect memory layout as that is specified by ABI (and large part of the “art of structure packing” is about manually reordering struct fields because the compiler cannot do that however obvious the optimization would be).

And as for the second part: anything that writes sizeof(struct foo) bytes of struct foo is inherently non-portable. If you portably want to (de)serialize something you want to write the thing explicitly, very often the compiler will optimize it to more direct implementation. (And well, this is only portable to platforms where CHAR_BITS == 8)

1 more reply

dmitrygr4y ago

> is vanishingly small.

Ladies and gentlemen, this thought is why we now consider 8GB of ram to be a "weak device".

No, no no no no, 1000 times no. Every situation is a low ram situation. Every!

1 more reply

WalterBright4y ago

One of the annoying things about doing it manually is you have to come up with all those special identifier names for the shifts and masks.

zozbot2344y ago

If "foo" is defined as part of an API/ABI that's used in multiple compile units you will always care, since otherwise a random change in "implementation defined" bitfield encodings on some obscure architecture might break your build. Bitfields are a misfeature in most real-world cases.

PaulDavisThe1st4y ago

You omitted the CAS required for assignment in threaded or otherwise reentrant code. Understandably, of course.

chrisseaton4y ago· 4 in thread

Surely there’s an ABI? Otherwise how does this work at all?

masklinn4y ago

> Otherwise how does this work at all?

Hopes, prayers, and a single version of a single compiler being involved.

ncmncm4y ago

This. The ABI is an aspiration, not an implementation. It has turned out well elsewhere than bitfields.

throwaway70334y ago

Most ABIs are complex and only partially specified; you're playing with fire if you use bitfields or other ill-defined features in public APIs.

pavlov4y ago

Never ever use bitfields in structs that may cross library boundaries. There are some corners of C that are not fit for public APIs.

Findecanor4y ago· 2 in thread

While the C and C++ language specs don't specify the layout of bitfields, modern platforms tend to have a specified ABI which compilers follow when compiling for that platform.

64-bit Linux distros and the BSDs follow the convention once set by the "C ABI for Itanium".

In that, bitfields are grouped in declaration order into container words of the same width as the bitfield's type (char, int, etc.). Bitfields don't span multiple container words, and container words don't overlap. On little-endian platforms, bitfields are packed LSB first, but on big-endian platforms they are packed MSB first within their container word. Alignment rules apply only to the container words.

ncmncm4y ago

That is all very fine.

If the instructions emitted and the instructions implemented both happen to match that, on every chip your code must run on, you got lucky.

dfox4y ago

The point is that if you care about the resulting in-memory layout then you by definition know on what platform the code will run and what is the ABI.

If you want to produce same sequence of bytes regardless of underlying platform, then you have to do it by hand with uint8_t[] buffers and explicit shifts and masks. Casting pointer to struct to char* and writing it somewhere is inherently non-portable and this gas nothing to do with bitfields and nothing to do with things like __attributte__((packed)), although both of these things are useful when you want to do that and understand the (non-)portability implications.

1 more reply

iainmerrick4y ago· 2 in thread

With shifts and masks you know where the bits are.

You know where the bits are within a single word. But if you have a struct with multiple fields, it’s not safe to rely on the exact memory layout even if it doesn’t have any bitfields.

If you need to represent a very specific memory layout, it’s not just bitfields you need to avoid, it’s structs in general.

Conversely, if you don’t need to guarantee a specific layout, bitfields are fine to use, and could be a useful optimisation hint for the compiler.

ncmncm4y ago

In other words, you don't understand.

iainmerrick4y ago

Here’s an example where I think bitfields are totally appropriate:

Say I have a window manager, and I want to attach a bunch of boolean flags to each window object (isVisible, isMaximized, etc). I don’t need to serialize them to disk. It’s highly preferable that they should be efficiently bit-packed, but not strictly essential.

The conservative way to implement that would be bit-shifts and masking (either manually or via a macro). But implementing it with bitfields would be a lot easier and less error-prone, and would work just as well. What problems do you see with the bitfield approach?

1 more reply

kevin_thibedeau4y ago

Could be neither if char is bigger than 8-bits.

j / k navigate · click thread line to collapse

0 comments

19 comments · 5 top-level

dmitrygr4y ago· 6 in thread

sometimes you do not care, and

   x = foo.a

is simpler than

  x = (foo & FOO_MASK_A) >> FOO_SHIFT_A

and for assignments, the difference is even bigger:

  foo.a = x

is much better than

  foo = (foo &~ FOO_MASK_A) | ((a << FOO_SHIFT_A) & FOO_MASK_A)

InitialLastName4y ago

dfox4y ago

1 more reply

dmitrygr4y ago

> is vanishingly small.

Ladies and gentlemen, this thought is why we now consider 8GB of ram to be a "weak device".

No, no no no no, 1000 times no. Every situation is a low ram situation. Every!

1 more reply

WalterBright4y ago

One of the annoying things about doing it manually is you have to come up with all those special identifier names for the shifts and masks.

zozbot2344y ago

PaulDavisThe1st4y ago

You omitted the CAS required for assignment in threaded or otherwise reentrant code. Understandably, of course.

chrisseaton4y ago· 4 in thread

Surely there’s an ABI? Otherwise how does this work at all?

masklinn4y ago

> Otherwise how does this work at all?

Hopes, prayers, and a single version of a single compiler being involved.

ncmncm4y ago

This. The ABI is an aspiration, not an implementation. It has turned out well elsewhere than bitfields.

throwaway70334y ago

Most ABIs are complex and only partially specified; you're playing with fire if you use bitfields or other ill-defined features in public APIs.

pavlov4y ago

Never ever use bitfields in structs that may cross library boundaries. There are some corners of C that are not fit for public APIs.

Findecanor4y ago· 2 in thread

While the C and C++ language specs don't specify the layout of bitfields, modern platforms tend to have a specified ABI which compilers follow when compiling for that platform.

64-bit Linux distros and the BSDs follow the convention once set by the "C ABI for Itanium".

ncmncm4y ago

That is all very fine.

If the instructions emitted and the instructions implemented both happen to match that, on every chip your code must run on, you got lucky.

dfox4y ago

The point is that if you care about the resulting in-memory layout then you by definition know on what platform the code will run and what is the ABI.

1 more reply

iainmerrick4y ago· 2 in thread

With shifts and masks you know where the bits are.

You know where the bits are within a single word. But if you have a struct with multiple fields, it’s not safe to rely on the exact memory layout even if it doesn’t have any bitfields.

If you need to represent a very specific memory layout, it’s not just bitfields you need to avoid, it’s structs in general.

Conversely, if you don’t need to guarantee a specific layout, bitfields are fine to use, and could be a useful optimisation hint for the compiler.

ncmncm4y ago

In other words, you don't understand.

iainmerrick4y ago

Here’s an example where I think bitfields are totally appropriate:

1 more reply

kevin_thibedeau4y ago

Could be neither if char is bigger than 8-bits.

j / k navigate · click thread line to collapse