The thing I am describing is when you link a compilation unit using:
struct internal_state { int dummy; } state;
with another compilation unit that defined the same state differently: struct internal_state {
int actual_meaningful_member_1;
unsigned long actual_meaningful_member_2; } state;
As far as I know, BSD socked do not do this. Zlib was doing this (https://github.com/pascal-cuoq/zlib-fork/blob/a52f0241f72433... ), but I have had the privilege of discussing this with Mark Adler, and I think the no-longer-necessary hack was removed from Zlib.BSD sockets probably have a different kind of UB, related to so-call “strict aliasing” rules, unless they have been carefully audited and revised since the carefree times in which they were written. I am going to have to let you read this article for details (example st1, page 5): https://trust-in-soft.com/wp-content/uploads/2017/01/vmcai.p...
/*
* Structure used by kernel to store most
* addresses.
*/
struct sockaddr {
unsigned char sa_len; /* total length */
sa_family_t sa_family; /* address family */
char sa_data[14]; /* actually longer; address value */
};
/*
* RFC 2553: protocol-independent placeholder for socket addresses
*/
#define _SS_MAXSIZE 128U
#define _SS_ALIGNSIZE (sizeof(__int64_t))
#define _SS_PAD1SIZE (_SS_ALIGNSIZE - sizeof(unsigned char) - \
sizeof(sa_family_t))
#define _SS_PAD2SIZE (_SS_MAXSIZE - sizeof(unsigned char) - \
sizeof(sa_family_t) - _SS_PAD1SIZE - _SS_ALIGNSIZE)
struct sockaddr_storage {
unsigned char ss_len; /* address length */
sa_family_t ss_family; /* address family */
char __ss_pad1[_SS_PAD1SIZE];
__int64_t __ss_align; /* force desired struct alignment */
char __ss_pad2[_SS_PAD2SIZE];
};And in particular, what about something like this?
struct Foo {
#ifdef __cplusplus
int bar() const { return bar_; }
private:
#endif
int bar_;
};
Or, taking this a step further: struct _Foo;
typedef struct _Foo Foo;
// In C "struct _Foo" is never defined.
int Foo_bar(const Foo* foo) { return *(int*)foo; }
void Foo_setbar(Foo* foo) { *(int*)foo; }
Foo* Foo_new() { return malloc(sizeof(int)); }
#ifdef __cplusplus
struct _Foo {
void set_bar() { bar_ = bar; }
int bar() const { return bar_; }
private:
int bar_;
};
#endif
The above isn't ideal but it does provide encapsulation in a way that doesn't seem to violate strict aliasing (the memory location is consistently read/written as "int").This is different from pretending that the address of a struct s { int a; double b; } is the address of a struct t { int a; long long c; } and accessing it through a pointer to that. If you do that, C compilers will (given the opportunity) assume that the write-through-a-pointer-to-struct-t does not modify any object of type “struct s”. This is what the example st1 in the article illustrates.
The latter is what I suspect plenty of socket implementations still do (because there are several types of sockets, represented by different struct types with a common prefix). It is possible to revise them carefully so that they do not break the rules, but I doubt this work has been done.