In other words,
bool overflowed = (x+1)<x;
should be meaningful. It may or may not do what you want on any given architecture, but it shouldn’t just be assumed false. bool overflowed = (x+1)<x;
I was convinced that some warning, probably -Wtautological-compare, already handled this and I plugged it into godbolt to see which one, but got no warnings with either gcc or clang. Frankly, I'm stunned and even a bit annoyed. Clearly this code deserves a warning, unless there's some good reason I'm just blind to right now.Nevertheless, I don't know what to do about the suggestion "it may or may not do what you want on any given architecture, but it shouldn’t just be assumed false." The compiler needs to know what it can and can't do. The common advice of "just do what the CPU does" doesn't work, we need to know what to do when cross-compiling, when constant folding, and we need to know which instructions are valid to select. If I selected PADDSW for this add (a saturating addition operation) but then later when you use x+1 I select a non-saturating addition, would you be happy with that? Probably not. The compiler needs actual rules to follow.
I don't know how to apply the suggestion about warnings when we end up deleting dead code. The compiler deletes dead code all the time, consider a case like "vector<t> v; v.push_back(a); v.push_back(b);", each push_back begins with an if-statement on whether reallocation is required, and that becomes constant with inlining. Tracking "this code became dead because", well, because which situations exactly?
Deleting dead code because the code demonstrably wouldn’t do anything (that is, it has defined behavior that is not observable) makes sense and is of course hugely useful. Deleting code that isn’t dead, it just doesn’t have a universally defined behavior, is the issue.
Can I delete "if (x & 3 == 16)" without a warning? There is no 'x' which makes that expression true, so I can safely fold it to false without a warning?
Can I delete "if (x + 1 < x)" without a warning? There is no signed 'x' which makes that expression true, so I can safely fold it to false without a warning?
How about this:
int x = 7;
call_function_outside_this_file();
if (x != 7) { /* dead */ }
Does deleting the code require a warning or no?Or this:
void f(int *x, float *y) {
*x = 1;
*y = 2;
if (*x != 1) { /* dead */ }
A float cannot alias an int, so '*x' can not have changed. Warning or no?The problem with UB is that you can use it to set up impossible situations, like create an 'x' where x & 3 == 16 is true or a variable whose address was never taken being modified through a pointer, and so on. If you account for UB then "code that doesn't have a universally defined behaviour" becomes all code.
Ideally I think the first two examples should have warnings, though not because we delete the code, and the last two shouldn't? The warning should be because it's a tautology so the human likely didn't mean to write that (for instance if the human wrote it indirectly through macros, then we shouldn't warn on it).
void f(int *px, float *py) {
int x = *px;
x = 1;
*y = 2;
if (x != 1) { /* dead */ }
*px = x;
If I put in a dereference, I expect a dereference to happen. Not dereferencing the pointer when I wrote a dereference operator seems like going too far. If they aren’t supposed to alias, but they did anyway, the code should do the wrong thing, in a way that makes sense based on the code I wrote.I’m obviously just a holdover from the 90s, but it does seem we’ve leaned too far into hidden assumptions that the compiler thinks I share, rather than doing what the code says, or a simplification of what the code says.
vector<t> v;
v.push_back(a);
v.push_back(b);
v.push_back(c);
v.push_back(d);
Let's suppose the definition of our vector here looks something like this: template <typename T> struct vector<t> {
size_t len = 0;
size_t storage_len = 0;
T *storage = nullptr;
void push_back(T t) {
if (len + 1 > storage_len) {
storage_len = storage_len ? storage_len * 2 : 1;
storage = realloc(storage, storage_len);
}
storage[len] = t;
++len;
}
};
So here's what happens. vector<t> v;
v.push_back(a);
v.push_back(b);
v.push_back(c);
v.push_back(d);
becomes len = 0;
storage_len = 0;
storage = nullptr;
if (len + 1 > storage_len) {
storage_len = storage_len ? storage_len * 2 : 1;
storage = realloc(storage, storage_len);
}
storage[len] = a;
++len;
if (len + 1 > storage_len) {
storage_len = storage_len ? storage_len * 2 : 1;
storage = realloc(storage, storage_len);
}
storage[len] = b;
++len;
if (len + 1 > storage_len) {
storage_len = storage_len ? storage_len * 2 : 1;
storage = realloc(storage, storage_len);
}
storage[len] = c;
++len;
if (len + 1 > storage_len) {
storage_len = storage_len ? storage_len * 2 : 1;
storage = realloc(storage, storage_len);
}
storage[len] = d;
++len;
becomes len = 0;
storage_len = 0;
storage = nullptr;
if (0 + 1 > 0) {
storage_len = 1;
storage = realloc(storage, 1);
}
storage[0] = a;
len = 1;
if (1 + 1 > 1) {
storage_len = 2;
storage = realloc(storage, 2);
}
storage[1] = b;
len = 2;
if (2 + 1 > 2) {
storage_len = 4;
storage = realloc(storage, 4);
}
storage[2] = c;
len = 3;
if (3 + 1 > 4) {
storage_len = 8;
storage = realloc(storage, 8);
}
storage[3] = d;
len = 4;
In that last one, you can see that the if-expression is false and the body becomes dead code. If I understand the rule you're proposing, you want to get a warning or error for that? // Source by nlewycky
template <typename T> struct vector<t> {
size_t len = 0;
size_t storage_len = 0;
T *storage = nullptr;
void push_back(T t) {
if (len + 1 > storage_len) {
// FIXME: what if storage_len > (size_t_max >> 1) ??
// ? Define maximum expected growth unit? 1K? 4K? 64K?
// Assignment from trinary op visually confusing, add ( )
storage_len = ( storage_len ? storage_len * 2 : 1 );
storage = realloc(storage, storage_len);
}
storage[len] = t;
++len;
}
};
//
vector<t> v;
v.push_back(a);
v.push_back(b);
v.push_back(c);
v.push_back(d);
After inlining all 4 times, the compiler might notice that in this code section vector<t> v always reaches the final size of storage = realloc(storage, 8); and optimize for that. // Pseudo code
template <typename T> vector<T> v = {
// final state at end of code section
size_t len = 4;
size_t storage_len = 8;
T *storage = realloc(NULL, storage_len);
};
// Compiled and included, forgot template function syntax.
void template <typename T> vector<T>.push_back(T t);
(v[0], v[1], v[2], v[3]) = (a, b, c, d);
Even in this case, all the code the programmer wrote was evaluated and had side effects at least once at compile time. No code was eliminated as unreachable / impossible to reach. Instead it was optimized into operations that would always occur, barring realloc failure (which wasn't specified in the toy source, but would probably be a fatal error).In all the cases that the hypothetical -Wubelim would cover the written code would be evaluated by the compiler and 'as it's undefined it can't ever be true, silent elimination'. That's the case where the human that wrote the code and the compiler that's interpreting a specification to claim that code can't ever do anything. Rather than transliterate the code as it was written to the machine instructions the programmer probably expected to see were this native machine assembly they'd written, the compiler behaved differently, silently.
In a 7-character sequence it may seem pretty obvious that it's probably intended to mean something, but when joining together multiple independent things doing redundant operations it's much less trivial.
int game_logic(int prev_score, int bonus) {
if (bonus < 0) { // check for bad argument
return -1;
}
int new_score = prev_score + bonus;
// ...
if (new_score < prev_score) { // sanity check
abort();
}
return new_score;
}
// in the above the abort path can (and is) already optimized out, but an invocation of game_logic(x,1) becomes exactly x+1<x(I commonly use -fwrapv when writing programs in C.)