Damn. This is how you do enterprise.
I might be the only person who likes SQL nulls. If you learn how they work up front, they're useful and not really that confusing. But if I ran into weird behaviors like this, I might hate them too.
The problem with Null is that it can be either a value or show the lack of presence, and in JS because both undefined and null are values you can't depend on them being used for their semantics either (you can try to enforce their use but it that's no guarantee).
SQL suffers this exact same problem. I wonder what SQL would look like without Null.
Select id, Option (key)
From table
Insert... Some(5), None::Int
From this perspective it still seems that empty string should not be None, but I think you're right that many people do use a code like "N/A", but you can just add a supplemental Boolean field which is cleaner for set / not set.Interesting
Emtpy string vs null is not that useful if you have null and undefined.
Take the middle name example: "What was U.S President Theodore Roosevelt’s middle name?" when you know he didn't have a middle name.
Are you suggesting that a blank is the correct choice here?
I don't think it's accurate to say they have a blank middle name. I think it's better to say they don't have a middle name.
That Oracle behavior annoys me each time, though.
It's never been a problem with strings in my many decades of experience, unless somebody does something which I consider poor system engineering. Nearby I invited a solid use-case illustrating a real string need.
Null should equal null like in every other programming language even SQL group by do null equals null which is even more inconsistent.
Think of SQL's NULL like it's a NaN.
(Microsoft's SQL Server still defaults to the non-ANSI NULL behavior where a = b when both are NULL, and that's something that still pings on checklists of SQL Server if it follows ANSI standards. SQL Server is kind enough to let you enable/disable the behavior, and likely that would persist even after the default switches to meet the standard as the docs assure will happen "in some future version".)
NULL = Undefined or No Data. Whereas a blank field can, in and of itself, be data. It may indicated something is intentionally left blank.
But for those times where you want to consider them the same, it would be nice to have a setting.
(Note that I admit the possibility that this may exist already, like most my great ideas.)
Suppose we have an "age" field, but don't actually know the age of the person, null makes perfect sense.
Otherwise we'd have do do something like using a "magic number" like 0, -1, or a separate field altogether to indicate an unknown value.
Granted, they do need some handling in queries.
I'm writing in Go, so my structs all have empty values unless specifically initialised. This does sometimes mean that I get null uuid's (0000-000000-00000-0000) inserted into tables, which Postgres doesn't understand as null and cheerfully returns as a valid uuid. This has been my only real pain with using nulls.
I've contemplated modding the database driver to interpret nil-value uuid's as null, but that seems a little drastic. Anyone got any better ideas?
1. NULLIF(the_uuid, '0000-000000-00000-0000')
https://www.postgresql.org/docs/current/functions-conditiona...
2. on insert/update rules to rewrite the uuid (probably using NULLIF)
`database/sql` does have an interface which lets you define your own marshaling code though. Using it is very simple and would let you marshal an all-zero UUID into NULL easily enough.
One interesting problem that arises when you use nulls is that it can be difficult to ensure people actually use them when it's appropriate to do so. Case in point I have a field that is essentially an optional positive integer, so obviously I made it a nullable unsigned int. A few years later there's a pretty even spread of nulls and 0s in there to indicate the same thing -- to a lot of consumers, they behave the same because their business logic basically says "if (foo->field) do stuff" which works either way. In the end I changed it to a non-nullable field using 0 as the null stand-in, which is semantically worse, but ended up making interesting searches over this data set a lot easier.
On the one hand, perhaps the more correct answer would have been to yell at people putting 0s in when they should have been putting nulls in. On the other hand, we put constraints on fields for a reason...
All other relational databases differentiate between empty strings and NULL.
SQL*Plus: Release 11.2.0.4.0 Production on Mon Feb 3 15:56:30 2020
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit Production
SQL> create table foo(fook varchar2(10) not null);
Table created.
SQL> insert into foo values ('');
insert into foo values ('')
*
ERROR at line 1:
ORA-01400: cannot insert NULL into ("SPORTSBOOK_DOCK"."FOO"."FOOK")
SQL> create table bar(bark varchar2(10));
Table created.
SQL> insert into bar values ('');
1 row created.
SQL> insert into bar values ('a');
1 row created.
SQL> insert into bar values (null);
1 row created.
SQL> select * from bar;
BARK
----------
a
SQL> select * from bar where bark = '';
no rows selectedYes, because it will convert the empty string into a NULL, and fail to insert that NULL into a non-NULLable column.
It's also true for AWS' DynamoDB offering.
Further suppose that you have an external function in your codebase to estimate how many paychecks you've paid to someone, but the author doesn't know about any "0000-01-01" conventions your office uses. Without that, you'd see that Joe New Guy has worked here about 2,020 years, so we've probably issued him about 48,000 checks. If only you'd used null, then that function would have calculated "today() - null", which in any sane language would raise a type exception and alert you to the problem.
Nulls are beautiful. They have meaning. Lots of people misuse them, but that doesn't mean they're not valid and useful.
2. NULLs are used by SQL functions and operators as an "unknown" value. So, for example, "NULL AND TRUE" is NULL, because we could substitute NULL with TRUE or with FALSE to get different results, but "NULL AND FALSE" is FALSE, because no matter what we substitute NULL with, the result will always be FALSE.
3. Clearly all these valid uses of NULL do not indicate "flaws" in the database design.
4. Database normalization isn't always a good thing, and beyond a certain level it's almost always a bad thing, so using normalization methods as a standard for whether something is "flawed" is probablyn ot the best idea.
5. No database normalization method, as far as I know, actually tries to eliminate NULLs, so I don't know what "(See database normalizations)" refers to. Can you clarify?
SELECT * FROM my_table WHERE my_column != 5
You would expect it to return rows that have a null value for my_column, since null is not 5. However that is not the case. ... WHERE my_column IS DISTINCT FROM 5 SELECT * FROM my_table WHERE my_column NOT IN (5, 6)A NULL value technically means it's unknown. An unknown value might be 5, hence why it's not in the result set. Some abuse NULL to mean "value doesn't exist". But a value that doesn't exist can't be 3, or 42, or any other value that's different from 5, so in that regard shouldn't be part of the result set either.
Others again abuse NULL to mean "doesn't apply". And in that case I think it makes sense to include the row in the result set. For example, if I write a query to get all people who's middle name is not "William", I'd most likely want people without middle names included.
Maybe we should have introduced NEX (non-existing) and NAP (non-applicable) as possible values in addition to NULL?
Codd (the inventor of relational algebra) actually suggested this. I think the primary source is a book that may not be on the web. There's some discussion here (mostly saying why they think it didn't happen and wouldn't work out): https://arxiv.org/html/1606.00740v1/
No, that would muddy things in my opinion, like it did to JavaScript. Instead, have more operations/functions for dealing with them in a more "normal" way, so that we can say "WHERE x <> 5" and get results one expects. I'm not sure the syntax, and my drafts would take a lot of time to explain. To give a taste, maybe have something like "WHERE ~x <> 5" in which the tilde converts x's value to the type's default, such as a blank in the case of strings.
If the different reasons for "emptiness" matter, then usually it suggests the need for a "status" column of some kind so that queries can be done on the reasons. I'd need to study domain specifics to recommend something specific.
If a value is unknown, you don't know if it is different from 5, so it would be incorrect to return in the query.
Also a null is no equal to anything.. not even another null
This will print false in SQL Server
if null = null print 'true' else print 'false'
Wrong. It is equal to UNKNOWN:
https://docs.microsoft.com/en-us/sql/t-sql/queries/is-null-t...
shortcuts "OR is null", works within functions.
When it's the result of a relational operation (such as a LEFT JOIN) however, the default makes sense while considering NULLs as equal to each other is typically not useful.
Also, at a certain point, knowing that nulls are present gives you yet another measure of dq: not knowing if they are present and hidden vs. visible and countable can be the difference between a wrong answer vs. just an uncertain one.
There is a lot of discussion in this thread about whether this implementation of null checking in Oracle is appropriate, analysing it, but the current implementation is just fine, it has been tested by time.
The internet does tend to rehash the same arguments over and over!!! The internet forgets. I remember these arguments 20 years ago.
> the current implementation is just fine, it has been tested by time.
No, it isn't "just fine". It is broken. Just because something has been broken for a very long time and has spawned an entire industry devoted to dealing with the fact that it is broken does not change the fact that it is broken.
There is no mention of outer joins in this thread, no mentions of the ability to minus results of one query from another which are basic constructs which handle many of the issues that are discussed here. It says that the people here are inexperienced with Oracle. Everyone here trying to resolve issues using inner joins. Inexperience.
If people here had experience, not only would these topics have been discussed, but the real issues with NULL would have been discussed, one of which I mention in my previous post.
I also swear I have seen a gotcha involving UPDATE WHERE IN and not throwing an error where it should have, which is why I always quadruple check my update statements, but I wasn't able to reproduce it and couldn't find any information online. I haven't seen the issue in so long I forgot what it was, but it would update all rows in your table even if your WHERE clause was proper.
(NULL AND 0) gives 0
(0 AND NULL) gives 0
(NULL AND 1) gives NULL
(1 AND NULL) gives NULL
(NULL AND NULL) gives NULL
(NULL OR 0) gives NULL
(0 OR NULL) gives NULL
(NULL OR 1) gives 1
(1 OR NULL) gives 1
(NULL OR NULL) gives NULL“Why isn’t it consistent??” - well a lot of systems have a lot of bat shit crazy inconsistencies, some times there for good reason. You learn to keep them straight and get your shit done.
If you want to learn the “why” every time you encounter a system design quirk, be my guest but you may be going down a time intensive rabbit hole with little pay off for yourself.
These same applications do have requirements to deal with empty values. Sometimes an empty value means "I haven't yet entered this value in the to the UI". But in that case the UI won't let you submit the form until you have supplied a valid value.
In other cases an empty value is a valid value. For example, "who is your spouse?" and the answer is "I'm not married".
Sometimes NULL represents "irrelevant", like for "who is your spouse?", where some of the records in the table represent people who can have spouses, and some of the records represent other person-like entities that aren't actually people and therefore they can't have spouses.
Given that NULL is _not_ being used to represent "unknown" values, and there is a requirement to represent empty values, and you don't want to have a whole extra column just to represent "emptiness", the most straightforward way to implement empty values is to use NULL. So that is what happens.
And you have to remember to use "is" instead of "=" when you want to test your empty NULL values for equality with other empty NULL values - because your SQL database is pretending that NULL really means "unknown", and it doesn't want to say that one unknown value is equal to another unknown value, because that would be theoretically incorrect.
SELECT ... FROM ... WHERE blah NOT IN (SELECT foo FROM bar);
getting no hits until I slap my forehead and add WHERE foo IS NOT NULL to the subselect.