Null Is Not Empty

chime · on May 15, 2009

> The answer to that question is "I don't know -- there is data missing", so the database returns Null.

There is no way I'd give that answer to my boss (via dashboard reports or in person) Even though he asked me what the Sales in August were, my answer would have to be $123456 + Missing Sales Staff (which has 12-month rolling of $789) so estimated Sales in August is $123456 + $789. Null does not work in real-world and a computer that says "I don't know" is not a useful computer.

I'm definitely in the minority here but personally I think databases would be a lot easier if Null=empty string/0/0D/False or any init value set for a field. I've been coding for 15 years now and have not once used Null in a positive, useful way. The only time I care about Nulls is in WHERE (field='' OR field IS NULL). I would rather it treat nulls like empty strings etc. because then I only have to check for field=''. Nulls in every real-world app I've used mean the same as empty string/false etc.

Customer name isn't entered in the system yet? That's not allowed so system will not create the record. Has customer been set as type 'Active'? We don't know? Then they're not active. Do they have multiple shipping addresses? 0 rows in the customer-ship table? That means no, they don't have multiple shipping addresses as of right now. The worst is having a null in a boolean field. That's not boolean, that's a ternary! I either want a TRUE or a FALSE in a boolean field. Null doesn't make sense. Is employee salaried or not? Null is not allowed in that field. The user-interface will not accept a blank or null. It will be a radio button 'Yes' or 'No'. There is no way a Null can be set for that field.

I can totally understand nulls in C/C++ with pointers etc. That makes complete sense and I'd be a fool to argue against the need for nulls there. But for databases where a field value is missing, it's just extra work instead of any benefits. I don't see why I'd ever need a Null. It's not like I'm using weird hacks to do whatever null is supposed to do. I'm just saying all I ever get from null is extra code on my end yet I get no benefits of detecting that something is null.

ars · on May 15, 2009

There are plenty of uses for null. For example: last failed login date. What do you put if it never happened? You put NULL.

As a general rule, there is no good "empty" value for things that are not strings, so you use NULL for those.

For your boolean: you have a new user and want to ask if he wants the widget enabled. They could say yes or no - but until you ask, you set it to NULL, and it's an excellent flag for your application that means: ask the user.

If you have never used NULLs in your databases, well, that's more a failing on your part than a failing on the concept of NULL.

Sometimes you want them, and sometimes you don't - that's why the database asks you.

NULL is used as the answer to x/0.

NULL is used to indicate lack of records on an outer join.

If you ask for the average value of something - and there are no rows, you get NULL - what else should it return? 0 is not the correct answer.

> a computer that says "I don't know" is not a useful computer

And a computer that returns incorrect information is worse. You asked it "what are the total sales for august". You did not ask "what are the sales so far for august".

Without the NULL as the answer, how does your front end application know to collect the running average to display to the user? You most certainly do have to detect NULL.

I will admit it's very rare to use NULL instead of an empty string (in fact in some versions of oracle NULL and the empty string are the same). But NULL is used frequently for other datatypes. And it's used frequently in other parts of databases.

chime · on May 15, 2009

Thanks for the reply. Very good points. NULL is perfect for DIV/0 or no record in join errors. I have absolutely nothing against that. It's the NULL instead of empty string that really bothers me and causes problems. Setting the field to NOT NULL has its own set of problems when used over ODBC. I know it's not the fault of the DB but the problem still remains.

amalcon · on May 15, 2009

You can always put NOT NULL DEFAULT='' everywhere in your schema. If you do use nulls (I use them mostly in foreign keys), it will be good to have the capability, but nobody is required to have them.

That said, NOT NULL probably should have been the default. It's easy to update the schema to allow nulls if you decide you need them later. You need to make up some data to replace the nulls if you decide to take them out.

stcredzero · on May 15, 2009

In OO, isn't this territory covered by the "Missing Object" pattern? Also, aren't there new statically typed languages that define the missing object class for you?

A pernicious thing you find in Smalltalk at certain shops is finding lots of methods defined on UndefinedObject (nil) so that it can act as a Missing Object. (Like comparison operations.) The problem, is that this often results in not knowing what the heck to expect in a variable.

This is very bad. You can't change your mind about this! It can also break the system in insidiously subtle ways. (Sometimes things work because an exception is thrown!)

raganwald · on May 15, 2009

I'm not a huge fan of trying to make One UndefinedObject To Rule Them All:

http://github.com/raganwald/homoiconic/blob/master/2009-02-0...

stcredzero · on May 15, 2009

Yes. If possible, if a variable is for a Floogle, then make it always have a Floogle in it. If the thing is not there, represent it with a MissingFloogle.

I think dynamic languages would be wise to support this explicitly. (There's always a Missing subclass of every type.)

dkarl · on May 15, 2009

Funny how a little care in naming can save countless hours of confusion and explanation. In the original case of VBScript, if the value had been called "Unknown" instead of "Null," it would have saved a lot of confusion.

Now when it comes to languages like Java with only one special non-value value, "null" or "Null" or "Nothing," you have to know the context to know what it means. It might mean "this value is unknown," "this value has not been calculated yet," "an invalid value was calculated," or "there's a bug in the program." Or it might be overloaded to mean that one or more of those might be true.

If exactly one of those special cases is possible in your program, then everything is peachy: use null to represent that case. If you need to represent zero or more than one of those cases, then you'll end up with some kind of ugliness or imprecision.

teilo · on May 15, 2009

I think Lao Tzu said it best: "The Null that can be named is not the true Null". Or something like that.

biohacker42 · on May 15, 2009

Ah yes, but a lot of people think Null == 0. I have seen databases where 99% of the entries were Null, because they didn't know of the one to many relationship and thought Null was a good "number".

DrJokepu · on May 15, 2009

This reminds me to Oracle's behaviour of treating empty VARCHAR fields as NULL. I think it's a bit confusing since an empty jar is not the same as a non-existing jar.

frankus · on May 16, 2009

N.b. In Oracle, empty strings are null. In T-SQL (MS), empty strings are distinct from nulls.

speek · on May 15, 2009

null is icky... In java's arrayList, you can't really tell if a cell hasn't been initialized or has the value "null."

dkarl · on May 15, 2009

<pedantry>There are no uninitialized values in Java. Reference values are initialized to null by default.</pedantry>

I'm not sure how null makes your problem worse...? If you want a value meaning "this value hasn't been set by the program yet" then you'll have to specify a value for that purpose. The easiest thing is to just use null... unless null corresponds to a valid value for the type, in which case it seems fortunate that it's available to represent that value. I find null annoying when it doesn't correspond to any valid value for my type and I'm forced to check for it everywhere anyway.