开发者

Why doesn't SQL support "= null" instead of "is null"?

开发者 https://www.devze.com 2023-03-28 16:13 出处:网络
I\'m not asking if it does.I know that it doesn\'t. I\'m curious as to the reason.I\'ve read support docs such as as this one on Working With Nulls in MySQL but they don\'t really give any reason.The

I'm not asking if it does. I know that it doesn't.

I'm curious as to the reason. I've read support docs such as as this one on Working With Nulls in MySQL but they don't really give any reason. They only repeat the mantra that you have to use "is null" instead.

This has always bothered me. When doing dynamic SQL (those rare times when it has to be done) it would be so much easier to pass "null" into where clause like this:

@where = "where GroupId = null"

Which would be a simple replacement for a regular variable. Instead we have to use if/else blocks to do stuff like:

if @groupId is null then
     @where = "where GroupId is null"
else
     @where = "where GroupId = @groupId"
end

In larger more-complicated queries, this is a huge pain in the neck. Is there a specific reason that SQL and all the major RDBMS vendors don't allow this? Some kind of keyword conflict or value conflict that it would create?

Edit:

The problem with a lot of the answers (in my opinion) is that everyone is setting up an equivalency between null and "I don't know what the value is". There's a huge difference between those two things. If null meant "there's a value but it's unknown" I would 100% agree that nulls couldn't be equal. But SQL null doesn't mean that. It means that there is no value. Any two SQL results that are null both have no value. No value does not equal unknown value. Two different things. That's an important distinction.

Edit 2:

The other problem I have is that other HLLs allow n开发者_开发技巧ull=null perfectly fine and resolve it appropriately. In C# for instance, null=null returns true.


The reason why it's off by default is that null is really not equal to null in a business sense. For example, if you were joining orders and customers:

select * from orders o join customers c on c.name = o.customer_name

It wouldn't make a lot of sense to match orders with an unknown customer with customers with an unknown name.

Most databases allow you to customize this behaviour. For example, in SQL Server:

set ansi_nulls on
if null = null  
    print 'this will not print' 
set ansi_nulls off
if null = null  
    print 'this should print'


Equality is something that can be absolutely determined. The trouble with null is that it's inherently unknown. If you follow the truth table for three-value logic, null combined with any other value is null - unknown. Asking SQL "Is my value equal to null?" would be unknown every single time, even if the input is null. I think the implementation of IS NULL makes it clear.


It's a language semantic.

Null is the lack of a value.

is null makes sense to me. It says, "is lacking a value" or "is unknown". Personally I've never asked somebody if something is, "equal to lacking a value".


I can't help but feel that you're still not satisfied with the answers that have been given so far, so I thought I'd try another tack. Let's have an example (no, I've no idea why this specific example has come into my head).

We have a table for employees, EMP:

EMP
---
EMPNO           GIVENNAME
E0001           Boris
E0002           Chris
E0003           Dave
E0004           Steve
E0005           Tony

And, for whatever bizarre reason, we're tracking what colour trousers each employee chooses to wear on a particular day (TROUS):

TROUS
-----
EMPNO       DATE        COLOUR
E0001       20110806    Brown
E0002       20110806    Blue
E0003       20110806    Black
E0004       20110806    Brown
E0005       20110806    Black
E0001       20110807    Black
E0003       20110807    Black
E0004       20110807    Grey

I could go on. We write a query, where we want to know the name of every employee, and what colour trousers they had on on the 7th August:

SELECT e.GIVENNAME,t.COLOUR
FROM
    EMP e
        LEFT JOIN
    TROUS t
        ON
             e.EMPNO = t.EMPNO and
             t.DATE = '20110807'

And we get the result set:

GIVENNAME       COLOUR
Chris           NULL
Steve           Grey
Dave            Black
Boris           Black
Tony            NULL

Now, this result set could be in a view, or CTE, or whatever, and we might want to continue asking questions about these results, using SQL. What might some of these questions be?

  1. Were Dave and Boris wearing the same colour trousers on that day? (Yes, Black==Black)

  2. Were Dave and Steve wearing the same colour trousers on that day? (No, Black!=Grey)

  3. Were Boris and Tony wearing the same colour trousers on that day? (Unknown - we're trying to compare with NULL, and we're following the SQL rules)

  4. Were Boris and Tony not wearing the same colour trousers on that day? (Unknown - we're again comparing to NULL, and we're following SQL rules)

  5. Were Chris and Tony wearing the same colour trousers on that day? (Unknown)

Note, that you're already aware of specific mechanisms (e.g. IS NULL) to force the outcomes you want, if you've designed your database to never use NULL as a marker for missing information.

But in SQL, NULL has been given two roles (at least) - to mark inapplicable information (maybe we have complete information in the database, and Chris and Tony didn't turn up for work that day, or did but weren't wearing trousers), and to mark missing information (Chris did turn up that day, we just don't have the information recorded in the database at this time)

If you're using NULL purely as a marker of inapplicable information, I assume you're avoiding such constructs as outer joins.


I find it interesting that you've brought up NaN in comments to other answers, without seeing that NaN and (SQL) NULL have a lot in common. The biggest difference between them is that NULL is intended for use across the system, no matter what data type is involved.

You're biggest issue seems to be that you've decided that NULL has a single meaning across all programming languages, and you seem to feel that SQL has broken that meaning. In fact, null in different languages frequently has subtly different meanings. In some languages, it's a synonym for 0. In others, not, so the comparison 0==null will succeed in some, and fail in others. You mentioned VB, but VB (assuming you're talking .NET versions) does not have null. It has Nothing, which again is subtly different (it's the equivalent in most respects of the C# construct default(T)).


The concept is that NULL is not an equitable value. It denotes the absence of a value.

Therefore, a variable or a column can only be checked if it IS NULL, but not if it IS EQUAL TO NULL.

Once you open up arithmetic comparisions, you may have to contend with IS GREATER THAN NULL, or IS LESS THAN OR EQUAL TO NULL


NULL is unknown. It is neither true nor false so when you are comparing anything to unknown, the only answer is "unknown" Much better article on wikipedia http://en.wikipedia.org/wiki/Null_(SQL)


Because in ANSI SQL, null means "unknown", which is not a value. As such, it doesn't equal anything; you can just evaluate the value's state (known or unknown).


a. Null is not the "lack of a value"

b. Null is not "empty"

c. Null is not an "unset value"

It's all of the above and none of the above.

By technical rights, NULL is an "unknown value". However, like uninitialized pointers in C/C++, you don't really know what your pointing at. With databases, they allocate the space but do not initialize the value in that space.

So, it is an "empty" space in the sense that it's not initialized. If you set a value to NULL, the original value stays in that storage location. If it was originally an empty string (for example), it will remain that.

It's a "lack of a value" in the fact that it hasn't been set to what the database deems a valid value.

It's an "unset value" in that if the space was just allocated, the value that is there has never been set.

"Unknown" is the closest that we can truly come to knowing what to expect when we examine a NULL.


Because of that, if we try to compare this "unknown" value, we will get a comparison that

a) may or may not be valid

b) may or may not have the result we expect

c) may or may not crash the database.

So, the DBMS systems (long ago) decided that it doesn't even make sense to use equality when it comes to NULL.

Therefore, "= null" makes no sense.


In addition to all that has already been said, I wish to stress that what you write in your first line is wrong. SQL does support the “= NULL” syntax, but it has a different semantic than “IS NULL” – as can be seen in the very piece of documentation you linked to.


I agree with the OP that

where column_name = null

should be syntactic sugar for

where column_name is null

However, I do understand why the creators of SQL wanted to make the distinction. In three-valued logic (IMO this is a misnomer), a predicate can return two values (true or false) OR unknown which is technically not a value but just a way to say "we don't know which of the two values this is". Think about the following predicate in terms of three-valued logic:

A == B

This predicate tests whether A is equal to B. Here's what the truth table looks like:

    T U F
    -----
T | T U F
U | U U U
F | F U T

If either A or B is unknown, the predicate itself always returns unknown, regardless of whether the other one is true or false or unknown.

In SQL, null is a synonym for unknown. So, the SQL predicate

column_name = null

tests whether the value of column_name is equal to something whose value is unknown, and returns unknown regardless of whether column_name is true or false or unknown or anything else, just like in three-valued logic above. SQL DML operations are restricted to operating on rows for which the predicate in the where clause returns true, ignoring rows for which the predicate returns false or unknown. That's why "where column_name = null" doesn't operate on any rows.


NULL doesn't equal NULL. It can't equal NULL. It doesn't make sense for them to be equal.

A few ways to think about it:

  1. Imagine a contacts database, containing fields like FirstName, LastName, DateOfBirth and HairColor. If I looked for records WHERE DateOfBirth = HairColor, should it ever match anything? What if someone's DateOfBirth was NULL, and their HairColor was too? An unknown hair color isn't equal to an unknown anything else.

  2. Let's join the contacts table with purchases and product tables. Let's say I want to find all the instances where a customer bought a wig that was the same color as their own hair. So I query WHERE contacts.HairColor = product.WigColor. Should I get matches between every customer I don't know the hair color of and products that don't have a WigColor? No, they're a different thing.

  3. Let's consider that NULL is another word for unknown. What's the result of ('Smith' = NULL)? The answer is not false, it's unknown. Unknown is not true, therefore it behaves like false. What's the result of (NULL = NULL)? The answer is also unknown, therefore also effectively false. (This is also why concatenating a string with a NULL value makes the whole string become NULL -- the result really is unknown.)


Why don't you use the isnull function?

@where = "where GroupId = "+ isnull(@groupId,"null")
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号