I had to debug a process this week which was displaying unexpected duplicate results. The SQL statement went something like this:
Select distinct customer,order,product,description,instructions from orders
Instead of getting one row for some orders we were seeing two. It turns out that one of the fields contained a NULL value. When SQL compares two fields which are both NULL you might think that the result would be equal or true but it isn’t. When you consider this more deeply it makes some sense, Null meaning unknown or unspecified. In that case, why would you expect two unknown items to be equal?
The trick in my case was to use the coalesce function to wrap the fields which could contain null. This way you can have them treated as empty strings rather than ‘unknown’.
Select distinct customer,order,product,description, coalesce(instructions,”) as instructions from orders
The coalesce now wraps the instructions field converting nulls to empty strings which can then be compared and equal values dropped by the distinct clause on the select.