function hangs with null case operation

Question

I created a function that accepts a start and end date, with the end date being optional. I then wrote a CASE in the filter to use the start date if no end date is passed.

CASE WHEN @dateEnd IS NULL
    THEN @dateStart
    ELSE @dateEnd
END

When I call the function for the most recent month of the data:

SELECT * FROM theFunction ('2013-06-01', NULL)

... the query hangs. If I specify the end date:

SELECT * FROM theFunction ('2013-06-01', '2013-06-01')

... the result is returned normally. I took the code out of the function and ran it fine inside a query window. I can't duplicate the issue the fiddle either. A query like:

SELECT * FROM theFunction ('2013-04-01', '2013-06-01')

... also works fine.

Is there anything in the query (below) that could be causing the function to hang when a NULL is passed for the end date?

SQL Fiddle

Execution plan for SELECT * FROM theFunction ('2013-06-01', '2013-06-01')
Estimated plan for SELECT * FROM theFunction ('2013-06-01', NULL)

Can you post more of the logic? What you have there shouldn't be causing a problem.
If you replace the CASE with COALESCE(@dateEnd,@dateStart), does the issue still appear?
Is it busy or waiting on something? Whilst it is "hung" what does SELECT task_state FROM sys.dm_os_tasks WHERE session_id = x show? If it spends a lot of time not in the RUNNING state what wait types is that session getting in sys.dm_os_waiting_tasks?
@ypercube No improvement with COALESCE. ISNULL fixed it.

Martin Smith · Accepted Answer · 2013-07-18 16:30:45Z

Part of your initial query is as follows.

  FROM   [dbo].[calendar] a
          LEFT JOIN [dbo].[colleagueList] b
            ON b.[Date] = a.d
   WHERE  DAY(a.[d]) = 1
          AND a.[d] BETWEEN @dateStart AND COALESCE(@dateEnd,@dateStart)

That section of the plan is shown below

enter image description here

Your revised query BETWEEN @dateStart AND ISNULL(@dateEnd,@dateStart) has this for the same join

enter image description here

The difference seems to be that ISNULL simplifies further and as a result you get more accurate cardinality statistics going into the next join. This is an inline table valued function and you are calling it with literal values so it can do something like.

 a.[d] BETWEEN @dateStart AND ISNULL(@dateEnd,@dateStart) 
 a.[d] BETWEEN '2013-06-01' AND ISNULL(NULL,'2013-06-01') 
 a.[d] BETWEEN '2013-06-01' AND '2013-06-01'
 a.[d] = '2013-06-01'

And as there is an equi join predicate b.[Date] = a.d the plan also shows an equality predicate b.[Date] = '2013-06-01'. As a result the cardinality estimate of 28,393 rows is likely to be pretty accurate.

For the CASE/COALESCE version when @dateStart and @dateEnd are the same value then it simplifies OK to the same equality expression and gives the same plan but when @dateStart = '2013-06-01' and @dateEnd IS NULL it only goes as far as

a.[d]>='2013-06-01' AND a.[Date]<=CASE WHEN (1) THEN '2013-06-01' ELSE NULL END

which it also applies as an implied predicate on ColleagueList. The estimated number of rows this time is 79.8 rows.

The next join along is

   LEFT JOIN colleagueTime
     ON colleagueTime.TC_DATE = colleagueList.Date
        AND colleagueTime.ASSOC_ID = CAST(colleagueList.ID AS VARCHAR(10))

colleagueTime is a 3,249,590 row table which is (again) apparently a heap with no useful indexes.

This discrepancy in estimates affects the join choice used. The ISNULL plan chooses a hash join that just scans the table once. The COALESCE plan chooses a nested loops join and estimates that it will still just need to scan the table once and be able to spool the result and replay it 78 times. i.e. it estimates that the correlated parameters will not change.

From the fact that the nested loops plan was still going after two hours this assumption of a single scan against colleagueTime seems likely to be highly inaccurate.

As for why the estimated number of rows between the two joins are so much lower I'm not sure without being able to see the statistics on the tables. The only way I managed to skew the estimated row counts that much in my testing was adding a load of NULL rows (this reduced the estimated row count even though the actual number of rows returned remained the same).

The estimated row count in the COALESCE plan with my test data was in the order of

number of rows matching >= condition * 30% * (proportion of rows in the table not null)

Or in SQL

SELECT 1E0 * COUNT([Date]) / COUNT(*) * ( COUNT(CASE
                                                  WHEN [Date] >= '2013-06-01' THEN 1
                                                END) * 0.30 )
FROM   [dbo].[colleagueList]

but this does not square with your comment that the column has no NULL values.

"do you have a very high proportion of NULL values in the Date column in that table?" I don't have NULL values for dates in any of those tables.
@FreshPrinceOfSO - That's a pity. I still have no idea as to why there is such a large discrepancy in the two estimates then. In the tests I did the bitmap filter and additional predicate didn't seem to alter the cardinality estimates maybe it does here.
@FreshPrinceOfSO - Though if you felt like scripting out the statistics I can try and figure it out.
I'm on 2008R2; when I get to Choose Schemas, dbo is not listed. Just other schemas that I don't use.

FreshPrinceOfSO · Answer 2 · 2013-07-17 15:31:15Z

It seems as though there was an issue with the data types. ISNULL fixed the issue (thanks ypercube). After some research, COALESCE is the equivalent to the CASE statement that I was using:

CASE
   WHEN (expression1 IS NOT NULL) THEN expression1
   WHEN (expression2 IS NOT NULL) THEN expression2
   ...
   ELSE expressionN
END

Paul White explains that:

COALESCE( expression [ ,...n ] ) returns the data type of the expression with the highest data type precedence.

ISNULL(check_expression, replacement_value) returns the same type as check_expression.

To avoid any data type issues, it seems ISNULL is the appropriate function to use for dealing with only two expressions.

XML Plan Excerpts

XML plan using CASE, expression 2 is NULL:

SELECT * FROM theFunction ('2013-06-01', NULL)

<ScalarOperator ScalarString="CASE WHEN (1) THEN '2013-06-01' ELSE NULL END">
  <IF>
    <Condition>
      <ScalarOperator>
        <Const ConstValue="(1)"/>
      </ScalarOperator>
    </Condition>
    <Then>
      <ScalarOperator>
        <Const ConstValue="'2013-06-01'"/>
      </ScalarOperator>
    </Then>
    <Else>
      <ScalarOperator>
        <Const ConstValue="NULL"/>
      </ScalarOperator>
    </Else>
  </IF>
</ScalarOperator>

XML plan using CASE, expression 2 is a date:

SELECT * FROM theFunction ('2013-06-01', '2013-06-01')

<ScalarOperator ScalarString="CASE WHEN [Expr1035]=(0) THEN NULL ELSE [Expr1036] END">
  <IF>
    <Condition>
      <ScalarOperator>
        <Compare CompareOp="EQ">
          <ScalarOperator>
            <Identifier>
              <ColumnReference Column="Expr1035"/>
            </Identifier>
          </ScalarOperator>
          <ScalarOperator>
            <Const ConstValue="(0)"/>
          </ScalarOperator>
        </Compare>
      </ScalarOperator>
    </Condition>
    <Then>
      <ScalarOperator>
        <Const ConstValue="NULL"/>
      </ScalarOperator>
      </Then>
    <Else>
      <ScalarOperator>
        <Identifier>
          <ColumnReference Column="Expr1036"/>
        </Identifier>
      </ScalarOperator>
    </Else>
  </IF>
</ScalarOperator>

XML plan using ISNULL, expression 2 is NULL:

SELECT * FROM theFunction ('2013-06-01', NULL)

<ScalarOperator ScalarString="CASE WHEN [Expr1035]=(0) THEN NULL ELSE [Expr1036] END">
  <IF>
    <Condition>
      <ScalarOperator>
        <Compare CompareOp="EQ">
          <ScalarOperator>
            <Identifier>
              <ColumnReference Column="Expr1035"/>
            </Identifier>
          </ScalarOperator>
          <ScalarOperator>
            <Const ConstValue="(0)"/>
          </ScalarOperator>
        </Compare>
      </ScalarOperator>
    </Condition>
    <Then>
      <ScalarOperator>
        <Const ConstValue="NULL"/>
      </ScalarOperator>
    </Then>
    <Else>
      <ScalarOperator>
        <Identifier>
          <ColumnReference Column="Expr1036"/>
        </Identifier>
      </ScalarOperator>
    </Else>
  </IF>
</ScalarOperator>

XML plan using ISNULL, expression 2 is a date:

SELECT * FROM theFunction ('2013-06-01', '2013-06-01')

<ScalarOperator ScalarString="CASE WHEN [Expr1035]=(0) THEN NULL ELSE [Expr1036] END">
  <IF>
    <Condition>
      <ScalarOperator>
        <Compare CompareOp="EQ">
          <ScalarOperator>
            <Identifier>
              <ColumnReference Column="Expr1035"/>
            </Identifier>
          </ScalarOperator>
          <ScalarOperator>
            <Const ConstValue="(0)"/>
          </ScalarOperator>
        </Compare>
      </ScalarOperator>
    </Condition>
    <Then>
      <ScalarOperator>
        <Const ConstValue="NULL"/>
      </ScalarOperator>
    </Then>
    <Else>
      <ScalarOperator>
        <Identifier>
          <ColumnReference Column="Expr1036"/>
        </Identifier>
      </ScalarOperator>
    </Else>
  </IF>
</ScalarOperator>

But that doesn't explain why it worked OK for SELECT * FROM theFunction ('2013-06-01', '2013-06-01'). The expression datatype is still the same. And both the parameters are date datatype anyway. Can you view the execution plans?
@MartinSmith Here's the plan for the query that returns a result. I don't have a plan when the second expression is NULL.
Casting the expressions inside the CASE also had no effect, query still hangs.
How come no plan for the second case? Is it just because the query never finishes? If so can you get an estimated plan? Wondering if the different expressions change the cardinality estimates and you end up with a different plan.
The ISNULL plan looks like it simplifies better. It has a simple equality predicate on ColleagueList of [Date]='2013-06-01' whereas the CASE one has a predicate on [Date]>='2013-06-01' AND [Date]<=CASE WHEN (1) THEN '2013-06-01' ELSE NULL END AND PROBE([Bitmap1067],[Date]). The estimated rows coming out of that join are 28,393 for the ISNULL version but much lower at 79.8 for the CASE version which effects the join choice later in the plan. Not sure why there would be such a discrepancy.

asked	3 days ago
viewed	158 times
active	2 days ago

function hangs with null case operation

2 Answers

Your Answer

Not the answer you're looking for? Browse other questions tagged sql-server sql sql-server-2008-r2 or ask your own question.

function hangs with null case operation

2 Answers

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged sql-server sql sql-server-2008-r2 or ask your own question.

Related