Improve performance with the WHERE NOT IN sub-select clause

Question

In the following query, I have to count transactions for each customer. [EDIT]However, I have to exclude from the result set entirely, customers that have a transaction older than one year.

Shouldn't the query optimizer be smart enough to only evaluate existence once for each customer?

--Count transactions on customers that are less than 1 year old

  SELECT t1.CUSTID,COUNT(*)
  FROM CUST_TRX t1
  WHERE NOT EXISTS ( 
    SELECT FIRST 1 1 
    FROM CUST_TRX t2 
    WHERE 
      t2.CUSTID=t1.CUSTID AND
      t2.DATED<CURRENT_DATE-365
    GROUP BY t2.CUSTID
  )
  GROUP BY t1.CUSTID

There are no naturals in my query plan. This query is performing as if the database is running the existence clause for every transaction instead of running it for every customer. Performance is the same if I remove the GROUP BY in the sub-query.

Is there a better way to do this so that I may get better performance out of the database? Hopefully a simple select query will work avoiding a CTE if possible (that would introduce other challenges). Due to other group by criteria (not shown here) I'm not able to simply check MIN(DATED), I really need to perform another query.

What indexes are there on the table? What is the actual query and what is the execution plan of the actual query? — ypercube, Aug 27 at 19:55

RLF · Answer 1 · 2013-08-28 12:19:54Z

When you say "count transactions on customers that are less than 1 year old" do you mean:

Count all customer transactions that are less than one year old?
Count all transactions for new customers less than one year old?

From the sample code I understand that #1 is what you want. In that case, do you really need a WHERE NOT EXISTS? Could you just do something like:

SELECT t1.CUSTID, COUNT(*)
FROM CUST_TRX t1
WHERE t1.DATED>=CURRENT_DATE-365
GROUP BY t1.CUSTID
HAVING COUNT(*) > 0

I am not a Firebird user, but I looked up the GROUP BY / HAVING syntax.

[EDIT]Exclude from the result set customers that have a transaction older than one year.

OK, the here is other take on aggregating the rows to eliminate customer from the select.

SELECT A.CUSTID, A.HowMany
FROM (SELECT t1.CUSTID, COUNT(*) HowMany, MIN(t1.DATED) OldestTran
    FROM CUST_TRX t1
    GROUP BY t1.CUSTID
    HAVING COUNT(*) > 0 AND MIN(t1.DATED) >=CURRENT_DATE-365) AS A

[EDIT] OK, so the query is more complex that can be embodied in a single query.

This means that you will likely need to use a pattern much like the one you first posted. Note that EXISTS implies a DISTINCT and is often faster than a JOIN from a SELECT DISTINCT. But you can try different approaches and compare the behavior, timing, etc. Then choose the one that you like the best.

I have to exclude from the result set entirely, customers that have a transaction older than one year. — jcalfee314, Aug 27 at 19:04

asked	3 months ago
viewed	138 times
active	yesterday

Improve performance with the WHERE NOT IN sub-select clause

1 Answer

Your Answer

Not the answer you're looking for? Browse other questions tagged sql query-performance subquery firebird query-plan or ask your own question.

Improve performance with the WHERE NOT IN sub-select clause

1 Answer

Your Answer

Sign up or login

Post as a guest

Not the answer you're looking for? Browse other questions tagged sql query-performance subquery firebird query-plan or ask your own question.

Related