simplifying query that has multiple WITH and multiple subqueries

Question

It is bothering me that for a simple query, I have to write out so many sub-selects and WITH statements.

The question is: are there basic guidelines on how to simplify queries that have subqueries?

Here's my query:

WITH cte_min
     AS (SELECT a.client_id,
                a.specimen_source,
                a.received_date
         FROM   f_accession_daily a
                JOIN (SELECT DISTINCT f.client_id,
                                      f.received_date,
                                      f.accession_daily_key
                      FROM   F_ACCESSION_DAILY f
                             JOIN (SELECT CLIENT_ID,
                                          Min(received_date) MinRecDate
                                   FROM   F_ACCESSION_DAILY
                                   GROUP  BY CLIENT_ID) i
                               ON f.CLIENT_ID = i.CLIENT_ID
                                  AND f.RECEIVED_DATE = i.MinRecDate) b
                  ON a.ACCESSION_DAILY_KEY = b.ACCESSION_DAILY_KEY),
     cte_max
     AS (SELECT a.client_id,
                a.specimen_source,
                a.received_date
         FROM   f_accession_daily a
                JOIN (SELECT DISTINCT f.client_id,
                                      f.received_date,
                                      f.accession_daily_key
                      FROM   F_ACCESSION_DAILY f
                             JOIN (SELECT CLIENT_ID,
                                          Max(received_date) MaxRecDate
                                   FROM   F_ACCESSION_DAILY
                                   GROUP  BY CLIENT_ID) i
                               ON f.CLIENT_ID = i.CLIENT_ID
                                  AND f.RECEIVED_DATE = i.MaxRecDate) b
                  ON a.ACCESSION_DAILY_KEY = b.ACCESSION_DAILY_KEY),
     cte_est
     AS (SELECT DISTINCT client_id,
                         MLIS_DATE_ESTABLISHED
         FROM   D_CLIENT
         WHERE  REC_ACTIVE_FLG = 1
                AND MLIS_DATE_ESTABLISHED IS NOT NULL)
SELECT DISTINCT f.client_id,
                cmin.specimen_source,
                cmin.received_date,
                cmax.specimen_source,
                cmax.received_date,
                cest.MLIS_DATE_ESTABLISHED
FROM   F_ACCESSION_DAILY f
       LEFT JOIN cte_max cmax
         ON cmax.CLIENT_ID = f.CLIENT_ID
       LEFT JOIN cte_min cmin
         ON cmin.CLIENT_ID = f.CLIENT_ID
       LEFT JOIN cte_est cest
         ON cest.CLIENT_ID = f.CLIENT_ID

I am not asking necessarily for you to do the simplification yourself (although I would be very grateful for this), rather I am asking for general guidelines/directions on re-writing this query to be more elegant.

I wouldn't necesssarily rewrite it just for aesthetics if it already works. Are you having any kind of performance problems with it?
I definitely wouldn't rewrite it because you don't like subqueries. If it works and is fast enough, why on earth touch it? CTEs are used to simplify readability, so removing them would likely make the query a lot harder to follow.

RichardTheKiwi · Accepted Answer · 2012-09-25 21:33:10Z

Does this look any better?

;WITH minmax AS (
SELECT client_id, specimen_source, received_date,
       RMin = row_number() over (partition by Client_id
                                 order by received_date, accession_daily_key),
       RMax = row_number() over (partition by Client_id
                                 order by received_date desc, accession_daily_key desc)
FROM F_ACCESSION_DAILY
)
SELECT f.client_id,
       max(case when rmin=1 then f.specimen_source end),
       max(case when rmin=1 then f.received_date end),
       max(case when rmax=1 then f.specimen_source end),
       max(case when rmax=1 then f.received_date end),
       D.MLIS_DATE_ESTABLISHED
FROM   minmax f
LEFT JOIN D_CLIENT D ON D.REC_ACTIVE_FLG = 1 AND D.MLIS_DATE_ESTABLISHED IS NOT NULL
WHERE 1 in (f.rmin, f.rmax)
GROUP BY f.client_id, D.MLIS_DATE_ESTABLISHED

Where this might be different than OP is more than one ACCESSION_DAILY_KEY on the min/max (received_date). Still +1
@Blam - I had considered that, but given they have the same received_date which was part of the question's DISTINCT, that's a moot point.
I was going to argue as it is not filtered in the first distinct but realized it is filtered in the final distinct. Man that is a messy query. Already gave you a +1.

Blam · Answer 2 · 2012-09-25 20:30:42Z

50 rows reporting 5 values and in all of that only two tables are referenced.

In the first CTE you have 4 joins (or virtual joins) to the same table and no other table involved reporting 3 columns. Don't know the key so cannot conclude it can be reduced.

If a cte is not reference more than once then it does not result in less lines of code.

For one this cte can be replaced with less code.

cte_est
     AS (SELECT DISTINCT client_id,
                         MLIS_DATE_ESTABLISHED
         FROM   D_CLIENT
         WHERE  REC_ACTIVE_FLG = 1
                AND MLIS_DATE_ESTABLISHED IS NOT NULL)
...
cest.MLIS_DATE_ESTABLISHED
...
LEFT JOIN cte_est cest
         ON cest.CLIENT_ID = f.CLIENT_ID

reduces to

D_CLIENT.MLIS_DATE_ESTABLISHED
...
LEFT JOIN  D_CLIENT 
       ON  D_CLIENT.CLIENT_ID = f.CLIENT_ID 
       AND D_CLIENT.REC_ACTIVE_FLG = 1
       AND D_CLIENT.MLIS_DATE_ESTABLISHED IS NOT NULL

RBarryYoung · Answer 3 · 2012-09-25 21:17:48Z

While I am not sure if everyone would consider this simpler and/or easier to read, this is how I would do it:

WITH 
  cte_MaxMinRecvd As
(
    SELECT      CLIENT_ID,   
                Min(received_date) MinRecDate,
                Max(received_date) MaxRecDate
    FROM        F_ACCESSION_DAILY
    GROUP BY    CLIENT_ID
)
, cte_MaxMinDaily As
(
    SELECT      *
    FROM        F_ACCESSION_DAILY f
    JOIN        cte_MaxMinRecvd   i     ON  f.CLIENT_ID = i.CLIENT_ID
)
, cte_min AS 
(
    SELECT  a.client_id,
            a.specimen_source,
            a.received_date
    FROM    F_ACCESSION_DAILY a
    WHERE   EXISTS(
                    SELECT  *
                    FROM    cte_MaxMinDaily f
                    WHERE   f.RECEIVED_DATE       = f.MinRecDate
                      AND   a.ACCESSION_DAILY_KEY = f.ACCESSION_DAILY_KEY
                   )
)
, cte_max AS 
(
    SELECT  a.client_id,
            a.specimen_source,
            a.received_date
    FROM    f_accession_daily a
    WHERE   EXISTS(
                    SELECT  *
                    FROM    cte_MaxMinDaily f
                    WHERE   f.RECEIVED_DATE       = f.MinRecDate
                      AND   a.ACCESSION_DAILY_KEY = f.ACCESSION_DAILY_KEY
                   )
)
SELECT DISTINCT 
            f.client_id,
            cmin.specimen_source,
            cmin.received_date,
            cmax.specimen_source,
            cmax.received_date,
            cest.MLIS_DATE_ESTABLISHED
FROM        F_ACCESSION_DAILY f
LEFT JOIN   cte_max  cmax        ON  cmax.CLIENT_ID = f.CLIENT_ID
LEFT JOIN   cte_min  cmin        ON  cmin.CLIENT_ID = f.CLIENT_ID
LEFT JOIN   D_CLIENT cest        ON  cest.CLIENT_ID = f.CLIENT_ID 
                                 AND cest.REC_ACTIVE_FLG = 1
                                 AND cest.MLIS_DATE_ESTABLISHED IS NOT NULL

Mainly what I did was to

Turn most of the subqueries into CTEs, where applicable,
Merge the Min and Max subqueries together, and
Change the DISTINCT subqueries into EXISTS subqueries, which can be simpler (and usually perform better)

Ooops, I also got rid of the cte_est CTE as Blam suggested..

Sort of works as a general pattern, but you haven't really unravelled the group by/min/max pattern which really was a row_number() solution in legacy form.
Yes, I considered that approach, but it appeared to me that I would have to make assumptions about which columns were keys/unique ...

asked	9 months ago
viewed	107 times
active	9 months ago

simplifying query that has multiple WITH and multiple subqueries

3 Answers

Your Answer

Not the answer you're looking for? Browse other questions tagged sql sql-server sql-server-2008 tsql or ask your own question.

simplifying query that has multiple WITH and multiple subqueries

3 Answers

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged sql sql-server sql-server-2008 tsql or ask your own question.

Related