2

Ok, so this particular use case is giving me quite a bit of headache.

This is a edit of another post, where i forgot to add an important aspect to the usecase.

What is need to do is match a Candidates WorkWish with the CustomerJobQueries in the database.

This means match if the DateTimes in the WorkWish is overlapping with any of the CustomerJobQueries. then i need to match the GeographicalPreference with the location of the Customer.

The logic we had when we build the database was this:

  • if the Country is set but the region and municipality is NULL - the candidate wishes to work in the entire country.
  • if the country and region is set the candidate wishes to work in all municipalities in the specific region in the specific country.
  • if the country, region and municipality is set he wishes to only work in that municipality.

It is possible for multiple rows to exists such that 2 specific municipalities is set, the only rule is the there will never exist a row that takes precedence over another.

meaning

GeographicalPreferenceId Country_ShortName    Region_Id   Municipality_Id WorkWish_Id
------------------------ -------------------- ----------- --------------- -----------
1                        DK                   NULL        NULL            1
2                        DK                   3           NULL            1

Is NOT possible because one row already specifies he wishes to work in the entire country.

This is how a snippet of the database looks like

DateRange Table -- contains all DateRanges

DateRangeId StartDate                   EndDate                     CustomerJobQuery_Id WorkWish_Id
----------- --------------------------- --------------------------- ------------------- -----------
1           2014-03-31 00:00:00.0000000 2014-08-18 00:00:00.0000000 NULL                1
2           2014-08-25 00:00:00.0000000 2014-09-15 00:00:00.0000000 NULL                1
3           2013-08-24 00:00:00.0000000 2014-09-25 00:00:00.0000000 1                   NULL

Workwish Table

Id          Candidate_Id
----------- ------------
1           5

CustomerJobQuery Table

Id          Customer_CustomerId Country_ShortName  Region_Id  Municipality_Id
----------- ------------------- -----------------  ---------  -----------------
1           2                   DK                 2          3

GeographicalPreference Table

GeographicalPreferenceId Country_ShortName    Region_Id   Municipality_Id WorkWish_Id
------------------------ -------------------- ----------- --------------- -----------
1                        DK                   2           NULL            1
2                        DK                   3           NULL            1

What is have so far is; - Returns the CustomerJobQueries that overlap with the WorkWish dates.

SELECT c.*
 FROM CustomerJobQuery c
  WHERE EXISTS ( 
   SELECT *
     FROM Workwish w
     INNER JOIN DateRange d1 ON d1.CustomerJobQuery_Id = c.Id
     INNER JOIN DateRange d2 ON d2.WorkWish_Id = w.Id
     WHERE w.Candidate_Id = @CandidateId -- input from my **StoredProcedure**
     -- overlap check:
     AND d1.StartDate <= d2.EndDate
     AND d2.StartDate <= d1.EndDate

)

I am uncertain how i would do the geographicalpreference logic in a setbased manner. - currently our solution is done in C# loading everything into memory and looping over everything with nested foreaches which is horribly ineffective. And is an absolute worst case scenario.

Any help is much appreciated!

1
  • Why is the DateRange table joining on the two id columns? This looks ver suspicious. Very strange data structure. Commented Oct 3, 2014 at 14:03

2 Answers 2

1

I have created some tables, since you haven't provided any DDL.

declare @CandidateId int;

declare @WorkWish table (
    Id int primary key,
    CandidateId int
);

declare @Preference table (
    Id int primary key,
    WishId int,
    Country char(2),
    RegionId int null,
    MunicipalityId int null
);

declare @DateRange table (
    Id int primary  key,
    StartDate datetime,
    EndDate datetime,
    JobQueryId int null,
    WishId int null
);

declare @JobQuery table (
    Id int primary key,
    CustomerId int,
    Country char(2),
    RegionId int,
    MunicipalityId int
);


select *,
    case
        when p.MunicipalityId is not null then 0
        when p.RegionId is not null then 1
        else 2
    end as [LocationMetric]
from @WorkWish w
    inner join @Preference p on w.Id = p.WishId
    inner join @DateRange dw on dw.WishId = w.Id
    inner join @JobQuery j on j.Country = p.Country
    inner join @DateRange dj on dj.JobQueryId = j.Id
where w.CandidateId = @CandidateId
    and dw.StartDate <= dj.EndDate
    and dj.StartDate <= dw.EndDate
    and nullif(p.RegionId, j.RegionId) is null
    and nullif(p.MunicipalityId, j.MunicipalityId) is null
order by LocationMetric;

I have to warn you about the performance of the NULLIF() function, however - it's far from perfect. You can try and expand it into a corresponding case construct, but usually it isn't worth the effort.

Still, it should be better than what you have now.

1
  • Thank you. Your answer really helped me on the way. I've rewritten most of it, but you helped me on how to approach the problem. Commented Oct 7, 2014 at 12:31
0

I think the following will get you all matches:

WITH ww as (
      SELECT ww.*, drw.StartDate, drw.EndDate
      FROM WorkWish ww JOIN
           DateRange drw
           ON drw.WorkWish_Id = w.id
     )
SELECT c.*
FROM CustomerJobQuery c JOIN
     DateRange drc
     ON drc.CustomerJobQuery_Id = c.id JOIN
     ww
     ON drc.StartDate <= ww.EndDate AND ww.StartDate <= drw.EndDate AND
        (drc.Country_ShortName = ww.Country_ShortName and ww.Region_id is NULL or
         drc.Country_ShortName = ww.Country_ShortName and drc.Region_id = ww.Region_id and ww.municipality_id is null or
         drc.Country_ShortName = ww.Country_ShortName and drc.Region_id = ww.Region_id and drc.Municipality_Id = ww.Munipality_Id
        )
)

If you want to prioritize the matches, then use row_number() and a subquery:

WITH ww as (
      SELECT ww.*, drw.StartDate, drw.EndDate
      FROM WorkWish ww JOIN
           DateRange drw
           ON drw.WorkWish_Id = w.id
     )
SELECT c.*
FROM (SELECT c.*, ww.id as WorkId,
             ROW_NUMBER() OVER (PARTITION BY c.id
                                ORDER BY (case when ww.Munipality_Id is not null then 1
                                               when ww.Region_id is not null then 2
                                               else 3
                                          end)
                               ) as seqnum
      FROM CustomerJobQuery c JOIN
           DateRange drc
           ON drc.CustomerJobQuery_Id = c.id JOIN
           ww
           ON drc.StartDate <= ww.EndDate AND ww.StartDate <= drw.EndDate AND
              (drc.Country_ShortName = ww.Country_ShortName and ww.Region_id is NULL or
               drc.Country_ShortName = ww.Country_ShortName and drc.Region_id = ww.Region_id and ww.municipality_id is null or
               drc.Country_ShortName = ww.Country_ShortName and drc.Region_id = ww.Region_id and drc.Municipality_Id = ww.Munipality_Id
              )
     ) c
WHERE seqnum = 1;
1
  • I appreciate the attempt. But the GeographicalPreference table is the one the has the columns Country_ShortName Region and Municipality.. not DateRange nor WorkWish... Commented Oct 3, 2014 at 18:50

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.