Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I'm having some performance issues (speed of execution) with a very simple task, explained below. My code is to dumb to show here, so i'll explain the problem in detail:

Consider the following sql table, which stores daily rainfall data over ~12 years:

historic data

year, month, day, rainfall_observed
2001, 1, 1, 4
2001, 1, 2, 0
...
2012, 12, 31, 10

Now, consider another table with identical structure, but holding ~1 year of daily forecast data:

forecast

year, month, day, rainfall_forec
2013, 1, 1, 0
2013, 1, 2, 3
...
2013, 12, 31, 15

The problem: for each day of forecast data, count the same (month, day) in historic data, such that rainfall_observed > rainfall_forec.

Example

Consider the forecast entry

year = 2013, month = 3, day = 15, rainfall_forec = 10

Then i need to count in historic data every entry with month = 3 AND day = 15 WHERE rainfall observed > 10 (easy with SQL). So, my output is a list with a count for each forecast day.

There's a lot way of doing this:

1 - Use a SQL query using inside a for loop (1 query per forecast day = 365 queries in a loop - i didn't test it yet)

2 - Query all the observed data and use nested loops for comparison (~ 365 * 12 * 365 comparisons...)

3 - Do some mix of the above or some math with Dates (but here missing data is a issue)

I'm currently using option 2, but i need it to be faster. I'll try option 1, but maybe i'm missing a simpler solution.

So, i appreciate any advice on how to solve this. Thanks!

share|improve this question
What exactly is the output you require? Just a list of 365 counts? – Mike W Sep 20 at 4:24
yes, the output is an array of counts, with 365 items. – Fernando Sep 20 at 4:31

1 Answer

You can try a query like this:

SELECT r_for.year, r_for.month, r_for.day, count(r_for.rainfall_forec) 
FROM forecast AS r_for 
RIGHT JOIN historic_data AS r_obs 
ON r_for.month = r_obs.month 
  AND r_for.day = r_obs.day 
  AND r_obs.rainfall_observed > r_for.rainfall_forec 
GROUP BY r_for.year, r_for.month, r_for.day
share|improve this answer
Well, that looks nice, but these 2 table are not in the same DB... – Fernando Sep 20 at 19:09
There is dblink which may help you. But it's not usually installed by default so you need access to the database server. – pupkinsen Sep 21 at 17:46

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.