1

I'm having some performance issues (speed of execution) with a very simple task, explained below. My code is to dumb to show here, so i'll explain the problem in detail:

Consider the following sql table, which stores daily rainfall data over ~12 years:

historic data

year, month, day, rainfall_observed
2001, 1, 1, 4
2001, 1, 2, 0
...
2012, 12, 31, 10

Now, consider another table with identical structure, but holding ~1 year of daily forecast data:

forecast

year, month, day, rainfall_forec
2013, 1, 1, 0
2013, 1, 2, 3
...
2013, 12, 31, 15

The problem: for each day of forecast data, count the same (month, day) in historic data, such that rainfall_observed > rainfall_forec.

Example

Consider the forecast entry

year = 2013, month = 3, day = 15, rainfall_forec = 10

Then i need to count in historic data every entry with month = 3 AND day = 15 WHERE rainfall observed > 10 (easy with SQL). So, my output is a list with a count for each forecast day.

There's a lot way of doing this:

1 - Use a SQL query using inside a for loop (1 query per forecast day = 365 queries in a loop - i didn't test it yet)

2 - Query all the observed data and use nested loops for comparison (~ 365 * 12 * 365 comparisons...)

3 - Do some mix of the above or some math with Dates (but here missing data is a issue)

I'm currently using option 2, but i need it to be faster. I'll try option 1, but maybe i'm missing a simpler solution.

So, i appreciate any advice on how to solve this. Thanks!

2
  • What exactly is the output you require? Just a list of 365 counts? Commented Sep 20, 2013 at 4:24
  • yes, the output is an array of counts, with 365 items. Commented Sep 20, 2013 at 4:31

1 Answer 1

1

You can try a query like this:

SELECT r_for.year, r_for.month, r_for.day, count(r_for.rainfall_forec) 
FROM forecast AS r_for 
RIGHT JOIN historic_data AS r_obs 
ON r_for.month = r_obs.month 
  AND r_for.day = r_obs.day 
  AND r_obs.rainfall_observed > r_for.rainfall_forec 
GROUP BY r_for.year, r_for.month, r_for.day
Sign up to request clarification or add additional context in comments.

1 Comment

There is dblink which may help you. But it's not usually installed by default so you need access to the database server.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.