I've been having a lot of trouble with the following question:
List the yearly abnormal rate for each type of medical tests for 2000 - 2004 inclusive. An abnormal rate for a test type in a given year is the number of tests of the type with the result that is not "normal" over the number of the tests of the type, all conducted in the year. Note that the result set shall contain three columns, i.e., the year, the type of test, and the abnormal rate.
The table that's is used is test_record:
test_record(test_id,type_id,patient_no,employee_no,medical_lab,result,prescribe_date,test_date)
I've tried to restart it but I am stuck, I'll show my thought process below.
I started off by splitting the question into two pieces: finding the count of the abnormal results for each type in each year, and finding the total number of tests per type in each year.
The following query the count of abnormal tests per type per year (if there are any):
SELECT EXTRACT(YEAR FROM t.test_date) AS year, t.type_id, COUNT(t.test_id)
FROM test_record t
WHERE t.result <> 'normal'
AND EXTRACT(YEAR FROM t.test_date) >= 2000
AND EXTRACT(YEAR FROM t.test_date) <= 2004
GROUP BY EXTRACT(YEAR FROM t.test_date), t.type_id;
outputs:
YEAR TYPE_ID COUNT(T.TEST_ID)
---------- ---------- ----------------
2003 5 1
2003 4 3
2004 1 1
2003 1 2
which for my inputs is the correct count of tests with abnormal results.
I then proceeded with the second part by simply removing the line comparing t.result
to 'normal':
SELECT EXTRACT(YEAR FROM t.test_date) AS year, t.type_id, COUNT(t.test_id)
FROM test_record t
WHERE
AND EXTRACT(YEAR FROM t.test_date) >= 2000
AND EXTRACT(YEAR FROM t.test_date) <= 2004
GROUP BY EXTRACT(YEAR FROM t.test_date), t.type_id;
Output:
YEAR TYPE_ID COUNT(T.TEST_ID)
---------- ---------- ----------------
2003 5 2
2003 4 7
2004 3 1
2003 3 2
2004 1 2
2004 4 2
2003 1 6
2004 5 1
Which once again precisely fits my input.
But now when I try and combine them to get the rate, it all goes haywire. I've tried many different techniques but all end up erroneous. I'm really not sure how else to approach this last part. The most common error in my attempts had to do with grouping (I think).
SELECT EXTRACT(YEAR FROM t.test_date) AS year, t.type_id, COUNT(t.test_id)/abnormal.t_count AS rate
FROM test_record t,
(SELECT EXTRACT(YEAR FROM t1.test_date) AS year, t1.type_id, COUNT(t1.test_id) AS t_count
FROM test_record t1
WHERE t1.result <> 'normal'
AND EXTRACT(YEAR FROM t1.test_date) >= 2000
AND EXTRACT(YEAR FROM t1.test_date) <= 2004
GROUP BY EXTRACT(YEAR FROM t1.test_date), t1.type_id) abnormal
WHERE t.type_id = abnormal.type_id
AND EXTRACT(YEAR FROM t.test_date) >= 2000
AND EXTRACT(YEAR FROM t.test_date) <= 2004
GROUP BY EXTRACT(YEAR FROM t.test_date), t.type_id;
Output:
YEAR TYPE_ID RATE
---------- ---------- ----------
2003 4 2.33333333
2004 1 2
2003 1 6
2004 5 1
2003 5 2
2004 4 .666666667
2004 1 1
2003 1 3
As you can see, this output is a mess and is wrong. I'm not looking for an answer but a point in the right direction - or at least an explanation of where I've screwed up.