Sign up ×
Stack Overflow is a community of 4.7 million programmers, just like you, helping each other. Join them, it only takes a minute:

I have a slight problem with one of my query. The goal of this query is to get all the table1 items of a user and their information. As you can see, the data model is quite complex (for good reasons), and this requires an big query (my goal is to gather everything with one query only).

Here is the data model :

model

What I want :

  • All T1 info
  • All T2 info for one T1 item (it is a 1 to n relations, so I'll use array_agg)
  • All T3 info for one T1 item
  • All T4 info for one T1 item
  • All T6 info for one T1 item
  • i18n info for the T1 itemp

Here are the table1_table2 and table4_table6 SELECT * :

   table1_id | table2_id
-------------+---------------
 item2id     | table2item1
 item4id     | table2item2
 item4id     | table2item1
 item5id     | table2item3
 item5id     | table2item2

   table4_id      | table6_id
------------------+--------------------
 table4item1      | table6item1
 table4item1      | table6item2
 table4item2      | table6item2
 table4item3      | table6item3
 table4item1      | table6item3
 table4item2      | table6item3

Here are the Table1 SELECT with id and its foreign key.

 table1_id | table3_id
------------------------
 item1id   | table3item1
 item2id   | table3item1
 item6id   | table3item4
 item3id   | table3item2
 item4id   | table3item2
 item5id   | table3item3

Same for table3 :

  table3_id  |  table4_id
------------+--------------
 table3item1 | table4item1
 table3item4 | table4item1
 table3item2 | table4item2
 table3item3 | table4item3

Finally, here is my query :

SELECT t1.id,
na.name,
array_to_json(array_agg(row_to_json(t2))) AS table2items,
array_to_json(array_agg(row_to_json(t6))) AS table6items
FROM table1 t1
INNER JOIN table1_i18n na ON na.table1_id = t1.id
INNER JOIN table3 t3 ON t3.id = t1.table3_id
INNER JOIN table4 t4 ON t4.id = t3.table4_id
LEFT JOIN table1_table2 t1t2 ON t1t2.table1_id = t1.id
LEFT JOIN table2 t2 ON t2.id = t1t2.table2_id
LEFT JOIN table4_table6 t5_t6 ON t5_t6.table5_id = t3.table4_id
LEFT JOIN table6 t6 ON t6.id = t5_t6.table6_id
WHERE t1.user_id = 'myuserid' AND na.lang = 'en_US'
GROUP BY t1.id, na.name, t4.id
ORDER BY t1.id;

Here is the result :

     id      |     name     |   table3_id  |  table4_id  |           table2items                                                                                                                             |  table6items
-------------+------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 item1id     | MyFirstItem  |  table3item1 | table4item1 | [null,null,null]                                                                                                                                  | [{"id":"table6item1"},{"id":"table6item2"},{"id":"table6item3"}]
 item2id     | MySecondItem |  table3item1 | table4item1 | [{"table2item1","data1":"damage","data2":10},{"id":"table2item1","data1":"damage","data2":10},{"id":"table2item1","data1":"damage","data2":10}]   | [{"id":"table6item1"},{"id":"table6item2"},{"id":"table6item3"}]
 item3id     | MyThirdItem  |  table3item2 | table4item2 | [null,null]                                                                                                                                       | [{"id":"table6item2"},{"id":"table6item3"}]
 item4id     | MyFourthItem |  table3item2 | table4item2 | [{"id":"table2item2","data1":"range","data2":20},{"id":"table2item1","data1":"damage","data2":10},{"id":"table2item2","data1":"range","data2":20},{"id":"table2item1","data1":"damage","data2":10}] | [{"id":"table6item2"},{"id":"table6item3"},{"id":"table6item3"},{"id":"table6item2"}]
 item5id     | MyFifthItem  |  table3item3 | table4item3 | [{"id":"table2item3","data1":"range","data2":20},{"id":"table2item2","data1":"range","data2":20}]                                                 | [{"id":"table6item3"},{"id":"table6item3"}]
 item6id     | MySixthItem  |  table3item4 | table4item1 | [null,null,null]                                                                                                                                  | [{"id":"table6item2"},{"id":"table6item1"},{"id":"table6item3"}]

Well, I've got a problem here. As you can see, my table2_items and table6_items arrays have the same size. I don't know the reason for this, but it seems that I'm missing something. Worse, instead of filling this array with null value, this query creates duplicates which should not appear.

Details :

  • item1 and item6 have the same problem : no links to table2, and 3 items in table6. I end up with an array [null, null, null] for table2_items
  • item2 has 3 links to table 6, and 1 to table2. I end up with 3 times the same table2 object in the array
  • item4... I don't know what's happening here. Should have 2 things in each array, and I've got 4 (duplicates)
  • item5 : you can clearly see the duplication.

I have tried to group by table6.id, or table2.id. It doesn't work (I have got a line for each of them, so several line for each item).

Note : If I do

SELECT t1.id,
na.name,
array_to_json(array_agg(row_to_json(t2))) AS table2items,
FROM table1 t1
INNER JOIN table1_i18n na ON na.table1_id = t1.id
INNER JOIN table3 t3 ON t3.id = t1.table3_id
INNER JOIN table4 t4 ON t4.id = t3.table4_id
LEFT JOIN table1_table2 t1t2 ON t1t2.table1_id = t1.id
LEFT JOIN table2 t2 ON t2.id = t1t2.table2_id
WHERE t1.user_id = 'myuserid' AND na.lang = 'en_US'
GROUP BY t1.id, na.name, t4.id
ORDER BY t1.id;

alone, it works perfectly. Same for t6. It's only when I try to gather everything at the same time that I got some problems.

If it is not clear enough, ask for details. It's really not easy to explain such a problem :).

share|improve this question
3  
Don't anonymize and obfuscate your schema like this. It makes things difficult to read — in your case, downright impossible. – Denis de Bernardy Jan 14 '14 at 7:10
    
Sorry… I've abandoned the idea of even reading this impossibly dry question. I would advise, though, to actually learn what a join, a group by and aggregate functions do, as none of these concepts seem clear in your mind. Also, I'd advise against running "pull everything in one go" types of queries like you're trying to do; fetch t2 and t6 data in separate queries — and without the json/array_agg lunacy. – Denis de Bernardy Jan 14 '14 at 7:24
    
Thanks @Denis . I will change my question to make it clearer. I know a thing or two about join , group by and aggregate, enough to make this query working if I only want one of those arrays. But I'm missing something, and that is exactly why I'm posting here. I can get the data via 2 queries without any problem, I would rather have just one for performances and code readability. As for the array_agg and json, that's necessary to get my info and process them. I don't see any problem with this. – Pierrick Bignet Jan 14 '14 at 12:43
1  
From the little I understood of your question, your problem seems to be that your joins are multiplying rows (as they should), and you're aggregating on the entire set. Whereas what you possibly want instead is to run your aggregates independently from one another and then join. If so (and even if not, in fact), you'll be better off finding the rows you want and then, in separate queries, fetching a handful of aggregates based on the latter. – Denis de Bernardy Jan 14 '14 at 13:07
    
>>Here is my query... shows 4 columns (id, name, bonuses, skills) but the results show 5 columns (id, name, table3, table4, table2items). Then the "working" query shows 3. I can't figure out what you're comparing. – Kirk Roybal Jan 14 '14 at 23:55

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged or ask your own question.