I have a slight problem with one of my query. The goal of this query is to get all the table1 items of a user and their information. As you can see, the data model is quite complex (for good reasons), and this requires an big query (my goal is to gather everything with one query only).
Here is the data model :
What I want :
- All T1 info
- All T2 info for one T1 item (it is a 1 to n relations, so I'll use array_agg)
- All T3 info for one T1 item
- All T4 info for one T1 item
- All T6 info for one T1 item
- i18n info for the T1 itemp
Here are the table1_table2 and table4_table6 SELECT * :
table1_id | table2_id
-------------+---------------
item2id | table2item1
item4id | table2item2
item4id | table2item1
item5id | table2item3
item5id | table2item2
table4_id | table6_id
------------------+--------------------
table4item1 | table6item1
table4item1 | table6item2
table4item2 | table6item2
table4item3 | table6item3
table4item1 | table6item3
table4item2 | table6item3
Here are the Table1 SELECT with id and its foreign key.
table1_id | table3_id
------------------------
item1id | table3item1
item2id | table3item1
item6id | table3item4
item3id | table3item2
item4id | table3item2
item5id | table3item3
Same for table3 :
table3_id | table4_id
------------+--------------
table3item1 | table4item1
table3item4 | table4item1
table3item2 | table4item2
table3item3 | table4item3
Finally, here is my query :
SELECT t1.id,
na.name,
array_to_json(array_agg(row_to_json(t2))) AS table2items,
array_to_json(array_agg(row_to_json(t6))) AS table6items
FROM table1 t1
INNER JOIN table1_i18n na ON na.table1_id = t1.id
INNER JOIN table3 t3 ON t3.id = t1.table3_id
INNER JOIN table4 t4 ON t4.id = t3.table4_id
LEFT JOIN table1_table2 t1t2 ON t1t2.table1_id = t1.id
LEFT JOIN table2 t2 ON t2.id = t1t2.table2_id
LEFT JOIN table4_table6 t5_t6 ON t5_t6.table5_id = t3.table4_id
LEFT JOIN table6 t6 ON t6.id = t5_t6.table6_id
WHERE t1.user_id = 'myuserid' AND na.lang = 'en_US'
GROUP BY t1.id, na.name, t4.id
ORDER BY t1.id;
Here is the result :
id | name | table3_id | table4_id | table2items | table6items
-------------+------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
item1id | MyFirstItem | table3item1 | table4item1 | [null,null,null] | [{"id":"table6item1"},{"id":"table6item2"},{"id":"table6item3"}]
item2id | MySecondItem | table3item1 | table4item1 | [{"table2item1","data1":"damage","data2":10},{"id":"table2item1","data1":"damage","data2":10},{"id":"table2item1","data1":"damage","data2":10}] | [{"id":"table6item1"},{"id":"table6item2"},{"id":"table6item3"}]
item3id | MyThirdItem | table3item2 | table4item2 | [null,null] | [{"id":"table6item2"},{"id":"table6item3"}]
item4id | MyFourthItem | table3item2 | table4item2 | [{"id":"table2item2","data1":"range","data2":20},{"id":"table2item1","data1":"damage","data2":10},{"id":"table2item2","data1":"range","data2":20},{"id":"table2item1","data1":"damage","data2":10}] | [{"id":"table6item2"},{"id":"table6item3"},{"id":"table6item3"},{"id":"table6item2"}]
item5id | MyFifthItem | table3item3 | table4item3 | [{"id":"table2item3","data1":"range","data2":20},{"id":"table2item2","data1":"range","data2":20}] | [{"id":"table6item3"},{"id":"table6item3"}]
item6id | MySixthItem | table3item4 | table4item1 | [null,null,null] | [{"id":"table6item2"},{"id":"table6item1"},{"id":"table6item3"}]
Well, I've got a problem here. As you can see, my table2_items and table6_items arrays have the same size. I don't know the reason for this, but it seems that I'm missing something. Worse, instead of filling this array with null value, this query creates duplicates which should not appear.
Details :
- item1 and item6 have the same problem : no links to table2, and 3 items in table6. I end up with an array [null, null, null] for table2_items
- item2 has 3 links to table 6, and 1 to table2. I end up with 3 times the same table2 object in the array
- item4... I don't know what's happening here. Should have 2 things in each array, and I've got 4 (duplicates)
- item5 : you can clearly see the duplication.
I have tried to group by table6.id, or table2.id. It doesn't work (I have got a line for each of them, so several line for each item).
Note : If I do
SELECT t1.id,
na.name,
array_to_json(array_agg(row_to_json(t2))) AS table2items,
FROM table1 t1
INNER JOIN table1_i18n na ON na.table1_id = t1.id
INNER JOIN table3 t3 ON t3.id = t1.table3_id
INNER JOIN table4 t4 ON t4.id = t3.table4_id
LEFT JOIN table1_table2 t1t2 ON t1t2.table1_id = t1.id
LEFT JOIN table2 t2 ON t2.id = t1t2.table2_id
WHERE t1.user_id = 'myuserid' AND na.lang = 'en_US'
GROUP BY t1.id, na.name, t4.id
ORDER BY t1.id;
alone, it works perfectly. Same for t6. It's only when I try to gather everything at the same time that I got some problems.
If it is not clear enough, ask for details. It's really not easy to explain such a problem :).