Join the Stack Overflow Community
Stack Overflow is a community of 6.8 million programmers, just like you, helping each other.
Join them; it only takes a minute:
Sign up

I want to use array_agg in a subquery, then make use of the aggregated data by it's array index in my main query, however, after trying many different ways, I'm really at a loss as to how it should be done; could someone explain why in the example below I get a series of None values instead of the first category in the array?

I understand that the following, simplified example can be done without doing a SELECT on array[i], but it will explain the nature of the problem:

from sqlalchemy import Integer
from sqlalchemy.dialects.postgres import ARRAY

prods = (
    session.query(
        Product.id.label('id'),
        func.array_agg(ProductCategory.id, type_=ARRAY(Integer)).label('cats'))
    .outerjoin(
        ProductCategory,
        ProductCategory.product_id == Product.id)
    .group_by(Product.id).subquery()
)

# Confirm that there's categories:
[x for x in session.query(prods) if len(x[1]) > 1][:10]
"""
Out[48]: 
[(2428, [1633667, 1633665, 1633666]),
 (2462, [1162046, 1162043, 2543783, 1162045]),
 (2573, [1633697, 1633696]),
 (2598, [2546824, 922288, 922289]),
 (2645, [2544843, 338411]),
 (2660, [1633713, 1633714, 1633712, 1633711]),
 (2686, [2547480, 466995, 466996]),
 (2748, [2546706, 2879]),
 (2785, [467074, 467073, 2545804]),
 (2806, [2545326, 686295, 686298, 686297])]
"""

# Ok now try to query to get the first category of each array:
[x for x in session.query(prods.c.cats[0].label('first_cat'))]

"""
 (None),
 (None),
 (None),
 (None),
 (None),
 (None),
 (None),
 (None),
 (None),
 (None),
 (None),
"""
share|improve this question
    
You are using 0 as index for the first array element. In Postgres, the index starts with 1 by default. Does Python translate this to 0 or is it a simple off-by-one issue? – Erwin Brandstetter Apr 24 '14 at 8:13
    
This was it, thanks! – benwah Apr 26 '14 at 2:25
up vote 7 down vote accepted

You are probably doing everything right, but getting empty arrays first. In your previous query you used in-python filtering (len(x[1]) > 1). You can print Query expression before executing it to make sure.

You should probably add a having clause to you base query:

cats_agg = func.array_agg(ProductCategory.id, type_=ARRAY(Integer)).label('cats')
prods = (
    session.query(
        Product.id.label('id'),
        cats_agg,
    .outerjoin(
        ProductCategory,
        ProductCategory.product_id == Product.id)
    .group_by(Product.id)
    .having(func.array_length(cats_agg, 1) > 1)
    .subquery()
)

Then you won't need in-python filtering either.

share|improve this answer
    
The problem was just that.. wrong index used for the array, [1] instead of [0]. Thanks! – benwah Apr 24 '14 at 20:24
    
There's that, too) – letitbee Apr 25 '14 at 10:33

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.