Take the 2-minute tour ×
Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.890

I'm relatively new to SQL and am learning a lot of about writing queries as in my current work role. I encountered this problem that puzzles me. If I do a count of subscriptions that are paid with a null churn date, I get a count of 1279. This is the correct number. But once I add the count of customers that installed shopify, this count jumps to 1835. Why would this happen? Can anybody help me out? Thanks!

select 
    count (s.account_id)
  , count (case when c.type ilike 'shopifychannel' and s.status = 'paid' 
                then c.created_at else null end) as installed_shopify
from subscriptions s
join channels c on s.account_id = c.account_id
where s.status = 'paid' 
  and s.churned_at is null

Count s.id will count the total number of customers who have s.status = 'paid' Count case when 'shopifychannel' will count the total number of customers who have paid and installed a shopify channel. These are the outputs I am attempting to get.

share|improve this question
2  
Please show the original query as well as the new one. If possible, explain analyze output for both too. What's your PostgreSQL version? SELECT version(). Edit the question to add this info, then comment here when done. –  Craig Ringer Apr 16 at 2:21
    
If you have PostgreSQL 9.4 (or higher), then you should look into the filter clause on your aggregate expressions. –  Colin 't Hart Apr 16 at 8:43
    
Alright. I made the edits. From my understanding, using case when will only count the cases when they have installed shopify. I am using "case when" because the rest of my query will count cases when they installed "amazon", etc. (other sales channels) –  Lucas Neo Apr 17 at 1:34
2  
@LucasNeo no, you did not answer Craig's questions at all. –  dezso Apr 17 at 8:01

1 Answer 1

up vote 0 down vote accepted

I suppose you have a problem with your channels table that may return more than one row per s.account_id. I propose this query :

SELECT 
    COUNT (DISTINCT s.account_id),
    COUNT (DISTINCT CONCAT(s.account_id, c.type)) AS installed_shopify
FROM subscriptions s
LEFT JOIN channels c ON s.account_id = c.account_id AND c.type ILIKE 'shopifychannel'
WHERE s.status = 'paid'
  AND s.churned_at IS NULL

Again, you must be sure that there is only (zero or one) row in channels for each account_id having the condition c.type ilike 'shopifychannel'.

The LEFT JOIN assure you don't miss a row in your primary table subscriptions. The condition c.type ILIKE 'shopifychannel' is in the LEFT JOIN because if you put it in the WHERE clause, you'll also lose subscriptions lines that don't have installed Shopify.

share|improve this answer
    
If there is more than 1 row, for some account_id having c.type ilike 'shopifychannel' then this would increase the count of s.account_id correct? Running this query gave me a s.account_id count of 1335. This number is higher but from my understanding it is because some s.account_id do have more than 1 row for shopifychannel. –  Lucas Neo Apr 17 at 1:40
    
You're right. I was supposing that there would be only zero/one line for shopifychannel. Let me think about that. (; –  DeadEye Apr 17 at 4:49
    
I modified the query so it takes in account the fact that a paid user can install the application multiple times. –  DeadEye Apr 17 at 5:52
    
it worked, thanks! I like your use of the CONCAT() function and I think its a good method for counting the distincts from 2 variables. –  Lucas Neo Apr 18 at 9:18
    
Glad it works ! (; –  DeadEye Apr 18 at 9:26

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.