1

I have 2 tables; one is users and the other records user actions. I want to count the number of actions per user and record this in the users table. There are ~100k user and the following code takes 6 hours! There must be a better way!

def calculate_invites():

sql_db.execute("SELECT id, uid FROM users")

for row in sql_db:
    id = row['id']
    uid = row['uid']

    sql1 = "SELECT COUNT(1) FROM actions WHERE uid = %s"
    sql_db.execute(sql1, uid)
    count_actions = sql_db.fetchone()["COUNT(1)"]

    sql = "UPDATE users SET count_actions=%s WHERE uid=%s"
    sql_db.execute(sql, (count_actions, uid))

2 Answers 2

5

You can do this all as one statement:

update users
    set count_actons = (select count(*) from actions a where a.uid = users.uid)

No for loop. No multiple queries. Do in SQL what you can do in SQL. Generally looping over rows is something you want to do in the database rather than in the application.

1
  • update users set count_actons = (select count(1) from actions a where a.uid = users.uid) - I would change Count(*) to Count(1), will save even more. Commented Jun 19, 2013 at 13:19
1

Offered only as an alternative since Gordon's answer is probably faster:

update users
from (
   select uid, count(*) as num_actions
   from actions
   group by uid
   ) x
set count_actions = x.num_actions
where users.uid=x.uid
1
  • 1
    . . This is worth an upvote because this version will be faster if there is no index on actions(uid). It would scan the table once, doing an aggregation (with an external sort). My version would scan the table and do lots of small sorts. With the index, the two should be comparable in speed, although I think my version would be a bit faster. Commented May 19, 2013 at 16:45

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.