3
\$\begingroup\$

My objective is to find out on what other subreddit users from r/(subreddit) are posting on; you can see my code below. It works pretty well, but I am curious to know if I could improve it by:

First, restricting my code so that it only considers users only once (i.e. not collect the posting history twice for the same user) and, secondly, by adding a minimum of 5 posts per user before extracting his/her info (i.e. if the user wrote less than 5 posts in his reddit life, my code would not consider him).

import praw
import pandas as data
import datetime as time


reddit = praw.Reddit(client_id = 'XXXX',
                     client_secret = 'XXXX',
                     username = 'XXXX',
                     password = 'XXXX',
                     user_agent = 'XXXX')

collumns = { "User":[], "Subreddit":[], "Title":[], "Description":[], "Timestamp":[]}


for submission in reddit.subreddit("ENTER SUBREDDIT").new(limit=100):
    user = reddit.redditor('{}'.format(submission.author))

    for sub in user.submissions.new(limit=100):
        collumns["User"].append(sub.author)
        collumns["Subreddit"].append(sub.subreddit)
        collumns["Title"].append(sub.title)
        collumns["Description"].append(sub.selftext)
        collumns["Timestamp"].append(sub.created)
          

collumns_data = data.DataFrame(collumns)

def get_date(Timestamp):
    return time.datetime.fromtimestamp(Timestamp)
_timestamp = collumns_data["Timestamp"].apply(get_date)
collumns_data = collumns_data.assign(Timestamp = _timestamp)

collumns_data.to_csv('DataExport.csv')

\$\endgroup\$
0

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.