Join the Stack Overflow Community
Stack Overflow is a community of 6.6 million programmers, just like you, helping each other.
Join them; it only takes a minute:
Sign up

I am scraping this webpage for usernames which loads the users after scrolling

Url to page : "http://www.quora.com/Kevin-Rose/followers"

I know the number of users on the page (in this case no. is 43812) How can I scroll the page till all the users are loaded? I have searched for the same on the internet and everywhere I got almost same line of code for doing it which is:

driver.execute_script("window.scrollTo(0, )")

How can I determine the vertical position to ensure that all the users are loaded? Is there any other option to achieve the same thing without actually scrolling?

   from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
import urllib

driver = webdriver.Firefox()
driver.get('http://www.quora.com/')
time.sleep(10)

wait = WebDriverWait(driver, 10)

form = driver.find_element_by_class_name('regular_login')
time.sleep(10)
#add explicit wait

username = form.find_element_by_name('email')
time.sleep(10)
#add explicit wait

username.send_keys('[email protected]')
time.sleep(30)
#add explicit wait

password = form.find_element_by_name('password')
time.sleep(30)
#add explicit wait

password.send_keys('def')
#add explicit wait

password.send_keys(Keys.RETURN)
time.sleep(30)

#search = driver.find_element_by_name('search_input')
search = wait.until(EC.presence_of_element_located((By.XPATH, "//form[@name='search_form']//input[@name='search_input']")))

search.clear()
search.send_keys('Kevin Rose')
search.send_keys(Keys.RETURN)

link = wait.until(EC.presence_of_element_located((By.LINK_TEXT, "Kevin Rose")))
link.click()
#Wait till the element is loaded (Asynchronusly loaded webpage)

handle = driver.window_handles
driver.switch_to.window(handle[1])
#switch to new window 

element = WebDriverWait(driver, 2).until(EC.presence_of_element_located((By.PARTIAL_LINK_TEXT, "Followers")))
element.click()
share|improve this question
    
There are certainly options. Please show the complete code you have now (including scrolling part). Thanks. – alecxe Sep 16 '14 at 14:05
    
I dont think its of any use but I have added the code. This is just code to log into the site and navigate to particular page. I dont know what to add in y coordinate position? – Siddhesh Sep 16 '14 at 14:11
up vote 2 down vote accepted

Since there is nothing special appearing after the last followers bucket is loaded, I would rely on the fact that you know how many followers does the user have and you know how many are loaded on each scroll down (I've inspected - it is 18 per scroll). Hence, you can calculate how many times do you need to scroll the page down.

Here's the implementation (I've used a different user with only 53 followers to demonstrate the solution):

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

followers_per_page = 18

driver = webdriver.Chrome()  # webdriver.Firefox() in your case
driver.get("http://www.quora.com/Andrew-Delikat/followers")

# get the followers count
element = WebDriverWait(driver, 2).until(EC.presence_of_element_located((By.XPATH, '//li[contains(@class, "FollowersNavItem")]//span[@class="profile_count"]')))
followers_count = int(element.text.replace(',', ''))
print followers_count

# scroll down the page iteratively with a delay
for _ in xrange(0, followers_count/followers_per_page + 1):
    driver.execute_script("window.scrollTo(0, 10000);")
    time.sleep(2)

Also, you may need to increase this 10000 Y coordinate value based on the loop variable in case there is a big number of followers.

share|improve this answer
    
Thanks a lot !! Right now I am trying the following script which appears to work perfectly driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") – Siddhesh Sep 16 '14 at 14:33
    
^Nope. Code which I mentioned above didnt load all the users. – Siddhesh Sep 16 '14 at 14:35
    
@Siddhesh thank you for the another interesting challenge. Sorry, I didn't quite get - does it work for you? – alecxe Sep 16 '14 at 14:37
    
Yes it worked. Thanks again for putting so much efforts. – Siddhesh Sep 16 '14 at 14:45

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.