3

im trying to get all the Images from certain url using python.

So the usage of beautiful soup is strait Forward, but i'm facing the problem, that not all img tags are printed in the console. A closer look to the desired HTML file shows that the missing Images are coming from Angular, because they have a data-ng-src tag.

Is there any way to tell soup to wait until all scripts have finished? Or is there a nother way to detect all img tags?

My code so far:

import urllib2
from BeautifulSoup import BeautifulSoup

page = BeautifulSoup(urllib2.urlopen(url))
allImgs = imgs = page.findAll('img')
print allImgs
1

2 Answers 2

1

Images are not inserted in HTML Page they are linked to it. And for things that need some wait/pause time I would rather use Selenium Web Driver. I think Beautiful Soup is reading page all at once. I think about it as a wrapper for daunting chores of parsing files, but not as a tool to interact with page.

Sign up to request clarification or add additional context in comments.

Comments

0

You can try using selenium. Though this library is used for automation testing, this has much enriched functions than BeautifulSoup

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException

url ='http://example.com/'
driver = webdriver.Firefox()
driver.get(url)

delay = 5 # seconds

try:
    WebDriverWait(driver, delay).until(EC.presence_of_element_located(driver.find_elements_by_xpath('..//elementid')))
    print "Page is ready!"
    for image in driver.find_elements_by_xpath('..//img[@src]'):
        print image.get_attribute('src')
except TimeoutException:
    print "Couldn't load page"

Also read the following post; talks about dynamically loaded page using JS
https://stackoverflow.com/a/11460633/6626530

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.