Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

Is there any way to find javascript-links on a webpage with python? I use mechanize and I can't find all the links I want. I want the url on the pictures on this site: http://500px.com/popular

share|improve this question
1  
can you post a usecase? –  Pankaj Sharma 59 mins ago
    
A sample page with expected output would be helpful. –  Martijn Pieters 57 mins ago
    
want the url on the pictures on this site: 500px.com/popular –  user3465589 32 mins ago

1 Answer 1

With just BeautifulSoup this is quite easy:

js_links = soup.select('a[href^="javascript:"]')

This selects all <a> elements that have a href attribute whose value starts with javascript::

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('''\
... <html><body>
... <a href="http://stackoverflow.com">Not a javascript link</a>
... <a name="target">Not a link, no href</a>
... <a href="javascript:alert('P4wned');">Javascript link (with scary message)</a>
... <a href="javascript:return False">Another javascript link</a>
... </body></html>
... ''')
>>> for link in soup.select('a[href^="javascript:"]'):
...     print link['href'], link.get_text()
... 
javascript:alert('P4wned'); Javascript link (with scary message)
javascript:return False Another javascript link
share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.