Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I am working on a web-scraping project. One of the website, I am working has the data coming from Javascript.

There was a suggestion in one of my earlier question, that I can directly call the Javascript from Python.

Any idea how to do it? I was not able to figure out how to call the Javascript function for instance in Python.

For example : If JS function is defined as : add_2(var,var2)

How do we call the same function from Python ? Any useful reference would be highly appreciated.

Thanks

share|improve this question
1  
If it's something you know and can easily simulate, it may be easiest to parse and interpret it yourself. If not, you could end up needing to tie into a JavaScript engine. –  Chris Morgan Nov 27 '11 at 10:07

4 Answers 4

up vote 6 down vote accepted

Find a JavaScript interpreter that has Python bindings. (Try Rhino? V8? SeaMonkey?). When you have found one, it should come with examples of how to use it from python.

Python itself, however, does not include a JavaScript interpreter.

share|improve this answer
4  
You should definitely check outpyv8 which offers a python wrapper for Google's V8 engine. This contains information about using python with SpiderMonkey. Hope this helps. –  Codahk Nov 27 '11 at 11:07
1  
If you want to support more than one JavaScript engine, you should have a look at PyExecJS –  Wienczny Aug 19 '13 at 2:36

You can eventually get the JavaScript from the page and execute it through some interpreter (such as v8 or Rhino). However, you can get a good result in a way easier way by using some functional testing tools, such as Selenium or Splinter. These solutions launch a browser and effectively load the page - it can be slow but assures that the expected browser displayed content will be available.

For example, consider the HTML document below:

<html>
    <head>
        <title>Test</title>
        <script type="text/javascript">
            function addContent(divId) {
                var div = document.getElementById(divId);
                div.innerHTML = '<em>My content!</em>';
            }
        </script>
    </head>
    <body>
        <p>The element below will receive content</p>
        <div id="mydiv" />
        <script type="text/javascript">addContent('mydiv')</script>
    </body>
</html>

The script below will use Splinter. Splinter will launch Firefox and after the complete load of the page it will get the content added to a div by JavaScript:

from splinter.browser import Browser
import os.path

browser = Browser()
browser.visit('file://' + os.path.realpath('test.html'))
elements = browser.find_by_css("#mydiv")
div = elements[0]
print div.value

browser.quit()

The result will be the content printed in the stdout.

share|improve this answer

To interact with JavaScript from Python I use webkit, which is the browser renderer behind Chrome and Safari. There are Python bindings to webkit through Qt. In particular there is a function for executing JavaScript called evaluateJavaScript().

Here is a full example to execute JavaScript and extract the final HTML.

share|improve this answer

An interesting alternative I discovered recently is the Python bond module, which can be used to communicate with a NodeJs process (v8 engine).

Usage would be very similar to the pyv8 bindings, but you can directly use any NodeJs library without modification, which is a major selling point for me.

Your python code would look like this:

val = js.call('add2', var1, var2)

or even:

add2 = js.callable('add2')
val = add2(var1, var2)

Calling functions though is definitely slower than pyv8, so it greatly depends on your needs. If you need to use an npm package that does a lot of heavy-lifting, bond is great. You can even have more nodejs processes running in parallel.

But if you just need to call a bunch of JS functions (for instance, to have the same validation functions between the browser/backend), pyv8 will definitely be a lot faster.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.