Basically, a page generates some dynamic content, and I want to get that dynamic content, and not just the static html. I am not being able to do this with cURL. Help please.
you could try selenium at http://seleniumhq.org, which supports js. | |||||
|
You can't with just cURL. cURL will grab the specific raw (static) files from the site, but to get javascript generated content, you would have to put that content into a browser-like envirionment that supports javascript and all other host objects that the javascript uses so the script can run. Then once the script runs, you would have to access the DOM to grab whatever content you wanted from it. This is why most search engines don't index javascript-generated content. It's not easy. If this is one specific site that you're trying to gather info on, you may want to look into exactly how the site gets the data itself and see if you can't get the data directly from that source. For example, is the data embedded in JS in the page (in which case you can just parse out that JS) or is the JS obtained from an ajax call (in which case you can maybe just make that ajax call directly) or some other method. | ||||
|