I am a beginner of python and trying to create a procedure with Python 2.7 which retrieves the betting odds from the following web sites.
English Version web site: http://bet.hkjc.com/racing/pages/odds_wp.aspx?date=24-09-2015&venue=hv&raceno=1&lang=en
Chinese Version web site: bet.hkjc.com/racing/pages/odds_wp.aspx?date=24-09-2015&venue=hv&raceno=1
The data that i want to retrieve is marked in the following image file https://na.cx/i/Bz873x.jpg
The procedure works well in other web site (e.g. reddit or lxml.de/parsing.html). But I don't know why the procedure retrieved a different html code that I've retrieved by using Chrome.
from urllib2 import urlopen
from lxml import etree
# print out the sources code of the web site
# work properly on other web sites (e.g. reddit.com or lxml.de/parsing.html)
# but having problem on the betting web site
url = 'http://bet.hkjc.com/racing/pages/odds_wp.aspx?date=24-09-2015&venue=hv&raceno=1'
tree = etree.HTML(urlopen(url).read())
print(etree.tostring(tree, pretty_print=True))
# printing the first horse name in chinese version web site (Doesn't work)
horse_name = tree.xpath('//*[@id="detailWPTable"]/table/tbody/tr[2]/td[3]/a/span/text()')
print horse
After running the above procedure, I found that the html code retrieved by Python is different from the html code that I retrieved by using Chrome Function - [View Sources] or [Open Developer Tools].
My question is
- How can I get the correct html code (Same code as Chrome - View Sources) by using python?
Thanks :)