Parsing rss xml file in PYTHON

Question

I am actually trying to extract data from RSS documents. I am using the following code to parse xml doc.

But wont work for this document http://www.mediafire.com/?hptptj8847awnn1 . Please help!!

#import easy to use xml parser called minidom:
import xml.dom.minidom as minidom
import csv

def getTags(xml):
"""
Print out all titles found in xml
"""

doc = minidom.parse(xml)



node = doc.documentElement
items = doc.getElementsByTagName("item")

titles = []
for item in items:
    titleObj = item.getElementsByTagName("title")[0]
    titles.append(titleObj)


print len(titles)

x = 0
for x in range(len(titles)):
    nodes = titles[x].childNodes
    for node in nodes:
        if node.nodeType == node.CDATA_SECTION_NODE:
            titletxt = node.data

        elif node.nodeType == node.TEXT_NODE:
            titletxt = node.data

if __name__ == "__main__":
    document = 'D2B0918.xml'
    getTags(document)

Getting this error:line 10, in getTags doc = minidom.parse(xml) File "C:\Python26\lib\xml\dom\minidom.py", line 1918, in parse return expatbuilder.parse(file) File "C:\Python26\lib\xml\dom\expatbuilder.py", line 924, in parse result = builder.parseFile(fp) File "C:\Python26\lib\xml\dom\expatbuilder.py", line 207, in parseFile parser.Parse(buffer, 0) ExpatError: not well-formed (invalid token): line 2, column 573 — ISGAL, Nov 8 '11 at 4:49

AKX · Answer 1 · 2011-11-08 06:57:54Z

up vote 0 down vote

If you want to parse RSS in particular, I'll just humbly point you towards the excellent feedparser library, which probably does what you want and then some.

http://code.google.com/p/feedparser/

answered Nov 8 '11 at 6:57

AKX

20.6k13149

add a comment |

asked	4 years ago
viewed	2357 times
active	4 years ago

current community

your communities

more stack exchange communities

Parsing rss xml file in PYTHON

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged python xml parsing or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Parsing rss xml file in PYTHON

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python xml parsing or ask your own question.

Related

Hot Network Questions