Tuesday, November 27, 2012

How to get the source code of an url?

I'm trying to parse an xml url with minidom. I have an url with my xml data.

This is my code:

url = "http://myurl.com/wsname.asp"      datasource = urllib2.urlopen(url)    dom = parse(datasource)  handleElements(dom)

my handleElements function to parse xml:

def handleElements(dom):      Elements = dom.getElementsByTagName("book")      for item in Elements:          getText(item.getElementsByTagName("id")[0].childNodes)          ....

My xml:

<html><head><style type="text/css"></style></head>  <body>  <bibliothque>   <book>   <id>747</id>   <title>L'alchimiste</nomclient>   <author>Paulo Cohelo </nomposte>   </book>    ...   </bibliothque>    </body>

I get no error, but no result!

my handleElements() works fine because when I copy the same data from my url put it in a string and use parseString instead of parse everything works fine and I get my results.

But when trying to openurlElements is empty and the loop is not even started


Seems that I need to get the sourcecode of the url (not it's content) (like the view-source in chrome) How can I do that?

Thanks

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To view this discussion on the web visit https://groups.google.com/d/msg/django-users/-/dLvj123olLUJ.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to django-users+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-users?hl=en.

No comments:

Post a Comment