Saturday, June 30, 2012

Re: Use regular expression to retrieve all image tags from a given content

You can try the following two suggestions:

1. Try removing the "^" from the pattern and match only r"<img". I believe that the image tag might not be coming at the start of the string.
2. Try printing the value of "content" to check that the "<img" pattern exist in it. The match will be case sensitive, so even <IMG will not be matched.

On a sidenote, you should not be using regular expressions if you are doing anything complex that what you are doing right now.
HTML is not a regular language. So, you will be better off using an xml parser (like lxml or elementtree) or an html parser (BeautifulSoup)


On Saturday, June 30, 2012 6:07:13 PM UTC+5:30, mo.mughrabi wrote:

am really a noob with regular expressions, I tried to do this on my own but I couldn't understand from the manuals how to approach it. Am trying to find all img tags of a given content, I wrote the below but its returning None

 content = i.content[0].value
= re.compile(r'^<img')
= prog.match(content)
print result

any suggestions?

You received this message because you are subscribed to the Google Groups "Django users" group.
To view this discussion on the web visit
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at

No comments:

Post a Comment