Re: Use regular expression to retrieve all image tags from a given content

You can try the following two suggestions:

1. Try removing the "^" from the pattern and match only r"<img". I believe that the image tag might not be coming at the start of the string.
2. Try printing the value of "content" to check that the "<img" pattern exist in it. The match will be case sensitive, so even <IMG will not be matched.

On a sidenote, you should not be using regular expressions if you are doing anything complex that what you are doing right now.
HTML is not a regular language. So, you will be better off using an xml parser (like lxml or elementtree) or an html parser (BeautifulSoup)


On Saturday, June 30, 2012 6:07:13 PM UTC+5:30, mo.mughrabi wrote:

am really a noob with regular expressions, I tried to do this on my own but I couldn't understand from the manuals how to approach it. Am trying to find all img tags of a given content, I wrote the below but its returning None

 content = i.content[0].value
= re.compile(r'^<img')
= prog.match(content)
print result

any suggestions?

