Regular expression to match closing HTML tags

Question

Regular expression to match closing HTML tags

I am working on a small Python script to clear HTML documents. It works by accepting a tag list for KEEP, and then parsing HTML tags that are not in the list that I used for regular expressions, and I was able to match the opening tags and the self-closing tags but not closing the tags. The pattern I experimented to match the closing tags is - </(?!a)>. It seems logical to me, so why doesn't it work? (?!a)should coincide with everything that is NOT an anchor tag (not that "a" can be anything - it's just an example).

Edit: AGG! I think the regex did not show!

+3

python html regex

kevin628 Aug 19 '10 at 17:15

source share

3 answers