How to combine these two regular expression patterns?

I feel pretty stupid when I ask about it, but I can't get it to work to save my life ...

What works

preg_replace( '/(<[^>]+) onmouseout=".*?"/i', '$1', preg_replace( '/(<[^>]+) onmouseover=".*?"/i', '$1', $strHtml ) ) 

How can I combine these two preg_replace functions into one (combing two regex patterns?

My cleanup attempt (not working)

 preg_replace( '/(<[^>]+) (onmouseover|onmouseout)=".*?"/i', '$1', $strHtml ) 

I want this preg_replace() function to remove all onmouseover AND onmouseout from my HTML string. It seems to remove only one of the two attributes ... What am I doing wrong?

UPDATE: Example String

 <p><img src="http://www.bestlinknetware.com/products/204233spc.jpg" width="680" height="365"><br> <a href="http://www.bestlinknetware.com/products/204233INST.pdf" target="_blank" onmouseover="MM_swapImage('Image2','','/Content/bimages/ins2.gif',1)" onmouseout="MM_swapImgRestore()"><img name="Image2" border="0" src="http://www.bestlinknetware.com/Content/bimages/ins1.gif"></a> </p> <p><strong>No contract / No subscription / No monthy fee<br> 1080p HDTV reception<br> 32db high gain reception<br> Rotor let you change direction of the antenna to find best reception</strong></p> <a href=http://transition.fcc.gov/mb/engineering/dtvmaps/ target="blank"><strong>CLICK HERE</strong></a><br>to see HDTV channels available in your area.<br> <br/> ** TV signal reception is immensely affected by the conditions such as antenna height, terrain, distance from broadcasting transmission antenna and output power of transmitter. Channels you can watch may vary depending on these conditions. <br> <br/> <br/> <p>* Reception: VHF/UHF/FM<br/> * Reception range: 120miles<br/> * Built-in 360 degree motor rotor<br> * Wireless remote controller for rotor (included)<br/> * Dual TV Outputs<br> * Easy Installation<br> * High Sensitivity Reception<br> * Built-in Super Low Noise Amplifier<br> * Power : AC15V 300mA<br> <br/> Kit contents<br/> * One - HDTV Yagi antenna with built-in roter & amplifier<br/> * One - Roter control box<br/> * One - Remote for roter control box<br/> * One - 40Ft coax cable<br/> * One - 4Ft coax cable<br/> * One - power supply for roter control box</p> 

UPDATE: a tool for future views of this topic

https://regex101.com/

I could not determine exactly how to use http://regexr.com/ , so I tried this site regex101.com, and I have loved it since then. Highly recommended for those who encounter similar problems (which use the cut-and-paste regex pattern as before).

+5
source share
1 answer

The problem with your original expression was that the initial group captured too much, and therefore the only thing that was replaced was the one that appears last on the line. This happened because of the greedy repetition [^>]+ , which consumed most of the search string than you expected, capturing everything from the beginning of the first desired match to the beginning of the second attribute that you wanted to get rid of. And then binding the template to the start bracket of the html tag will also prevent multiple matches within the element even after addressing this problem.

If you want to do this in one call to preg_replace() , instead of trying to grab the text you want to save, it makes sense to look for the text to delete (by replacing with an empty string):

 preg_replace( '/(onmouseover|onmouseout)=".*?"/i', '', $strHtml ) 

You already had a greedy match for the attribute value (s .*? ) And based on your previous code, it seems to be working well for you already. Please note that this particular expression does not cover all possible variations in an HTML / XML document (for example, spaces and quotation marks). I hope you can make a legal call as to whether this is enough for your needs.

+1
source

Source: https://habr.com/ru/post/1240393/


All Articles