How to filter in textarea with regex?

btw: I know using regex is not the best idea in the world ...

For example, I have these input options:

<p>&nbsp; &nbsp; &nbsp;</p> 

or

 <p>&nbsp; &nbsp;</p> 

or

 <p>&nbsp;</p> 

and I want to check my input as: everything except <p> with &nbsp; in each quantity (0, 1 or 50) ...

I wrote this expression:

 /[^<p>(\s*&nbsp;\s*)*<\/p>]/ig 

and it seems that it works , but!

For example, I have this input:

 <p>&nbsp; &nbsp;t&nbsp;</p> 

or

 <p>&nbsp; &nbsp;tttt&nbsp;tttt</p> 

and he thinks he is equal to my regular expression ...

not a good idea ...

What am I doing wrong in my regex? or maybe there are some better ways to solve this?

+6
source share
3 answers

Assuming you want to remove all <P>'s with only nbsp; (or more) inside them: then:

Assuming you have this

 var a='a<p>&nbsp; &nbsp;&nbsp;</p>c<p>&nbsp; &nbsp;&nbsp;</p>d<p>&nbsp;aa &nbsp;&nbsp;</p>e'; 

And assuming the yellow part should go: because it contains aa inside:

You will be left with everything but the problematic Ps with pure nbsps:

enter image description here

Then this code:

 a=a.replace(/(<p>.*?<\/p>)/g, function(match, p1 ) { if (/^<p>(\s*&nbsp;\s*)*<\/p>$/ig.test(p1)) return ''; else return p1; }) 

Will yield:

 acd<p>&nbsp;aa &nbsp;&nbsp;</p>e 

As you can see, the P tag was not deleted due to aa

http://jsbin.com/cizidayeru/3/edit

+4
source

Your expression is pretty close, you want:

 .replace(/<p>(\s*&nbsp;\s*)+<\/p>/ig,'<p>&nbsp;</p>'); 

This will match <p> , followed by one or more occurrences of \s*&nbsp;\s* , followed by </p> , and replace them with <p>&nbsp;</p> .

Or do you want only one &nbsp; remained a multiple that needs to be completely removed? In this case, you need to:

 .replace(/<p>\s*&nbsp;\s*(\s*&nbsp;\s*)+<\/p>/ig,'') 

Noting that you should not use a regular expression to process HTML .; -)

Edit

If you only need to test it, use:

 /<p>(\s*&nbsp;\s*)+<\/p>/.test(string); 

for one or more and:

 /<p>\s*&nbsp;\s*(\s*&nbsp;\s*)+<\/p>/.test(string); 

for two or more.

+1
source

You can parse HTML in the DOM before displaying it on the page. This provides some benefit, since you do not need to include a tag in Regex. An added benefit is that your paragraph elements can include other attributes, such as class names, data - * information, or inline style; which all would fail your Regex test.

Since this is parsed in the DOM before being added to the body, there is another advantage that you do not need to look for &nbsp; in your Regex, you can just look for spaces \s (or vice versa any characters without spaces).

JQuery

 var strText ='a<p>&nbsp; &nbsp;&nbsp;</p>c<p>&nbsp; &nbsp;&nbsp;</p>d<p>&nbsp;aa &nbsp;&nbsp;</p>e', $div = $('<div/>').html(strText), $p = $div.find('p'); var empty_paragraph_count = 0; $p.each(function(){ var $this = $(this); if ( /^\s*$/.test( $this.text() ) ){ empty_paragraph_count++; // uncomment this line if you want to remove the paragraph: // $this.remove(); } }); 

Then you can do whatever you want with $div.html(); and empty_paragraph_count , showing how many paragraphs were blank or had only spaces.


Vanilla

If you are looking for a VanillaJS solution, you can use the same approach:

 var strText = 'a<p>&nbsp; &nbsp;&nbsp;</p>c<p>&nbsp; &nbsp;&nbsp;</p>d<p>&nbsp;aa &nbsp;&nbsp;</p>e', div = document.createElement('div'), div.innerHTML = strText, p = div.getElementsByTagName('p'); var empty_paragraph_count = 0; for(var i=0, n=p.length; i<n; i++){ if( /^\s*$/.test( p[i].textContent ) ){ empty_paragraph_count++; } } 
0
source

Source: https://habr.com/ru/post/988146/


All Articles