How to search text between HTML tags

I am using mongoJS to process my database query. I ran into a problem that contains HTML tags, I use regular expression expressions to find my string in a collection. How to search for text ignoring HTML tags?

var userInput = $scope.userInput; // value from user input
db.collections.find({'obj': {$regex: new RegExp(userInput) } }).toArray(function(err, result){ 
  return res.json(result); 
}

Collections

[{_id:"34aw34d343s4", obj:"How are you?"},
{_id:"34asdfwer343s4", obj:"Are you okay?"},
{_id:"3sDaweqr43s4", obj:"Goodbye, my friend!"},
{_id:"34aw3sdfgds3s4", obj:"Do you know these are <strong>important</strong> items"}]

User input

these are
these
these are important

Output

[{_id:"34aw3sdfgds3s4", obj:"Do you know these are <strong>important</strong> items"}]
[{_id:"34aw3sdfgds3s4", obj:"Do you know these are <strong>important</strong> items"}]
[]

Expected

[{_id:"34aw3sdfgds3s4", obj:"Do you know these are <strong>important</strong> items"}]
[{_id:"34aw3sdfgds3s4", obj:"Do you know these are <strong>important</strong> items"}]
[{_id:"34aw3sdfgds3s4", obj:"Do you know these are <strong>important</strong> items"}]
+4
source share
3 answers

You must sanitize user input before entering the database. From my understanding of your system, there is a high probability that user input (before it is inserted into the database) will not be disinfected, and your site is vulnerable to an XSS attack .

, sanitize-html, , .

+3

RegExp test: /these|are/.test(stringToCheckAgainst);

var testCases = ["these are", "these", "these are <strong>item</strong>"];

testCases.forEach(function(value) {
  document.write(/these|are/.test(value) + "\n");
});
Hide result
0

If you want to remove the html tag, then the following method

  • JQuery (HTML) .text ();
  • yourStr.replace (/ <(?:. | \ n) *?> / gm, '');
  • yourStr.replace (/ <[^>] +> / g, '');

more HTML strip from text JavaScript

-3
source

Source: https://habr.com/ru/post/1686258/


All Articles