NSRegularExpression to remove HTML tags

I am developing an e-book reader application. I have an .ePUB file for the entire book, where there is an html file in every ebook topic. I want to implement a search function in an application. I am using the NSRegularExpression class to search. Please consider the following HTML:

<temp> I am temp in tempo with temptation </temp>

Say, for example, in the above html code, I just want to find the word temp. Now in the above code temp appears 5 times -> <temp> </temp>temp tempo temptation. I am looking for a regular expression where I can extract the whole word "temp". I do not want to consider the word temp in html tags <temp> </temp>. I also do not want the word tempo and temptation to be considered.

Thanks in advance

+3
source share
2

?

[^<\/?\w*>]+(temp\s)

http://rubular.com/r/3PkdvNZSbr

NSString *evaluate_string = @"<temp> I am temp in tempo with temptation </temp>";
NSString *word = @"temp";
NSError *outError;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:[NSString stringWithFormat:@"[^<\\/?\\w*>]+(%@\\s)", word] options:0 error:&outError];

NSTextCheckingResult *result = [regex firstMatchInString:evaluate_string options:0 range:NSMakeRange(0, [evaluate_string length])];

if(result) {
    NSLog(@"Found");
}
+2

:

</?[a-z][a-z0-9]*[^<>]*>

RegExBuddy:)

+1

Source: https://habr.com/ru/post/1790888/


All Articles