How to detect email addresses in arbitrary strings

I am using the following code to detect email in a string. It works great, except that it deals with email with a prefix of a clean number, such as " 536264846@gmail.com ". Is it possible to overcome this apple mistake? Any help would be appreciated!

NSString *string = @" 536264846@gmail.com "; NSError *error = NULL; NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink error:&error]; NSArray *matches = [detector matchesInString:string options:0 range:NSMakeRange(0, [string length])]; for (NSTextCheckingResult *match in matches) { if ([match.URL.scheme isEqualToString:@"mailto"]) { NSString *email = [match.URL.absoluteString substringFromIndex:match.URL.scheme.length + 1]; NSLog(@"email :%@",email); }else{ NSLog(@"[match URL] :%@",[match URL]); } } 

Edit : log result: [match URL]: http://gmail.com

+1
source share
1 answer

What I have done in the past:

  • tokenize input, for example, individual tokens, using spaces (since most other common delimiters can be valid within email). However, this may not be necessary if the regular expression is not fixed, but not sure how it will work without the "^" and "$" bindings (which I added to what was shown on the website).

  • remember that addresses can take the form of "string", as well as just an address

  • in each token, look for "@", as this is probably the best indicator you have, its email address

  • run the token through the regular expression shown on this email comparison site for email (I found that one tagged # 1 from 3/21/2013 worked best)

What I did was put the regular expression in a text file, so I did not need to avoid it:

?!?

^ (((: \ x22 \ x5C [\ x00- \ x7E] \ x22) | (???? \ X22 [^ \ x5C \ x22] \ x22)) {255}) (?!? ((: \ x22 \ x5C [\ x00- \ x7E] \ x22) | (???? \ x22 [^ \ x5C \ x22] \ x22)) {65,} @) (: (:: [\ x21 \ x23- \ x27 \ X2A \ x2B \ x2d \ x2F- \ x39 \ x3d \ X3F \ x5E- \ x7E] +?) | (:? \ X22 (: [\ x01- \ x08 \ x0B \ x0C \ x0E- \ x1f \ x21 \ x23- \ X5b \ x5D- \ x7F] | (?: \ x5C [\ x00- \ x7F])) \ x22)) (?:.? ((: [\ x21 \ x23- \ x27 \ X2A \ x2B \ x2d \ x2F- \ x39 \ x3d \ X3F \ x5E- \ x7E] +) | (?: \ x22 (: [\ x01 -? \ x08 \ x0B \ x0C \ x0E- \ x1f \ x21 \ x23- \ X5b \ x5D- \ x7F] | (?: \ x5C [\ x00- \ x7F])) \ x22))) @ (:( ?: - [a-z0-9] + ([^.] {64} ?! .) (:( :( ?: xn?)? (?: - [a-z0-9] +).) {} 1 126) {1,} (: (: ??? [AZ] [a- z0-9]) | (:(?: x -) [a-z0-9] +)) (?: - [a-z0-9] +)) | (: ???? [(:(? : IPv6: (: (: [[a-f0-9] {1,4} (:: [a-f0 -9] {1,4}) {7}) | (:?!?. ((: [ a-f0-9] [:]])? {7}) (: [a-f0-9] {1,4} {0,5}) :: ((:: [a-f0-9] { 1,4}?) :? [a-f0-9] {1,4} (? :: [a-f0-9] {1,4}) {0,5})))) | (:( ?: IPv6: ???? (: (:: [a-f0-9] {1,4} {5} :) | ((:: [a-f0-9] {1,4}?) :? !?. ((: * [a-f0-9] :) {5,}) (: [a-f0-9] {1,4} {0,3}) :: ((:: [a- f0-9] {1,4}?) :? [a-f0-9] {1,4} (:: [a-f0-9] {1,4}?) {0,3} :)))) (:( ?: 25 [0-5] ) | (???: 2 [0-4] [0-9]) | (: 1 [0-9] {2}) | (: ??? [1-9] [0-9])) (: (.? :( ?: 25 [0-5]) | (: 2 [0-4] [0-9]) | ( : 1 [0-9] {2}) | (?:? [1-9] [0-9]))) {3}))])) $

Defined by ivar:

 NSRegularExpression *reg 

Created regex:

 NSString *fullPath = [[NSBundle mainBundle] pathForResource:@"EMailRegExp" ofType:@"txt"]; NSString *pattern = [NSString stringWithContentsOfFile:fullPath encoding:NSUTF8StringEncoding error:NULL]; NSError *error = nil; reg = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:&error]; assert(reg && !error); 

Then he wrote a method for comparison:

 - (BOOL)isValidEmail:(NSString *)string { NSTextCheckingResult *match = [reg firstMatchInString:string options:0 range:NSMakeRange(0, [string length])]; return match ? YES : NO; } 

EDIT: I included this in a github project

EDIT2: for change, less rigorous but faster, see comments section of this question

+5
source

Source: https://habr.com/ru/post/1480584/


All Articles