Allow + check email in regex email

Regex is blowing in my head. How can I change this to check plus emails? so I can subscribe to test+spam@gmail.com

if(!preg_match("/^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*$/i", $_GET['em'])) { 
+4
source share
5 answers

It looks like you are not completely familiar with what your regular expression is doing at the moment, which would be a good first step before modifying it. Let me go through your regular expression using the email address john.robert.smith@mail.com (in each section below, the bold part corresponds to what corresponds to this section):

  • ^ - beginning of the line anchor . It indicates that any match should begin at the beginning of the line. If the pattern is not bound, the regex engine may match a substring, which is often undesirable.

    Anchors are zero width, which means they do not capture any characters.

  • [_a-z0-9-]+ consists of two elements, a character class and a repeating modifier:

    • [...] defines the character class that the regex engine reports; any of these characters is a valid match. In this case, the class contains the characters az, numbers 0-9, and dashes and underscores (in general, a dash in a character class defines a range, so you can use az instead of abcdefghijklmnopqrstuvwxyz ; when given as the last character in a class, it acts like a literal dash).
    • + is a repeat modifier that indicates that the previous token (in this case, a character class) can be repeated one or more times. There are two more repetitions of Operators: * matches zero or more times; ? matches exactly zero or once (i.e. does something at will).

    (captures john .robert.smith @ mail.com)

  • (\.[_a-z0-9-]+)* again contains the repeated character class. It also contains, and escaped character:

    • (...) defines a group that allows you to group several tokens together (in this case, the group will be repeated as a whole). Say we wanted a match of 'abc', zero or more (i.e. no match for abcabcabc, abcccc). If we try to use the abc* pattern, the repeat modifier will only apply to c , because c is the last token before the modifier. To get around this, we can group abc ( (abc)* ), in which case the modifier is applied to the whole group, as if it were the only token.
    • \. indicates a literal dot character. The reason this is necessary is because . is a special character in regular expression, which means any character . Since we want to match the actual character point, we need to avoid this.

    (captures John .robert.smith @ mail.com)

  • @ not a special character in regex, so like all other non-special characters, this matches literally.
    (captures john.robert.smith @ mail.com)

  • [a-z0-9-]+ again defines a repeating character class, for example, item # 2 above.
    (captures john.robert.smith @ mail .com)

  • (\.[a-z0-9-]+)* - almost exactly the same template as # 3 above.
    (captures john.robert.smith@mail .com )

  • $ is the end of the string binding. It works the same as ^ above, except for matches at the end of the line.


With this in mind, it should be a little clearer how to add a section with the capture of the plus segment. As we saw above, + is a special character, so it must be escaped. Then, since some characters must follow the + character, we can define a character class with the characters we want to match and determine its repetition. Finally, we need to make the whole group optional, since email addresses do not need to have a + segment:

 (\+[a-z0-9-]+)? 

When pasted into your regex, it will look like this:

 /^[_a-z0-9-]+(\.[_a-z0-9-]+)*(\+[a-z0-9-]+) ?@ [a-z0-9-]+(\.[a-z0-9-]+)*$/i 
+27
source

Keep your sanity. Get Ready PHP RFC 822 Email Address Parser

+6
source

I used this regex to validate email, and it works great with email messages that contain + :

 /^(([^<>()[\]\\.,;:\ s@ \"]+(\.[^<>()[\]\\.,;:\ s@ \"]+)*)|(\".+\"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/ 
+2
source

\+ will match the letter +, but keep in mind: you will still not be close to matching all possible email addresses in accordance with the RFC specification, because the actual regular expression for this is insane . It is almost certainly not worth it; you must use real email parser for this.

+1
source

This is another solution (similar to the solution found by David):

 //Escaped for .Net ^[_a-zA-Z0-9-]+((\\.[_a-zA-Z0-9-]+)*|(\\+[_a-zA-Z0-9-]+)*)*@[a-zA-Z0-9-]+(\\.[a-zA-Z0-9-]+)*(\\.[a-zA-Z]{2,4})$ //Native ^[_a-zA-Z0-9-]+((\.[_a-zA-Z0-9-]+)*|(\+[_a-zA-Z0-9-]+)*)*@[a-zA-Z0-9-]+(\.[a-zA-Z0-9-]+)*(\.[a-zA-Z]{2,4})$ 
0
source

Source: https://habr.com/ru/post/1346088/


All Articles