It looks like you are not completely familiar with what your regular expression is doing at the moment, which would be a good first step before modifying it. Let me go through your regular expression using the email address john.robert.smith@mail.com (in each section below, the bold part corresponds to what corresponds to this section):
^ - beginning of the line anchor . It indicates that any match should begin at the beginning of the line. If the pattern is not bound, the regex engine may match a substring, which is often undesirable.
Anchors are zero width, which means they do not capture any characters.
[_a-z0-9-]+ consists of two elements, a character class and a repeating modifier:
[...] defines the character class that the regex engine reports; any of these characters is a valid match. In this case, the class contains the characters az, numbers 0-9, and dashes and underscores (in general, a dash in a character class defines a range, so you can use az instead of abcdefghijklmnopqrstuvwxyz ; when given as the last character in a class, it acts like a literal dash).+ is a repeat modifier that indicates that the previous token (in this case, a character class) can be repeated one or more times. There are two more repetitions of Operators: * matches zero or more times; ? matches exactly zero or once (i.e. does something at will).
(captures john .robert.smith @ mail.com)
(\.[_a-z0-9-]+)* again contains the repeated character class. It also contains, and escaped character:
(...) defines a group that allows you to group several tokens together (in this case, the group will be repeated as a whole). Say we wanted a match of 'abc', zero or more (i.e. no match for abcabcabc, abcccc). If we try to use the abc* pattern, the repeat modifier will only apply to c , because c is the last token before the modifier. To get around this, we can group abc ( (abc)* ), in which case the modifier is applied to the whole group, as if it were the only token.\. indicates a literal dot character. The reason this is necessary is because . is a special character in regular expression, which means any character . Since we want to match the actual character point, we need to avoid this.
(captures John .robert.smith @ mail.com)
@ not a special character in regex, so like all other non-special characters, this matches literally.
(captures john.robert.smith @ mail.com)
[a-z0-9-]+ again defines a repeating character class, for example, item # 2 above.
(captures john.robert.smith @ mail .com)
(\.[a-z0-9-]+)* - almost exactly the same template as # 3 above.
(captures john.robert.smith@mail .com )
$ is the end of the string binding. It works the same as ^ above, except for matches at the end of the line.
With this in mind, it should be a little clearer how to add a section with the capture of the plus segment. As we saw above, + is a special character, so it must be escaped. Then, since some characters must follow the + character, we can define a character class with the characters we want to match and determine its repetition. Finally, we need to make the whole group optional, since email addresses do not need to have a + segment:
(\+[a-z0-9-]+)?
When pasted into your regex, it will look like this:
/^[_a-z0-9-]+(\.[_a-z0-9-]+)*(\+[a-z0-9-]+) ?@ [a-z0-9-]+(\.[a-z0-9-]+)*$/i
source share