The second regex has a problem:
^(IN:[AE][0-9]{7}Q)|([AE][0-9]{7})$
| has a lower priority than concatenation, so the regex will be parsed as:
^(IN:[AE][0-9]{7}Q) # Starts with (IN:[AE][0-9]{7}Q) | # OR ([AE][0-9]{7})$ # Ends with ([AE][0-9]{7})
To fix this problem, just use the group without capture:
^(?:(IN:[AE][0-9]{7}Q)|([AE][0-9]{7}))$
It ensures that the input string matches either the format, and not just the beginning or end of a specific format (which is clearly wrong).
Regarding the reduction of regex, you can replace [0-9] with \d if you want, but that's fine as it is.
I don't think there is another way to reduce regex within the default Ruby support level.
Subroutine call
For your Perl / PCRE information only, you can shorten it with a subroutine call :
^(?:([AE][0-9]{7})|(IN:(?1)Q))$
(?1) refers to the pattern defined by the first capture group, that is, [AE][0-9]{7} . The regular expression is practically the same, just look shorter. This demo with input IN:E0123463Q shows all the text that will be removed by group 2 (and the text will not be removed for group 1).
Ruby has a similar concept to call subexpression , with slightly different syntax. Ruby uses \g<name> or \g<number> to refer to a capture group whose template we want to reuse:
^(?:([AE][0-9]{7})|(IN:\g<1>Q))$
the test case here is on rubular in Ruby 1.9.7, for entering IN:E0123463Q , returns E0123463 as a match for group 1 and IN:E0123463Q as a match for group 2.
The Ruby implementation (1.9.7) seems to write the captured text for group 1, even if group 1 is not directly involved in the mapping. In PCRE, routine calls do not capture text.
Conditional Regular Expression
There is also a conditional regex that allows you to check if a capture group matches or not. You can check the matte answer for more information.