Analysis almost always includes regular expressions. However, the regular expression itself does not do the parser. In the simplest sense, a parser consists of:
text input stream -> tokenizer
Usually it has an extra step:
text input stream -> tokenizer -> parser
The notifier processes the input stream and collects the text accordingly, so the programmer does not need to think about it. It consumes text elements until only one match is available. Then it runs the code associated with this βtokenβ. If you do not have a tokenizer, you need to manually knock it over (in pseudocode):
while stuffInStream: currChars + getNextCharFromString if regex('firstCase'): do stuff elif regex('other stuff'): do more stuff
This loop code is full of gotchas unless you build them all the time. It is also easy to get a computer from a set of rules. This is how Lex / flex works. You can have token-related rules, pass the token to yacc / bison as your parser, which adds structure.
Note that a lexer is just a state machine . He can do anything when he is transferred from state to state. I wrote lexers that I used to strip characters from the input stream, open files, print text, send email, etc.
So, if you want to collect text after the fourth capital letter, the regular expression is not only suitable, but it is the right solution. BUT , if you want to parse text input, with different rules for what to do and an unknown amount of input, then you need a lexer / parser. I suggest PLY since you are using python.