This is a very difficult problem than it might seem at first glance, since you need to consider comment tokens inside the lines, comment on tokens that are commented out, etc.
I wrote a line and comment parser for C #, let me see if I can dig out something that will help ... I will update if I find anything.
EDIT: ... OK, so I found my old project "codemasker". It turns out that I did it in stages, and not with one regular expression. Basically, I go through the source file, looking for the tokens of the beginning, when I find it, then I look for the final and mask everything in between. This takes into account the context of the initial token ... if you find the token for the "beginning of the line", you can safely ignore comment tokens until you find the end of the line, and vice versa. When the code is masked (I used guides as masks and a hash table for tracking), you can safely search and replace, and then finally restore the masked code.
Hope this helps.
source share