The naive version changes each rule of the original grammar:
LHS = RHS1 RHS2 ... RHSN ;
:
LHS = RHS1 COMMENTS RHS2 COMMENTS ... COMMENTS RHSN ;
Although this works in an abstract way, it will most likely ruin your parser generator if it is based on LL or LALR, because now it cannot see far enough ahead of just the next token to decide what to do. Therefore, you will have to switch to a more powerful parser generator such as GLR.
A smarter version replaces (only and) each terminal T with nonterminal:
T = COMMENTS t ;
and modifies the original lexer to trivially emit t instead of T. You still have write problems.
But this gives us the basis for a real solution.
A more complicated version is to force lexer to collect the comments visible in front of the token and attach them to the next token that it emits; in essence, we are implementing a modification of the terminal grammar rule in a lexer.
Now the analyzer (you donβt need to switch technologies) just sees the markers that it originally saw; tokens carry comments as annotations. It will be useful for you to divide the comments into those that are attached to the previous token, and those that are attached to the next token, but you cannot do it better than the heuristic, because there is no practical way to decide which tokens really belong to.
You will be interested in learning how to capture positioning information on tokens and comments in order to enable source text regeneration ("comments in appropriate places"). You will find it more fun to actually recover text with the appropriate radius values, escape character strings, etc. Thus, in order not to violate the rules of the syntax of the language.
We do this with our common language processing tools, and it works quite well. It's amazing how much work so that everything is straightforward so that you can focus on your conversion task. People underestimate this a lot.