Detection of regular expression in content during analysis

I am writing a simple parser for C. I just ran it with some other language files (for fun - to see the degree of C-similarity and laziness - you do not want to write separate parsers for each language if I can avoid this).

However, the parser seems to break into JavaScript if the parsed code contains regular expressions ...

Case 1: For example, by analyzing a piece of JavaScript code,

var phone="(304)434-5454"
phone=phone.replace(/[\(\)-]/g, "") 
//Returns "3044345454" (removes "(", ")", and "-")

"(", "[", etc., etc. are the same as starters of new areas that can never be closed.

Case 2: And, for the Perl code snippet,

 # Replace backslashes with two forward slashes
 # Any character can be used to delimit the regex
 $FILE_PATH =~ s@\\@//@g; 

// match as a comment ...

How can I detect a regular expression in the text content of a "C-like" program-file?

+3
2

.

, :

m =~ s/a/b/g;

C, perl.

, perl, sntyctically C, .

:

m+foo *bar[index]+i

, , - . , , .

. "", .

+4

, regex. : -, , - . , , . , , , ORed . .

, . (, PEG)

— : , Javascript Perl, .

+1

Source: https://habr.com/ru/post/1747245/


All Articles