How to parse actual code like stackoverflow / intellisense / etc?

I was wondering how stackoverflow parses all kinds of different codes and identifies keywords, special characters, formatting spaces, etc. He does this for most of the codes that I consider, and I noticed that he is even sophisticated enough to understand the relationship between all, for example:

String mystring1 = "inquotes"; //incomment
String mystring2 = "inquotes//incomment";
String mystring3 = //incomment"inquotes";

Many IDEs do this too. How it's done?

Edit: Further explanation. I do not ask about the analysis of the text, my question is when I pass by this part. Is there something like a universal XML schema or a cross-code format hierarchy that describes which lines are keywords, whose characters indicate comments, text lines, logical operators, etc. Or should I become a syntax guru for any language that I want to parse correctly?

+3
source share
2 answers

IDE// "" , . , ": , ".

i+++++i; 

list<list<hash<list<int>,hash<int,<list>>>>>;
//or just matching parens 

hard , , java, , , C ++ ( ) ruby ​​( ). , , 80% - . , SO , .

80% 100% - , IDE ++, Visual ++ - ++. , , . , .

+3

, . , . - , .

, / . - , .

+2

Source: https://habr.com/ru/post/1760434/


All Articles