How to parse actual code like stackoverflow / intellisense / etc?

Question

How to parse actual code like stackoverflow / intellisense / etc?

I was wondering how stackoverflow parses all kinds of different codes and identifies keywords, special characters, formatting spaces, etc. He does this for most of the codes that I consider, and I noticed that he is even sophisticated enough to understand the relationship between all, for example:

String mystring1 = "inquotes"; //incomment
String mystring2 = "inquotes//incomment";
String mystring3 = //incomment"inquotes";

Many IDEs do this too. How it's done?

Edit: Further explanation. I do not ask about the analysis of the text, my question is when I pass by this part. Is there something like a universal XML schema or a cross-code format hierarchy that describes which lines are keywords, whose characters indicate comments, text lines, logical operators, etc. Or should I become a syntax guru for any language that I want to parse correctly?

+3

parsing xsd code-structure

stupidkid Aug 18 '10 at 23:54

source share

2 answers

, . , . - , .

, / . - , .

+2

Borealid 18 . '10 23:56

Paul Rubel · Accepted Answer · 2010-08-19T00:48:14+0000

IDE// "" , . , ": , ".

i+++++i;

list<list<hash<list<int>,hash<int,<list>>>>>;
//or just matching parens

hard , , java, , , C ++ ( ) ruby ( ). , , 80% - . , SO , .

80% 100% - , IDE ++, Visual ++ - ++. , , . , .

How to parse actual code like stackoverflow / intellisense / etc?

More articles: