Boost Regex - Where are the corresponding lines stored?

I am writing a web spider and want to use the extended regex library instead of creating complex parsing functions.

I looked at this example:

#include <string> 
#include <map> 
#include <boost/regex.hpp> 

// purpose: 
// takes the contents of a file in the form of a string 
// and searches for all the C++ class definitions, storing 
// their locations in a map of strings/int 
typedef std::map<std::string, int, std::less<std::string> > map_type; 

boost::regex expression(
   "^(template[[:space:]]*<[^;:{]+>[[:space:]]*)?"
   "(class|struct)[[:space:]]*"
   "(\\<\\w+\\>([[:blank:]]*\\([^)]*\\))?"
   "[[:space:]]*)*(\\<\\w*\\>)[[:space:]]*"
   "(<[^;:{]+>[[:space:]]*)?(\\{|:[^;\\{()]*\\{)"); 

void IndexClasses(map_type& m, const std::string& file) 
{ 
   std::string::const_iterator start, end; 
   start = file.begin(); 
   end = file.end(); 
      boost::match_results<std::string::const_iterator> what; 
   boost::match_flag_type flags = boost::match_default; 
   while(regex_search(start, end, what, expression, flags)) 
   { 
      // what[0] contains the whole string 
      // what[5] contains the class name. 
      // what[6] contains the template specialisation if any. 
      // add class name and position to map: 
      m[std::string(what[5].first, what[5].second) 
            + std::string(what[6].first, what[6].second)] 
         = what[5].first - file.begin(); 
      // update search position: 
      start = what[0].second; 
      // update flags: 
      flags |= boost::match_prev_avail; 
      flags |= boost::match_not_bob; 
   } 
}

but, it is somewhat confused (this is my first attempt with boost;)), and I can not find the actual location of the corresponding lines.

So my question is: how do I get the location of all matches?

+3
source share
1 answer

as the comments in the code show, that [0] contains the entire string. so that [0]. First it will indicate the beginning of the match at each iteration of the cycle. and in general, to get the ith group that you could use:

string s(what[i].first, what[i].second);

match_results, .

+5

Source: https://habr.com/ru/post/1707075/


All Articles