Memory leak acceleration

I am writing a small program to process a large text file and doing some replacements. The fact is that he never ceases to allocate a new memory, so in the end, she runs out of memory. I reduced it to a simple program that simply counts the number of lines (see code below), while still allocating more and more memory. I must admit that I know little about raising and strengthening the spirit in particular. Could you tell me what I am doing wrong? Thanks a million!

#include <string> #include <iostream> #include <boost/spirit/include/lex_lexertl.hpp> #include <boost/bind.hpp> #include <boost/ref.hpp> #include <boost/spirit/include/support_istream_iterator.hpp> // Token ids enum token_ids { ID_EOL= 100 }; // Token definition template <typename Lexer> struct var_replace_tokens : boost::spirit::lex::lexer<Lexer> { var_replace_tokens() { this->self.add ("\n", ID_EOL); // newline characters } }; // Functor struct replacer { typedef bool result_type; template <typename Token> bool operator()(Token const& t, std::size_t& lines) const { switch (t.id()) { case ID_EOL: lines++; break; } return true; } }; int main(int argc, char **argv) { size_t lines=0; var_replace_tokens< boost::spirit::lex::lexertl::lexer< boost::spirit::lex::lexertl::token< boost::spirit::istream_iterator> > > var_replace_functor; cin.unsetf(std::ios::skipws); boost::spirit::istream_iterator first(cin); boost::spirit::istream_iterator last; bool r = boost::spirit::lex::tokenize(first, last, var_replace_functor, boost::bind(replacer(), _1, boost::ref(lines))); if (r) { cerr<<"Lines processed: "<<lines<<endl; } else { string rest(first, last); cerr << "Processing failed at: "<<rest<<" (line "<<lines<<")"<<endl; } } 
+2
source share
1 answer

Design Behavior.

  • Me : it must be a multi_pass iterator multi_pass . Since there is no grammar, the Spirit does not know when it can turn red. [...]

  • You : as I know, istream_iterator takes care of reading the input stream without having to store the entire stream in memory

Yes. But you are not using std::istream_iterator . You are using Boost Spirit. This is a parser generator. Parsers need random access for reverse tracking.

Spirit supports input iterators, adapting the input sequence to a random access sequence using the multi_pass adapter. This iterator adapter stores variable size buffer 1 for backtracking purposes. Some actions (waiting points, always greedy operators such as Kleene- * , etc.) tell the parser structure when it is safe to flush the buffer.

Problem:

You do not understand, just a marker. Nothing ever tells an iterator to flush its buffers.

The buffer is unlimited, so the amount of memory increases. Of course, this is not a leak, because as soon as the last copy of the adapter with the multi-pass adapter goes out of scope, the common backtracking buffer is freed.

Decision:

The easiest solution is to use a random access source. If you can, use a memory mapped file.

Other solutions include indicating the reset of a multi-pass adapter. The easiest way to achieve this is to use tokenize_and_parse . Even with fake grammar, such as *(any_token) , this should be enough to convince the parser structure, you won’t ask her to return.

Inspiration:


ΒΉ http://www.boost.org/doc/libs/1_62_0/libs/spirit/doc/html/spirit/support/multi_pass.html stores a common detective by default. Look at it, checking the test a bit, using dd if=/dev/zero bs=1M | valgrind --tool=massif ./sotest dd if=/dev/zero bs=1M | valgrind --tool=massif ./sotest :

enter image description here

Clearly displays all memory in

 100.00% (805,385,576B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.99% (805,306,368B) 0x4187D5: void boost::spirit::iterator_policies::split_std_deque::unique<char>::increment<boost::spirit::multi_pass<std::istream, boost::spirit::iterator_policies::default_policy<boost::spirit::iterator_policies::ref_counted, boost::spirit::iterator_policies::no_check, boost::spirit::iterator_policies::istream, boost::spirit::iterator_policies::split_std_deque> > >(boost::spirit::multi_pass<std::istream, boost::spirit::iterator_policies::default_policy<boost::spirit::iterator_policies::ref_counted, boost::spirit::iterator_policies::no_check, boost::spirit::iterator_policies::istream, boost::spirit::iterator_policies::split_std_deque> >&) (in /home/sehe/Projects/stackoverflow/sotest) | ->99.99% (805,306,368B) 0x404BC3: main (in /home/sehe/Projects/stackoverflow/sotest) 
+7
source

Source: https://habr.com/ru/post/1387470/


All Articles