C ++: find the largest container in a program

I am trying to parse a large C ++ program. The program heavily uses STL container data structures such as set, map, unordered set, unordered map, vector, etc. Sometimes they are nested, for example. mapping of sets.

I want to find out in which particular program launch, which containers contain the largest number of elements (i.e. the largest size () value). I can make minor changes to the program.

If there was a way to iterate over all containers, or if there was a way to intercept container APIs (size modifiers), this could be useful. But this is impossible.

How would you approach this?

Addition: Linux platform, compiler either g ++ or clang ++.

+5
source share
3 answers

This method is useful when your project is really large and has a lot of instances of different containers. The advantage of the method is that you do not need to change a large amount of code. This allows you to narrow down the type of container to find it. This method helps to diagnose the situation on the contingent and type.

You can override template< class T > struct allocator . You can rename the original dispenser to std headers or change it. Let me make statistics for distribution and release. You will know the quantity and size of each type of item. But you cannot know which instance of the container has elements.

The template< class T > struct allocator located in the library header files. It always exists and does not need to rebuild the development environment library, since you know that the template cannot be compiled into a static library (exclude specialization). Templates compiled always with your sources. But there may be a problem with precompiled headers. You can regenerate it or not use it for a project, but you need to check for a library. This may be a bottleneck in the method, but just checking to see if the problem exists or not.

There is one empirical method that does not guarantee accuracy. When your application shuts down, the containers are freed after the release of its elements. Thus, you can write statistics for each container of the parent type, how many internal elements were in which type of container.

For example, suppose:

 vector<A>({1,2,3}) and map<string,B>({1,2}) and map<string,B>({1,2}) 

This will create an unlock event list as follows:

 B, B, map<string,B>, A, A, map<string,A>, A, A, A, vector<A>, 

So, you can know that 3 elements A in vector<A> , 2 elements A in map<string,A> and 2 elements A in map<string,A>

+2
source

If you can make small changes, can you add each container to a large list of them?
For instance:

 std::set<......> my_set; // existing code all_containers.add( &my_set ); // minor edit IMHO 

Then you can call all_containers.analyse() , which will call size() for each of them and print the results.

You can use something like this:

 struct ContainerStatsI { virtual int getSize() = 0; }; template<class T> struct ContainerStats : ContainerStatsI { T* p_; ContainerStats( T* p ) : p_(p) {} int getSize() { return p->size(); } }; struct ContainerStatsList { std::list<ContainerStatsI*> list_; // or some other container.... template<class T> void add( T* p ) { list_.push_back( new ContainerStats<T>(p) ); } // you should probably add a remove(T* p) as well void analyse() { for( ContainerStatsI* p : list_ ) { p->getSize(); // do something with what returned here } } }; 
+2
source

Add statistical code to container destructors in std header files. This does not require modification of a large amount of code for a large project. But this only shows the type of container (see my other answer here). The method does not require C ++ 0x or C ++ 11 or more.

The first and obligatory step is to add your std library under the source code, git, for example, to quickly see what has actually been changed, and to quickly switch between the changed and the original version.

Put this declaration of the Stat class in the sources folder of the std library:

 class Stat { std::map<std::string,int> total; std::map<std::string,int> maximum; public: template<class T> int log( std::string cont, size_t size ) { std::string key = cont + ": " + typeid(T).name(); if( maximum[key] < size ) maximum[key] = size; total[key] += size; } void show_result() { std::cout << "container type total maximum" << std::endl; std::map<std::string,int>::const_iterator it; for( it = total.begin(); it != total.end(); ++it ) { std::cout << it->first << " " << it->second << " " << maximum[it->first] << std::endl; } } static Stat& instance(); ~Stat(){ show_result(); } }; 

Run a singleton instance of the Stat class in the cpp project file:

 Stat& Stat::instance() { static Stat stat; return stat; } 

Edit the std library container templates. Add statistics in destructors.

 // modify this standart template library sources: template< T, Allocator = std::allocator<T> > vector { ... virtual ~vector() { Stat::instance().log<value_type>( "std::vector", this->size() ); } }; template< Key, T, Compare = std::less<Key>, Allocator = std::allocator<std::pair<const Key, T> > map { ... virtual ~map(){ Stat::instance().log<value_type>( "std::map", this->size() ); } }; 

Consider a program, for example:

 int main() { { // reject to use C++0x, project does not need such dependency std_vector<int> v1; for(int i=0;i<10;++i) v1.push_back( i ); std_vector<int> v2; for(int i=0;i<10;++i) v2.push_back( i ); std_map<int,std::string> m1; for(int i=0;i<10;++i) m1[i]=""; std_map<int,std::string> m2; for(int i=0;i<20;++i) m2[i]=""; } Stat::instance().show_result(); return 0; } 

Result for gcc:

 container type total maximum std::map: St4pairIiSsE 30 20 std::vector: i 20 10 

If you need a more detailed type description than finding information about your development environment. This conversion is described here for gcc: https://lists.gnu.org/archive/html/help-gplusplus/2009-02/msg00006.html

The conclusion could be like this:

 container type total maximum std::map: std::pair<int, std::string> 30 20 std::vector: int 20 10 
+1
source

Source: https://habr.com/ru/post/1243146/


All Articles