Optimization of C ++ operations std :: stringstream

My case is as follows:

  • I have a binary file that I am reading using the read operation std :: fstream as (char *)
  • My goal is to take every byte from the file, format it in hexadecimal format, and then add it to the string variable
  • The string variable must contain all the contents of the file formatted as element 2.

For example, let's say I have the following binary file:

D0 46 98 57 A0 24 99 56 A3

The formatting method for each byte is as follows:

stringstream fin;; for (size_t i = 0; i < fileb_size; ++i) { fin << hex << setfill('0') << setw(2) << static_cast<uint16_t>(fileb[i]); } // this would yield the output "D0469857A0249956A3" return fin.str(); 

The above approach works as expected, however for large files it is very slow, which I understand; stringstream is for formatting input!

My question is, are there any ways to optimize such code or the approach that I take together? My only limitation is that the output must be in string format, as shown above.

Thanks.

+5
source share
1 answer

std::stringstream pretty slow. It will not be preinstalled and must always copy a string at least once in order to receive it. In addition, conversion to hex can be manually encoded faster.

I think something like this might be more productive:

 // Quick and dirty char to_hex(unsigned char nibble) { assert(nibble < 16); if(nibble < 10) return char('0' + nibble); return char('A' + nibble - 10); } std::string to_hex(std::string const& filename) { // open file at end std::ifstream ifs(filename, std::ios::binary|std::ios::ate); // calculate file size and move to beginning auto end = ifs.tellg(); ifs.seekg(0, std::ios::beg); auto beg = ifs.tellg(); // preallocate the string std::string out; out.reserve((end - beg) * 2); char buf[2048]; // larger = faster (within limits) while(ifs.read(buf, sizeof(buf)) || ifs.gcount()) { for(std::streamsize i = 0; i < ifs.gcount(); ++i) { out += to_hex(static_cast<unsigned char>(buf[i]) >> 4); // top nibble out += to_hex(static_cast<unsigned char>(buf[i]) & 0b1111); // bottom nibble } } return out; } 

It is added to the pre-highlighted line to minimize copying and avoid redistribution.

+4
source

Source: https://habr.com/ru/post/1274390/


All Articles