LZMA SDK is unpacked for iOS (xcode) using too much RAM

I am trying to use the LZMA SDK in an application for iPhone / iPad, the starting point was an example of LZMA for iPhone provided by Mo Dejong, available here: https://github.com/jk/lzmaSDK The original was here: http: //www.modejong. com / iOS / lzmaSDK.zip (I tried both and I get the same result from both).

The problem is that the extract uses as much RAM as .7z contains uncompressed. In other words, let's say I have a 40 megabyte compressed file, the uncompressed file is about 250 MB binary sqlite DB, it will slowly use more and more memory as it decompresses the file up to 250 MB. This will crash iPad1 or something before iPhone4 (256 MB RAM). I have the feeling that many people will eventually face the same problem, so resolution now can help many developers.

I originally created a .7z file on a PC using the 7-zip version of Windows (latest version) and a dictionary size of 16 MB. Unlocking requires only 18 MB of RAM (and this happens when testing on a PC that looks at the task manager). I also tried to create the archive using keka (open source mac archiver), it did not allow anything, although I can confirm that keka itself uses only 19 MB of RAM while extracting the file on mac, which I expect. I guess the next step is to compare the Keka source code with the LZMA SDK source code.

I played with various dictionary sizes and other settings when creating the .7z file, but nothing helped. I also tried splitting my single binary into 24 smaller fragments before compressing, but that also did not help (still uses more than 250 MB of RAM to extract 24 pieces).

Please note that ONLY the change I made to the source code was to use a larger .7z file. Also note that it immediately frees up RAM immediately after the extraction is completed, but this does not help. I feel that he is not freeing up RAM, because he is extracting it as he should, or he puts all the contents in RAM until the very end, when this is done, and only then removes it from RAM. In addition, if I try to extract the same exact file using the mac application, while working with tools, I do not see the same behavior (for example, StuffIt Expander maximizes approximately 60 MB of RAM when extracting the file, Keka, mac with open source code archiver has exceeded 19 MB of RAM).

I am not a mac / xcode / objective-c developer (for now), so any help with this would be greatly appreciated. I could resort to using zip or rar instead, but I get much higher compression with LZMA, so if at all possible, I want to stick with this solution, but obviously I need to get it working without fail.

Thanks!

Screenshot of Instruments.app profiling the example app

+4
source share
3 answers

Igor Pavlov, the author of 7zip, sent me an email, he basically said that the observations that I made in the original question are a known limitation of the SDK version. The C ++ version does not have this limitation. Actual Quote:

"7-Zip uses another multi-threaded decoder written in C ++. For this C ++ decoder. 7z it is not necessary to allocate a RAM block for a single solid block. Read also this stream:

http://sourceforge.net/projects/sevenzip/forums/forum/45797/topic/5655623 "

So, until someone fixes the SDK for iOS, a workaround is:

1) Determine which RAM limit you want to use for file decompression operations.

2) Any SINGLE file in your archive that exceeds the limit of 1 above should be split, you can do this with any binary spliter application, such as split: http://www.fourmilab.ch/splits/

3) After your files are ready, create a 7z file using the dictionary / block size parameters as described by MoDJ in his answer, for example, with a 24-megabyte limit: 7za a -mx = 9 -md = 24m -ms = 24m CompressedFile.7z SourceFiles *

4) In the iOS application, after unpacking the files, determine which files were split and merge them again. The code for this is not so complicated (I assume a naming convention that uses splits.exe, i.e. a .001 file, a .002 file, etc.)

if(iParts>1) { //If this is a multipart binary split file, we must combine all of the parts before we can use it NSString *finalfilePath = whateveryourfinaldestinationfilenameis NSString *splitfilePath = [finalfilePath stringByAppendingString:@".001"]; NSFileHandle *myHandle; NSFileManager *fileManager = [NSFileManager defaultManager]; NSError *error; //If the target combined file exists already, remove it if ([fileManager fileExistsAtPath:finalfilePath]) { BOOL success = [fileManager removeItemAtPath:finalfilePath error:&error]; if (!success) NSLog(@"Error: %@", [error localizedDescription]); } myHandle = [NSFileHandle fileHandleForUpdatingAtPath:splitfilePath]; NSString *nextPart; //Concatenate each piece in order for (int i=2; i<=iParts; i++) { //Assumes fewer than 100 pieces if (i<10) nextPart = [splitfilePath stringByReplacingOccurrencesOfString:@".001" withString:[NSString stringWithFormat:@".00%d", i]]; else nextPart = [splitfilePath stringByReplacingOccurrencesOfString:@".001" withString:[NSString stringWithFormat:@".0%d", i]]; NSData *datapart = [[NSData alloc] initWithContentsOfFile:(NSString *)nextPart]; [myHandle seekToEndOfFile]; [myHandle writeData:datapart]; } [myHandle closeFile]; //Rename concatenated file [fileManager moveItemAtPath:splitfilePath toPath:finalfilePath error:&error]; } 
+1
source

Good, so it's hard. The reason you run into problems is because iOS doesn't have virtual memory while your desktop system is running. The lzmaSDK library is written in such a way that assumes that your system has a lot of virtual memory for decompression. There will be no problems on the desktop. Only when allocating large amounts of memory for unpacking on iOS will you encounter problems. This is best solved by rewriting the lzma SDK so that it is better to use the memory card directly, but this is not a trivial task. Here's how to solve this problem.

Using 7za

There are actually two command line options that you want to pass to the 7zip archive program in order to segment files into smaller pieces. I am going to assume that you are just using the 24 megabyte size that I used, as it was a good space / mem tradeoff. Here is the command line that should do the trick, note that in this example I have large movie files called XYZ.flat, and I want to compress them together in the archive.7z file:

 7za a -mx=9 -md=24m -ms=24m Animations_9_24m_NOTSOLID.7z *.flat 

If you compare this segmented file with a version that does not split the file into segments, you will see that when segmenting the file becomes a little larger:

 $ ls -la Animations_9_24m.7z Animations_9_24m_NOTSOLID.7z -rw-r--r-- 1 mo staff 8743171 Sep 30 03:01 Animations_9_24m.7z -rw-r--r-- 1 mo staff 9515686 Sep 30 03:21 Animations_9_24m_NOTSOLID.7z 

So, segmentation reduces compression by about 800K, but it doesn't really hurt, because now decompression routines will not try to allocate a lot of memory. Decompression memory usage is now limited to a 24 megabyte block that iOS can handle.

Double-check the results by printing the header information of the compressed file:

 $ 7za l -slt Animations_9_24m_NOTSOLID.7z Path = Animations_9_24m_NOTSOLID.7z Type = 7z Method = LZMA Solid = + Blocks = 7 Physical Size = 9515686 Headers Size = 1714 

Pay attention to the “Blocks” element in the above output; it indicates that the data has been segmented into different 24-megabyte blocks.

If you compare the information about the segmented file above with the output without the -ms = 24m argument, you will see:

 $ 7za l -slt Animations_9_24m.7z Path = Animations_9_24m.7z Type = 7z Method = LZMA Solid = + Blocks = 1 Physical Size = 8743171 Headers Size = 1683 

Pay attention to the value "Blocks", you do not need only one huge block, since it will try to allocate a huge amount of memory when unpacking on iOS.

0
source

I ran into the same problem but found a much more practical way:

  • use the CPP interface for the LZMA SDK. It uses only very small memory and does not suffer from a memory consumption problem, as the C interface does (as rightly said in tradergordo).

  • look at LZMAAlone.cpp, disconnect it from any unnecessary ones (for example, encoding, material in 7-zip files and bit encoding will also require a lot of memory) and create a tiny header file for your CPP LZMA decompressor, for example:

extern "C" int extractLZMAFile (const char * filePath, const char * outPath);

  • for very large files (for example, 100 MB + db files) Then I use LZMA decompression to compress this file. Of course, since there is no file container in LZMA, you need to specify the name of the unpacked file

  • because I do not have full 7Z support, I use tar as a container along with compressed lzma files. At https://github.com/mhausherr/Light-Untar-for-iOS

    there is a tiny iOS untar.

Unfortunately, I cannot provide any sources, even if I wanted to.

0
source

Source: https://habr.com/ru/post/1436054/


All Articles