One of the main steps in file compression, such as ZIP, is to use the previous decoded text as the source of the link. For example, in the encoded stream, it can be said that "the next 219 output characters match the characters from the decoded stream 5161 bytes ago." This allows you to represent 219 characters in just 3 bytes. (There is more to it than ZIP than, for example, Huffman compression, but I'm just talking about link matching.)
My question is what is the strategy for the string matching algorithm. Even looking at the source code from zlib, etc., does not seem to provide a good description of the compression matching algorithm.
The problem can be formulated as follows: if a block of text, say, 30K of it and an input line, finds a link the longest in 30K of text, which exactly matches the front of the string input. "The algorithm should be effective when repeating, that is, the 30K block of text will be updated by removing some bytes from the front and adding new bytes to the rear and making a new match.
I'm more interested in discussing the algorithm (s) to do this, not the source code or libraries. (zlib has a very good source!) I suspect there may be several approaches with various tradeoffs.
LZMA, 7-zip. 7-zip , , zlib .
, , , , 4 RFC 1951 ( DEFLATE, , ZIP), , , , .
- - . , : ) , ) ) .
( , , , , , - . , , , , .. , , .. "ABCDEFG...", - "ABC" 100, 302 416, - "BCD" 301, , , - - - 302 - .)
" " (, , ): , , , , . "ABCDE...", "ABC" "BCDE", "A" , "BCDE" .
, .
Source: https://habr.com/ru/post/1704716/More articles:ICommand in MVVM WPF - c #Error responding java.io.IOException: Broken pipe on sun.nio.ch.FileDispatcher.write0 (native method) - javaЕсть ли переменная в Maven, которая поддерживает текущую цель? - variablesHow can I get files stored on the host machine? - linuxHow to force SQL Server to execute a query in a specific order - sqlHow can I refer to configuration information from multiple class libraries? - c #WPF Grid Question, can I determine the absolute position of something that automatically positions itself in the WPF grid? - wpfLINQ to SQL - проблема с ассоциацией 1-to-1 - c#Как отлаживать удаленный бинарный файл Linux? - linuxWinXP button style with wxPython - pythonAll Articles