What is the optimal fill dvd algorithm for recording

What is the optimal algorithm for populating a set of Blu-ray discs with many hundreds of gigabytes of assets of various sizes?

I am trying to combine a large number of old CDROMS, DVDs and small hard drives and put everything in a database indexed by the MD5 signature. An amazing challenge for sure.

What I'm doing now is to sort the sizes of the assets (usually the sizes of the catalog) in descending order, start inserting the largest assets into the fill list, skipping anything that doesn't work until the end of the use of the assets. It works almost instantly, but I would not miss the night if it was necessary once.

This usually gives me 95% or more use, but I'm sure there is a way to use other combinations to provide better performance. With huge elements such as disk images, I can get pretty low usage with this primitive method.

My thought is to take all combinations of taken assets, 1 then 2, then 3, ... elements at a time and save the current value for the highest number of bytes <25,025,314,816 bytes, pointing to an array that is summed with it. When I guess that I have so many assets that have ever been accepted, none of the combinations will work, stop and use the array pointed to by the highest counter.

Is this the best possible algorithm?

There are 2 Perl modules that seem to fit the task, the combinatorics algorithm and the mathematics combinatorics. Any tips for a faster, more stable cooler?

My scheme is to write a script to calculate the size of a large number of directories and show me the optimal contents of dozens of disks for writing.

And I don’t want to just fill the file with a file base, because I want to have whole directories on one drive.

+6
source share
4 answers

This is an NP-complete problem known as bin packaging . There is no known polynomial time algorithm that optimally solves it. In other words, the optimal solution cannot be found without trying to basically solve all the solutions.

On the plus side, a very simple heuristic such as β€œput the largest remaining folder on the first drive with a number” ensures that you will use twice as many drives. (You can read more information about the issue of the Wikipedia article).

+4
source

The algorithm is called 1d bin-packing. The algorithm is very fast, but not optimal. You can also use brute force algorithm, but the search space is very large. Here is a greedy program: http://www.phpclasses.org/package/2027-PHP-Pack-files-without-exceeding-a-given-size-limit.html

+2
source

The most practical method I have yet found to efficiently fill my Blu-ray discs.

I am making a list of fully qualified paths for all available files for writing.

Then (arbitrarily) decide how many directory levels should be considered to group or accept a command line parameter for it. This is necessary so that directories filled with the same elements are combined on one blu-ray. There is also the STUFF option for inserting the largest files in the first place, and when the file causes an overflow, look at the next smaller one until you finish the files or space.

Make a hash with each directory as a key and the total size of the files that it contains as data. Also keep a parallel hash with the number of files in the directory, since free space and directory overhead are likely to add up and should be taken into account.

Choose 22 as the magic number. If you have <= 22 directories, try all combinations to find the closest, but no more than 25.025 GB. If you have more than 22, just use the 22 largest. I use the Perl Algorithm :: Combinatorics module to search for all combinations. Through a trial version and mainly a mistake, I decided that combinations of 21 items took just a few seconds. 23 subjects take many minutes, which is more than my attention. 22 takes about 35 seconds.

The output directory is also accepted and checked for existing data. It is possible to move files (copy, check size and detach).

Each time I bought a new hard drive, it was usually twice as large as the previous one, so I would just copy everything. With Nikon D800E (Extreme!), HDR and Panoramas, I finally ran out of free space.

My project was the unique, weedy and consolidation of the 15-year-old cost of [mostly unwanted] photos, videos, films, music, etc. I took an inventory of about a dozen storage devices, calculated the MD5 signatures, and put them into the database. I chose one drive as the master for the photo and one for the video, and everything else. I found 8 instances of some things!

Now I have about 10 TB of free disk space !!!

Below is a function that does all the real work in case anyone is interested.

================================================= Unfortunately! Your answer cannot be sent because:

Your post appears to contain code that is not properly formatted as code 

The stupid website has distorted my original code. Sorry: (..

0
source

Use the algorithm from the "Backpack" optimization task.

http://en.wikipedia.org/wiki/Knapsack_problem

  • Set weight equal to file size
  • Set the value to "weight"
  • Run the algorithm for each subsequent disk to be packed

This may not be the best choice (it maximizes the fill factor of the next disk, rather than minimizes the number of required disks), but it is well documented and easily finds examples and working code for the programming language of your choice (even tables) on the Internet.

-2
source

Source: https://habr.com/ru/post/921443/


All Articles