I know this is historical data, but you may prefer a naming scheme to help solve this problem. It can be much easier to solve this problem in two passes: first rename the directories based on the date, then select the directories to save them in the future.
You can make a quick approximation if all the dates in the directory in ls -l are displayed fairly well:
ls -l | awk '{print "mv " $8 " " $6;}' > /tmp/runme
Take a look at /tmp/runme , and if it looks good, you can run it with sh /tmp/runme . You might want to trim records or something like that before you.
If all backups are stored in named directories, for example:
2011-01-01/ 2011-01-02/ 2011-01-03/ ... 2011-02-01/ 2011-02-02/ ... 2011-03-07/
then your problem will be reduced to calculating the names to save and delete. This problem is much easier to solve than searching all your files and trying to choose which ones to save and delete based on when they were made. (See Output date "+%Y-%m-%d" for a quick way to create such a name).
Once they are named conveniently, you can save the first backup of each month using a script as follows:
for y in `seq 2008 2010` do for m in `seq -w 1 12` do for d in `seq -w 2 31` do echo "rm $y-$m-$d" done done done
Save its output, check it :) and then run the output, similar to renaming the script.
After you have kept the previous backups under control, you can generate 2010 from date --date="Last Year" "+%Y" and other improvements, therefore it processes "once a week" for the current month and saves itself forever in the future.
source share