How to sort row groups?

Question

How to sort row groups?

In the following example, there are 3 elements that need to be sorted:

"[aaa]" and 4 lines (always 4) below it form a single unit.
"[kkk]", and 4 lines (always 4) below it form a single unit.
"[zzz]" and 4 lines (always 4) below it form a single block.

Only the groups of lines following this pattern should be sorted; nothing before "[aaa]" and after the 4th line of "[zzz]" should be left untouched.

from

This sentence and everything above it should not be sorted. [zzz] some random text here [aaa] bla blo blu bli [kkk] 1 44 2 88 And neither should this one and everything below it.

in

 This sentence and everything above it should not be sorted. [aaa] bla blo blu bli [kkk] 1 44 2 88 [zzz] some random text here And neither should this one and everything below it.

+3

sorting bash

octosquidopus Nov 23 '12 at 0:36

source share

3 answers

Assuming the other lines do not contain in them [ :

 header=`grep -n 'This sentence and everything above it should not be sorted.' sortme.txt | cut -d: -f1` footer=`grep -n 'And neither should this one and everything below it.' sortme.txt | cut -d: -f1` head -n $header sortme.txt #print header head -n $(( footer - 1 )) sortme.txt | tail -n +$(( header + 1 )) | tr '\n[' '[\n' | sort | tr '\n[' '[\n' | grep -v '^\[$' #sort lines between header & footer #cat sortme.txt | head -n $(( footer - 1 )) | tail -n +$(( header + 1 )) | tr '\n[' '[\n' | sort | tr '\n[' '[\n' | grep -v '^\[$' #sort lines between header & footer tail -n +$footer sortme.txt #print footer

Serves the target.

Please note that the main sorting job is done only with the 4th command. Other lines are to reserve the header and footer.

I also assume that there are no other lines between the title and the first "[section]".

+1

anishsane Nov 23 '12 at 5:13

source share

This may work for you (GNU sed and sort):

 sed -i.bak '/^\[/!b;N;N;N;N;s/\n/UnIqUeStRiNg/g;w sort_file' file sort -o sort_file sort_file sed -i -e '/^\[/!b;R sort_file' -e 'd' file sed -i 's/UnIqUeStRiNg/\n/g' file

The sorted file will be in file and the source file in file.bak .

All lines starting with [ and the next 4 lines in sorted order will be shown here.

UnIqUeStRiNg can be any unique line that does not contain a new line, for example. \x00

0

potong Nov 23 '12 at 9:54

source share

rici · Accepted Answer · 2012-11-23T01:13:13+0000

Perhaps not the fastest :) [1], but it will do what you want, I believe:

 for line in $(grep -n '^\[.*\]$' sections.txt | sort -k2 -t: | cut -f1 -d:); do tail -n +$line sections.txt | head -n 5 done

It's better here:

 for pos in $(grep -b '^\[.*\]$' sections.txt | sort -k2 -t: | cut -f1 -d:); do tail -c +$((pos+1)) sections.txt | head -n 5 done

[1] The first is something like O (N ^ 2) in the number of lines in the file, since it should read the entire path to the section for each section. The second, which can immediately search for the correct position of the character, should be closer to O (N log N).

[2] This leads to your word that in each section there are always exactly five lines (heading plus the next four), therefore head -n 5 . However, it would be very easy to replace this with something that was read before, but did not include the next line starting with "[", in the case when it will ever be necessary.

Saving the beginning and the end requires a bit more work:

 # Find all the sections mapfile indices < <(grep -b '^\[.*\]$' sections.txt) # Output the prefix head -c+${indices[0]%%:*} sections.txt # Output sections, as above for pos in $(printf %s "${indices[@]}" | sort -k2 -t: | cut -f1 -d:); do tail -c +$((pos+1)) sections.txt | head -n 5 done # Output the suffix tail -c+$((1+${indices[-1]%%:*})) sections.txt | tail -n+6

You might want to make a function or script file out of this by changing section.txt to $ 1 everywhere.

How to sort row groups?

More articles: