Permutation Columns Without Repeat

Question

Permutation Columns Without Repeat

Can someone give me a piece of code or an algorithm or something else to solve the following problem? I have several files, each of which has a different number of columns, for example:

$> cat file-1 1 2 $> cat file-2 1 2 3 $> cat file-3 1 2 3 4

I would like to subtract the absolute values of the column and divide by the sum of all rows in each column only once (combination without repeated pairs of columns):

 in file-1 case I need to get: 0.3333 # because |1-2/(1+2)| in file-2 case I need to get: 0.1666 0.1666 0.3333 # because |1-2/(1+2+3)| and |2-3/(1+2+3)| and |1-3/(1+2+3)| in file-3 case I need to get: 0.1 0.2 0.3 0.1 0.2 0.1 # because |1-2/(1+2+3+4)| and |1-3/(1+2+3+4)| and |1-4/(1+2+3+4)| and |2-3/(1+2+3+4)| and |2-4/(1+2+3+4)| and |3-4/(1+2+3+4)|

+4

linux bash shell awk sed

user1116360 Jan 23 '12 at 10:38

source share

3 answers

jaypal singh · Answer 1 · 2012-01-23T23:58:16+0000

This should work, although I assume you made a small mistake in your input. Based on your third template, the following data should be:

Instead:

 in file-2 case I need to get: 0.1666 0.1666 0.3333 # because |1-2/(1+2+3)| and |2-3/(1+2+3)| and |1-3/(1+2+3)|

It should be:

 in file-2 case I need to get: 0.1666 0.3333 0.1666 # because |1-2/(1+2+3)| and |1-3/(1+2+3)| and |2-3/(1+2+3)|

Here is awk one liner:

 awk ' NF{ a=0; for(i=1;i<=NF;i++) a+=$i; for(j=1;j<=NF;j++) { for(k=j;k<NF;k++) printf("%s ",-($j-$(k+1))/a) } print ""; next; }1' file

Short version:

 awk ' NF{for (i=1;i<=NF;i++) a+=$i; for (j=1;j<=NF;j++){for (k=j;k<NF;k++) printf("%2.4f ",-($j-$(k+1))/a)} print "";a=0;next;}1' file

Input file:

 [jaypal:~/Temp] cat file 1 2 1 2 3 1 2 3 4

Test:

 [jaypal:~/Temp] awk ' NF{ a=0; for(i=1;i<=NF;i++) a+=$i; for(j=1;j<=NF;j++) { for(k=j;k<NF;k++) printf("%s ",-($j-$(k+1))/a) } print ""; next; }1' file 0.333333 0.166667 0.333333 0.166667 0.1 0.2 0.3 0.1 0.2 0.1

Test from a shorter version:

 [jaypal:~/Temp] awk ' NF{for (i=1;i<=NF;i++) a+=$i; for (j=1;j<=NF;j++){for (k=j;k<NF;k++) printf("%2.4f ",-($j-$(k+1))/a)} print "";a=0;next;}1' file 0.3333 0.1667 0.3333 0.1667 0.1000 0.2000 0.3000 0.1000 0.2000 0.1000

Steve · Answer 2 · 2012-01-24T00:17:24+0000

@Jaypal just beat me too! Here is what I had:

 awk '{for (x=1;x<=NF;x++) sum += $x; for (i=1;i<=NF;i++) for (j=2;j<=NF;j++) if (i < j) printf ("%.1f ",-($i-$j)/sum)} END {print ""}' file.txt

Output:

 0.1 0.2 0.3 0.1 0.2 0.1

prints up to one decimal place.

@Jaypal, is there a quick way to print the absolute value? Perhaps for example: abs(value) ?

EDIT:

@Jaypal, yes, I was also looking for a search and could not find something simple :-( It seems that if ($i < 0) $i = -$i is the way to go. I think you could use sed to remove any minus signs:

 awk '{for (x=1;x<=NF;x++) sum += $x; for (i=1;i<=NF;i++) for (j=2;j<=NF;j++) if (i < j) printf ("%.1f ", ($i-$j)/sum)} {print ""}' file.txt | sed "s%-%%g"

Hooray!

Abhijeet rastogi · Answer 3 · 2012-01-23T22:47:32+0000

How it looks like homework, I will act accordingly.

To find the total number of numbers in a file, you can use

 cat filename | wc -w

Find first_number:

 cat filename | cut -d " " -f 1

To find the amount in a file:

 cat filename | tr " " "+" | bc

Now you have total_nos, use something like:

 for i in {seq 1 1 $total_nos} do #Find the numerator by first_number - $i #Use the sum you got from above to get the desired value. done

Permutation Columns Without Repeat

Instead:

It should be:

Short version:

Input file:

Test:

Test from a shorter version:

More articles: