Code Golf: The shortest code to find a weighted median?

Question

Code Golf: The shortest code to find a weighted median?

My attempt at golf.

The problem of finding the minimum value ∑W_i*|X-X_i| comes down to finding the weighted median of the list x[i] with weights w[i] (see below for a definition). How do you do it with the shortest, easiest and most beautiful program?

Here's how my code looked initially (explanation in answer to the question , and the short version is published as one of the answers below).

  #define zero(x) ( abs(x) < 1e-10 ) /* because == doesn't work for floats */ float sum = 0; int i; for (i = 0; i < n; i++) sum += w[i]; while (sum > 0) sum -= 2*w[--i]; right = x[i] // the rightmost minimum point left = ( zero(sum) && zero(w[i]-w[i-1]) ) ? x[i-1] : right; answer = (left + right) / 2;

(In fact, it is already highly optimized, since you see the variables i and sum repeatedly)

rules

Floats and integers: different languages have different floating point arithmetic standards, so I reformulate the problem so that x[i] and w[i] are integers , and you can, if you prefer, return the value of the answer twice (which is always an integer) . You can return, print, or assign a response to a variable.

Definition of a weighted median and explanations:

The median of the sorted array x[i] length n is either x[n/2] or (x[n/2-1/2]+x[n/2+1/2])/2 depending on whether whether n odd or even
~~The median of an unsorted array is the median of the array after sorting~~ (true, but our array is sorted)
The weighted median x[i] with integer positive weights w[i] is defined as the median of a larger array, where each occurrence x[i] been changed to w[i] occurrence x[i] .

What i hope to see

One of the reasons for asking is that I assume that the most suitable language would have a trivial summation of the array and iteration with lambdas. I thought that a functional language might be reasonable, but I'm not so sure about this - this is part of the question. I hope to see something like

  // standard function add := (a,b) :-> a + b myreduce := w.reduce with: add until: (value) :-> 2*value >= (w.reduce with:add) answer = x [myreduce from:Begin] + x [myreduce from:End]

I don’t know if any language exists where possible, and is actually shorter.

Test data

 static int n = 10; for (int j = 0; j < n; j++) { w[j] = j + 1; x[j] = j; }

Answer: 6 or 12.

 static int n = 9; int w[n], x[n] ; for (int j = 0; j < n; j++) { w[j] = j + ((j<6) ? 1 : 0); x[j] = j + 1; }

Answer: 6.5 or 13.

+4

language-agnostic code-golf

ilya n. Jun 08 '09 at 20:39

source share

7 answers

So here is how I could squeeze my solution: still leaving spaces:

  int s = 0, i = 0; for (; i < n; s += w[i++]) ; while ( (s -= 2*w[--i] ) > 0) ; a = x[i] + x[ !s && (w[i]==w[i-1]) ? i-1 : i ];

+3

ilya n. Jun 08 '09 at 21:26

source share

Haskell code, ungolfed: attempting a smart functional solution.

 import Data.List (zip4) import Data.Maybe (listToMaybe) mid :: (Num a, Ord a) => [a] -> (Int, Bool) mid w = (i, total == part && maybe False (l ==) r) where (i, l, r, part):_ = dropWhile less . zip4 [0..] wv $ map (2*) sums _:sums = scanl (+) 0 w; total = last sums; less (_,_,_,x) = x < total v = map Just w ++ repeat Nothing wmedian :: (Num a, Ord a) => [a] -> [a] -> (a, Maybe a) wmedian wx = (left, if rem then listToMaybe rest else Nothing) where (i, rem) = mid w; left:rest = drop ix

  > wmedian [1,1,1,1] [1,2,3,4]
 (2, Just 3)
 > wmedian [1,1,2,1] [1,2,3,4]
 (3, Nothing)
 > wmedian [1,2,2,5] [1,2,3,4]
 (3, Just 4)
 > wmedian [1,2,2,6] [1,2,3,4]
 (4, Nothing)

  > wmedian [1..10] [0..9]
 (6, Nothing)
 > wmedian ([1. 1..6] ++ [6..8]) [1..9]
 (6, Just 7)

My original J solution was a simple translation of the above Haskell code.

Here's the Haskell porting of the current J code:

 {-# LANGUAGE ParallelListComp #-} import Data.List (find); import Data.Maybe (fromJust) w&x=foldr((+).fst.fromJust.find((>=sum w).snd))0[fg(+)0$map (2*)w|f<-[zip x.tail,reverse.zip x]|g<-[scanl,scanr]]/2

Yes & hellip; please do not write such code.

  > [1,1,1,1,1] & [1,2,3,4]
 2.5
 > [1,1,2,1] & [1,2,3,4]
 3
 > [1,2,2,5] & [1,2,3,4]
 3.5
 > [1,2,2,6] & [1,2,3,4]
 4
 > [1..10] & [0..9]
 6
 > ([1..6] ++ [6..8]) & [1..9]
 6.5

+2

ephemient Jun 08 '09 at 10:58

source share

short, and does what you expect. Not particularly economical.

 def f(l,i): x,y=[],sum(i) map(x.extend,([m]*n for m,n in zip(l,i))) return (x[y/2]+x[(y-1)/2])/2.

here is the constant space version using itertools. it still has to iterate sum (i) / 2 times, so it won’t beat index calculation algorithms.

 from itertools import * def f(l,i): y=sum(i)-1 return sum(islice( chain(*([m]*n for m,n in zip(l,i))), y/2, (y+1)/2+1 ))/(y%2+1.)

+2

Jimmy Jun 19 '09 at 19:03

source share

Python:

 a=sum([[X]*W for X,W in zip(x,w)],[]);l=len(a);a[l/2]+a[(l-1)/2]

+1

cobbal Jun 28 '09 at 4:12

source share

Something like that? O (n).

 for(int i = 0; i < x.length; i++) { sum += x[i] * w[i]; sums.push(sum); } median = sum/2; for(int i = 0; i < array.length - 1; i++) { if(median > sums[element] and median < sums[element+1] return x[i]; if(median == sums[element]) return (x[i] + x[i+1])/2 }

Not sure how you can get two answers for the median, you mean, if the sum / 2 is exactly equal to the border?

EDIT: By looking at your formatted code, my code does pretty much the same thing, do you need a more efficient method?

EDIT2: Part of the search can be done using a modified binary search, which will make it a little faster.

 index = sums.length /2; finalIndex = binarySearch(index); int binarySearch(i) { if(median > sums[i+1]) { i += i/2 return binarySearch(i); } else if(median < sums[i]) { i -= i/2 return binarySearch(i); } return i; }

You will need to do some checking to make sure that it does not continue indefinitely at the edges.

0

CookieOfFortune Jun 08 '09 at 20:54

source share

Just a comment about your code: I really hope that I don’t have to support it unless you also wrote all the unit tests that are required here :-)

Of course, this is not related to your question, but, as a rule, the "shortest encoding method" is also the "most difficult support method." For scientific applications, this is probably not a stopper. But for IT applications this is.

I think it needs to be said. All the best.

-5

Sro Jun 08 '09 at 20:49

source share

ephemient · Accepted Answer · 2009-06-09T02:36:14+0000

J

Go ahead and enter it directly into the interpreter. The prompt consists of three spaces, so the inserted lines are entered by the user.

  m=:-:@+/@(((2*+/\)I.+/)" 1@ (,:(\: i.@ #))@[{"0 1(,:(\: i.@ #))@])

Test data I used in another answer:

  1 1 1 1 m 1 2 3 4 2.5 1 1 2 1 m 1 2 3 4 3 1 2 2 5 m 1 2 3 4 3.5 1 2 2 6 m 1 2 3 4 4

Test data added to the question:

  (>:,:[)i.10 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 (>:m[)i.10 6 (([+<&6),:>:)i.9 1 2 3 4 5 6 6 7 8 1 2 3 4 5 6 7 8 9 (([+<&6)m>:)i.9 6.5

  i =: (2 * +/\) I. +/

The first index is such that the total amount is greater than or equal to twice the total amount.

  j =: ,: (\: i.@ #)

List and its inverse.

  k =: i"1 @ j @ [

The first indices are such that are higher than the left argument and its inverse.

  l =: k {"(0 1) j @ ]

Those indices extracted from the correct argument and its inverse, respectively.

  m =: -: @ +/ @ l

Half the sum of the resulting list.

Code Golf: The shortest code to find a weighted median?

rules

What i hope to see

Test data

J

More articles: