Functional style removal

Question

Functional style removal

I struggled with something that looks like a simple algorithm, but so far cannot find a clean way to express it in a functional style. Here is the outline of the problem: suppose I have 2 arrays of X and Y,

X = [| 1; 2; 2; 3; 3 |] Y = [| 5; 4; 4; 3; 2; 2 |]

I want to get elements that match, and unsurpassed elements, for example:

 matched = [| 2; 2; 3 |] unmatched = [| 1; 3 |], [| 4; 4; 5 |]

In pseudo code, I think of approaching the problem:

 let rec match matches xy = let m = find first match from x in y if no match, (matches, x, y) else let x' = remove m from x let y' = remove m from y let matches' = add m to matches match matches' x' y'

The problem I am facing is the "remove m from x" - I cannot find a clean way to do this (I have working code, but it is ugly as hell). Is there a good, idiomatic functional way to approach this problem, or to the removal part, or to write the algorithm itself in a different way?

+4

algorithm f #

Mathias Jun 07 '13 at 22:44

source share

2 answers

It seems you are describing multiset (bag) and its actions.

If you use appropriate data structures, the operations are very simple to implement:

 // Assume that X, Y are initialized bags let matches = X.IntersectWith(Y) let x = X.Difference(Y) let y = Y.Difference(X)

There is no built-in package collection in the .NET Framework. You can use the Power Collection library, including the Bag class , where the above function signature is taken.

UPDATE:

You can imagine the bag on a slightly ascending list. Here is an improved version of @kqr's answer in F # syntax:

 let overlap xs ys = let rec loop (matches, ins, outs) xs ys = match xs, ys with // found a match | x::xs', y::ys' when x = y -> loop (x::matches, ins, outs) xs' ys' // `x` is smaller than every element in `ys`, put `x` into `ins` | x::xs', y::ys' when x < y -> loop (matches, x::ins, outs) xs' ys // `y` is smaller than every element in `xs`, put `y` into `outs` | x::xs', y::ys' -> loop (matches, ins, y::outs) xs ys' // copy remaining elements in `xs` to `ins` | x::xs', [] -> loop (matches, x::ins, outs) xs' ys // copy remaining elements in `ys` to `outs` | [], y::ys' -> loop (matches, ins, y::outs) xs ys' | [], [] -> (List.rev matches, List.rev ins, List.rev outs) loop ([], [], []) (List.sort xs) (List.sort ys)

After two calls to List.sort , which are probably O(nlogn) , the search for matches is linear with respect to the sum of the lengths of the two lists.

If you need a module with a quick and dirty bag, I would suggest the following module signature:

 type Bag<'T> = Bag of 'T list module Bag = val count : 'T -> Bag<'T> -> int val insert : 'T -> Bag<'T> -> Bag<'T> val intersect : Bag<'T> -> Bag<'T> -> Bag<'T> val union : Bag<'T> -> Bag<'T> -> Bag<'T> val difference : Bag<'T> -> Bag<'T> -> Bag<'T>

+3

pad Jun 07 '13 at 23:31

source share

kqr · Accepted Answer · 2013-06-07T23:40:11+0000

This can be easily solved using the correct data structures, but in case you want to do it manually, here is how I will do it in Haskell. I don’t know F # is enough to translate this, but I hope this is similar enough. So, here, in (semi) literate Haskell.

 overlap xs ys =

I will start by sorting the two sequences in order to get away from the problem of having to know about the previous values.

  go (sort xs) (sort ys) where

Two basic cases for recursion are simple enough to process - if one of them is empty, the result includes another list in the list of elements that do not overlap.

  go xs [] = ([], (xs, [])) go [] ys = ([], ([], ys))

Then I check the first items in each list. If they match, I can be sure that the lists overlap over this element, so I add this to the included elements and I allow the excluded elements. I continue to search the rest of the list, recursing along the tails of the lists.

  go (x:xs) (y:ys) | x == y = let ( included, excluded) = go xs ys in (x:included, excluded)

Then comes the interesting part! What I essentially want to know is that the first element of one of the lists does not exist in the second list - in this case I must add it to the excluded lists and continue the search.

  | x < y = let (included, ( xex, yex)) = go xs (y:ys) in (included, (x:xex, yex)) | y < x = let (included, ( xex, yex)) = go (x:xs) ys in (included, ( xex, y:yex))

And it really is. This seems to work, at least for the example you gave.

 > let (matched, unmatched) = overlap xy > matched [2,2,3] > unmatched ([1,3],[4,4,5])

Functional style removal

More articles: