Use or not use Data.Map

Question

Use or not use Data.Map

I am currently working on the Haskell API. The latter provides some functions that currently take a list of lists as input, i.e. [(String,[(String, Double)])] .

For visualization purposes, here is an example of the list of lists mentioned above:

 [ ("A", [ ("I1", 1), ("I2", 2), ] ), ("B", [ ("I1", 3), ] ) ]

I have defined some private helper functions. One helper function will look for specific records in this list ( Data.List.find = O(n) ); another will make intersections; and another function converts the list above to the following:

 [ ("I1", [ ("A", 1), ("B", 3), ] ), ("I2", [ ("A", 2), ] ) ]

The conversion function uses Data.Map because it offers some functions that simplify the process, such as Data.Map.unionWith and Data.Map.insertWith . Well, since the conversion functions had to be called Data.Map.fromList and Data.Map.toList , I thought it would be nice to have a map map instead of a list of lists from the very beginning. And so I changed my input pattern to fit the card requirements card.

Again, for visualization purposes, here is a list at the top like a map of cards:

 Map.fromList [ ("A", Map.fromList [ ("I1", 1), ("I2", 2), ] ), ("B", Map.fromList [ ("I1", 3), ] ) ]

Thanks to this step, my code lost several lines, and thanks to Data.Map.lookup , finding what I want now only requires O(log n) .

However, I am now asking myself, is this really a good solution? Is map card a way to go? Or should the conversion function work with Data.Map.fromList and Data.Map.toList , and the rest should work with a list of lists? Or better yet, is there a data structure that is more suitable for this kind of work?

I look forward to your answers.

+4

data-structures haskell

Giuseppe accaputo Jan 03 '11 at 19:02

source share

3 answers

It smells like graphs and edges. One slightly different approach, which may or may not work, is to rework your problem, so instead of [(String,[(String,Double)])] you are just working with two lines of strings. Then you have [((String, String), Double)] , and the resulting map is of type Data.Map.Map (String, String) Double .

Alternatively, if the space for string keys is limited and thus can be effectively displayed in machine ints, consider using IntMap. The same semantics as the card, except that the keys MUST be machine ints (Int32 or Int64). Will have much better performance.

+4

mxc Jan 4 '11 at 12:26

source share

Of course, it depends on your actual data, but maybe you could use Multimap? There are implementations floating around (e.g. http://hackage.haskell.org/packages/archive/Holumbus-Distribution/0.0.1.1/doc/html/Holumbus-Data-MultiMap.html ), but I have not tried them.

+2

Landei Jan 03 '11 at 21:42

source share

rampion · Accepted Answer · 2011-01-03T20:57:08+0000

Initializing map cards still only takes O(n) .

Consider the list of lists first.

Let's say the external list is [a ₁ , a ₂ , ..., a _p ], and each inner element is a _j = (l _j , [b ₀ , b ₁ , ..., b _{q _j} ]). Then building a list of lists takes O (n = & sum; _{j = 1} ^p q _j ).

The initialization of the internal map takes m _j . = O (q _j ). Initializing map cards takes O (& _j ; _{= 1} ^p m _j ) = O (n).

Use or not use Data.Map

More articles: