Haskell data structure with optimized access methods

I want to create data types for accessing and modifying a simple music library consisting of albums consisting of tracks. For a basic idea, consider the following:

data MusicCollection = MC { albums :: Seq Album } data Album = Album { albumTitle :: String, tracks :: Seq Track } data Track = Track { trackTitle :: String, tempo :: Tempo } data Tempo = Unknown | BPM Int 

In addition to the pace, there may be other attributes such as style or rating.

The above solution gives me quick access to random albums. In addition, I would like to have quick access to all tracks faster than a certain pace. And again, it would be nice to have quick random access to the returned tracks:

 fasterThan :: Int -> MusicCollection -> SomeRandomAccessCollection Track 

Updating a track in a collection should also not be too slow.

Question:

I do not know whether it is better to add a Map Tempo (Seq Track) to the MusicCollection or whether relational databases can be placed in any way. Perhaps there are completely different solutions?

I currently do not want to use the database, but it would be interesting to know when to use them in desktop applications.

+6
source share
3 answers

Yes, Map Tempo (Seq Track) is a good choice here, especially since its Ord based structure allows you to efficiently make queries, such as "all tracks with a tempo greater than n"; Wed Data.Map.split .

This is mainly an imitation of relational databases using an index.

You may also be interested in IxSet and HiggsSet (the alleged successor to IxSet by another author), which are designed to increase the set of structures with different indices, especially in tandem with a transactional serialization system, such as acid-state . They support convenient comparative queries such as more, less, etc. If you only have one or two indexes, chances are you just need to collapse your own.

+3
source

If you don't want to use databases, you pretty much need to make a Map for each relationship that you want to have quick access to, as you describe yourself.

Please note that Data.Map.Map in the containers package uses ordered keys, so you can, for example, use Data.Map.split to get sections of the map that are fast enough, which is useful for finding “tracks faster than a given tempo” (provided that you get Ord for Tempo ). Therefore, if you do not need ordered keys for the relationship, you should use Data.HashMap.HashMap from unordered-containers , which is faster to implement.

You can also use relational databases in memory without additional dependencies. For example, you can use persistent and Sqlite as a backend. How to do this is described in a permanent textbook .

+6
source

It's good that you mention databases, because relational databases are designed specifically to solve such problems. A classic example from database theory is the problem with facts / objects, the problem of which is almost certain here: the account has separate elements, but sometimes we want to write reports that examine the elements without linking them to their accounts. The database or data structure that makes us go through invoices to get items makes us do too much work in these cases.

So one of your alternatives here will only be to use a relational database (there are lightweight built-in ones like SQLite, which might be the most suitable). Another alternative is to create a more relational-like design and model the problem as multiple relationships that can be related to each other in several ways - possibly using an index that speeds up the connection.

So, you can start with something like this, where the tracks are available at the top level of the collection (and therefore it’s easy to just search the tracks):

 data MusicCollection = MC { tracks :: Set Track , albums :: Set Album , albumToTracks :: Album -> [Track] } data Track = Track { album :: Album, trackTitle :: String, tempo :: Maybe Int } deriving (Eq, Ord) data Album = Album { albumTitle :: String } deriving (Eq, Ord) 

When you built the MusicCollection value, you will need to make sure that you are creating the albumToTracks function, which allows you to quickly get album tracks. (Or you can use Map Album [Track] instead of a function, the function is a bit more general.)

Another alternative that I have not developed, but perhaps worth paying attention to: a circular data structure in which tracks and albums refer to each other. You would have these data declarations:

 data MusicCollection = MC { tracks :: Set Track, albums :: Set Album } data Track = Track { album :: Album, trackTitle :: String, tempo :: Maybe Int } deriving (Eq, Ord) data Album = Album { albumTitle :: String, tracks :: Set Track } deriving (Eq, Ord) 

The question here, of course, is whether it is possible to “link the node” when you are building this data structure so that it can be finally presented. If you can do it, well, that would be great ...

+3
source

Source: https://habr.com/ru/post/904189/


All Articles