How to group nested collections based on specified criteria?

How can I group nested collections based on column values ​​that are dynamically given? For example, suppose we have the following nested collections; How can I group it by the values ​​in the first and second columns?

[ ["A" 2011 "Dan"] ["A" 2011 "Jon"] ["A" 2010 "Tim"] ["B" 2009 "Tom"] ]

Desired Result Display:

 { A { 2011 [['A', 2011, 'Dan'] ['A', 2011, 'Joe']] 2010 [['A', 2010, 'Tim']] } B { 2009 [['B', 2009, 'Tom']] } } 

The following is my solution, which almost works:

 (defn nest [data criteria] (if (empty? criteria) data (for [[kv] (group-by #(nth % (-> criteria vals first)) data)] (hash-map k (nest v (rest criteria)))))) 
+6
source share
5 answers

Here is the solution I came up with. It works, but I'm sure it can be improved.

 (defn nest [data criteria] (if (empty? criteria) data (into {} (for [[kv] (group-by #(nth % (-> criteria vals first)) data)] (hash-map k (nest v (rest criteria))))))) 
+1
source

I came up with the following:

 user=> (def a [["A" 2011 "Dan"] ["A" 2011 "Jon"] ["A" 2010 "Tim"] ["B" 2009 "Tom"] ]) user=> (into {} (for [[kv] (group-by first a)] [k (group-by second v)])) {"A" {2011 [["A" 2011 "Dan"] ["A" 2011 "Jon"]], 2010 [["A" 2010 "Tim"]]}, "B" {2009 [["B" 2009 "Tom"]]}} 
+6
source

group-by Generalization

I needed a group-by generalization that created more than 2-nested card maps. I would like to provide such a function with a list of arbitrary functions for recursive start through group-by . Here is what I came up with:

 (defn map-function-on-map-vals "Take a map and apply a function on its values. From [1]. [1] http://stackoverflow.com/a/1677069/500207" [mf] (zipmap (keys m) (map f (vals m)))) (defn nested-group-by "Like group-by but instead of a single function, this is given a list or vec of functions to apply recursively via group-by. An optional `final` argument (defaults to identity) may be given to run on the vector result of the final group-by." [fs coll & [final-fn]] (if (empty? fs) ((or final-fn identity) coll) (map-function-on-map-vals (group-by (first fs) coll) #(nested-group-by (rest fs) % final-fn)))) 

Your example

Applies to your dataset:

 cljs.user=> (def foo [ ["A" 2011 "Dan"] #_=> ["A" 2011 "Jon"] #_=> ["A" 2010 "Tim"] #_=> ["B" 2009 "Tom"] ]) cljs.user=> (require '[cljs.pprint :refer [pprint]]) nil cljs.user=> (pprint (nested-group-by [first second] foo)) {"A" {2011 [["A" 2011 "Dan"] ["A" 2011 "Jon"]], 2010 [["A" 2010 "Tim"]]}, "B" {2009 [["B" 2009 "Tom"]]}} 

Produces the exact desired output. nested-group-by can take three or four or more functions and creates many hash card slots. It may be useful to others.

Convenient function

nested-group-by also has a convenient additional function: final-fn , which defaults to identity , so if you did not specify it, the deepest nesting returns a vector of values, but if you provide final-fn , it runs on the innermost vectors . To illustrate: if you just wanted to know how many rows of the original dataset appeared in each category and year:

 cljs.user=> (nested-group-by [first second] foo count) #^^^^^ this is final-fn {"A" {2011 2, 2010 1}, "B" {2009 1}} 

Caveat

This function does not use recur , so deep recursive calls can explode the stack. However, for the intended use case with few features, this should not be a problem.

+2
source

I suspect the most idiomatic version of this is:

 (defn nest-by [ks coll] (let [keyfn (apply juxt ks)] (reduce (fn [mx] (update-in m (keyfn x) (fnil conj []) x)) {} coll))) 

This exploits the fact that update-in already does most of what you want. In your particular case, you would just go:

 (nest-by [first second] [["A" 2011 "Dan"] ["A" 2011 "Jon"] ["A" 2010 "Tim"] ["B" 2009 "Tom"] ]) {"A" {2011 [["A" 2011 "Dan"] ["A" 2011 "Jon"]], 2010 [["A" 2010 "Tim"]]}, "B" {2009 [["B" 2009 "Tom"]]}} 
+1
source

This is pretty close.

 (defn my-group [coll] (let [m (group-by #(-> % val first first) (group-by #(second %) coll))] (into {} (for [[kv] m] [k (#(into {} %) v)])))) (my-group [["A" 2011 "Dan"] ["A" 2011 "Jon"] ["A" 2010 "Tim"] ["B" 2009 "Tom"]]) {"A" { 2011 [["A" 2011 "Dan"] ["A" 2011 "Jon"]], 2010 [["A" 2010 "Tim"]] }, "B" {2009 [["B" 2009 "Tom"]]} } 

As usual, with Clojure, you may find something that is a little less verbose.

0
source

Source: https://habr.com/ru/post/898712/


All Articles