Is it possible to group by several sizes in crossfilter?

For example, if we have data for books, authors, and dates. Can we build a cross-filter for how many books are available to the author per month?

+6
source share
3 answers

In pseudo-sql terms, you are trying to do the following:

SELECT COUNT(book) GROUP BY author, month 

The way I approach this problem is to combine the fields into a single dimension. Therefore, in your case, I would combine the information of the month and the author together into a dimension.

Let this be our test data:

 var cf = crossfilter([ { date:"1 jan 2014", author: "Mr X", book: "Book 1" }, { date:"2 jan 2014", author: "Mr X", book: "Book 2" }, { date:"3 feb 2014", author: "Mr X", book: "Book 3" }, { date:"1 mar 2014", author: "Mr X", book: "Book 4" }, { date:"2 apr 2014", author: "Mr X", book: "Book 5" }, { date:"3 apr 2014", author: "Mr X", book: "Book 6"}, { date:"1 jan 2014", author: "Ms Y", book: "Book 7" }, { date:"2 jan 2014", author: "Ms Y", book: "Book 8" }, { date:"3 jan 2014", author: "Ms Y", book: "Book 9" }, { date:"1 mar 2014", author: "Ms Y", book: "Book 10" }, { date:"2 mar 2014", author: "Ms Y", book: "Book 11" }, { date:"3 mar 2014", author: "Ms Y", book: "Book 12" }, { date:"4 apr 2014", author: "Ms Y", book: "Book 13" } ]); 

Size is determined as follows:

 var dimensionMonthAuthor = cf.dimension(function (d) { var thisDate = new Date(d.date); return 'month='+thisDate.getMonth()+';author='+d.author; }); 

And now we can just simply make an abbreviation counter to calculate how many books there are per author, per month (i.e. per unit):

 var monthAuthorCount = dimensionMonthAuthor.group().reduceCount(function (d) { return d.book; }).all(); 

And the results are as follows:

 {"key":"month=0;author=Mr X","value":2} {"key":"month=0;author=Ms Y","value":3} {"key":"month=1;author=Mr X","value":1} {"key":"month=2;author=Mr X","value":1} {"key":"month=2;author=Ms Y","value":3} {"key":"month=3;author=Mr X","value":2} {"key":"month=3;author=Ms Y","value":1} 
+21
source

I did not find the accepted answer all useful.

Instead, I used the following.

First I made a group with a key (in your case of the month)

  var authors = cf.dimension(function (d) { return +d['month']; }) 

Then I used the map reduction method in the key dataset to calculate the average values

Auxiliary grouping function:

 var monthsAvg = authors.group().reduce(reduceAddbooks, reduceRemovebooks, reduceInitialbooks).all(); 

Display Reduction Functions:

 function reduceAddbooks(p, v) { p.author = v['author']; p.books = +v['books']; return p; } function reduceRemovebooks(p, v) { p.author = v['author']; p.books = +v['books']; return p; } function reduceInitialbooks() { return { author:0, books:0 }; } 
+4
source

I want to update the old answer with the new work described in: https://github.com/dc-js/dc.js/pull/91

This performance has not been tested on large datasets.

  var cf = crossfilter([ { date:"1 jan 2014", author: "Mr X", book: "Book 1" }, { date:"2 jan 2014", author: "Mr X", book: "Book 2" }, { date:"3 feb 2014", author: "Mr X", book: "Book 3" }, { date:"1 mar 2014", author: "Mr X", book: "Book 4" }, { date:"2 apr 2014", author: "Mr X", book: "Book 5" }, { date:"3 apr 2014", author: "Mr X", book: "Book 6"}, { date:"1 jan 2014", author: "Ms Y", book: "Book 7" }, { date:"2 jan 2014", author: "Ms Y", book: "Book 8" }, { date:"3 jan 2014", author: "Ms Y", book: "Book 9" }, { date:"1 mar 2014", author: "Ms Y", book: "Book 10" }, { date:"2 mar 2014", author: "Ms Y", book: "Book 11" }, { date:"3 mar 2014", author: "Ms Y", book: "Book 12" }, { date:"4 apr 2014", author: "Ms Y", book: "Book 13" } ]); var dimensionMonthAuthor = cf.dimension(function (d) { var thisDate = new Date(d.date); //stringify() and later, parse() to get keyed objects return JSON.stringify ( { date: thisDate.getMonth() , author: d.author } ) ; }); group = dimensionMonthAuthor.group(); //this forEach method could be very expensive on write. group.all().forEach(function(d) { //parse the json string created above d.key = JSON.parse(d.key); }); return group.all() 

Results in:

 [ { key: { date: 0, author: 'Mr X' }, value: 2 }, { key: { date: 0, author: 'Ms Y' }, value: 3 }, { key: { date: 1, author: 'Mr X' }, value: 1 }, { key: { date: 2, author: 'Mr X' }, value: 1 }, { key: { date: 2, author: 'Ms Y' }, value: 3 }, { key: { date: 3, author: 'Mr X' }, value: 2 }, { key: { date: 3, author: 'Ms Y' }, value: 1 } ] 
+3
source

Source: https://habr.com/ru/post/945911/


All Articles