RavenDb - a request in a field that is aggregated in the Reduce function

I have a set of documents that are some work items:

public class WorkItem { public string Id {get;set; public string DocumentId { get; set; } public string FieldId { get; set; } public bool IsValidated { get; set; } } public class ExtractionUser { public string Id {get;set;} public string Name {get;set;} public string[] AssignedFields {get;set;} } 

The user has access to the FieldIds set. I need to query WorkItems based on this set of fields and get the status for each document:

 public class UserWorkItems { public string DocumentId { get; set; } public int Validated { get; set; } public int Total { get; set; } } 

The next request is the following:

 using (var session = RavenDb.OpenSession()) { string[] userFields = session.Load<User>("users/1").Fields; session.Query<WorkItem>() .Where(w => w.FieldId.In(userFields)) .GroupBy(w => w.DocumentId) .Select(g => new { DocumentId = g.Key, Validated = g.Where(w => w.IsValidated).Count(), Total = g.Count() }).Skip(page * perPage).Take(perPage) .ToArray(); } 

I tried to create a Map / Reduce index, but the main problem was that I needed to apply a filter to FieldId, which is not included in the Reduce output stream, since it is a counted property.

I also tried to make a simple Map index in FieldId for the query part and TransformResults to execute GroupBy, but since paging is applied before TransformResults, the pages and totals reflect the documents before grouping, which is not very good.

Then I tried to use the Multi Map index, which displays users and their collection of fields, and also displays work items and a field, and then tries to reduce the result to what I wanted. I created a gist with an index definition. Part of the abbreviation includes a group by field, followed by several SelectMany and final GroupBy and Select. The index was adopted by the raven, but I will not return any results. Iโ€™m a bit stuck in the Multi Map index, because I donโ€™t know how I could debug it.

I assume that eventually my problem can be reduced (pun intended) before requesting a โ€œreducedโ€ field?

Any ideas how I could achieve such functionality? Are there any other features that I could explore next to Map / MultiMap / Reduce / TransformResults?

UPDATE : while reading Ayende Map Shorten message I realized that I am approaching the wrong picture. Still looking for a solution ...

UPDATE 2 : after a bit more research, I ended up with this index, which looks the way I want, but returns no data (the index was defined directly in the studio)

Map

 from user in docs where user["@metadata"]["Raven-Entity-Name"] == "ExtractionUsers" from field in user.AssignedFields from item in docs where item["@metadata"]["Raven-Entity-Name"] == "WorkItems" && item.FieldId == field select new { UserId = user.Id, DocumentId = item.DocumentId, Validated = item.Status=="Validated"? 1: 0, Count = 1 } 

Decrease:

 from r in results group r by new { r.UserId , r.DocumentId } into g select new { UserId = g.Key.UserId, DocumentId = g.Key.DocumentId, Validated = g.Sum(d => d.Validated), Count = g.Sum(d => d.Count), } 

The idea is to try to map all documents and a link from users to fields and WorkItems in the index.

+4
source share
2 answers

A week later, I managed to solve the problem. I took a slightly different (less relational) approach, which is simple and seems to work fine. Here are the details if anyone else has such problems:

I group WorkItems by DocumentId and put Validated and NonValidated fields into the collection. The result of reducing the map is as follows:

 public class Result { public string DocumentId { get; set; } public string[] ValidatedFields { get; set; } public string[] ReadyFields { get; set; } } 

The Map function is as follows:

 Map = items => items.Select(i => new { DocumentId = i.DocumentId, ValidatedFields = i.IsValidated ? new string[] { i.FieldId } : new string[0], ReadyFields = !i.IsValidated ? new string[] { i.FieldId } : new string[0] }); 

And Reduce :

 Reduce = result => result .GroupBy(i => i.DocumentId) .Select(g => new { DocumentId = g.Key, ValidatedFields = g.SelectMany(i => i.ValidatedFields), ReadyFields = g.SelectMany(i => i.ReadyFields) }); 

To query the index, I now use the following expression:

 User user = session.Load<User>("users/1"); var result = session.Query<WorkItem, UserWorkItemIndex>() .As<UserWorkItemIndex.Result>() .Where(d => d.ValidatedFields.Any(f => f.In(user.AssignedFields))) .ToArray(); 

The only thing I need to do on the client side is to count only the fields belonging to the user.

There is also a gist with a solution.

+3
source

First of all, disclaimer: I had never worked with RavenDB on a real system before, but I read some articles, watched a few videos and really wanted the idea behind it. I thought of this problem as an interesting exercise. Thus, this approach may not be ideal; Comments and improvements are welcome.

My idea is that an index should be created in the WorkItems collection to include these fields:

  • DocumentId (due to the fact that we will group at the end)
  • FieldId (because we will filter this field)
  • ValidatedCount (the number of records that have IsValidated = true)
  • TotalCount

After creating this index, we can query it using the .Where(x => x.FieldId.In(userFields)) filter .Where(x => x.FieldId.In(userFields)) and return a set of results that have the structure described above.

To get the final result, we need to perform a more complex grouping on the DocumentId according to these results.

The code I came up with is this:

Index Definition

 public class WorkItems_ValidationStatistics : AbstractIndexCreationTask<WorkItem, WorkItems_ValidationStatistics.ReduceResult> { public class ReduceResult { public string DocumentId { get; set; } public string FieldId { get; set; } public int ValidatedCount { get; set; } public int TotalCount { get; set; } } public WorkItems_ValidationStatistics() { Map = workItems => from workItem in workItems select new { workItem.DocumentId, workItem.FieldId, ValidatedCount = workItem.IsValidated ? 1 : 0, TotalCount = 1 }; Reduce = results => from result in results group result by new { result.FieldId, result.DocumentId } into g select new { g.Key.DocumentId, g.Key.FieldId, ValidatedCount = g.Sum(x => x.ValidatedCount), TotalCount = g.Sum(x => x.TotalCount) }; } } 

Code for creating an index in the database:

 public void CreateIndex() { using (var store = CreateDocumentStore()) { IndexCreation.CreateIndexes( typeof(WorkItems_ValidationStatistics).Assembly, store); } } 

Note. . Alternatively, you can create an index directly in RavenDB Management Studio.

The code that queries the index and performs the final aggregation:

 public void GetWorkItemStatisticsGroupedByDocumentId() { using (var store = CreateDocumentStore()) using (var documentSession = store.OpenSession()) { var userFields = new[] { "fields/11", "fields/13" }; var items = documentSession .Query<WorkItems_ValidationStatistics.ReduceResult, WorkItems_ValidationStatistics>() .Where(x => x.FieldId.In(userFields)) .ToList(); var results = items .GroupBy(x => x.DocumentId) .Select(g => new { DocumentId = g.Key, ValidatedCount = g.Sum(x => x.ValidatedCount), TotalCount = g.Sum(x => x.TotalCount) }); foreach (var r in results) { Console.WriteLine("DocId={0}: validated: {1}/{2}", r.DocumentId, r.ValidatedCount, r.TotalCount); } } } 
0
source

Source: https://habr.com/ru/post/1444322/


All Articles