Aggregating Values ​​in a Data Tree Using R

I am trying to count hours from a tree structure of a data structure. I can add the clock directly under the parent node, but I cannot include the clock assigned to the parent nodes in the tree. Any suggestions would be great.

This is what I get:

levelName hours totalhours 1 Ned NA 1 2 °--John 1 3 3 °--Kate 1 3 4 ¦--Dan 1 1 5 ¦--Ron 1 1 6 °--Sienna 1 1

This is what I am looking for:

levelName hours totalHours 1 Ned NA 5 2 °--John 1 5 3 °--Kate 1 4 4 ¦--Dan 1 1 5 ¦--Ron 1 1 6 °--Sienna 1 1

Here is my code:

# Install package
install.packages('data.tree')
library(data.tree)

# Create data frame
to <- c("Ned", "John", "Kate", "Kate", "Kate")
from <- c("John", "Kate", "Dan", "Ron", "Sienna")
hours <- c(1,1,1,1,1)
df <- data.frame(from,to,hours)

# Create data tree
tree <- FromDataFrameNetwork(df)
print(tree, "hours")

# Get running total of hours that includes all nodes and children values.
tree$Do(function(x) x$total <- Aggregate(x, "hours", sum), traversal = "post-order")
print(tree, "hours", runningtotal = tree$Get(Aggregate, "total", sum))
+4
source share
3 answers

You can simply use the recursive function:

myApply <- function(node) {
  node$totalHours <- 
    sum(c(node$hours, purrr::map_dbl(node$children, myApply)), na.rm = TRUE)
}
myApply(tree)
print(tree, "hours", "totalHours")

Result:

           levelName hours totalHours
1 Ned                   NA          5
2  °--John               1          5
3      °--Kate           1          4
4          ¦--Dan        1          1
5          ¦--Ron        1          1
6          °--Sienna     1          1

Edit: Filling two elements:

# Create data frame
to <- c("Ned", "John", "Kate", "Kate", "Kate")
from <- c("John", "Kate", "Dan", "Ron", "Sienna")
hours <- c(1,1,1,1,1)
hours2 <- 5:1
df <- data.frame(from,to,hours, hours2)

# Create data tree
tree <- FromDataFrameNetwork(df)
print(tree, "hours", "hours2")

myApply <- function(node) {
  res.ch <- purrr::map(node$children, myApply)
  a <- node$totalHours <- 
    sum(c(node$hours,  purrr::map_dbl(res.ch, 1)), na.rm = TRUE)
  b <- node$totalHours2 <- 
    sum(c(node$hours2, purrr::map_dbl(res.ch, 2)), na.rm = TRUE)
  list(a, b)
}
myApply(tree)
print(tree, "hours", "totalHours", "hours2", "totalHours2")

Result:

           levelName hours totalHours hours2 totalHours2
1 Ned                   NA          5     NA          15
2  °--John               1          5      5          15
3      °--Kate           1          4      4          10
4          ¦--Dan        1          1      3           3
5          ¦--Ron        1          1      2           2
6          °--Sienna     1          1      1           1
+6
source

Caching Aggregateover Doseems to work for only one field:

tree$Do(function(node) node$totalHours = node$hours)

tree$Do(function(node) node$totalHours = sum(if(!node$isLeaf) node$totalHours else 0,
                                             Aggregate(node, "totalHours", sum)),
        traversal = "post-order")
print(tree, "hours", "totalHours")
#           levelName hours totalHours
#1 Ned                   NA          5
#2  °--John               1          5
#3      °--Kate           1          4
#4          ¦--Dan        1          1
#5          ¦--Ron        1          1
#6          °--Sienna     1          1
+6
source

The Aggregate function of the data.tree package is especially useful if you want to summarize data recursively. In your case, there are two things you want to do:

  • Summarize children plus your own value
  • Save the amount in a separate variable

The way to do this is:

library(data.tree)

# Create data frame
to <- c("Ned", "John", "Kate", "Kate", "Kate")
from <- c("John", "Kate", "Dan", "Ron", "Sienna")
hours <- c(1,1,1,1,1)
df <- data.frame(from,to,hours)

# Create data tree
tree <- FromDataFrameNetwork(df)
print(tree, "hours")

# Get running total of hours that includes all nodes and children values.
tree$Do(function(x) x$total <- ifelse(is.null(x$hours), 0, x$hours) + sum(Get(x$children, "total")), traversal = "post-order")
print(tree, "hours", "total")
+3
source

Source: https://habr.com/ru/post/1681914/


All Articles