Plyr only summarizes global function calls

I am trying to pass a function (weight.func) to another function (wrapper) that calls ddply. I want ddply to use this function (weight.func) as part of its calculations. I get the output that I want when weight.func is installed "globally", but not when it is passed as an anonymous function to the shell.

Can I get ddply to do what I want? Here is a sample code:

> print(sampleData)
   studentId   problem  part       workerId rating
1       8001 problem26 partA A127R5QI5OGBIK    0.0
2       8001 problem26 partA A1FCLYRBAB430F    0.0
3       8001 problem26 partA A25FZQY34C6RVO    0.0
4       8001 problem26 partA A3G0MO562MHMZ3    0.5
5       8001 problem26 partA A3RB9ZOIUC3NWG    2.0
6       8001 problem26 partB A1FCLYRBAB430F    0.5
7       8001 problem26 partB A1XRDZKSJBWY8Q    0.5
8       8001 problem26 partB A22CRWMZUX7FFR    0.5
9       8001 problem26 partB A25FZQY34C6RVO    1.0
10      8001 problem26 partB A3G0MO562MHMZ3    0.5
11      8001 problem27 partA A1ET309DW6M2XA    2.0
12      8001 problem27 partA A1FCLYRBAB430F    0.0
13      8001 problem27 partA A22CRWMZUX7FFR    0.0
14      8001 problem27 partA A25FZQY34C6RVO    0.0
15      8001 problem27 partA A3G0MO562MHMZ3    0.0
16      8001 problem27 partB A1FCLYRBAB430F    1.0
17      8001 problem27 partB A22CRWMZUX7FFR    0.0
18      8001 problem27 partB A25FZQY34C6RVO    0.0
19      8001 problem27 partB A2U9676210WST5    0.0
20      8001 problem27 partB A3G0MO562MHMZ3    0.0
21      8002 problem26 partA A127R5QI5OGBIK    0.0
22      8002 problem26 partA A1FCLYRBAB430F    0.5
23      8002 problem26 partA A22CRWMZUX7FFR    0.0
24      8002 problem26 partA A25FZQY34C6RVO    2.0
25      8002 problem26 partA A3G0MO562MHMZ3    0.5
26      8002 problem26 partB A17EHJZNJGNRAN    2.0
27      8002 problem26 partB A1FCLYRBAB430F    0.0
28      8002 problem26 partB A2IPRDTE6B4TAB    0.0
29      8002 problem26 partB A3G0MO562MHMZ3    0.0
30      8002 problem26 partB  A6SON3OS15XKA    0.0
31      8002 problem27 partA A1FCLYRBAB430F    0.0
32      8002 problem27 partA A25FZQY34C6RVO    0.0
33      8002 problem27 partA A2IPRDTE6B4TAB    0.0
34      8002 problem27 partA A2U9676210WST5    0.0
35      8002 problem27 partA A3G0MO562MHMZ3    0.0
36      8002 problem27 partB A1FCLYRBAB430F    0.0
37      8002 problem27 partB A1V52SSKROBV8E    2.0
38      8002 problem27 partB A25FZQY34C6RVO    2.0
39      8002 problem27 partB A2IPRDTE6B4TAB    0.0
40      8002 problem27 partB A3G0MO562MHMZ3    0.0
> 
> #Make a wrapper
> wrapper <- function ( ratingData, weight.func ) {
+   print(weight.func) #prove that the function is being passed
+   ddply(ratingData, c('studentId','problem','part'), summarize, 
+           sum.weights = sum ( weight.func(rating)  ))
+ }
> wrapper( sampleData, weight.func=function(x) (x+.001)^-1  )
function(x) (x+.001)^-1
Error in data.frame(sum.weights = sum(weight.func(rating))) : 
  could not find function "weight.func"
> 
> #'globally' declare weight.func
> weight.func <- function(x) (x+.001)^-1
> wrapper( sampleData, weight.func=NULL  )
NULL
  studentId   problem  part sum.weights
1      8001 problem26 partA 3002.495758
2      8001 problem26 partB    8.983033
3      8001 problem27 partA 4000.499750
4      8001 problem27 partB 4000.999001
5      8002 problem26 partA 2004.491766
6      8002 problem26 partB 4000.499750
7      8002 problem27 partA 5000.000000
8      8002 problem27 partB 3000.999500

The second result is the goal. Any help appreciated! (Including a non-plyr-based method for performing the same task.)

The example above is an example of a toy. This is the simplest case that I could reproduce.

+3
source share
4 answers

:

w2 <- function(d, f){
  aggregate(rating~studentId+problem+part, function(x)sum(f(x)), data=d)
}

w2( sampleData, function(x) (x+.001)^-1  )

, , , , .

ddply,

wrapper <- function ( ratingData, weight.func ) {
   ddply(ratingData, c('studentId','problem','part'), function(x)c(sum.weights=sum(weight.func(x$rating))))
 }

wrapper( sampleData, weight.func=function(x) (x+.001)^-1  )

name.

+2

, ( "" NULL < < something β†’ ), :

wrapper <- function ( ratingData, weight.func=weight.func) {
      ddply(ratingData, .variables=c('studentId','problem','part'),  
            .fun=summarise, sum.weights = sum(weight.func(rating)  ))
  }

wrapper( sampleData, weight.func=weight.func  )
  studentId   problem  part sum.weights
1      8001 problem26 partA 3002.495758
2      8001 problem26 partB    8.983033
3      8001 problem27 partA 4000.499750
4      8001 problem27 partB 4000.999001
5      8002 problem26 partA 2004.491766
6      8002 problem26 partB 4000.499750
7      8002 problem27 partA 5000.000000
8      8002 problem27 partB 3000.999500
0

plyr (https://github.com/hadley/plyr/issues/3):

"" plyr, "summaryize" "here (summaryize)", , ddply.

wrapper <- function(ratingData, weight.func){
           ddply(ratingData, c('studentId','problem','part'),
                 here(summarize),  # here(summarize)!
                 sum.weights = sum(weight.func(rating))
                 )
            }
0

Source: https://habr.com/ru/post/1777137/


All Articles