Creating a cohort-style data frame from an observation set

I am new to R and ask a simple question since I am still learning the R / R data manipulation style.

I have a data set of observations of the main clinical signs (blood pressure, cholesterol, etc.) over a period of time. Each observation has a patient identifier and date, but is entered as separate items. Something like that:

Patient ID    Date  Blood Pressure
         1 21/1/14             120
         1 19/3/14             134
         1  3/5/14             127

I want to convert the data so that for a given variable (for example, blood pressure) I have a data frame with one line for one patient and all blood pressure values ​​observed throughout the whole time in chronological order. Something like that:

Patient ID BP1 BP2 BP3 
         1 120 134 127

I want to do this because I want to be able to write code to select the average of the first three observed blood pressures, for example.

Any recommendations or reading recommendations would be greatly appreciated.

+4
source share
4 answers

You can achieve the desired formatting by modifying your data in several ways, including using a function reshape()in Base R or dcast()in a package reshape2, but it may be easier to just get your answer directly using the aggregation form. Here is one method using ddply()from the package plyr:

library(plyr)

df <- read.table(text="id  date  bp
1 21/1/14             120
1 19/3/14             134
1  3/5/14             127",header=TRUE)

df1 <- ddply(df, .(id), summarize, mean.bp = mean(bp[1:3]))

df1
#   id mean.bp
# 1  1     127

Of course, if you really want to do what you ask, you can do the following:

library(reshape2)

df$bp.id <- ave(df$id,df$id,FUN=function(x) paste0("BP",seq(along=x)))
df2 <- dcast(df[df$bp.id %in% paste0("BP",1:3)], id~bp.id, value.var="bp")    

df2
#   id BP1 BP2 BP3
# 1  1 120 134 127
+3
source
 # example dataframe
id <- c(rep(1:4,25))
date <- c(rep("21/01/14",30),rep("21/01/14",30),rep("22/01/14",30),rep("23/01/14",10))
bp <- c(rnorm(100,100))
df <- data.frame(id,date,bp)

# reorder the dataframe
library(dplyr)
df2 <- group_by(df,id) # group by id
df2 <- arrange(df2, date) # order each group by date
df3 <- mutate(df2,   # add a colum with ascending number per for each group
              c = 1:length(date))

# use dcast
library(reshape2)
dcast(df3[,c(1,4,3)],id~c)
+3
source

data.table-package ( melt dcast reshape2), :

newdf <- dcast(setDT(df)[, idx := 1:.N, by = id], id ~ paste0("bp",idx), value.var = "bp")

rowid :

newdf <- dcast(setDT(df), id ~ rowid(prefix="bp",id), value.var = "bp")

:

> newdf
   id bp1 bp2 bp3
1:  1 120 134 129
2:  2 110 124 119

, @SamDickson, () , df :

# using base R
df$first2mn <- ave(df$bp, df$id, FUN = function(x) mean(x[1:2])) 
# using data.table
setDT(df)[, first2mn := mean(bp[1:2]), id] 

:

> df
   id    date  bp first2mn
1:  1 21/1/14 120      127
2:  1 19/3/14 134      127
3:  1  3/5/14 129      127
4:  2 21/1/14 110      117
5:  2 19/3/14 124      117
6:  2  3/5/14 119      117

:

# using base R
aggregate(bp ~ id, df, function(x) mean(x[1:2])) 
# using data.table
setDT(df)[, .(bp = mean(bp[1:2])), id] 

:

  id  bp
1  1 127
2  2 117

:

df <- read.table(text="id  date  bp
1 21/1/14             120
1 19/3/14             134
1  3/5/14             129
2 21/1/14             110
2 19/3/14             124
2  3/5/14             119", header=TRUE)
+3

Other answers suggested a number of methods for calculating the average of the groups. A linked post provides a number of methods for calculating group-level highs. You will need to replace maxon meanin these answers.

The following is an alternate method for changing the width using the R base functions reshape.

Use data.frame provided by @jaap, add a variable to count observations by ID:

df$times <- ave(df$bp, df$id, FUN=seq_along)

Now make the change by discarding the unnecessary date variable:

reshape(df, direction="wide", drop="date", timevar="times")
  id bp.1 bp.2 bp.3
1  1  120  134  129
4  2  110  124  119
0
source

Source: https://habr.com/ru/post/1621529/


All Articles