I have been using R for a long time, so I can’t say "hello, I'm new, explain it to me." But this is what I would like to ask because I came across this problem from time to time and every time I do not solve it and do not work on something else. But today I'm curious enough to ask.
I think of a data frame as a collection of columns having the same length. However, I know this is wrong. This is wrong, because matrices, elements with multiple columns can be inserted into the data frame. When I accidentally do this, I get a thing that does not print on the screen. there is
Apparently, inconsistent column names between what R says "head" and what it actually has, and
I can’t find a specific way to set the data frame: “Are you ordinary, one column per variable data frames” or “do you have some of these frustrating internal structures that make life difficult?”
You can see what I mean if you do. Run
example(predict.lm)
This starts the prediction method and generates an output matrix called pt.
Then change the last step of your example, instead of outputting the matrix output as free, add it to the data frame named npk
npk$predict <- predict(npk.aov, type = "terms")
After that, what is npk? Is it still a data frame? Yes
> is.data.frame(npk)
[1] TRUE
Hmm, notice how the head reports the column names:
> head(npk)
block N P K yield predict.block predict.N predict.P
1 1 0 1 1 49.5 -0.8500000 -4.9250000 0.2083333
2 1 1 1 0 62.8 -0.8500000 4.9250000 0.2083333
3 1 0 0 0 46.8 -0.8500000 -4.9250000 -0.2083333
4 1 1 0 1 57.0 -0.8500000 4.9250000 -0.2083333
5 2 1 0 0 59.8 2.5750000 4.9250000 -0.2083333
6 2 1 1 1 58.5 2.5750000 4.9250000 0.2083333
predict.K predict.N:P predict.N:K predict.P:K
1 -0.9583333 0.9416667 1.1750000 0.4250000
2 0.9583333 -2.8250000 1.1750000 -0.1416667
3 0.9583333 0.9416667 1.1750000 -0.1416667
4 -0.9583333 0.9416667 -3.5250000 -0.1416667
5 0.9583333 0.9416667 1.1750000 -0.1416667
6 -0.9583333 -2.8250000 -3.5250000 0.4250000
predict.N:P:K
1 0.0000000
2 0.0000000
3 0.0000000
4 0.0000000
5 0.0000000
6 0.0000000
This makes it look like there are columns named "pred.block" or "pred.P", but not:
> colnames(npk)
[1] "block" "N" "P" "K" "yield"
[6] "predict"
The colnames function will be more appropriately named column_or_whatever_else_we_find_here.
> npk$predict.P
NULL
"" , :
> npk$predict[ , "P"]
1 2 3 4 5
0.2083333 0.2083333 -0.2083333 -0.2083333 -0.2083333
6 7 8 9 10
0.2083333 -0.2083333 0.2083333 0.2083333 0.2083333
11 12 13 14 15
-0.2083333 -0.2083333 -0.2083333 0.2083333 -0.2083333
16 17 18 19 20
0.2083333 0.2083333 -0.2083333 -0.2083333 0.2083333
21 22 23 24
-0.2083333 0.2083333 0.2083333 -0.2083333
, , , .
npk - , ,
> npk.new <- merge(npk, pt, by = "row.names",
suffixes = c("", ".predict"))
> colnames(npk.new)
[1] "Row.names" "block" "N"
[4] "P" "K" "yield"
[7] "block.predict" "N.predict" "P.predict"
[10] "K.predict" "N:P" "N:K"
[13] "P:K" "N:P:K"
, , , , , " ".
: " , , ?" , . , .
, , "" . :
> sapply(npk, is.atomic)
block N P K yield predict
TRUE TRUE TRUE TRUE TRUE TRUE
> sapply(npk, is.vector)
block N P K yield predict
FALSE FALSE FALSE FALSE TRUE FALSE
, ,
> sapply(npk, is.matrix)
block N P K yield predict
FALSE FALSE FALSE FALSE FALSE TRUE
, , , " ", " ", "". , , .