How to add loaded values ​​of cluster nodes (tree) in NEWICK format in R

I want to create a tree (cluster) using the Interactive Tree of Life tool (iTOL). As an input file (or line), this tool uses Newick format , which is a way of representing graphically theoretical trees with edge lengths using parentheses and commas. In addition, additional information, such as loaded values of cluster nodes , can be supported .

For example, here I created a data set for cluster analysis using a package clusterGeneration:

library(clusterGeneration)
set.seed(1)    
tmp1 <- genRandomClust(numClust=3, sepVal=0.3, numNonNoisy=5,
        numNoisy=3, numOutlier=5, numReplicate=2, fileName="chk1")
data <- tmp1$datList[[2]]

Then I performed a cluster analysis and evaluated the support of the cluster nodes using bootstrap using the package pvclust:

set.seed(2)    
y <- pvclust(data=data,method.hclust="average",method.dist="correlation",nboot=100)
plot(y)  

Here is the cluster and the loaded values: cluster and bootstrapped values

To create a Newick file, I used apepackage:

library(ape)
yy<-as.phylo(y$hclust)
write.tree(yy,digits=2)

write.tree the function will print the tree in Newick format:

((x2: 0.45, x6: 0.45): 0.043, ((x7: 0.26, (x4: 0.14, (x1: 0.14, x3: 0.14): 0.0064) : 0.12): 0.22, (x5: ​​0.28, x8: 0.28): 0.2): 0.011);

These numbers represent the lengths of the branches (cluster edge lengths). Following the instructions on the iTOL help page (section "Downloading and working with your trees"), I manually added boot values ​​to my Newick file (bold values ​​below):

((2: 0,45, x6: 0,45) 74: 0,043, ((7: 0,26, (4: 0,14, (x1: 0,14, 3: 0,14) 55: 0,0064) 68: 0,12) 100: 0,22, (5: 0,28, 8: 0,28) 100: 0,2) 63: 0,011);

, iTOL. , ...

: , ?

:

(round(y$edges,2)*100)[,1:2]

, Newick, :

yy$edge.length

, write.tree . , .write.tree2, , Newick.

.

+4
1

: phylo node.label, node. . Newick , .write.tree2:

> .write.tree2
function (phy, digits = 10, tree.prefix = "") 
{
    brl <- !is.null(phy$edge.length)
    nodelab <- !is.null(phy$node.label)

...

    if (is.null(phy$root.edge)) {
        cp(")")
        if (nodelab) 
            cp(phy$node.label[1])
        cp(";")
    }
    else {
        cp(")")
        if (nodelab) 
            cp(phy$node.label[1])
        cp(":")
        cp(sprintf(f.d, phy$root.edge))
        cp(";")
    }

...

, . , posteriori.... , hclust phylo.

, , as.phylo.hclust, , hclust:

> as.phylo.hclust
function (x, ...) 
{
    N <- dim(x$merge)[1]
    edge <- matrix(0L, 2 * N, 2)
    edge.length <- numeric(2 * N)
    node <- integer(N)              #<-This one
...

, as.phylo.hclust nodenames, , hclust ( pvclust) , .. hclust , , ):

# NB: in the following function definition I only modified the commented lines
as.phylo.hclust.with.nodenames <- function (x, nodenames, ...) #We add a nodenames argument
{
    N <- dim(x$merge)[1]
    edge <- matrix(0L, 2 * N, 2)
    edge.length <- numeric(2 * N)
    node <- integer(N)
    node[N] <- N + 2L
    cur.nod <- N + 3L
    j <- 1L
    for (i in N:1) {
        edge[j:(j + 1), 1] <- node[i]
        for (l in 1:2) {
            k <- j + l - 1L
            y <- x$merge[i, l]
            if (y > 0) {
                edge[k, 2] <- node[y] <- cur.nod
                cur.nod <- cur.nod + 1L
                edge.length[k] <- x$height[i] - x$height[y]
            }
            else {
                edge[k, 2] <- -y
                edge.length[k] <- x$height[i]
            }
        }
        j <- j + 2L
    }
    if (is.null(x$labels)) 
        x$labels <- as.character(1:(N + 1))
    node.lab <- nodenames[order(node)] #Here we define our node labels
    obj <- list(edge = edge, edge.length = edge.length/2, tip.label = x$labels, 
        Nnode = N, node.label = node.lab) #And you put them in the final object
    class(obj) <- "phylo"
    reorder(obj)
}

, :

bootstraps <- (round(y$edges,2)*100)[,1:2]
yy<-as.phylo.hclust.with.nodenames(y$hclust, nodenames=bootstraps[,2])
write.tree(yy,tree.names=TRUE,digits=2)
[1] "((x5:0.27,x8:0.27)100:0.24,((x7:0.25,(x4:0.14,(x1:0.13,x3:0.13)61:0.014)99:0.11)100:0.23,(x2:0.46,x6:0.46)56:0.022)61:0.027)100;"
#See the bootstraps    ^^^ here for instance
plot(yy,show.node.label=TRUE) #To show that the order is correct
plot(y) #To compare with (here I used the yellow value)

enter image description hereenter image description here

+4

Source: https://habr.com/ru/post/1534163/


All Articles