走分层树

et_id 发布于 2019-04-15 hierarchy 最后更新 2019-04-15 22:59 25 浏览

我希望能够“遍历”(遍历)一个层次结构的集群(参见下图和代码)。我想要的是:

  1. 一个需要矩阵和最小高度的函数。在这个例子中说10。
    splitme <- function(matrix, minH){
        ##Some code
    }
    
  2. 从顶部开始到minH,每当有新的拆分时开始切割。这是第一个问题。如何检测新的分割以获得高度h
  3. 在这个特定的h上,有多少组?检索集群
    mycl <- cutree(hr, h=x);#x is that found h
    count <- count(mycl)# Bad code
    
  4. 保存变量(s)中的每个新矩阵。这是另一个难题,即动态创建x个新矩阵。因此,也许一个采用这些集群的函数需要做什么(比较)并返回一个变量??
  5. 继续3和4,直到minH到达

enter image description here

代码
# Generate data
set.seed(12345)
desc.1 <- c(rnorm(10, 0, 1), rnorm(20, 10, 4))
desc.2 <- c(rnorm(5, 20, .5), rnorm(5, 5, 1.5), rnorm(20, 10, 2))
desc.3 <- c(rnorm(10, 3, .1), rnorm(15, 6, .2), rnorm(5, 5, .3))
data <- cbind(desc.1, desc.2, desc.3)
# Create dendrogram
d <- dist(data) 
hc <- as.dendrogram(hclust(d))
# Function to color branches
colbranches <- function(n, col)
  {
  a <- attributes(n) # Find the attributes of current node
  # Color edges with requested color
  attr(n, "edgePar") <- c(a$edgePar, list(col=col, lwd=2))
  n # Don't forget to return the node!
  }
# Color the first sub-branch of the first branch in red,
# the second sub-branch in orange and the second branch in blue
hc[[1]][[1]] = dendrapply(hc[[1]][[1]], colbranches, "red")
hc[[1]][[2]] = dendrapply(hc[[1]][[2]], colbranches, "orange")
hc[[2]] = dendrapply(hc[[2]], colbranches, "blue")
# Plot
plot(hc)

已邀请:

wnatus

赞同来自:

我认为你需要的是树形图的共生相关系数。 它会告诉你所有分裂点的高度。从那里您可以轻松地穿过树。 我在下面尝试并将所有子矩阵存储到列表“子矩阵”。这是一个嵌套列表。第一级是来自所有分裂点的子矩阵。第二级是分裂点的子矩阵。 例如,如果您想要来自第一个分裂点(灰色和蓝色簇)的所有子矩阵,则它应该是子矩阵[[1]]。如果你想从子矩阵[[1]]得到第一个子矩阵(红色聚类),它应该是子矩阵[[1]] [1]。

splitme <- function(data, minH){
  ##Compute dist matrix and clustering dendrogram
  d <- dist(data)
  cl <- hclust(d)
  hc <- as.dendrogram(cl)
##Get the cophenetic correlation coefficient matrix (cccm)
  cccm <- round(cophenetic(hc), digits = 0)
#Get the heights of spliting points (sps)
  sps <- sort(unique(cccm), decreasing = T)
#This list store all the submatrices
  #The submatrices extract from the nth splitting points
  #(top splitting point being the 1st whereas bottom splitting point being the last)
  submatrices <- list()
#Iterate/Walk the dendrogram
  i <- 2 #Starting from 2 as the 1st value will give you the entire dendrogram as a whole
  while(sps[i] > minH){
    membership <- cutree(cl, h=sps[i]) #Cut the tree at splitting points
    lst <- list() #Create a list to store submatrices extract from a splitting point
    for(j in 1:max(membership)){
      member <- which(membership == j) #Get the corresponding data entry to create the submatrices
      df <- data.frame()
      for(p in member){
        df <- rbind(df, data[p, ])
        colnames(df) <- colnames(data)
        dm <- dist(df)
      }
      lst <- append(lst, list(dm)) #Append all submatrices from a splitting point to lst
    }
    submatrices <- append(submatrices, list(lst)) #Append the lst to submatrices list
    i <- i + 1
  }
  return(submatrices)
}