超详细的R语言热图之complexheatmap系列(1)
獲取更多R語言和生信知識,請關注公眾號:醫學和生信筆記。?
公眾號后臺回復R語言,即可獲得海量學習資料!
目錄
第一章 簡介
1.1 設計理念
1.2 各章節速覽
第二章 單個熱圖
2.1 顏色
2.2 行標題/列標題
2.3 聚類
2.3.1 距離計算方法
2.3.2 聚類方法
2.3.3 自定義聚類樹顏色
2.3.4 重新排列聚類樹
2.4 改變行/列順序
2.5 Seriation包排序
2.6 行名/列名
2.7 熱圖分割
2.7.1 通過K-means方法分割
2.7.2 通過離散型變量分割
2.7.3 通過聚類樹分割
2.7.4 切片順序
2.7.5 分割標題
2.7.6 分割的圖形參數
2.7.7 分割寬度
2.7.8 分割注釋條
2.8 光柵圖(略)
2.9 自定義熱圖主體
2.9.1 cell_fun
2.9.2 layer_fun
2.10 熱圖大小
本系列是對ComplexeHeatmap包的學習筆記,部分內容根據自己的理解有適當的改動,但總體不影響原文。如有不明之處,以原文為準。原文請見:ComplexHeatmap Complete Reference
第一章 簡介
復雜熱圖可用于展示同一個數據集或不同數據集之間的關系或揭示內部規律。ComplexHeatmap包可提供靈活的熱圖展示及高度自定義的注釋圖形。
1.1 設計理念
一個完整的熱圖由熱圖主體和熱圖組件構成。熱圖主體可以被分為不同的行和列,熱圖組件包括行/列標題,聚類樹,行名/列名,行注釋條/列注釋條。
熱圖列表由多個熱圖主體和熱圖注釋組成,但不同的熱圖主體和注釋被有序排列,使得彼此之間具有較好的可比性。
ComplexHeatmap包是面向對象的,主要包括以下類:
-
Heatmap class: 單個熱圖,包括熱圖主體,行名/列名,標題,聚類樹,行注釋條/列注釋條;
-
HeatmapList class: 多個熱圖主體和熱圖注釋;
-
HeatmapAnnotation class: 定義一系列的行注釋/列注釋,這些注釋既可以作為熱圖組件,又可以獨立于熱圖;
還有一些其他類:
-
SingleAnnotation class: 定義單個行注釋/列注釋,包含在 HeatmapAnnotation class中;
-
ColorMapping class: 映射顏色,包括熱圖主體顏色和各種注釋的顏色
-
AnnotationFunction class: 創建用戶自定義的注釋 ComplexHeatmap是基于grid的,充分利用此包需要用戶了解grid繪圖系統的知識。
1.2 各章節速覽
2. 單個熱圖
介紹單個熱圖的組成
3. 熱圖注釋
熱圖注釋概念,如何繪制簡單注釋和復雜注釋,簡單注釋和復雜注釋的不同
4. 熱圖列表
如何繪制多個熱圖和注釋,它們的位置排布是怎樣安排的
5. 圖例
如何繪制熱圖主體和注釋條的圖例,如何自定義圖例
6. 熱圖裝飾
如何添加用戶自定義圖形
7. 瀑布圖
8. UpSet plot
9. 其他高階圖形
10. 和其他R包交互
11. 交互式熱圖
12. 更多例子
第二章 單個熱圖
單個熱圖是最常見的可視化圖形,雖然ComplexHeatmap包的閃光點是可以同時繪制多個熱圖,但是作為基本圖形,對單個熱圖的繪制也是很重要的。
首先隨機生成一個矩陣
set.seed(123) nr1 = 4; nr2 = 8; nr3 = 6; nr = nr1 + nr2 + nr3 nc1 = 6; nc2 = 8; nc3 = 10; nc = nc1 + nc2 + nc3 mat = cbind(rbind(matrix(rnorm(nr1*nc1, mean = 1, ? sd = 0.5), nr = nr1),matrix(rnorm(nr2*nc1, mean = 0, ? sd = 0.5), nr = nr2),matrix(rnorm(nr3*nc1, mean = 0, ? sd = 0.5), nr = nr3)),rbind(matrix(rnorm(nr1*nc2, mean = 0, ? sd = 0.5), nr = nr1),matrix(rnorm(nr2*nc2, mean = 1, ? sd = 0.5), nr = nr2),matrix(rnorm(nr3*nc2, mean = 0, ? sd = 0.5), nr = nr3)),rbind(matrix(rnorm(nr1*nc3, mean = 0.5, sd = 0.5), nr = nr1),matrix(rnorm(nr2*nc3, mean = 0.5, sd = 0.5), nr = nr2),matrix(rnorm(nr3*nc3, mean = 1, ? sd = 0.5), nr = nr3))) mat = mat[sample(nr, nr), sample(nc, nc)] # random shuffle rows and columns rownames(mat) = paste0("row", seq_len(nr)) colnames(mat) = paste0("column", seq_len(nc)) dim(mat) ## [1] 18 24Heatmap()函數是繪制熱圖的基本函數,它會繪制一個熱圖主體,行名,列名,聚類樹和注釋。默認的顏色是黃色系的。
library(ComplexHeatmap) ## 載入需要的程輯包:grid ## ======================================== ## ComplexHeatmap version 2.8.0 ## Bioconductor page: http://bioconductor.org/packages/ComplexHeatmap/ ## Github page: https://github.com/jokergoo/ComplexHeatmap ## Documentation: http://jokergoo.github.io/ComplexHeatmap-reference ## ## If you use it in published research, please cite: ## Gu, Z. Complex heatmaps reveal patterns and correlations in multidimensional ## ? genomic data. Bioinformatics 2016. ## ## The new InteractiveComplexHeatmap package can directly export static ## complex heatmaps into an interactive Shiny app with zero effort. Have a try! ## ## This message can be suppressed by: ## ? suppressPackageStartupMessages(library(ComplexHeatmap)) ## ======================================== heatmap(mat)2.1 顏色
對于熱圖可視化,顏色是數據矩陣的主要表示形式。在大多數情況下,熱圖用于可視化連續數值矩陣。在這種情況下,用戶應提供顏色映射功能。顏色映射函數接受數值型向量,并返回對應的顏色向量。用戶應始終使用circlize::colorRamp2()函數在Heatmap()中生成顏色映射。colorRamp2()的兩個參數是離散型數值向量和對應的顏色向量。colorRamp2()通過LAB顏色空間在每個間隔內線性插值顏色。另外,使用colorRamp2()有助于生成帶有適當刻度線的圖例。
在以下示例中,線性插值-2和2之間的值以獲得相應的顏色,大于2的值都映射為紅色,小于-2的值都映射為綠色。
library(circlize) ## ======================================== ## circlize version 0.4.13 ## CRAN page: https://cran.r-project.org/package=circlize ## Github page: https://github.com/jokergoo/circlize ## Documentation: https://jokergoo.github.io/circlize_book/book/ ## ## If you use it in published research, please cite: ## Gu, Z. circlize implements and enhances circular visualization ## ? in R. Bioinformatics 2014. ## ## This message can be suppressed by: ## ? suppressPackageStartupMessages(library(circlize)) ## ======================================== col_fun = colorRamp2(c(-2, 0, 2), c("green", "white", "red")) col_fun(seq(-3, 3)) ## [1] "#00FF00FF" "#00FF00FF" "#B1FF9AFF" "#FFFFFFFF" "#FF9E81FF" "#FF0000FF" ## [7] "#FF0000FF" Heatmap(mat, name = "mat", col = col_fun)使用colorRamp2()可以精確控制顏色映射范圍,并且不會受到極端值的影響。
mat2 = mat mat2[1, 1] = 100000 Heatmap(mat2, name = "mat", col = col_fun, column_title = "a matrix with outliers")另外,使用colorRamp2()可以使得多個熱圖之間的顏色具有可比性,如下所示,在3個熱圖中,相同的顏色總是對應相同的數值:
p1 <- Heatmap(mat, name = "mat", col = col_fun, column_title = "mat") p2 <- Heatmap(mat/4, name = "mat/4", col = col_fun, column_title = "mat/4") p3 <- Heatmap(abs(mat), name = "abs(mat)", col = col_fun, column_title = "abs(mat)") p1 + p2 + p3如果矩陣是連續的,也可以簡單地提供顏色的向量,并且顏色將被線性插值。 但是此方法對異常值不友好,因為映射總是從矩陣中的最小值開始,以最大值結束。
Heatmap(mat, name = "mat", col = rev(rainbow(10)), column_title = "set a color vector for continuous matrix")還可以可視化NA,使用na_col = "xxx"指定NA的顏色:
mat_with_na = mat na_index = sample(c(TRUE, FALSE), nrow(mat)*ncol(mat), replace = TRUE, prob = c(1, 9)) mat_with_na[na_index] = NA Heatmap(mat_with_na, name = "mat with na", na_col = "black", column_title = "a matrix with na")改變colorRamp2()函數的線性插值
f1 = colorRamp2(seq(min(mat), max(mat), length = 3), c("blue", "#EEEEEE", "red")) f2 = colorRamp2(seq(min(mat), max(mat), length = 3), c("blue", "#EEEEEE", "red"), space = "RGB")p1 <- Heatmap(mat, name = "mat1", col = f1, column_title = "color space in LAB") p2 <- Heatmap(mat, name = "mat2", col = f2, column_title = "color space in RGB") p1 + p2熱圖邊框的樣式通過border_gp函數控制,熱圖每個小格子的樣式由rect_gp = gpar()函數控制。
Heatmap(mat, name = "mat1", border_gp = gpar(lty = 2, col = "red"), column_title = "set heatmap border") Heatmap(mat, name = "mat2", column_title = "set cell border", rect_gp = gpar(col = "white", lty = 1, lwd = 2))如果設置type = "none",熱圖主體部分不會畫任何東西,可以通過cell_fun和layer_fun自定義,后面會介紹。
Heatmap(mat, name = "lalala", rect_gp = gpar(type="none"), column_title = "no heatmap body")2.2 行標題/列標題
添加行標題和列標題:
Heatmap(mat, name = "color", column_title = "i am column title", row_title = "i am row title")更改標題位置:
Heatmap(mat, name = "color", column_title = "i am column title", row_title = "i am row title", column_title_side = "bottom", row_title_side = "right")旋轉行/列標題:
Heatmap(mat, name = "color", column_title = "i am title", column_title_rot = 90, row_title = "i am row title", row_title_rot = 0)更改行/列標題樣式:
Heatmap(mat, name = "color", column_title = "i am column title", row_title = "i am row title", column_title_gp = gpar(fontsize = 20, fontface = "bold"),row_title_gp = gpar(col = "steelblue", fontsize = 16, fill = "red", border = "green"))標題是公式:
Heatmap(mat, name = "mat", column_title = expression(hat(beta) == (X^t * X)^{-1} * X^t * y))2.3 聚類
支持各種自定義
關閉聚類(不聚類):
p1 <- Heatmap(mat) p2 <- Heatmap(mat, cluster_rows = F, cluster_columns = F) p1 + p2聚類但是不顯示聚類樹:
Heatmap(mat, show_row_dend = T, show_column_dend = F)調整聚類樹的位置:
Heatmap(mat, row_dend_side = "right", column_dend_side = "bottom")調整聚類樹的高度和寬度:
Heatmap(mat, row_dend_width = unit(4, "cm"), column_dend_height = unit(3, "cm"))2.3.1 距離計算方法
支持:
-
pearson, spearson, kendall,三選一;
-
自定義距離計算函數
2.3.2 聚類方法
支持hclust函數提供的方法
Heatmap(mat, name = "mat", clustering_method_rows = "single")2.3.3 自定義聚類樹顏色
可以借助dendextend包自定義聚類樹的顏色,具體做法如下:
library(dendextend) ## ## --------------------- ## Welcome to dendextend version 1.15.1 ## Type citation('dendextend') for how to cite the package. ## ## Type browseVignettes(package = 'dendextend') for the package vignette. ## The github page is: https://github.com/talgalili/dendextend/ ## ## Suggestions and bug-reports can be submitted at: https://github.com/talgalili/dendextend/issues ## Or contact: <tal.galili@gmail.com> ## ## To suppress this message use: suppressPackageStartupMessages(library(dendextend)) ## --------------------- ## ## 載入程輯包:'dendextend' ## The following object is masked from 'package:stats': ## ## cutree row_dend = as.dendrogram(hclust(dist(mat))) row_dend = color_branches(row_dend, k = 2) # `color_branches()` returns a dendrogram object Heatmap(mat, name = "mat", cluster_rows = row_dend)row_dend_gp和column_dend_gp參數控制聚類樹樣式,使用此參數會覆蓋row_dend和column_dend:
Heatmap(mat, name = "mat", cluster_rows = row_dend, row_dend_gp = gpar(col = "red"))從2.5.6版本以后,可以通過提供合適的nodePar給樹的節點使用不同的形狀:
row_dend = dendrapply(row_dend, function(d) {attr(d, "nodePar") = list(cex = 0.8, pch = sample(20, 1), col = rand_color(1))return(d) }) Heatmap(mat, name = "mat", cluster_rows = row_dend, row_dend_width = unit(2, "cm"))2.3.4 重新排列聚類樹
在Heatmap()函數中,對聚類樹進行重新排序,以使具有較大差異的行/列彼此分離(請參閱reorder.dendrogram()文檔)。 此處的差異(或稱權重)是通過行/列的均值來計算的。如果將其設置為邏輯值,則row_dend_reorder和column_dend_reorder控制是否應用聚類樹重排序。 如果將兩個參數設置為數值向量,則它們還控制重排序的權重(會被傳遞給reorder.dendrogram()的wts參數)。可以通過設置row_dend_reorder = F來關閉重新排序。
默認情況下,如果將cluster_rows/cluster_columns設置為邏輯值或聚類函數,聚類樹會重新排序。 如果將cluster_rows/cluster_columns設置為聚類對象,則會關閉重排序。
m2 = matrix(1:100, nr = 10, byrow = TRUE) Heatmap(m2, name = "mat1", row_dend_reorder = FALSE, column_title = "no reordering") Heatmap(m2, name = "mat2", row_dend_reorder = TRUE, column_title = "apply reordering")還有非常多重新排序聚類樹的方法,可以使用使用dendsort包,所有的重新排序的方法都是返回排列好的聚類樹對象,因此我們可以先生成排列好的行/列聚類樹對象,然后再傳遞給cluster_rows和cluster_columns參數。
Heatmap(mat, name = "mat", column_title = "default reordering") library(dendsort) row_dend = dendsort(hclust(dist(mat))) col_dend = dendsort(hclust(dist(t(mat)))) Heatmap(mat, name = "mat", cluster_rows = row_dend, cluster_columns = col_dend,column_title = "reorder by dendsort")2.4 改變行/列順序
聚類可以改變行/列順序,我們也可以通過row_order和column_order手動改變行/列順序
Heatmap(mat, name = "mat", row_order = order(as.numeric(gsub("row", "", rownames(mat)))), column_order = order(as.numeric(gsub("column", "", colnames(mat)))),column_title = "reorder matrix") Heatmap(mat, name = "mat", row_order = sort(rownames(mat)), column_order = sort(colnames(mat)),column_title = "reorder matrix by row/column names")2.5 Seriation包排序
Seriation包是專門用來排序的,(詳見: Make Patterns Pop Out of Heatmaps with Seriation),一些用法如下:
library(seriation) o = seriate(max(mat) - mat, method = "BEA_TSP") Heatmap(max(mat) - mat, name = "mat", row_order = get_order(o, 1), column_order = get_order(o, 2),column_title = "seriation by BEA_TSP method") o1 = seriate(dist(mat), method = "TSP") o2 = seriate(dist(t(mat)), method = "TSP") Heatmap(mat, name = "mat", row_order = get_order(o1), column_order = get_order(o2),column_title = "seriation from the distance matrix") o1 = seriate(dist(mat), method = "GW") ## Registered S3 method overwritten by 'gclus': ## method from ## reorder.hclust seriation o2 = seriate(dist(t(mat)), method = "GW")Heatmap(mat, name = "mat", cluster_rows = as.dendrogram(o1[[1]]), cluster_columns = as.dendrogram(o2[[1]]))2.6 行名/列名
默認顯示,如果不想顯示行名/列名,使用show_row_names和show_column_names參數
Heatmap(mat, name = "mat", show_row_names = F, show_column_names = F)調整位置,使用row_names_side和column_names_side:
Heatmap(mat, name = "mat", row_names_side = "left", row_dend_side = "right", column_names_side = "top", column_dend_side = "bottom")調整行名/列名樣式,使用row_names_gp和column_names_gp:
Heatmap(mat, name = "mat", row_names_gp = gpar(fontsize = 20), column_names_gp = gpar(col = c(rep("red", 10), rep("blue", 8))))居中對齊:
Heatmap(mat, name = "mat", row_names_centered = TRUE, column_names_centered = TRUE)旋轉方向:
Heatmap(mat, name = "mat", column_names_rot = 45) Heatmap(mat, name = "mat", column_names_rot = 45, column_names_side = "top",column_dend_side = "bottom")行名/列名太長怎么辦?也能調整:
mat2 = mat rownames(mat2)[1] = paste(c(letters, LETTERS), collapse = "") Heatmap(mat2, name = "mat", row_title = "default row_names_max_width") Heatmap(mat2, name = "mat", row_title = "row_names_max_width as length of a*",row_names_max_width = max_text_width(rownames(mat2), gp = gpar(fontsize = 12)))自定義行名/列名,可用于解決原始矩陣行名/列名不能有重復的問題,或使用特殊符號等:
# use a named vector to make sure the correspondance between row names and row labels is correct row_labels = structure(paste0(letters[1:24], 1:24), names = paste0("row", 1:24)) column_labels = structure(paste0(LETTERS[1:24], 1:24), names = paste0("column", 1:24)) row_labels ## row1 row2 row3 row4 row5 row6 row7 row8 row9 row10 row11 row12 row13 ## "a1" "b2" "c3" "d4" "e5" "f6" "g7" "h8" "i9" "j10" "k11" "l12" "m13" ## row14 row15 row16 row17 row18 row19 row20 row21 row22 row23 row24 ## "n14" "o15" "p16" "q17" "r18" "s19" "t20" "u21" "v22" "w23" "x24" Heatmap(mat, name = "mat", row_labels = row_labels[rownames(mat)], column_labels = column_labels[colnames(mat)]) Heatmap(mat, name = "mat", row_labels = expression(alpha, beta, gamma, delta, epsilon, zeta, eta, theta, iota, kappa, lambda, mu, nu, xi, omicron, pi, rho, sigma))2.7 熱圖分割
主要通過四個參數調整:
-
row_km
-
row_split
-
column_km
-
column_split
2.7.1 通過K-means方法分割
Heatmap(mat, name = "mat", row_km = 2) Heatmap(mat, name = "mat", column_km = 3) Heatmap(mat, name = "mat", row_km = 2, column_km = 3)2.7.2 通過離散型變量分割
# split by a vector Heatmap(mat, name = "mat", row_split = rep(c("A", "B"), 9), column_split = rep(c("C", "D"), 12)) # split by a data frame Heatmap(mat, name = "mat", row_split = data.frame(rep(c("A", "B"), 9), rep(c("C", "D"), each = 9))) # split on both dimensions Heatmap(mat, name = "mat", row_split = factor(rep(c("A", "B"), 9)),column_split = factor(rep(c("C", "D"), 12)))2.7.3 通過聚類樹分割
Heatmap(mat, name = "mat", row_split = 2, column_split = 3) dend = hclust(dist(mat)) dend = color_branches(dend, k = 2) Heatmap(mat, name = "mat", cluster_rows = dend, row_split = 2) split = data.frame(cutree(hclust(dist(mat)), k = 2), rep(c("A", "B"), 9)) Heatmap(mat, name = "mat", row_split = split)2.7.4 切片順序
默認情況下,當把row_split/column_split設置為類別變量(向量或數據框)或設置row_km/column_km時,會對切片的平均值使用聚類,以顯示切片級別中的層次結構。 在這種情況下,無法精確地控制切片的順序,因為它是由切片的聚類控制的。但是可以將cluster_row_slices或cluster_column_slices設置為FALSE以關閉切片聚類,然后就可以精確地控制切片的順序了。
如果沒有切片聚類,則可以通過row_split/column_split中的每個變量的級別來控制每個切片的順序(在這種情況下,每個變量應該是一個因子)。 如果所有變量都是字符,則默認順序為unique(row_split)或unique(column_split)。
Heatmap(mat, name = "mat", row_split = rep(LETTERS[1:3], 6),column_split = rep(letters[1:6], 4)) # clustering is similar as previous heatmap with branches in some nodes in the dendrogram flipped Heatmap(mat, name = "mat", row_split = factor(rep(LETTERS[1:3], 6), levels = LETTERS[3:1]),column_split = factor(rep(letters[1:6], 4), levels = letters[6:1])) # now the order is exactly what we set Heatmap(mat, name = "mat", row_split = factor(rep(LETTERS[1:3], 6), levels = LETTERS[3:1]),column_split = factor(rep(letters[1:6], 4), levels = letters[6:1]),cluster_row_slices = FALSE, cluster_column_slices = FALSE)2.7.5 分割標題
split = data.frame(rep(c("A", "B"), 9), rep(c("C", "D"), each = 9)) Heatmap(mat, name = "mat", row_split = split, row_title = "%s|%s") map = c("A" = "aaa", "B" = "bbb", "C" = "333", "D" = "444") Heatmap(mat, name = "mat", row_split = split, row_title = "@{map[ x[1] ]}|@{map[ x[2] ]}") Heatmap(mat, name = "mat", row_split = split, row_title = "{map[ x[1] ]}|{map[ x[2] ]}") Heatmap(mat, name = "mat", row_split = split, row_title = "%s|%s", row_title_rot = 0) Heatmap(mat, name = "mat", row_split = 2, row_title = "cluster_%s") Heatmap(mat, name = "mat", row_split = split, row_title = c("top_slice", "middle_top_slice", "middle_bottom_slice", "bottom_slice"),row_title_rot = 0) Heatmap(mat, name = "mat", row_split = split, row_title = "there are four slices") ht = Heatmap(mat, name = "mat", row_split = split, row_title = "%s|%s") # This row_title is actually a heatmap-list-level row title draw(ht, row_title = "I am a row title") Heatmap(mat, name = "mat", row_split = split, row_title = NULL)2.7.6 分割的圖形參數
# 默認情況下標題頂部沒有空間,現在我們增加4pt的空間 ht_opt$TITLE_PADDING = unit(c(4, 4), "points") Heatmap(mat, name = "mat", row_km = 2, row_title_gp = gpar(col = c("red", "blue"), font = 1:2),row_names_gp = gpar(col = c("green", "orange"), fontsize = c(10, 14)),column_km = 3, column_title_gp = gpar(fill = c("red", "blue", "green"), font = 1:3),column_names_gp = gpar(col = c("green", "orange", "purple"), fontsize = c(10, 14, 8)))2.7.7 分割寬度
Heatmap(mat, name = "mat", row_km = 3, row_gap = unit(5, "mm")) Heatmap(mat, name = "mat", row_km = 3, row_gap = unit(c(2, 4), "mm")) Heatmap(mat, name = "mat", row_km = 2, column_km = 3, border = TRUE) Heatmap(mat, name = "mat", row_km = 2, column_km = 3, row_gap = unit(0, "mm"), column_gap = unit(0, "mm"), border = TRUE)2.7.8 分割注釋條
Heatmap(mat, name = "mat", row_km = 2, column_km = 3,top_annotation = HeatmapAnnotation(foo1 = 1:24, bar1 = anno_points(runif(24))),right_annotation = rowAnnotation(foo2 = 18:1, bar2 = anno_barplot(runif(18))) )2.8 光柵圖(略)
2.9 自定義熱圖主體
2.9.1 cell_fun
用來調整每一個小格子,需要提供一個函數,函數共有7個參數(參數名字可以不同,但是順序必須一樣):
-
i: 行索引
-
j: 列索引
-
x: 小格子中心點橫坐標
-
y: 小格子中心點縱坐標
-
width: 小格子寬度
-
height: 小格子高度
-
fill: 小格子填充色
最常見的用法是在熱圖中添加數字:
small_mat = mat[1:9, 1:9] col_fun = colorRamp2(c(-2, 0, 2), c("green", "white", "red")) Heatmap(small_mat, name = "mat", col = col_fun,cell_fun = function(j, i, x, y, width, height, fill) {grid.text(sprintf("%.1f", small_mat[i, j]), x, y, gp = gpar(fontsize = 10)) })也可以選擇只添加大于0的數字:
Heatmap(small_mat, name = "mat", col = col_fun,cell_fun = function(j, i, x, y, width, height, fill) {if(small_mat[i, j] > 0)grid.text(sprintf("%.1f", small_mat[i, j]), x, y, gp = gpar(fontsize = 10)) })直接分割熱圖,無需對cell_fun()做啥:
Heatmap(small_mat, name = "mat", col = col_fun,row_km = 2, column_km = 2,cell_fun = function(j, i, x, y, width, height, fill) {grid.text(sprintf("%.1f", small_mat[i, j]), x, y, gp = gpar(fontsize = 10)) })可視化相關性矩陣:
cor_mat = cor(small_mat) od = hclust(dist(cor_mat))$order cor_mat = cor_mat[od, od] nm = rownames(cor_mat) col_fun = circlize::colorRamp2(c(-1, 0, 1), c("green", "white", "red")) # `col = col_fun` here is used to generate the legend Heatmap(cor_mat, name = "correlation", col = col_fun, rect_gp = gpar(type = "none"), cell_fun = function(j, i, x, y, width, height, fill) {grid.rect(x = x, y = y, width = width, height = height, gp = gpar(col = "grey", fill = NA))if(i == j) {grid.text(nm[i], x = x, y = y)} else if(i > j) {grid.circle(x = x, y = y, r = abs(cor_mat[i, j])/2 * min(unit.c(width, height)), gp = gpar(fill = col_fun(cor_mat[i, j]), col = NA))} else {grid.text(sprintf("%.1f", cor_mat[i, j]), x, y, gp = gpar(fontsize = 10))}}, cluster_rows = FALSE, cluster_columns = FALSE,show_row_names = FALSE, show_column_names = FALSE)畫一個棋盤:
str = "B[cp];W[pq];B[dc];W[qd];B[eq];W[od];B[de];W[jc];B[qk];W[qn] ;B[qh];W[ck];B[ci];W[cn];B[hc];W[je];B[jq];W[df];B[ee];W[cf] ;B[ei];W[bc];B[ce];W[be];B[bd];W[cd];B[bf];W[ad];B[bg];W[cc] ;B[eb];W[db];B[ec];W[lq];B[nq];W[jp];B[iq];W[kq];B[pp];W[op] ;B[po];W[oq];B[rp];W[ql];B[oo];W[no];B[pl];W[pm];B[np];W[qq] ;B[om];W[ol];B[pk];W[qp];B[on];W[rm];B[mo];W[nr];B[rl];W[rk] ;B[qm];W[dp];B[dq];W[ql];B[or];W[mp];B[nn];W[mq];B[qm];W[bp] ;B[co];W[ql];B[no];W[pr];B[qm];W[dd];B[pn];W[ed];B[bo];W[eg] ;B[ef];W[dg];B[ge];W[gh];B[gf];W[gg];B[ek];W[ig];B[fd];W[en] ;B[bn];W[ip];B[dm];W[ff];B[cb];W[fe];B[hp];W[ho];B[hq];W[el] ;B[dl];W[fk];B[ej];W[fp];B[go];W[hn];B[fo];W[em];B[dn];W[eo] ;B[gp];W[ib];B[gc];W[pg];B[qg];W[ng];B[qc];W[re];B[pf];W[of] ;B[rc];W[ob];B[ph];W[qo];B[rn];W[mi];B[og];W[oe];B[qe];W[rd] ;B[rf];W[pd];B[gm];W[gl];B[fm];W[fl];B[lj];W[mj];B[lk];W[ro] ;B[hl];W[hk];B[ik];W[dk];B[bi];W[di];B[dj];W[dh];B[hj];W[gj] ;B[li];W[lh];B[kh];W[lg];B[jn];W[do];B[cl];W[ij];B[gk];W[bl] ;B[cm];W[hk];B[jk];W[lo];B[hi];W[hm];B[gk];W[bm];B[cn];W[hk] ;B[il];W[cq];B[bq];W[ii];B[sm];W[jo];B[kn];W[fq];B[ep];W[cj] ;B[bk];W[er];B[cr];W[gr];B[gk];W[fj];B[ko];W[kp];B[hr];W[jr] ;B[nh];W[mh];B[mk];W[bb];B[da];W[jh];B[ic];W[id];B[hb];W[jb] ;B[oj];W[fn];B[fs];W[fr];B[gs];W[es];B[hs];W[gn];B[kr];W[is] ;B[dr];W[fi];B[bj];W[hd];B[gd];W[ln];B[lm];W[oi];B[oh];W[ni] ;B[pi];W[ki];B[kj];W[ji];B[so];W[rq];B[if];W[jf];B[hh];W[hf] ;B[he];W[ie];B[hg];W[ba];B[ca];W[sp];B[im];W[sn];B[rm];W[pe] ;B[qf];W[if];B[hk];W[nj];B[nk];W[lr];B[mn];W[af];B[ag];W[ch] ;B[bh];W[lp];B[ia];W[ja];B[ha];W[sf];B[sg];W[se];B[eh];W[fh] ;B[in];W[ih];B[ae];W[so];B[af]"str = gsub("\\n", "", str) step = strsplit(str, ";")[[1]] type = gsub("(B|W).*", "\\1", step) row = gsub("(B|W)\\[(.).\\]", "\\2", step) column = gsub("(B|W)\\[.(.)\\]", "\\2", step)go_mat = matrix(nrow = 19, ncol = 19) rownames(go_mat) = letters[1:19] colnames(go_mat) = letters[1:19] for(i in seq_along(row)) {go_mat[row[i], column[i]] = type[i] } go_mat[1:4, 1:4] ## a b c d ## a NA NA NA "W" ## b "W" "W" "W" "B" ## c "B" "B" "W" "W" ## d "B" "W" "B" "W" Heatmap(go_mat, name = "go", rect_gp = gpar(type = "none"),cell_fun = function(j, i, x, y, w, h, col) {grid.rect(x, y, w, h, gp = gpar(fill = "#dcb35c", col = NA))if(i == 1) {grid.segments(x, y-h*0.5, x, y)} else if(i == nrow(go_mat)) {grid.segments(x, y, x, y+h*0.5)} else {grid.segments(x, y-h*0.5, x, y+h*0.5)}if(j == 1) {grid.segments(x, y, x+w*0.5, y) } else if(j == ncol(go_mat)) {grid.segments(x-w*0.5, y, x, y)} else {grid.segments(x-w*0.5, y, x+w*0.5, y)}if(i %in% c(4, 10, 16) & j %in% c(4, 10, 16)) {grid.points(x, y, pch = 16, size = unit(2, "mm"))}r = min(unit.c(w, h))*0.45if(is.na(go_mat[i, j])) {} else if(go_mat[i, j] == "W") {grid.circle(x, y, r, gp = gpar(fill = "white", col = "white"))} else if(go_mat[i, j] == "B") {grid.circle(x, y, r, gp = gpar(fill = "black", col = "black"))}},col = c("B" = "black", "W" = "white"),show_row_names = FALSE, show_column_names = FALSE,column_title = "One famous GO game",heatmap_legend_param = list(title = "Player", at = c("B", "W"), labels = c("player1", "player2"), border = "black") )2.9.2 layer_fun
用法差不多,layer_fun其實是cell_fun的“向量化版本”,速度更快,但其實作用是一樣的,學會一個就可以,都會更好。
# code only for demonstration Heatmap(..., layer_fun = function(j, i, x, y, w, h, fill) {...}) # or you can capitalize the arguments to mark they are vectors, # the names of the argumetn do not matter Heatmap(..., layer_fun = function(J, I, X, Y, W, H, F) {...}) col_fun = colorRamp2(c(-2, 0, 2), c("green", "white", "red")) Heatmap(small_mat, name = "mat", col = col_fun,layer_fun = function(j, i, x, y, width, height, fill) {# since grid.text can also be vectorizedgrid.text(sprintf("%.1f", pindex(small_mat, i, j)), x, y, gp = gpar(fontsize = 10)) }) Heatmap(small_mat, name = "mat", col = col_fun,row_km = 2, column_km = 2,layer_fun = function(j, i, x, y, width, height, fill) {v = pindex(small_mat, i, j)grid.text(sprintf("%.1f", v), x, y, gp = gpar(fontsize = 10))if(sum(v > 0)/length(v) > 0.75) {grid.rect(gp = gpar(lwd = 2, fill = "transparent"))} }) Heatmap(small_mat, name = "mat", col = col_fun,row_km = 2, column_km = 2,layer_fun = function(j, i, x, y, w, h, fill) {# restore_matrix() is explained after this chunk of codeind_mat = restore_matrix(j, i, x, y)for(ir in seq_len(nrow(ind_mat))) {# start from the second columnfor(ic in seq_len(ncol(ind_mat))[-1]) {ind1 = ind_mat[ir, ic-1] # previous columnind2 = ind_mat[ir, ic] # current columnv1 = small_mat[i[ind1], j[ind1]]v2 = small_mat[i[ind2], j[ind2]]if(v1 * v2 > 0) { # if they have the same signcol = ifelse(v1 > 0, "darkred", "darkgreen")grid.segments(x[ind1], y[ind1], x[ind2], y[ind2],gp = gpar(col = col, lwd = 2))grid.points(x[c(ind1, ind2)], y[c(ind1, ind2)], pch = 16, gp = gpar(col = col), size = unit(4, "mm"))}}}} )2.10 熱圖大小
heatmap_width和heatmap_height控制整個熱圖的大小(包括圖例),width和height只控制熱圖主體的大小
Heatmap(mat, name = "mat", width = unit(8, "cm"), height = unit(8, "cm")) Heatmap(mat, name = "mat", heatmap_width = unit(8, "cm"), heatmap_height = unit(8, "cm")) sessionInfo() ## R version 4.1.0 (2021-05-18) ## Platform: x86_64-w64-mingw32/x64 (64-bit) ## Running under: Windows 10 x64 (build 19044) ## ## Matrix products: default ## ## locale: ## [1] LC_COLLATE=Chinese (Simplified)_China.936 ## [2] LC_CTYPE=Chinese (Simplified)_China.936 ## [3] LC_MONETARY=Chinese (Simplified)_China.936 ## [4] LC_NUMERIC=C ## [5] LC_TIME=Chinese (Simplified)_China.936 ## ## attached base packages: ## [1] grid stats graphics grDevices utils datasets methods ## [8] base ## ## other attached packages: ## [1] seriation_1.3.0 dendsort_0.3.4 dendextend_1.15.1 ## [4] circlize_0.4.13 ComplexHeatmap_2.8.0 ## ## loaded via a namespace (and not attached): ## [1] shape_1.4.6 GetoptLong_1.0.5 tidyselect_1.1.1 ## [4] xfun_0.25 purrr_0.3.4 colorspace_2.0-2 ## [7] vctrs_0.3.8 generics_0.1.0 viridisLite_0.4.0 ## [10] stats4_4.1.0 utf8_1.2.2 rlang_0.4.11 ## [13] pillar_1.6.2 glue_1.4.2 DBI_1.1.1 ## [16] BiocGenerics_0.38.0 RColorBrewer_1.1-2 registry_0.5-1 ## [19] matrixStats_0.60.0 foreach_1.5.1 lifecycle_1.0.0 ## [22] stringr_1.4.0 munsell_0.5.0 gtable_0.3.0 ## [25] GlobalOptions_0.1.2 codetools_0.2-18 evaluate_0.14 ## [28] knitr_1.33 IRanges_2.26.0 Cairo_1.5-12.2 ## [31] doParallel_1.0.16 parallel_4.1.0 fansi_0.5.0 ## [34] highr_0.9 Rcpp_1.0.7 scales_1.1.1 ## [37] S4Vectors_0.30.0 magick_2.7.3 gridExtra_2.3 ## [40] rjson_0.2.20 ggplot2_3.3.5 png_0.1-7 ## [43] digest_0.6.27 gclus_1.3.2 stringi_1.7.3 ## [46] dplyr_1.0.7 clue_0.3-59 tools_4.1.0 ## [49] magrittr_2.0.1 tibble_3.1.3 cluster_2.1.2 ## [52] crayon_1.4.1 pkgconfig_2.0.3 ellipsis_0.3.2 ## [55] viridis_0.6.1 assertthat_0.2.1 iterators_1.0.13 ## [58] TSP_1.1-10 R6_2.5.1 compiler_4.1.0獲取更多R語言和生信知識,請關注公眾號:醫學和生信筆記。?
公眾號后臺回復R語言,即可獲得海量學習資料!
總結
以上是生活随笔為你收集整理的超详细的R语言热图之complexheatmap系列(1)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 格斗机器人制造图纸_一种新型格斗机器人的
- 下一篇: uniapp微信小程序是识别二维码下载a