目录
前言
介绍一个r包upsetr,专门用来集合可视化,当多集合的韦恩图不容易看的时候,就是它大展身手的时候了。
一、r包及数据
#安装及加载r包 #install.packages("upsetr") library(upsetr) #载入数据集 data <- read.csv("upset.csv",header=true) #先大致浏览一下该数据集,数据集太长,就只看前几列 head(data[,1:6],6) #view(data) #弹出窗口,可查看数据
二、upset()函数
使用upsetr包里面的upset()函数绘制集合可视化图形。
1)基本参数
upset(data, sets = c("action", "adventure", "comedy", "drama", "fantasy" , "children","crime"),#查看特定的几个集合 mb.ratio = c(0.55, 0.45),#控制上方条形图以及下方点图的比例 order.by = "freq", #如何排序,这里freq表示从大到小排序展示 keep.order = true, #keep.order按照sets参数的顺序排序 number.angles = 30, #调整柱形图上数字角度 point.size = 2, line.size = 1, #点和线的大小 mainbar.y.label = "genre intersections", sets.x.label = "movies per genre", #坐标轴名称 text.scale = c(1.3, 1.3, 1, 1, 1.5, 1)) #六个数字,分别控制c(intersection size title, intersection size tick labels, set size title, set size tick labels, set names, numbers above bars)
2)queries参数
queries参数分为四个部分:query, param, color, active;
query: 指定哪个query,upsetr有内置,也可以自定义;
param: list, query作用于哪个交集
color:每个query都是一个list,里面可以设置颜色,没设置的话将调用包里默认的调色板;
active:被指定的条形图:true显示颜色,false在条形图顶端显示三角形;
upset(data, main.bar.color = "black", queries = list(list(query = intersects, #upsetr 内置的intersects query params = list("drama"), ##指定作用的交集 color = "red", ##设置颜色,未设置会调用默认调色板 active = f, # true:条形图被颜色覆盖,false:条形图顶端显示三角形 query.name = "drama"), # 添加query图例 list(query = intersects, params = list("action", "drama"), active = t,query.name = "emotional action"), list(query = intersects, params = list("drama", "comedy", "action"), color = "orange", active = t)),query.legend = "top")
3)attribute.plots参数
添加属性图,内置有柱形图、散点图、热图等
3.1 添加柱形图和散点图
upset(data, main.bar.color = "black", queries = list(list(query = intersects, params = list("drama"), color = "red", active = f, query.name = "drama"), list(query = intersects, params = list("action", "drama"), active = t,query.name = "emotional action"), list(query = intersects, params = list("drama", "comedy", "action"), color = "orange", active = t)), attribute.plots = list(gridrows = 45, #添加属性图 plots = list( list(plot = scatter_plot, #散点图 x = "releasedate", y = "avgrating", #横纵轴的变量 queries = t), #t 则显示出上面queries定义的颜色 list(plot = histogram, x = "releasedate", queries = f)), ncols = 2), # 添加的图分两列 query.legend = "top") #query图例在最上方
3.2 添加箱线图
每次最多添加两个箱线图
upset(movies, boxplot.summary = c("avgrating", "releasedate"))
3.3 添加密度曲线图
因默认属性图中没有密度曲线,需要自定义plot函数
#自定义密度曲线 another.plot <- function(data, x, y) { data$decades <- round_any(as.integer(unlist(data[y])), 10, ceiling) data <- data[which(data$decades >= 1970), ] myplot <- (ggplot(data, aes_string(x = x)) geom_density(aes(fill = factor(decades)), alpha = 0.4) theme(plot.margin = unit(c(0, 0, 0, 0), "cm"), legend.key.size = unit(0.4, "cm"))) }
upset(data, main.bar.color = "black", mb.ratio = c(0.5, 0.5), queries = list(list(query = intersects, params = list("drama"), color = "red", active = f), list(query = intersects, params = list("action", "drama"), active = t), list(query = intersects, params = list("drama", "comedy", "action"), color = "orange", active = t)), attribute.plots = list(gridrows = 50, plots = list(list(plot = histogram, x = "releasedate", queries = f), list(plot = scatter_plot, x = "releasedate", y = "avgrating", queries = t), list(plot = another.plot, x = "avgrating", y = "releasedate", queries = f)), ncols = 3))
参考
以上就是r语言upset包实现集合可视化示例详解的详细内容,更多关于r语言upset包集合可视化的资料请关注其它相关文章!