什麼是比較分佈的良好數據可視化技術？

May 13, 2012

我正在寫我的博士論文，我意識到我過度依賴箱線圖來比較分佈。您還喜歡哪些其他選擇來完成這項任務？

我還想問你是否知道任何其他資源，例如 R 畫廊，我可以在其中激發自己關於數據可視化的不同想法。

正如@gung 所建議的那樣，我將詳細說明我的評論。為了完整起見，我還將包括@Alexander 建議的小提琴情節。其中一些工具可用於比較兩個以上的樣本。

# Required packages

library(sn)
library(aplpack)
library(vioplot)
library(moments)
library(beanplot)

# Simulate from a normal and skew-normal distributions
x = rnorm(250,0,1)
y = rsn(250,0,1,5)

# Separated histograms
hist(x)
hist(y)

# Combined histograms
hist(x, xlim=c(-4,4),ylim=c(0,1), col="red",probability=T)
hist(y, add=T, col="blue",probability=T)

# Boxplots
boxplot(x,y)

# Separated smoothed densities
plot(density(x))
plot(density(y))

# Combined smoothed densities
plot(density(x),type="l",col="red",ylim=c(0,1),xlim=c(-4,4))
points(density(y),type="l",col="blue")

# Stem-and-leaf plots
stem(x)
stem(y)

# Back-to-back stem-and-leaf plots
stem.leaf.backback(x,y)

# Violin plot (suggested by Alexander)
vioplot(x,y)

# QQ-plot
qqplot(x,y,xlim=c(-4,4),ylim=c(-4,4))
qqline(x,y,col="red")

# Kolmogorov-Smirnov test
ks.test(x,y)

# six-numbers summary
summary(x)
summary(y)

# moment-based summary
c(mean(x),var(x),skewness(x),kurtosis(x))
c(mean(y),var(y),skewness(y),kurtosis(y))

# Empirical ROC curve
xx = c(-Inf, sort(unique(c(x,y))), Inf)
sens = sapply(xx, function(t){mean(x >= t)})
spec = sapply(xx, function(t){mean(y < t)})

plot(0, 0, xlim = c(0, 1), ylim = c(0, 1), type = 'l')
segments(0, 0, 1, 1, col = 1)
lines(1 - spec, sens, type = 'l', col = 2, lwd = 1)

# Beanplots
beanplot(x,y)

# Empirical CDF
plot(ecdf(x))
lines(ecdf(y))

我希望這有幫助。

引用自：https://stats.stackexchange.com/questions/28431

comments powered by Disqus

什麼是比較分佈的良好數據可視化技術？

相關問答

對於所有類型的分佈，均值的 CDF 是否始終為 0.5？

泊松分佈的原始推導是什麼？

的分佈X4(X1−X3)+X5(X2−X1)X4(X1−X3)+X5(X2−X1)x_4(x_1-x_3)+x_5(x_2-x_1)獨立同居X一世∼N(0,1)X一世∼ñ(0,1)x_i sim N(0,1)

指數分佈的隨機變量的指數分佈？

貝比露絲的說法有意義嗎？

beta 分佈隨機變量的 argmax 分佈