t.test 和 prop.test 的 p 值差異很大

July 23, 2021

測試兩個獨立組之間的命中差異（1 與 0） $ X $ 和 $ Y $ 根據以下考慮，應該可以使用 t 檢驗：

$ x_i\in{0,1} $ 是測量 $ i $ -第一項即時通訊組 $ X $ ，和 $ y_i\in{0,1} $ 組相同 $ Y $

每組中的比例是測量值的平均值，即 $ \mu_X=\sum_i x_i/n $ 和 $ \mu_Y=\sum_i y_i/n $

平均值之差 $ \mu_X $ 和 $ \mu_Y $ 可以用 t 檢驗對兩組進行檢驗

在這種特殊情況下，比例測試（prop.testR 中的函數）是另一種測試選項。有趣的是，結果完全不同：
> x <- c(rep(1, 10), rep(0, 90))
> y <- c(rep(1, 20), rep(0, 80))
> t.test(x,y,paired=FALSE)
t = -1.99, df = 183.61, p-value = 0.04808
> prop.test(c(10,20), c(100,100))
X-squared = 3.1765, df = 1, p-value = 0.07471
請注意 prop.test 的更高 p 值。這是否意味著t檢驗具有更高的功效，即可以區分 $ H_0 $ 它的替代品已經適用於較小的 $ n $ ? 在這種情況下是否有理由不使用 t 檢驗？

加法（編輯：在下面 Thomas Lumley 的回答下的評論中解決）： t 檢驗的結果更加令人驚訝，因為觀察到即使是漸近（“Wald”）95% 置信區間的兩個測量值重疊（0.1587989 > 0.1216014）：
> library(binom)
> binom.confint(10, 100, method="asymptotic")
     method  x   n mean      lower     upper
1 asymptotic 10 100  0.1 0.04120108 0.1587989
> binom.confint(20, 100, method="asymptotic")
     method  x   n mean     lower     upper
1 asymptotic 20 100  0.2 0.1216014 0.2783986
由於基於 t 分佈的置信區間應該比基於正態分佈的置信區間更寬（即 $ z_{1-\alpha/2} $ )，我不明白為什麼 t 檢驗報告在 5% 的水平上有顯著差異。

你是對的，測試應該更相似。它們是對手段的測試，並且是針對輕尾分佈的，因此您應該期望他們同意。更重要的是，估計的方差 $ \hat p(1-\hat p)/n $ 對於二項分佈非常接近 $ s^2/n $
> var(x)/100
[1] 0.0009090909
> .1*(.9)/100
[1] 9e-04
> .2*(.8)/100
[1] 0.0016
> var(y)/100
[1] 0.001616162
你看到的是連續性校正。如果您嘗試不使用，則 $ p $ -值幾乎相同
> t.test(x,y)

   Welch Two Sample t-test

data:  x and y
t = -1.99, df = 183.61, p-value = 0.04808
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.1991454034 -0.0008545966
sample estimates:
mean of x mean of y 
     0.1       0.2 

> prop.test(c(10,20),c(100, 100),correct=FALSE)

   2-sample test for equality of proportions without continuity correction

data:  c(10, 20) out of c(100, 100)
X-squared = 3.9216, df = 1, p-value = 0.04767
alternative hypothesis: two.sided
95 percent confidence interval:
-0.197998199 -0.002001801
sample estimates:
prop 1 prop 2 
  0.1    0.2 
卡方檢驗的連續性校正有點爭議。它確實大大減少了測試是反保守的情況的數量，但代價是使測試明顯保守。不使用“校正”會給出在原假設下更接近均勻分佈的 p 值。而且，正如您在此處看到的，不使用校正會使您更接近 t 檢驗。

引用自：https://stats.stackexchange.com/questions/535683

comments powered by Disqus

t.test 和 prop.test 的 p 值差異很大

相關問答

這是p-hacking嗎？

為什麼當平均值看起來真的不同時，t.test() 的 p 值在統計上不顯著

具有二分變量的兩組的顯著性檢驗

韋爾奇檢驗似乎比等方差 t 檢驗差得多

有沒有使用的測試|μ一種−μ乙|≤δ|μ一種−μ乙|≤d|{mu_A}-{mu_B}|le delta作為零假設？

CDF的置信區間