用 R 中的 GAM 對象 {mgcv} 校正多個成對比較

November 9, 2018

我的問題涉及 gam 對像中因子水平的成對比較。我有一個數據框，df包含對形狀變化的刺激的反應時間數據（以毫秒為單位）：
subjectID     RT     shape
001           501    square
002           722    circle
003           302    square
...
gam（來自 mgcv 包）提供了一種很好的方法來對這些數據進行建模，同時處理諸如 subjectID 之類的隨機效應變量（即控制受試者之間的“隨機”可變性）：
m1 = gam(data = df, formula = lrt ~ Rp * slfreq + s(subjectID, bs = "re")
我可以用來查看因子內單個條件（和相應的p值）的summary(m1)成對對比，但是這種方法只給出未校正的p值。shape

我的問題是：在處理 R 中的 gam 對象時，是否有一種方法可以對這些 p 值進行校正（例如 Tukey 校正）？

包中的glht()廣義線性假設函數multcomp可用於使用一系列不同的 p 值調整來執行各種對比。對於所有成對比較，您正在尋找的對比也稱為“Tukey”對比。p 值調整包括單步法、Shaffer、Westfall 和所有p.adjust方法，請參閱?summary.glht。

正如@GavinSimpson 在評論中指出的那樣：對於gam()來自mgcv此的對像不能開箱即用，但需要一些手動干預。因為一切都很方便lmer()。lme4我在下面說明瞭如何使用這兩個包multcomp來獲得等效的結果。為了說明，我使用sleepstudy來自lme4但將數字回歸器折疊為Days三級因子的數據（僅用於說明目的）：
library("lme4")
data("sleepstudy", package = "lme4")
sleepstudy$Days <- cut(sleepstudy$Days, breaks = c(-Inf, 2.5, 5.5, Inf),
 labels = c("low", "med", "high"))
m1 <- lmer(Reaction ~ Days + (1 | Subject), data = sleepstudy)
summary(m1)
## ...
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 262.170 9.802 26.747
## Daysmed 31.217 6.365 4.905
## Dayshigh 67.433 5.954 11.326
## ...
然後glht()可用於設置Days因子的所有成對（又名 Tukey）對比。然後該summary()方法應用 p 值調整（默認為單步）。
library("multcomp")
g1 <- glht(m1, linfct = mcp(Days = "Tukey"))
summary(g1)
## Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## Fit: lmer(formula = Reaction ~ Days + (1 | Subject), data = sleepstudy)
## 
## Linear Hypotheses:
## Estimate Std. Error z value Pr(>|z|) 
## med - low == 0 31.217 6.365 4.905 2.28e-06 ***
## high - low == 0 67.433 5.954 11.326 < 1e-06 ***
## high - med == 0 36.216 5.954 6.083 < 1e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)
如問題中所述，可以安裝相同的模型gam()。
library("mgcv")
m2 <- gam(Reaction ~ Days + s(Subject, bs = "re"), data = sleepstudy)
summary(m2)
## ...
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|) 
## (Intercept) 262.170 9.802 26.747 < 2e-16 ***
## Daysmed 31.217 6.365 4.905 2.27e-06 ***
## Dayshigh 67.433 5.954 11.326 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## ...
然而，mcp(Days = "Tukey")描述 Tukey 對比的方法不與gam()輸出配合，因此失敗：
g2 <- glht(m2, linfct = mcp(Days = "Tukey"))
## Error in linfct[[nm]] %*% C : 
## requires numeric/complex matrix/vector arguments
但是，手動設置對比度矩陣並不困難（儘管有點技術性和乏味）。
contr <- matrix(0, nrow = 3, ncol = length(coef(m2)))
colnames(contr) <- names(coef(m2))
rownames(contr) <- c("med - low", "high - low", "high - med")
contr[, 2:3] <- rbind(c(1, 0), c(0, 1), c(-1, 1))
對比矩陣的第一列顯示了此處需要的內容：由於low模型中的係數被約束為零，med - low因此簡單med且類似地為high - low。最後一行顯示了的對比high - med：
contr[, 1:5]
## (Intercept) Daysmed Dayshigh s(Subject).1 s(Subject).2
## med - low 0 1 0 0 0
## high - low 0 0 1 0 0
## high - med 0 -1 1 0 0
使用這個對比矩陣，我們可以進行成對比較glht()：
g2 <- glht(m2, linfct = contr)
summary(g2)
## Simultaneous Tests for General Linear Hypotheses
## 
## Fit: gam(formula = Reaction ~ Days + s(Subject, bs = "re"), data = sleepstudy)
## 
## Linear Hypotheses:
## Estimate Std. Error z value Pr(>|z|) 
## med - low == 0 31.217 6.365 4.905 2.35e-06 ***
## high - low == 0 67.433 5.954 11.326 < 1e-06 ***
## high - med == 0 36.216 5.954 6.083 < 1e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)
另一種表示要測試的對比度的非常方便的方法是通過字符串。這可以根據來自的效果名稱設置線性函數names(coef(m2))。對於具有較少水平的因素（因此較少的 Tukey 對比），這非常有效 - 但如果比較變得更加複雜，則可能更容易像上面那樣收縮對比矩陣。
g3 <- glht(m2, linfct = c("Daysmed = 0", "Dayshigh = 0", "Dayshigh - Daysmed = 0"))
summary(g3)
## Simultaneous Tests for General Linear Hypotheses
## 
## Fit: gam(formula = Reaction ~ Days + s(Subject, bs = "re"), data = sleepstudy)
## 
## Linear Hypotheses:
## Estimate Std. Error z value Pr(>|z|) 
## Daysmed == 0 31.217 6.365 4.905 2.53e-06 ***
## Dayshigh == 0 67.433 5.954 11.326 < 1e-06 ***
## Dayshigh - Daysmed == 0 36.216 5.954 6.083 < 1e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

引用自：https://stats.stackexchange.com/questions/376237

comments powered by Disqus

用 R 中的 GAM 對象 {mgcv} 校正多個成對比較

相關問答

使用 GAM 回歸分析非線性數據，但審稿人建議改為擬合指數或對數曲線。使用哪個？

我們什麼時候“停止”使用多種校正技術？

R中GAM的模型選擇

GAM 中連續和分類預測變量之間交互作用的不同建模方式

GAM（廣義加法模型）是否存在共線性問題？

何時使用 GAM 與 GLM