R
在 R 中估計具有隨機效應的斷棒/分段線性模型中的斷點 [包括代碼和輸出]
當我還需要估計其他隨機效應時,有人可以告訴我如何讓 R 估計分段線性模型中的斷點(作為固定或隨機參數)嗎?
我在下麵包含了一個玩具示例,該示例適合曲棍球棒/斷棒回歸,具有隨機斜率方差和斷點為 4 的隨機 y 截距方差。我想估計斷點而不是指定它。它可以是隨機效應(首選)或固定效應。
library(lme4) str(sleepstudy) #Basis functions bp = 4 b1 <- function(x, bp) ifelse(x < bp, bp - x, 0) b2 <- function(x, bp) ifelse(x < bp, 0, x - bp) #Mixed effects model with break point = 4 (mod <- lmer(Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject), data = sleepstudy)) #Plot with break point = 4 xyplot( Reaction ~ Days | Subject, sleepstudy, aspect = "xy", layout = c(6,3), type = c("g", "p", "r"), xlab = "Days of sleep deprivation", ylab = "Average reaction time (ms)", panel = function(x,y) { panel.points(x,y) panel.lmline(x,y) pred <- predict(lm(y ~ b1(x, bp) + b2(x, bp)), newdata = data.frame(x = 0:9)) panel.lines(0:9, pred, lwd=1, lty=2, col="red") } )
輸出:
Linear mixed model fit by REML Formula: Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject) Data: sleepstudy AIC BIC logLik deviance REMLdev 1751 1783 -865.6 1744 1731 Random effects: Groups Name Variance Std.Dev. Corr Subject (Intercept) 1709.489 41.3460 b1(Days, bp) 90.238 9.4994 -0.797 b2(Days, bp) 59.348 7.7038 0.118 -0.008 Residual 563.030 23.7283 Number of obs: 180, groups: Subject, 18 Fixed effects: Estimate Std. Error t value (Intercept) 289.725 10.350 27.994 b1(Days, bp) -8.781 2.721 -3.227 b2(Days, bp) 11.710 2.184 5.362 Correlation of Fixed Effects: (Intr) b1(D,b b1(Days,bp) -0.761 b2(Days,bp) -0.054 0.181
另一種方法是將對 lmer 的調用包裝在一個函數中,該函數將斷點作為參數傳遞,然後根據斷點使用優化最小化擬合模型的偏差。這最大化了斷點的配置文件對數似然性,並且通常(即,不僅僅是這個問題)如果包裝器內部的函數(在這種情況下為 lmer)找到以傳遞給它的參數為條件的最大似然估計,則整個過程找到所有參數的聯合最大似然估計。
library(lme4) str(sleepstudy) #Basis functions bp = 4 b1 <- function(x, bp) ifelse(x < bp, bp - x, 0) b2 <- function(x, bp) ifelse(x < bp, 0, x - bp) #Wrapper for Mixed effects model with variable break point foo <- function(bp) { mod <- lmer(Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject), data = sleepstudy) deviance(mod) } search.range <- c(min(sleepstudy$Days)+0.5,max(sleepstudy$Days)-0.5) foo.opt <- optimize(foo, interval = search.range) bp <- foo.opt$minimum bp [1] 6.071932 mod <- lmer(Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject), data = sleepstudy)
要獲得斷點的置信區間,您可以使用配置文件可能性。例如,添加
qchisq(0.95,1)
到最小偏差(對於 95% 置信區間),然後搜索foo(x)
等於計算值的點:foo.root <- function(bp, tgt) { foo(bp) - tgt } tgt <- foo.opt$objective + qchisq(0.95,1) lb95 <- uniroot(foo.root, lower=search.range[1], upper=bp, tgt=tgt) ub95 <- uniroot(foo.root, lower=bp, upper=search.range[2], tgt=tgt) lb95$root [1] 5.754051 ub95$root [1] 6.923529
對於這個玩具問題,有些不對稱,但精度還不錯。如果您有足夠的數據使引導程序可靠,則另一種方法是引導估計程序。