R

在 R 中估計具有隨機效應的斷棒/分段線性模型中的斷點 [包括代碼和輸出]

  • December 13, 2011

當我還需要估計其他隨機效應時,有人可以告訴我如何讓 R 估計分段線性模型中的斷點(作為固定或隨機參數)嗎?

我在下麵包含了一個玩具示例,該示例適合曲棍球棒/斷棒回歸,具有隨機斜率方差和斷點為 4 的隨機 y 截距方差。我想估計斷點而不是指定它。它可以是隨機效應(首選)或固定效應。

library(lme4)
str(sleepstudy)

#Basis functions
bp = 4
b1 <- function(x, bp) ifelse(x < bp, bp - x, 0)
b2 <- function(x, bp) ifelse(x < bp, 0, x - bp)

#Mixed effects model with break point = 4
(mod <- lmer(Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject), data = sleepstudy))

#Plot with break point = 4
xyplot(
       Reaction ~ Days | Subject, sleepstudy, aspect = "xy",
       layout = c(6,3), type = c("g", "p", "r"),
       xlab = "Days of sleep deprivation",
       ylab = "Average reaction time (ms)",
       panel = function(x,y) {
       panel.points(x,y)
       panel.lmline(x,y)
       pred <- predict(lm(y ~ b1(x, bp) + b2(x, bp)), newdata = data.frame(x = 0:9))
           panel.lines(0:9, pred, lwd=1, lty=2, col="red")
       }
   )

輸出:

Linear mixed model fit by REML 
Formula: Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject) 
  Data: sleepstudy 
 AIC  BIC logLik deviance REMLdev
1751 1783 -865.6     1744    1731
Random effects:
Groups   Name         Variance Std.Dev. Corr          
Subject  (Intercept)  1709.489 41.3460                
         b1(Days, bp)   90.238  9.4994  -0.797        
         b2(Days, bp)   59.348  7.7038   0.118 -0.008 
Residual               563.030 23.7283                
Number of obs: 180, groups: Subject, 18

Fixed effects:
            Estimate Std. Error t value
(Intercept)   289.725     10.350  27.994
b1(Days, bp)   -8.781      2.721  -3.227
b2(Days, bp)   11.710      2.184   5.362

Correlation of Fixed Effects:
           (Intr) b1(D,b
b1(Days,bp) -0.761       
b2(Days,bp) -0.054  0.181

斷棒回歸適合每個人

另一種方法是將對 lmer 的調用包裝在一個函數中,該函數將斷點作為參數傳遞,然後根據斷點使用優化最小化擬合模型的偏差。這最大化了斷點的配置文件對數似然性,並且通常(即,不僅僅是這個問題)如果包裝器內部的函數(在這種情況下為 lmer)找到以傳遞給它的參數為條件的最大似然估計,則整個過程找到所有參數的聯合最大似然估計。

library(lme4)
str(sleepstudy)

#Basis functions
bp = 4
b1 <- function(x, bp) ifelse(x < bp, bp - x, 0)
b2 <- function(x, bp) ifelse(x < bp, 0, x - bp)

#Wrapper for Mixed effects model with variable break point
foo <- function(bp)
{
 mod <- lmer(Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject), data = sleepstudy)
 deviance(mod)
}

search.range <- c(min(sleepstudy$Days)+0.5,max(sleepstudy$Days)-0.5)
foo.opt <- optimize(foo, interval = search.range)
bp <- foo.opt$minimum
bp
[1] 6.071932
mod <- lmer(Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject), data = sleepstudy)

要獲得斷點的置信區間,您可以使用配置文件可能性。例如,添加qchisq(0.95,1)到最小偏差(對於 95% 置信區間),然後搜索foo(x)等於計算值的點:

foo.root <- function(bp, tgt)
{
 foo(bp) - tgt
}
tgt <- foo.opt$objective + qchisq(0.95,1)
lb95 <- uniroot(foo.root, lower=search.range[1], upper=bp, tgt=tgt)
ub95 <- uniroot(foo.root, lower=bp, upper=search.range[2], tgt=tgt)
lb95$root
[1] 5.754051
ub95$root
[1] 6.923529

對於這個玩具問題,有些不對稱,但精度還不錯。如果您有足夠的數據使引導程序可靠,則另一種方法是引導估計程序。

引用自:https://stats.stackexchange.com/questions/19772

comments powered by Disqus