Markov-Chain-Montecarlo

與 MCMC Metropolis-Hastings 變體混淆:Random-Walk、Non-Random-Walk、Independent、Metropolis

  • July 18, 2013

在過去的幾周里,我一直在嘗試理解 MCMC 和 Metropolis-Hastings 算法。每次我認為我理解它時,我意識到我錯了。我在網上找到的大多數代碼示例都實現了與描述不一致的東西。即:他們說他們實現了 Metropolis-Hastings,但他們實際上實現了隨機遊走大都會。其他人(幾乎總是)默默地跳過黑斯廷斯校正比率的實施,因為他們使用的是對稱提議分佈。實際上,到目前為止,我還沒有找到一個簡單的例子來計算比率。這讓我更加困惑。有人可以給我以下代碼示例(任何語言):

  • 具有 Hastings 校正比率計算的 Vanilla Non-Random Walk Metropolis-Hastings 算法(即使在使用對稱提議分佈時最終結果為 1)。
  • Vanilla Random Walk Metropolis-Hastings 算法。
  • Vanilla Independent Metropolis-Hastings 算法。

不需要提供 Metropolis 算法,因為如果我沒記錯的話,Metropolis 和 Metropolis-Hastings 之間的唯一區別是第一個總是從對稱分佈中採樣,因此它們沒有 Hastings 校正比率。無需對算法進行詳細解釋。我確實了解基礎知識,但我對 Metropolis-Hastings 算法的不同變體的所有不同名稱以及如何在 Vanilla 非隨機遊走 MH 上實際實現 Hastings 校正率感到困惑。請不要復制部分回答我的問題的粘貼鏈接,因為我很可能已經看過它們。這些鏈接使我感到困惑。謝謝你。

給你 - 三個例子。為了使邏輯更清晰(我希望),我已經使代碼的效率低於實際應用程序中的效率。

# We'll assume estimation of a Poisson mean as a function of x
x <- runif(100)
y <- rpois(100,5*x)  # beta = 5 where mean(y[i]) = beta*x[i]

# Prior distribution on log(beta): t(5) with mean 2 
# (Very spread out on original scale; median = 7.4, roughly)
log_prior <- function(log_beta) dt(log_beta-2, 5, log=TRUE)

# Log likelihood
log_lik <- function(log_beta, y, x) sum(dpois(y, exp(log_beta)*x, log=TRUE))

# Random Walk Metropolis-Hastings 
# Proposal is centered at the current value of the parameter

rw_proposal <- function(current) rnorm(1, current, 0.25)
rw_p_proposal_given_current <- function(proposal, current) dnorm(proposal, current, 0.25, log=TRUE)
rw_p_current_given_proposal <- function(current, proposal) dnorm(current, proposal, 0.25, log=TRUE)

rw_alpha <- function(proposal, current) {
  # Due to the structure of the rw proposal distribution, the rw_p_proposal_given_current and
  # rw_p_current_given_proposal terms cancel out, so we don't need to include them - although
  # logically they are still there:  p(prop|curr) = p(curr|prop) for all curr, prop
  exp(log_lik(proposal, y, x) + log_prior(proposal) - log_lik(current, y, x) - log_prior(current))
}

# Independent Metropolis-Hastings
# Note: the proposal is independent of the current value (hence the name), but I maintain the
# parameterization of the functions anyway.  The proposal is not ignorable any more
# when calculation the acceptance probability, as p(curr|prop) != p(prop|curr) in general.

ind_proposal <- function(current) rnorm(1, 2, 1) 
ind_p_proposal_given_current <- function(proposal, current) dnorm(proposal, 2, 1, log=TRUE)
ind_p_current_given_proposal <- function(current, proposal) dnorm(current, 2, 1, log=TRUE)

ind_alpha <- function(proposal, current) {
  exp(log_lik(proposal, y, x)  + log_prior(proposal) + ind_p_current_given_proposal(current, proposal) 
      - log_lik(current, y, x) - log_prior(current) - ind_p_proposal_given_current(proposal, current))
}

# Vanilla Metropolis-Hastings - the independence sampler would do here, but I'll add something
# else for the proposal distribution; a Normal(current, 0.1+abs(current)/5) - symmetric but with a different
# scale depending upon location, so can't ignore the proposal distribution when calculating alpha as
# p(prop|curr) != p(curr|prop) in general

van_proposal <- function(current) rnorm(1, current, 0.1+abs(current)/5)
van_p_proposal_given_current <- function(proposal, current) dnorm(proposal, current, 0.1+abs(current)/5, log=TRUE)
van_p_current_given_proposal <- function(current, proposal) dnorm(current, proposal, 0.1+abs(proposal)/5, log=TRUE)

van_alpha <- function(proposal, current) {
  exp(log_lik(proposal, y, x)  + log_prior(proposal) + ind_p_current_given_proposal(current, proposal) 
      - log_lik(current, y, x) - log_prior(current) - ind_p_proposal_given_current(proposal, current))
}


# Generate the chain
values <- rep(0, 10000) 
u <- runif(length(values))
naccept <- 0
current <- 1  # Initial value
propfunc <- van_proposal  # Substitute ind_proposal or rw_proposal here
alphafunc <- van_alpha    # Substitute ind_alpha or rw_alpha here
for (i in 1:length(values)) {
  proposal <- propfunc(current)
  alpha <- alphafunc(proposal, current)
  if (u[i] < alpha) {
     values[i] <- exp(proposal)
     current <- proposal
     naccept <- naccept + 1
  } else {
     values[i] <- exp(current)
  }
}
naccept / length(values)
summary(values)

對於香草採樣器,我們得到:

> naccept / length(values)
[1] 0.1737
> summary(values)
  Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 2.843   5.153   5.388   5.378   5.594   6.628 

這是一個較低的接受概率,但仍然……調整提案將在這裡有所幫助,或者採用不同的提案。這是隨機遊走提案的結果:

> naccept / length(values)
[1] 0.2902
> summary(values)
  Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 2.718   5.147   5.369   5.370   5.584   6.781 

類似的結果,正如人們所希望的那樣,以及更好的接受概率(一個參數的目標是~50%。)

並且,為了完整起見,獨立採樣器:

> naccept / length(values)
[1] 0.0684
> summary(values)
  Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 3.990   5.162   5.391   5.380   5.577   8.802 

因為它不“適應”後驗的形狀,所以它往往具有最差的接受概率,並且最難針對這個問題進行調整。

請注意,一般而言,我們更喜歡尾巴較粗的提案,但這是另一個話題。

引用自:https://stats.stackexchange.com/questions/64674

comments powered by Disqus