固定效應假人和固定效應估計器之間的區別？

September 26, 2015

我開始閱讀有關面板回歸模型的信息。但是，我對固定效應模型中的不同模型規格有點困惑：

固定效應面板回歸是否總是意味著我為橫截面（例如，對於我的樣本中的每個國家）引入虛擬變量，然後運行例如 OLS 估計？

在回歸模型中添加固定效應虛擬變量和固定效應估計器有什麼區別？

謝謝你的幫助！

為了看到相等性，讓我們首先推導出 FE 估計量。

定義殘差製造者矩陣
在哪裡 M 表示面板中每個單位的觀察次數。

與預乘居中和在他們的平均水平附近 , $$ \begin{align*} \mathbf{Q}\mathbf{y}_i&=\mathbf{y}_i-\mathbf{1}_M\mathbf{1}_M'\mathbf{y}_i/M\&=\mathbf{y}_i-\mathbf{1}M\overline{y{i}}. \end{align*} $$ 這也意味著回歸變量集中的每次不變變量變成一列零，因此從數據中消除。

這是 FE 估計器的一個嚴重缺點。考慮一組員工的工資回歸示例。性別或學校教育等變量是主要關注點，但（通常）不會隨時間而變化（不再）。

作為，我們有，使用誤差分量模型 $ \mathbf{y}_i=\mathbf{Z}_i\mathbf{\delta}+\mathbf{1}M\alpha_i+\mathbf{\eta}{i} \eta_i M $ \begin{align*} \mathbf{Q}\mathbf{y}_i&=\mathbf{Q}\mathbf{F}i\mathbf{\beta}+\mathbf{Q}\mathbf{\eta}{i}\qquad i=1,\ldots,n\ \tilde{\mathbf{y}}_i&\equiv\tilde{\mathbf{F}}i\mathbf{\beta}+\tilde{\mathbf{\eta}}{i}, \end{align*}
\underset{(Mn\times 1)}{\tilde{\mathbf{y}}}:=\left(% ˜y1 ⋮ ˜yn
% \right)\qquad\underset{(Mn\times L_b)}{\tilde{\mathbf{F}}}:=\left(% ˜F1 ⋮ ˜Fn
% \right) $$

FE 估計器只是應用於這些的 OLS 意見：

要查看 FE 和最小二乘虛擬變量之間的相等性，請進一步疊加觀察結果：
和 η(Mn×1):=(η1 ⋮ ηn );α(n×1):=(α1 ⋮ αn ).

此外，讓

然後，在矩陣符號中的誤差分量假設下獲得線性面板數據模型：
一個虛擬變量模型。

也就是說，我們也可以得到一個估計量從回歸量的 OLS 回歸和個別具體效果。

現在，請注意 Frisch-Waugh-Lovell 定理表示可以通過回歸找到 $ \mathbf{M}{\mathbf{D}}\mathbf{y} \mathbf{M}{\mathbf{D}}\mathbf{F} $ \underset{(Mn\times Mn)}{\mathbf{M}{\mathbf{D}}}:=\mathbf{I}-\mathbf{D}(\mathbf{D}'\mathbf{D})^{-1}\mathbf{D}' $$ 使用對稱性和冪等性 $ \mathbf{M}{\mathbf{D}} $ \begin{equation} \widehat{\mathbf{\beta}}{\text{LSDV}}=(\mathbf{F}'\mathbf{M}{\mathbf{D}}\mathbf{F})^{-1}\mathbf{F}'\mathbf{M}_{\mathbf{D}}\mathbf{y} \end{equation} $$

現在， $$ \begin{align*} \mathbf{M}{\mathbf{D}}&=\mathbf{I}{Mn}-(\mathbf{I}_n\otimes\mathbf{1}_M)[(\mathbf{I}_n\otimes\mathbf{1}_M)'(\mathbf{I}_n\otimes\mathbf{1}_M)]^{-1}(\mathbf{I}n\otimes\mathbf{1}M)'\ &=\mathbf{I}{n}\otimes\mathbf{I}{M}-(\mathbf{I}_n\otimes\mathbf{1}_M)[(\mathbf{I}_n\otimes\mathbf{1}_M')(\mathbf{I}_n\otimes\mathbf{1}_M)]^{-1}(\mathbf{I}n\otimes\mathbf{1}M')\ &=\mathbf{I}{n}\otimes\mathbf{I}{M}-(\mathbf{I}_n\otimes\mathbf{1}_M)[\mathbf{I}_n\otimes\mathbf{1}_M'\mathbf{1}_M]^{-1}(\mathbf{I}n\otimes\mathbf{1}M')\ &=\mathbf{I}{n}\otimes\mathbf{I}{M}-(\mathbf{I}_n\otimes\mathbf{1}_M)[\mathbf{I}_n\otimes M]^{-1}(\mathbf{I}n\otimes\mathbf{1}M')\ &=\mathbf{I}{n}\otimes\mathbf{I}{M}-(\mathbf{I}n\otimes\mathbf{1}M)\left\mathbf{I}_n\otimes \frac{1}{M}\right\ &=\mathbf{I}{n}\otimes\mathbf{I}{M}-(\mathbf{I}_n\otimes\mathbf{1}_M)\left[\mathbf{I}n\otimes \frac{1}{M}\mathbf{1}M'\right]\ &=\mathbf{I}{n}\otimes\mathbf{I}{M}-\mathbf{I}_n\otimes\mathbf{1}M\frac{1}{M}\mathbf{1}M'\ &=\mathbf{I}{n}\otimes\left(\mathbf{I}{M}-\frac{1}{M}\mathbf{1}_M\mathbf{1}_M'\right)\ &=\mathbf{I}_n\otimes\mathbf{Q} \end{align*} $$

因此， $$ \begin{align*} \mathbf{M}{\mathbf{D}}\mathbf{F}&=(\mathbf{I}n\otimes\mathbf{Q})\mathbf{F}\ &=\left(%
\right)\mathbf{F}\ &=\tilde{\mathbf{F}}, \end{align*} \widehat{\mathbf{\beta}}{\text{LSDV}}=\widehat{\mathbf{\beta}}{{FE}}. $$

順便說一句，雖然符號適用於平衡面板數據，但結果也適用於不平衡的情況，因為可以使用更複雜的符號或以下數字說明進行檢查：
library(plm)

# panel dimensions
n <- 10
m <- sample(2:4, n, replace=T) # unbalanced panel

# some data
alpha <- runif(n)
beta <- -2
y <- X <- y.d <- X.d <- c()
D <- matrix(0, sum(m), n) # for the dummy variable matrix
row.counter <- 0
for (i in 1:n) {
 X.n <- runif(m[i],i,i+1)
 X.d <- c(X.d, X.n - mean(X.n))
 X <- c(X,X.n)
 y.n <- alpha[i] + X.n*beta + rnorm(m[i])
 y <- c(y, y.n)
 y.d <- c(y.d, y.n - mean(y.n))
 
 D[(row.counter+1):(row.counter+m[i]), i] <- rep(1, m[i])
 row.counter <- row.counter + m[i]
}
輸出：
> # plm
> paneldata <- data.frame(rep(1:n, times=m), unlist(sapply(m, function(i) 1:i)), y, X) # first two columns are for plm to understand the panel .... [TRUNCATED] 

> FE <- plm(y~X, data = paneldata, model = "within")

> # results:
> coef(FE)  # the slope coefficient
       X 
-2.331847 

> fixef(FE) # the intercepts
     1       2       3       4       5       6       7       8       9      10 
0.99396 2.30328 1.90957 2.22670 1.09438 3.10411 2.03265 4.39759 4.42384 4.15294 

> # FWL
> lm(y.d~X.d-1) # just the slope in this formulation

Call:
lm(formula = y.d ~ X.d - 1)

Coefficients:
  X.d  
-2.332  


> # LSDV
> lm(y~D+X-1) # intercepts and slope

Call:
lm(formula = y ~ D + X - 1)

Coefficients:
   D1      D2      D3      D4      D5      D6      D7      D8      D9     D10       X  
0.994   2.303   1.910   2.227   1.094   3.104   2.033   4.398   4.424   4.153  -2.332 

引用自：https://stats.stackexchange.com/questions/174243

固定效應假人和固定效應估計器之間的區別？

相關問答

我們真的在線性回歸的第一步中取隨機線嗎？

為什麼是 F 統計量≈≈approx1 當原假設為真時？

究竟什麼是多熱編碼，它與單熱編碼有何不同？

固定效應的頻率論定義是什麼？

實際上，獨立同分佈假設是否適用於絕大多數監督學習任務？

為什麼這些圖中的 SE 區域差異如此之大