左衛(wèi)兵, 李英莉
(華北水利水電大學(xué) 數(shù)學(xué)與統(tǒng)計(jì)學(xué)院, 河南 鄭州 450046)
邏輯回歸模型中的混合最大似然估計(jì)
左衛(wèi)兵, 李英莉
(華北水利水電大學(xué) 數(shù)學(xué)與統(tǒng)計(jì)學(xué)院, 河南 鄭州 450046)
針對(duì)邏輯回歸模型中解釋變量存在復(fù)共線性問(wèn)題, 通過(guò)類(lèi)比線性模型中的混合估計(jì), 提出了邏輯回歸模型中的混合最大似然估計(jì); 在均方誤差矩陣意義下, 將新估計(jì)與最大似然估計(jì)、嶺估計(jì)、Liu估計(jì)、約束最大似然估計(jì)、隨機(jī)約束最大似然估計(jì)比較, 最后通過(guò)蒙特卡羅模擬方法驗(yàn)證其優(yōu)良性.
邏輯回歸; 復(fù)共線性; 隨機(jī)約束嶺估計(jì); 均方誤差矩陣
本文考慮如下模型
yi=πi+εi,i=1,2,…,n,
(1)
最大似然估計(jì)是估計(jì)參數(shù)β的常用方法,β的最大似然估計(jì)表達(dá)式為
(2)
文獻(xiàn)[1]提出了邏輯嶺估計(jì)為
(3)
文獻(xiàn)[2]提出了邏輯Liu估計(jì)為
(4)
考慮如下約束條件
h=Hβ+v;E(v)=0, Cov(v)=Ψ,
(5)
其中h為q×1階隨機(jī)已知變量,H為一個(gè)已知的q×(p+1)(q
在線性約束條件下, 文獻(xiàn)[3]提出了約束最大似然估計(jì)為
(6)
在隨機(jī)線性約束條件下, 文獻(xiàn)[4]提出了隨機(jī)約束最大似然估計(jì)為
(7)
本文通過(guò)類(lèi)比文獻(xiàn)[5] ,線性模型中的混合估計(jì)提出邏輯回歸模型中的混合最大似然估計(jì),定義如下
(8)
符號(hào)說(shuō)明:A>0表示A為對(duì)稱正定方陣,A≥0表示A為對(duì)稱半正定方陣,A≥B表示A≥0,B≥0且A-B≥0.
引理1[6]A為n×n矩陣,B為n×n矩陣,若A>0,B≥0,則A+B>0.
引理2[7]M為n×n矩陣,N為n×n矩陣,若M>0,N≥0,則M>N當(dāng)且僅當(dāng)λmax(NM-1)<1.
證明
C-1-C-1+C-1H′(Ψ-1+HC-1H′)-1HC-1=C-1H′(Ψ-1+HC-1H′)-1HC-1,
(9)
顯然,C-1H′(Ψ-1+HC-1H′)-1HC-1>0,以上定理得證.
證明
(10)
證明
(11)
證明
(12)
證明
(13)
本文采用文獻(xiàn)[8-10]的方法進(jìn)行蒙特卡羅模擬. 生成解釋變量使用下面的方程
xij=(1-ρ2)1/2zij+ρzi,p,i=1,2,…,n,j=1,2,…,p,
(14)
此外,我們選擇以下約束
(15)
對(duì)于參數(shù)d、k,選擇0≤d<1、0≤k<1. 重復(fù)模擬2 000次. 用式(16)模擬MLE、LLE、LRE、RMLE、SRMLE和MME的均方誤差,結(jié)果見(jiàn)表1~表9.
ME()=Mean{tr[MSEM(,β)]}=(-β)′(-β).
表2 n=20, ρ=0.80,對(duì)于不同k與d,各個(gè)估計(jì)的均方誤差Tab.2 The estimated MSE values for different k,d when n=20 and ρ=0.80
表4 n=100, ρ=0.70,對(duì)于不同k與d,各個(gè)估計(jì)的均方誤差Tab.4 The estimated MSE values for different k,d when n=100 and ρ=0.70
表5 n=100, ρ=0.80,對(duì)于不同k與d,各個(gè)估計(jì)的均方誤差Tab.5 The estimated MSE values for different k,d when n=100 and ρ=0.80
表6 n=100, ρ=0.99,對(duì)于不同k與d,各個(gè)估計(jì)的均方誤差Tab.6 The estimated MSE values for different k,d when n=100 and ρ=0.99
表7 n=200, ρ=0.70,對(duì)于不同k與d,各個(gè)估計(jì)的均方誤差Tab.7 The estimated MSE values for different k,d when n=200 and ρ=0.70
表8 n=200, ρ=0.80,對(duì)于不同k與d,各個(gè)估計(jì)的均方誤差Tab.8 The estimated MSE values for different k,d when n=200 and ρ=0.80
表9 n=200, ρ=0.99,對(duì)于不同k與d,各個(gè)估計(jì)的均方誤差Tab.9 The estimated MSE values for different k,d when n=200 and ρ=0.99
由表1~表9可知,當(dāng)ρ變大時(shí),上述6個(gè)估計(jì)的均方誤差會(huì)變大; 當(dāng)n增大時(shí),上述6個(gè)估計(jì)的均方誤差會(huì)減小. 但是,對(duì)于所有的n、ρ、k,新估計(jì)MME均優(yōu)于MLE、LRE、LLE、RMLE、SRMLE.
本文針對(duì)邏輯回歸模型中解釋變量存在復(fù)共線性問(wèn)題,提出一種混合最大似然估計(jì). 在均方誤差矩陣意義下,得到了優(yōu)于最大似然估計(jì)、嶺估計(jì)、Liu估計(jì)、約束最大似然估計(jì)、隨機(jī)約束最大似然估計(jì)的充分或充要條件,并且用蒙特卡羅模擬方法驗(yàn)證了其優(yōu)良性. 在保證均方誤差不增大的前提下,如何降低新估計(jì)的偏差是下一步研究的重點(diǎn).
[1] SCHAEFER R L, ROI L D, WOLFE R A. A ridge logistic estimator[J].Communication in statistics: Theory and methods, 2007, 13:99-113.
[2] DENIZ INAN, BIRSEN E E. Liu-type logistic estimator[J]. Communication in statistics: simulation and computation, 2013, 42:1578-1586.
[3] DUFFY D E, SANTNER T J. On the small sample prosperities of norm-restricted maximum likelihood estimators for logistic regression models[J]. Communications in statistics: Theory and methods, 1989, 18(3): 959-980.
[4] NAGARAJAH V, WIJEKOON P. Stochastic restricted maximum likelihood estimator in logistic regression model[J]. Open Journal of Statistics, 2015, 5(7):1-6.
[5] THEIL H. On the use of incomplete prior information in regression analysis[J]. Journal of the American Statistical Association, 1963, 58(302):401-414.
[6] RAO C R, TOUTENBURG H. Linear Models: Least Squares and Alternatives[M]. 2nd Ed. New York: Springer-Verlag Inc,1995.
[7] RAO C R, TOUTENBURG H, SHALABH, HEUMANN C. Linear Model and Generali-zations[M]. Berlin: Springer, 2008.
[8] TRENKLER G, TOUTENBURG H. Mean squared error matrix comparisons between biased estimators: An overview of recent results[J]. Statistical Papers, 1990, 31(1):165-179.
[9] GOLAM KIBRIA B M. Performance of some new ridge regression estimators[J]. Communication in Statistics: Simulation and Computation, 2003, 32(2):419-435.
[10] DONALD, GARY C, GALARNEAU, et al. A Monte Carlo evaluation of some ridge-type estimators[J]. Journal of the American Statistical Association, 1975, 70(350):407-416.
MixedMaximumLikelihoodEstimatorinLogisticRegressionModel
ZUO Weibing, LI Yingli
(CollegeofMathematicsandStatistics,NorthChinaUniversityofWaterResourcesandElectricPower,Zhengzhou450046,China)
Proposes the mixed maximum likelihood estimation in the logistic regression model with stochastic linear restrictions which is through the mixed estimation in the linear model of the explanatory variables in the logistic regression model. In the mean square error matrix sense, the new estimation is compared with the maximum likelihood estimation, ridge estimator, Liu estimator, restricted maximum likelihood estimator and stochastic restricted maximum likelihood estimator. Finally, the Monte Carlo simulation method is used to verify the superiority of the new estimation.
logistic regression; multicollinearity; stochastic restricted ridge estimator; mean squared error matrix
2017-04-11
河南省基礎(chǔ)與前沿技術(shù)研究項(xiàng)目(142300410401)
左衛(wèi)兵(1976—), 男, 河南內(nèi)黃人, 華北水利水電大學(xué)數(shù)學(xué)與統(tǒng)計(jì)學(xué)院教授, 碩士生導(dǎo)師,主要研究方向:數(shù)理統(tǒng)計(jì). 通信作者:李英莉(1991—), 女, 河南鄭州人, 華北水利水電大學(xué)數(shù)學(xué)與統(tǒng)計(jì)學(xué)院碩士研究生, 主要研究方向:數(shù)理統(tǒng)計(jì).
10.3969/j.issn.1007-0834.2017.03.001
O212.1
A
1007-0834(2017)03-0001-06