A Newey–West estimator is used in statistics and econometrics to provide an estimate of the covariance matrix of the parameters of a regression-type model where the standard assumptions of regression analysis do not apply.[1] It was devised by Whitney K. Newey and Kenneth D. West in 1987, although there are a number of later variants.[2][3][4][5] The estimator is used to try to overcome autocorrelation (also called serial correlation), and heteroskedasticity in the error terms in the models, often for regressions applied to time series data. The abbreviation "HAC," sometimes used for the estimator, stands for "heteroskedasticity and autocorrelation consistent."[2] There are a number of HAC estimators described in,[6] and HAC estimator does not refer uniquely to Newey–West. One version of Newey–West Bartlett requires the user to specify the bandwidth and usage of the Bartlett kernel from Kernel density estimation[6]
Regression models estimated with time series data often exhibit autocorrelation; that is, the error terms are correlated over time. The heteroscedastic consistent estimator of the error covariance is constructed from a term , where is the design matrix for the regression problem and is the covariance matrix of the residuals. The least squares estimator is a consistent estimator of . This implies that the least squaresresiduals are "point-wise" consistent estimators of their population counterparts . The general approach, then, will be to use and to devise an estimator of .[7] This means that as the time between error terms increases, the correlation between the error terms decreases. The estimator thus can be used to improve the ordinary least squares (OLS) regression when the residuals are heteroscedastic and/or autocorrelated.
where T is the sample size, is the residual and is the row of the design matrix, and is the Bartlett kernel [8] and can be thought of as a weight that decreases with increasing separation between samples. Disturbances that are farther apart from each other are given lower weight, while those with equal subscripts are given a weight of 1. This ensures that second term converges (in some appropriate sense) to a finite matrix. This weighting scheme also ensures that the resulting covariance matrix is positive semi-definite.[2]L = 0 reduces the Newey–West estimator to Huber–White standard error.[9]L specifies the "maximum lag considered for the control of autocorrelation. A common choice for L" is .[9][10]
Software implementations
In Julia, the CovarianceMatrices.jl package [11] supports several types of heteroskedasticity and autocorrelation consistent covariance matrix estimation including Newey–West, White, and Arellano.
In R, the packages sandwich[6] and plm[12] include a function for the Newey–West estimator.
In Stata, the command newey produces Newey–West standard errors for coefficients estimated by OLS regression.[13]
In MATLAB, the command hac in the Econometrics toolbox produces the Newey–West estimator (among others).[14]
In Python, the statsmodels[15] module includes functions for the covariance matrix using Newey–West.
In Gretl, the option --robust to several estimation commands (such as ols) in the context of a time-series dataset produces Newey–West standard errors.[16]
In SAS, the Newey–West corrected standard errors can be obtained in PROC AUTOREG and PROC MODEL [17]
Bierens, Herman J. (1994). Topics in Advanced Econometrics: Estimation, Testing, and Specification of Cross-section and Time Series Models. New York: Cambridge University Press. pp. 195–198. ISBN978-0-521-41900-0.