* Homoskedasticity is an important assumption in ordinary least squares (OLS) regression. Although the estimator of the regression parameters in OLS regression is unbiased when the homoskedasticity assumption is violated, the estimator of the covariance matrix of the parameter estimates can be biased and inconsistent under heteroskedasticity, which can produce significance tests and confidence intervals that can be liberal or conservative. After a brief description of heteroskedasticity and its effects on inference in OLS regression, we discuss a family of heteroskedasticity-consistent standard error estimators for OLS regression and argue investigators should routinely use one of these estimators when conducting hypothesis tests using OLS regression. To facilitate the adoption of this recommendation, we provide easy-to-use SPSS and SAS macros to implement the procedures discussed here. * Autoren: ANDREW F. HAYES und LI CAI Veröffentlichung: Using Heteroskedasticity-Consistent Standard Error Estimators in OLS Regression. An Introduction and Software Implementation, in: Behavior Research Methods (2007) * Stand: 11. August 2011 * Vollständiger Aufruf: HCREG dv = yvar/iv = ivlist/const = c/method = m/covmat = cv/test = q. * Erläuterung des Aufrufs: dv = abhängige Variable iv = unabhängige Variable(n) const, wobei c gleich 1 für eine Regression mit Konstante, 0 für eine Regression ohne Konstante steht - default = 1) method (zu spezifizierende Methoden - default = 3: 0 = For small sample sizes, the standard errors from HC0 are quite biased, usually downward, and this results in overly liberal inferences in regression models (see e.g., Bera, Suprayitno, & Premaratne, 2002; Chesher & Jewitt,1987; Cribari-Neto, Ferrari, & Cordeiro, 2000; Cribari-Neto & Zarkos, 2001; Furno, 1996). But HC0 is a consistent estimator when the errors are heteroskedastic.That is, the bias shrinks with increasing sample size 1, 2, 3 = Three alternative estimators, HC1, HC2, and HC3 are all asymptotically equivalent to HC0 but have far superior small sample properties relative to HC0 (Long & Ervin, 2000; MacKinnon & White, 1985) It is recommended that HC3 always be used because it can keep the test size at the nominal level regardless of the presence or absence of heteroskedasticity, and there is only a slight loss of power associated with HC3 when the errors are indeed homoscedastic. Cribari-Neto, Ferrari, & Oliveira’s (2005) simulation results also suggest the superiority of HC3 over its predecessors 4 = A newer estimator, HC4, is preferred when there are cases with high leverage 5 = OLS Regression (Homoskedastizität angenommen) covmat (Varianz-Kovarianz-Matrix der Koeffizienten - default = 0), der Wert 1 provoziert die Ausgabe test (ermöglicht Tests von geschachtelten Regressionsanalysen in Bezug auf die Koeffizienten); der einzusetzende Wert bezieht sich auf die letzten q Variablen (Maximum ist Zahl der unabhängigen Variablen - 1). * Hinweis: Es werden vom Makro nur die automatischen fehlenden Werte erkannt; benutzerseitig als fehlend definierte Werte müssen umkodiert werden. * ******************* Vorgehensweise in 3 Schritten ***********************. * Schritt 1) Zuerst die zu analysierende Datendatei im Dateneditor öffnen. * Schritt 2) Dann das Makro ab "Schritt 2" ausführen * (Cursor auf die Zeile mit "Schritt 2", dann "Menü Ausführen - Bis Ende"). * Schritt 3) Dann den ersten Makroaufruf mit den zu definierenden Variablen ausführen (bei Schritt 3 - Aufruf ggf. modifizieren). * Makro-Beispielaufruf hier mit der mit SPSS mitgelieferten Datendatei "car_data.sav" (in der englischen Version). * mpg = abhängige Variable (dv); horse, weight, accel = die unabhängigen Variablen (iv). * Schritt 3: Aufruf des Makros. HCREG dv = mpg/iv = horse weight accel/method = 3. * Schritt 2: Makro (= neue Prozedur) installieren). PRESERVE. set printback=off. DEFINE hcreg (dv =!charend ('/')/iv =!charend ('/') /test = !charend('/') !default (0) /const = !charend('/') !default(1) /method = !charend ('/') !default (3) /covmat = !charend('/') !default(0)). set length = none. SET MXLOOP = 100000000. MATRIX. GET x/file = */variables = !dv !iv/names = dv/missing = omit. compute y=x(:,1). compute x=x(:,2:ncol(x)). compute iv5 = x. compute pr = ncol(x). compute n = nrow(x). compute L = ident(pr). compute tss=cssq(y)-(((csum(y)&**2)/n)*(!const <> 0)). do if (!const = 0). compute iv = t(dv(1,2:ncol(dv))). compute df2 = n-pr. else. compute iv = t({"Constant", dv(1,2:ncol(dv))}). compute con = make(n,1,1). compute x={con,x}. compute df2 = n-pr-1. compute L1 = make(1,pr,0). compute L = {L1;L}. end if. compute dv=dv(1,1). compute b = inv(t(x)*x)*t(x)*y). compute k = nrow(b). compute invXtX = inv(t(x)*x). compute h = x(:,1). loop i=1 to n. compute h(i,1)= x(i,:)*invXtX*t(x(i,:)). end loop. compute resid = (y-(x*b)). compute mse = csum(resid&**2)/(n-ncol(x)). compute pred = x*b. compute ess= cssq(resid). do if (!method = 2 or !method = 3). loop i=1 to k. compute x(:,i) = (resid&/(1-h)&**(1/(4-!method)))&*x(:,i). end loop. end if. do if (!method = 0 or !method = 1). loop i=1 to k. compute x(:,i) = resid&*x(:,i). end loop. end if. do if (!method = 5). loop i=1 to k. compute x(:,i) = sqrt(mse)&*x(:,i). end loop. end if. do if (!method = 4). compute mn = make(n,2,4). compute pr3 = n-df2. compute mn(:,2) = (n*h)/pr3. compute ex=rmin(mn). loop i=1 to k. compute x(:,i) = (resid&/(1-h)&**(ex/2))&*x(:,i). end loop. end if. compute hc = invXtX*t(x)*x*invXtX. do if (!method = 1). compute hc = (n/(n-k))&*hc. end if. compute F = (t(t(L)*b)*inv(t(L)*hc*L)*((t(L)*b)))/pr). compute pf = 1-fcdf(f,pr,df2). compute r2 = (tss-ess)/tss. compute pf = {r2,f,pr,df2,pf}. do if (!method <> 5). print !method/title = "HC Method"/format F1.0. end if. print dv/title = "Criterion Variable"/format A8. print pf/title = "Model Fit:"/clabels = "R-sq" "F" "df1" "df2" "p"/format F10.4. compute sebhc = sqrt(diag(hc)). compute te = b&/sebhc. compute p = 2*(1-tcdf(abs(te), n-nrow(b))). compute oput = {b,sebhc, te, p}. do if (!method <> 5). print oput/title = 'Heteroscedasticity-Consistent Regression Results'/clabels = "Coeff" "SE(HC)" "t" "P>|t|"/rnames = iv/format f10.4. else if (!method = 5). print oput/title = 'OLS Regression Results Assuming Homoscedasticity'/clabels = "Coeff" "SE" "t" "P>|t|"/rnames = iv/format f10.4. end if. compute iv2 = t(iv). do if (!covmat = 1). print hc/title = 'Covariance Matrix of Parameter Estimates'/cnames = iv/rnames = iv2/format f10.4. end if. do if (!test > 0 and !test < pr). compute L2 = make(pr-!test+!const,!test,0). compute L = {L2;L((pr+1-!test+!const):(pr+!const),(pr-!test+1):(pr))}. compute F = (t(t(L)*b)*inv(t(L)*hc*L)*((t(L)*b)))/!test). compute pf = 1-fcdf(f,!test,df2). compute pf = {f,!test,df2,pf}. print pf/title = "Setwise Hypothesis Test" /clabels = "F" "df1" "df2" "p"/format F10.4. compute iv = t(iv((pr+1-!test+!const):(pr+!const),1)). print iv/title = "Variables in Set:"/format A8. end if. END MATRIX. !END DEFINE. RESTORE.