Sampling distribution of OLS estimators

FlorenceCC

Member
Hi,

I understand that the assumption that the sampling distribution of OLS estimators b0 and b1 is asymptotically normal is a key property. However I'm a bit stuck as to why that is. I assume the magic CLT comes into play here, but I guess there are stil grey areas for me.
When we apply the CLT, we apply it not to the distribution of the sample, but the distribution of the sample mean as a random variable.
When we talk about i.i.d. samples of X and Y here, and the corresponding SRF, and b0/b1 estimators, we have a sampling distribution. But how does the sampling distribution of their sample mean/CLT becomes relevant. I guess what I am trying to express here is that we are interested in the sampling distribution, not the sampling distribution of the sample mean?

What am I missing? Hope my question makes sense, thanks!

Florence
 

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Hi Florence,

Yes that's correct. This is a topic that was better explained in Gujarati, who was assigned to econometrics before Stock and Watson. There is an indirect sense in which the b(0) and b(1) estimates are a sophisticated sort of sample mean, but they are sample statistics; e.g., each sample produces a different sample regression function with different intercept/slope estimates. You are right, of course, that a single scatterplot (of X,Y pairs) is a single sample. But then there are a set of assumptions that inform the OLS linear regression (technically the Gauss-Markov theorem https://en.wikipedia.org/wiki/Gauss–Markov_theorem tells us the implications of these assumptions). Arguably, the keys assumptions in the CLRM concern the error term: (1) it is presumed constant (i.e., homoskedastic; aka, "identical" over time), (2) it is uncorrelated to the independent/explanatory variable, and (3) it is without correlation to itself (i.e., no autocorrelation in the regression; aka, "independent").

CLT tells us that the average or summation (after all, the sum is merely the average * n) of i.i.d. random variables tends to get normal as the sample increases. These assumptions about the error term allow CLT to be applied to the error term; it is approximately normal with mean zero, by construction of the OLS. The regression coefficients are then, actually, linear functions of the error term, so they inherit the normality of the error term. At this point, there is a sense in which they are similar to sample means of a sophisticated sort. The single sample is the scatterplot of pairwise (X,Y) values. We can retrieve the sample mean of X and the sample mean of Y; the regression line will run through these points, by construction. The CLT might tell us about the property of these sample means of X and Y, but this is not the regression. Additionally, CLT informs the error of the regression and indirectly the regression coefficients (which are estimates). They each have their own standard errors (i.e., standard deviations applicable to them as sample statistics). If the regression produces a slope estimate of 0.30 with a standard error of 0.15, then by virtue of the OLS assumptions, CLT is governing our ability to observe that, for a large sample, this slope is (0.30 - 0)/0.15 = 2.0 standard deviations away from zero; aka, different than zero with exact significance (p value) of about 95% or right on the decision "bubble." So the math is non-trivial but the essence of your logic is correct. I hope that's helpful!
 
Hi David, Equation 7.11 in the 2020 GARP Book 2 says that per the CLT, the estimate of the slope coefficient (beta_hat) follows a normal distribution with mean centered around the true slope and the variance = sigma^2/[n*sigma_x^2]. If this is true, then why do we assume the variance of beta_hat asymptotically tends to sigma^2/n in everything I've done/read so far? This impacts directly Q7.12 in the GARP textbook where you are asked to compute the standard errors of alpha_hat and beta_hat.
 
Last edited:

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Hi @sohinichowdhury I don't like to be responsible for the new GARP source material because it contains many errors, but equation 7.11 looks correct to me, albeit not helpful most of the time. Rather, please notice on the next page it shows the substitutions that lead to equation 7.13 and 7.14 which, at quick glance, also look correct to me; i.e. the first term in 7.13 which has variance of beta as s^2 divided by the sum of X square is certainly correct. Notice that 7.14 embeds s/sqrt(n). Please note that, of course, this is referring to the variance/standard error of the regression's slope coefficient. There is an accurate sense in which the slope's standard error scales with s/sqrt(n) but you might (?) be thinking of the distribution of a univariate sample mean: per CLT the sample mean (of a "univariate" distribution) does have variance of σ^2/n. I hope that's helpful,
 
Thank you David. You're right! I was indeed thinking about distribution of a univariate sample mean. Makes sense that for the regression's slope coefficient, the variance would embed the sigma^2/n term but also include the variance of the explanatory variable.
 
Top