Error term in multiple regression

brian.field

Well-Known Member
Subscriber
I simply cannot get this to compute....

One of the assumption of multiple linear regression states that the error term cannot contain any factors that both affect Y and are also correlated with any of the independent variables in the regression. To me, this is impossible to accomplish.

How is this feasible? For instance, there is, effectively, a countably infinite number of possible variables to consider. How can you possibly assume that none of these variables effect Y while also being correlated with the independent variables explicitly included in the multiple regression equation?

Can anyone shed any light on this? This continues to bother me.....if this assumption is violated, then the OLS estimators will be biased.

I supposed, then, that the OLS estimators are always biased. And the concern should not be with respect to producing unbiased estimators but rather, to limit the degree of bias? (Since there are measures that can provide the direction and degree of bias.)

Brian
 

ShaktiRathore

Well-Known Member
Subscriber
Yes Brian OLs estimators are always biased its like saying R^2 is not always 100%. The assumption of Mra that error terms should be not coorelated with independent vars is far from reality as u said there is bound to be some correlation of error term with the ind vars,this correlation is conditional heteriskedasticity, homoskedasticity is opposite to it that is Error terms are ind of level of ind vars,mra assumes this homoskedasticity.
Mra assumes each and every ind variable is not corrrelate with themself or error terms and each separately and together account for dependent variable i mean no multicolinearity and heteroskedasticity. In reality this is not the case there are some components of this two phenomenon present,but we accept mra results with some level of accuracy.
Ols estimators are biase to some level and we accept them to a desired degree of accuracy they need not have to be 100%right vut there is always some biaseness in their estimation because assumption of mra of homoskedasticity is not always 100% valid some error bound to happen from here or their.
Thanks
 

brian.field

Well-Known Member
Subscriber
I am not sure I understand your question. Are you asking which R^2 to use? In a simple linear regression framework, the R^2 can be used. In a multiple regression framework, i.e., when there are multiple independent variables, you should use the Adjusted R^2 which effectively penalizes the fit metric for each additional variable used (you want to use the least amount of variables without omitting any critical variables.)
 

brian.field

Well-Known Member
Subscriber
There are a few different ways to calculate the adjusted R^2. (Some of the alternatives are confusing since the acronyms used for a few of the sum of squareds values are so hard to keep straight.) I would suggest that you memorize the approach presented in the assigned text even if you came across a different version in Schaum's or elsewhere.

Brian
 

Dr. Jayanthi Sankaran

Well-Known Member
Yes, Debbie - it would be advisable to just stick to David's study notes:

Adjusted R^2 = 1 - [(n -1)/(n - k - 1)]*SSR/TSS. Don't confuse yourself by reading too many different texts. It is better to remain focused on David's study notes and PQ sets!

Jayanthi
 
Last edited:
Hi David,

If Error term's variance varies with DEPENDENT variable, then also the condition of homoskedasticity is broken right?

It makes sense to me as Dependent variable establishes indirect relationship between Error and Independent variable as Dependent variable itself is dependent on Independent variable.

Kindly help.

Thanks
Praveen
 
Top