Degrees of freedom concept

R

Rf67

Guest
hi,

I tried searching the forum for this but no luck,

I understand the idea of degrees of freedom but still unsure why it is used, in laymans terms?

Like exponential becomes normal with more df?

Thanks
 
Last edited by a moderator:

ShaktiRathore

Well-Known Member
Subscriber
Hi
Degrees of freedom df for a estimate of variable is the no of observations that goes into the estimate of variable minus the no. Of intermediate parameters that goes into its estimate. For e.g. In estimate of variance for n observations we utilise mean(1param) to estimate variance so n-1 df. Similarly for a one ind variable regression for n obs there are 2 params intercet and 1 ind variable used to estimate dependent variable so in total of n-2 df,for two ind variable regrression df=total observations(n)-3(2 ind vars and one intercept pthat goes for estimate of dependent var. In genetal for k ind vars df=n-k-1(k ind vars and intercet params that goes into estimate of dep var). For mean there is no mean reqd so 0 df, for variance one sample mean is requied so n-1 df,for 1 ind var regression there are two means one intercept and other slope param so n-2 df, in this way n-k-1 df for k ind var regression as there are k+1 means required to estimate dep. Variable.
In a way more the no of ind vars there are more number of estimates of coeffs of these ind vars lesser is the accuracy of estimate that is dep var, so more ind var we add lesser is accuracy and df,naturally larger no of obs is increasing accuracy of estimate and hence more df.
 
Last edited:
R

Rf67

Guest
Hi
Degrees of freedom df for a estimate of variable is the no of observations that goes into the estimate of variable minus the no. Of intermediate parameters that goes into its estimate. For e.g. In estimate of variance for n observations we utilise mean(1param) to estimate variance so n-1 df. Similarly for a one ind variable regression for n obs there are 2 params intercet and 1 ind variable used to estimate dependent variable so in total of n-2 df,for two ind variable regrression df=total observations(n)-3(2 ind vars and one intercept pthat goes for estimate of dependent var. In genetal for k ind vars df=n-k-1(k ind vars and intercet params that goes into estimate of dep var). For mean there is no mean reqd so 0 df, for variance one sample mean is requied so n-1 df,for 1 ind var regression there are two means one intercept and other slope param so n-2 df, in this way n-k-1 df for k ind var regression as there are k+1 means required to estimate dep. Variable.
In a way more the no of ind vars there are more number of estimates of coeffs of these ind vars lesser is the accuracy of estimate that is dep var, so more ind var we add lesser is accuracy and df,naturally larger no of obs is increasing accuracy of estimate and hence more df.
Hi,

Thank you for your response as always,

Can you explain more the 'why' degrees of freedom are relevant?

I am doing FRM to know the why, rather than memorising the mechanics of formulae. Am I correct in saying that the degrees of freedom combine the number of observations and number of parameters then?
 

ShaktiRathore

Well-Known Member
Subscriber
Hi
As far as i can explain,Yes df seems logical when you want to know how much our estimate is closer to the actual value,as no of observations increase the mean estimates of the parameters in regression ass with ind vars becomes more accurate and eficient as std. Error decreases but introducing more ind vars also leds to more errors in estimate because now not 1 but k different errors can led to lesseer accurate estimate so we decrease our observations effectively by no of ind vars so df becomes no of obs-k-1,overall df signifies how efficient is your estimate, a higher df implies a higher efficient estimate.
Yes as far as regression is concerned yes df combines no of observations and no of params as is evident from the formula for k ind var refression df=n-k-1 where k+1 denotes the no of params and n is the no of observations.
Thanks
 
R

Rf67

Guest
Hi
As far as i can explain,Yes df seems logical when you want to know how much our estimate is closer to the actual value,as no of observations increase the mean estimates of the parameters in regression ass with ind vars becomes more accurate and eficient as std. Error decreases but introducing more ind vars also leds to more errors in estimate because now not 1 but k different errors can led to lesseer accurate estimate so we decrease our observations effectively by no of ind vars so df becomes no of obs-k-1,overall df signifies how efficient is your estimate, a higher df implies a higher efficient estimate.
Yes as far as regression is concerned yes df combines no of observations and no of params as is evident from the formula for k ind var refression df=n-k-1 where k+1 denotes the no of params and n is the no of observations.
Thanks

"Error decreases but introducing more ind vars also leds to more errors in estimate because now not 1 but k different errors can led to lesseer accurate estimate so we decrease our observations effectively by no of ind vars so df becomes no of obs-k-1,overall df signifies how efficient is your estimate, a higher df implies a higher efficient estimate."

so errors decrease (due to more ind variables reducing the size of the error term) but since we are increasing the number of ind vars leads to more errors in estimate .

because we add more ind variables we must compensate for this by increasing the degrees of freedom? so say we have 10 independant variables, that will make for a much higher efficient estimate than 2 degrees of freedom?
 

ShaktiRathore

Well-Known Member
Subscriber
Hi
df=n-k-1, so if no of ind vars(k)increase the df decreases. For given 30 obs i.e n=30 for k=1 ind var the df=30-1-1=28 ,for k=3 df=30-3-1=26 so df decreases with increase in ind variables. As k increases more errors in estimate comes of dep variable which is synonymous with decrease in df. A larger df implies more efficient estimate of dep var.
Thanks
 
Top