Student t distribution

noalv4

Member
Hello,

Can I have an explanation (calculation) how to calculate the critical t of 2.262 in the example on page 56 (P1.T1 Quantitative Analysis)?

Thanks,
Noa.
 

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Hi Noa,

The critical-t is just a lookup (reference to) the student's t distribution; as opposed to a "computed t-statistic; aka t-ratio. In this way, a critical t is an inverse CDF (quantile function) just like, for a normal distribution, the "critical [one-tailed] value" at 1% is -2.33 and at 5% is -1.645.

But on page 56, we want the critical t for (n-1) degrees of freedom and two-tailed 5% (= one tailed 2.5%).
So we can find 2.262 on page 68 (the student's t lookup table) where column = 2-tail 0.05 and d.f. = 9.
fwiw, in Excel, 2.262 = T.INV.2t(5%, 9)

We are starting a "new thing" where we tag comments like your yours, to inform a revision/insertion into the notes. So, your question is helpful, because we can add some explanation to p 56, thanks,
 

tosuhn

Active Member
Hi @David Harper CFA FRM CIPM I am revising the hypothesis testing and i came across this. I can't recall if I have already asked you. Appreciate if you can go through with me how to obtain t-stats=1 with the value of 16.36% and df=24 from the t-tables.
Question:
4. A sample of 25 money market funds shows an average return of 3.0% with standard deviation also of 3.0%. Your colleague Peter conducted a significance test of the following alternative hypothesis: the true (population) average return of such funds is GREATER THAN the risk-free rate (Rf). He concludes that he can reject the null hypothesis with a confidence of 83.64%; i.e., there is a 16.36% chance (p value) that the true return is less than or equal to the risk-free rate. What is the risk-free rate, Rf? (note: this requires lookup-calculation)
a) 1.00%
b) 1.90%
c) 2.00%
d) 2.40%
Hope to hear from you soon :)
Regards,
Sun
 
Hi,

"The t-statistic tests the null hypothesis that the population mean equals a certain value:
- If the sample (n) is large (e.g., greater than 30), the t-statistic has a standard normal sampling distribution when the null hypothesis is true."

When we say, if sample size is large as mentioned above, does that mean number of observations in each sample is large or the number of such samples itself is large?

And how does number of observations in each sample have any impact on the assumptions regarding distribution or population parameters?
Eg: 1. 10 observations in each sample and 25 such samples.
2. 25 observations in each sample and 10 such samples.

Kindly help me understand.

Thanks,
Praveen
 

brian.field

Well-Known Member
Subscriber
I would conclude that it is a large number of samples since your question states "sampling" distribution. However, it is an interesting question. If we assume we are dealing with samples only, then what if each sample contains only one observation? Then I am not sure that "large" would be achieved by ~30, (30 is usually considered large!)

So, if the number of observations in each sample is small, then you would need more individual samples to approach normality (via CLT) whereas if each sample had many observations, then it would take fewer samples to approach normality.

I didn't really answer your question!
 

ShaktiRathore

Well-Known Member
Subscriber
Hi,
Its known that when sample size is large,n>=30 the t distribution is approx to normal.So the sample size is nothing but size of the sample itself.its equivalent to saying sampling distribution of 30 samples of size 1 obsevation, so its one and the same as distribution of 30 observations.null hypothesis true confirms that t distr is normal.Brian n is large in sense there are many sampling coombinations possible out of sample size 30 30Cr taking r as sample size of these much samples so sampling dist. Has many no of samples.30C5 is large no.clt confirms that samppling dist. Is normal for these sampkes. We take case r=1 and assume the same.
If no of observations is 2 instead of 1 then we have 30C2or 285 different samples which results in sampling distribution more clise to normal or population(redn in std error)).as u increase the obs. Within sample the no of diff samples gets large with large implications for saping distribution.
Finally its appropriate to assumet dist as normal at n greater than 30.
Thanks
 
Last edited:

David Harper CFA FRM

David Harper CFA FRM
Subscriber
It's a good question @Praveen_India, even after years I still need to remind myself of the difference between sample size and number of trials. The magic of the CLT refers to the number of observations in a sample (i.e., not the number of samples).

In case it's help to illustrate, I'd like to share one of the projects I did for my statistical inference class at Hopkins: http://rpubs.com/bionicturtle/jh-ds-st-p1 (this is short) Please note:
  • The population distribution is the exponential distribution which has a standard deviation (and mean, too!) of 1/λ
  • Then I ran a simulation: sample size of 40 with 1,000 trials; i.e., 40 columns by 1,000 rows. Importantly, n = 40 (not 1,000)
  • CLT says the sample mean has a (theoretical) standard deviation of (1/λ)/sqrt(40); this is simulated by a single row; i.e., a single trial which is a sample of 40 random observations
  • Then 1000 trials (i.e., 1,000 rows) are generated "simply" to see if the actual results compares to the theoretical; in technical terms, the sample mean of each row becomes an observation, and there are then 1,000 observations generated, but this is an additional step that should not confuse. I could have run 300 rows or 10,000 and the theoretical standard deviation of the sample mean would not change.
  • You can see from my chart: the magic is how approximately normal is the distribution of sample means (in blue). I hope that's interesting, it never ceases to fascinate me :)
 
Last edited:
Hi David,

That was a very clear explanation. Thank you.

So when sample size is >30 the distribution of sample means is assumed to be Normal. Need not worry about how many such samples have to be been taken. as i understand looking at the graph in your project, even thousand trials have not created a perfect normal curve, i guess it takes really a very large number of samples, however ultimately it will be a perfect normal.

Thanks Shakti and Brian for your inputs.

Thanks,
Praveen
 
Top