How to formulate hypothesis?

sridhar

New Member
I came across an old FRM exam question:
"Suppose the std dev of a normal population is known to be 10 and the mean is hypothesized to be 8. Suppose a sample size of 8 is considered. What is the range of sample means that allow the hypothesis to be accepted at a level of significance of 0.05?"

My question is: how to formulate H0 and H1 (the alternate...) I thought since we "want" to accept that the sample mean to be 8, I should formulate that as H1...That is:

H0: mu <> 8

H1: mu = 8;

Is this the correct way to formulate this? <> above means "not equal to...." I am confused. Because of the way the question is phrased (".....range of sample means...."), I thought:

H0: mu = 8

H1: mu <> 8;

--sridhar
 

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Hi sridhar,

Agreed, this can be a mind-bender. A couple of "shortcut" notes:

1. If H0:mu <> 8 were the null, i *think* it would be impossible to develop a test statistic; e.g., you don't know if you should use 9 or 10, etc. What i mean is, with this null, I don't immediately see how you can calculate the test statistic.

2. You don't need to engage the null/alt issue to answer the question. The quick way to answer is to attach the range to the anchor: sample mean +/- (critical Z value @ XX%)(std error = sample std dev/SQRT(8)). Note this language "std dev of a normal population is known" should tell you we want a normal distribution (i.e., known variance) instead of a student's t (unknown variance). (I realize this is probably not your point...). I raise this because: please don't forget, it is a bit unusual to know the population variance, more often we don't know the pop variance and then the student's t is appropriate.

To you question:

I agree it is tempting to approach from "we want the sample mean to be 8" but it can create confusion. Consider instead when we test for significance of regression parameters; e.g.,

Y = (0.1 = sample beta)*X + intercept

If we test for the slope coefficient, our null hypothesis is, H0: beta = 0. In this case, what we really "want to find" is that our 0.1 coefficient is significant. Significance means "different from zero." So, null is: beta = 0, and we "get what we want" if the null fails.

Similarly, in yours above, you might think of the question as, "I want to find if the population mean is significantly different than 8." So, i will set the null: mean = 8. I will get what i want if the null fails.

and, finally, the question is technically imprecise. We do not accept the null. We fail to reject the null. Think jury verdict where null: defendant is innocent. Jury will find defendant guilty (reject null) or not guilty (fail to reject null) but there is no finding of "innocent" (accept null). I can't tell if this helps your intuition, but the question should say either:

"...that allow the hypothesis to be rejected...", or
"...that allow the alternative hypothesis to be accepted", or even
"...that allow the null hypothesis to be accepted (i.e., fail to be rejected)"

David
 

sridhar

New Member
Thank you David. I understood what you are saying....The not guilty and innocent is an apt analog.. The question was apparently from FRM 2006. Since the time I posed the question, I've now read somewhere that the null hypothesis must always include the "=" sign -- as in:

H0: mu = 0 or

H0: mu <= 0 or

H0: mu >= 0

While on the subject -- can you elaborate on the concept of the p-value in the context of hypothesis testing? As I've read it. p-value is the smallest level of significance for which the null hypothesis can be rejected. Not sure, I really understand this.

1. Is p-value related to the so-called Type I error.

2. Is there a relationship between p-value and the t-statistic (for tests where this statistic is meaningful?)

--sridhar
 

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Thanks for the "=" rule, i did not know that per se, but it makes perfect sense.

Regarding the p-rule, I advise you *not* try to connect to the other two concepts; it leads to nuanced errors like "p value = probability [Type I error]" which is not the case.

Going to your example above, we have a hypothetical mean (H0: population mean = 0) and a sample mean. The "traditional" approach is to:

1. Solve for the test statistic; e.g., how many standard deviations (std errors) away from the hypothesized mean is my sample?
2. Select a significance level; e.g., 5%
4. and now SOLVE for the decision; e.g., reject @ 5%.

The p-value approach is:
1. Solve for the test statistic; e.g., how many standard deviations (std errors) away from the hypothesized mean is my sample?
2. Ask, If I were to just barely reject the null (i.e., if i were right on the border between accepting and rejecting the null, based on the standard error)...
3. Solve for, what significance level is implied by that?

in other words, instead of "pick significance, make decision," it is: pick border-line decision, solve for significance" If the p value is, say, 3%, then to further reduce the significance to say 2% (which means the confidence goes up from 97% to 98%) and we can no longer reject the null at these higher confidence levels. At the p-value, we cannot ask for more confidence and still reject the null; at the p-value, higher confidence (lower significance) is a higher burden that forces us to accept the null.

On 1 and 2, I don't like trying to link for the misunderstandings that may be introduced. Please note test statistic is generic to the significance test, it can use any distribution, including but not only the student's t distribution (which is implied by t-test).

David
 
Top