Basic Statistics

Deepak Chitnis

Active Member
Subscriber
Dear David,
Saw your example on page 32 of basic statistics(R10.P1.T2.Miller). In the real exam are we supposed to solve the questions like that? are we supposed to calculate the covariance, staddev. or correlation from given information? If yes, can you post any questions from GARP practice questions similar to it?
Thank you,
 

Prabhakar

New Member
Subscriber
Hi @David Harper CFA FRM CIPM

I have a question regarding questions in miller, referring to Miller textbook

1. in chapter "Basic Statistics" there is a sample solved problem. The problem below

Question:
Assume that the mean of daily Standard & Poor’s (S&P) 500 Index returns
is zero. You observe the following returns over the course of 10 days:

7 %
–4%
11%
8%
3%
9%
–21%
10%
–9%
–1%

Estimate the standard deviation of daily S&P 500 Index returns.

I am failing to understand why was the variance calculated using n and not n-1, since in the problem mentions population mean as zero. My understanding is we use the n-1 formula only when the sample mean is known. Even in the calculation the value of 0 is plugged which is population mean but the formula says it is sample mean. Can someone clarify this?
 

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Hi @Prabhakar

We don't use (n-1) when the sample mean is known; we always know the sample mean. His question is, in my opinion, oddly worded. In the FRM, if the goal were to use (n) rather than (n-1) as the divisor, we would expect to see "Assume this is a population" or, even better would be, "Calculate the standard deviation based on the maximum likelihood estimator (MLE) of the variance rather then the unbiased variance." This is better because it recognizes that either (n) or (n-1) is actually okay, because they are both just estimators of an unknown population/variance. The issue is the properties of the estimator (in fact, this question also may implicitly assume normal returns. See, for example, http://stats.stackexchange.com/ques...-for-variance-or-maximum-likelihood-estimator)

Miller is okay here because he is giving us the population mean which is used in the variance estimate. That's the key to using (n) because it's a matter of degrees of freedom. The divisor is df. If we use the sample itself to estimate the mean, we've consumed a degree of freedom and divisor of (n-1) is the appropriate unbiased variance estimate. If we measure a population--or alternatively are given the mean as Miller has--then it both cases, we aren't "bootrapping" the sample to generate our sample estimate (if it's a population, the mean is by definition the population mean!). So he is correct because rather than estimate the mean to be used in the variance calculation (which consumes a df), we are given the mean, which does not require us to self-reference the sample. Note he says "The sample mean is not exactly zero, but we are told to assume that the population mean is zero; therefore:" ... which is correct: if we were not given a population mean to assume, we'd use the sample mean and then divide by (n-1)--per convention of wanting the unbiased estimate (however, we could still seek the MLE estimate and divide validly by n; it's important to realize these are both just estimators with different properties). But we are given the population mean, so no need to reduce df. I hope that helps, thanks,
 
Top