I always found this to be a bit confusing, so I spent some time today looking at it some more. I also felt it would be helpful for some others, so I have included my thoughts below.
@David Harper CFA FRM - please let me know if you have any concerns with my thoughts.
Assume we have a population with a mean of 3.5 and a variance of 2.916667.
Now, let's assume we take a sample of size 2 from this population. The actual mean of this specific sample may or may not be 3.5 but its expected value is still 3.5. Similarly, the variance of this specific example might not be the variance of the population divided by n, or 2.9166667/2, but it is expected to be 2.916667/2.
Let's repeat this 1000 times. Then, the resulting 1000 samples of size 2 produce 1000 different means each of which is a point estimate of the population mean. Further, the mean of the 1000 point estimates is also a point estimate of the population mean! This mean of the 1000 individual means is the mean of the sampling distribution of the mean (which is also expected to be 3.5)!
Note that the sample size is still 2 and NOT 1000 in the above scenario! (I always found it difficult to decide if the sample size was 2 or 1000)!
The variance of the 1000 point data set of samples of size 2 is the variance of the original population divided by 2 - remember n = 2 not 1000!
Now, let's assume we take samples of size 100 and we repeat the above exercise. Now, n = 100 and we still produce 1000 point estimates each from a sample of size 100. Again, the expected value of the 1000 means will be close to 3.5 - it should be closer to 3.5 than the average associated with the 1000 point estimates from samples of size 2 but in both cases. the "expected value" is 3.5. Similarly, the variance of the distribution of means for the 1000 point estimates from samples of size 100 is expected to be 2.9166667/100 - i.e., the variance is much lower than in the first example. This is intuitive since the impact of an outlier on a sample size of 100 is significantly less than the impact an outlier would have on a sample size of 2!
Now is the interesting part.....and this is where I have had trouble in the past. We know that the sampling distribution of the sample mean will be normal or approximately normal regardless of the original population's distribution via the CLT.
I often wondered if the sampling in the above scenarios was with or without replacement. My analysis suggests to me that it must be with replacement (particularly if the population is discrete and small)!
Consider a single die! Then we know that the expected value is 3.5 and the variance is 2.9166667, as I elected to use above. Using a random number generator in excel, I generated 100 samples of size 2, i.e., or in other words, 200 rolls since they are all independent! Then I took the average of each pair to arrive at a sampling distribution of the mean. The average of the 100 sample means was 3.4250 (fairly close to the true population mean of 3.5). I also calculated the variance of the 100 averages which turned out to be 1.3619. This is pretty close to the population variance divided by 2, or 2.1966667/2 = 1.45833.
Then, I repeated the exercise with 100 samples of size 6! This is where it gets interesting. Since I am using a single die as my population, if I sample a size of 6 without replacement, then my average for every single iteration will equal the population average AND the variance of the sampling distribution of sample means will be 0 since every sample point estimate for the population mean would be 3.5! This is not consistent with the rule that the variance of the sampling distribution of the sampling mean is equal to the population variance divided by n! i.e., this rule suggests that the variance of the sampling distribution of sampling means (which are all 3.5) should be 2.9166667/6 = 0.486111 and not 0!
I have also been bothered by the fact that the variance of the sampling distribution of the mean approaches 0 as n approaches infinity! How can n approach infinity if the population has only 6 elements!!!!
Hence, I have concluded that the sampling must be "with replacement"!
So, I then generated 100 samples of size 500 each (obviously with replacement since we are dealing with a single die with 6 elements)! Now, the average of the 100 samples of size 500 is 3.5123 (closer to the population mean of 3.5) and the variance of this distribution of sample means is 0.0052 which is very close to the population variance divided by 500, or 2.916667/500 = 0.0058 (and incidentally, pretty close to 0 I might add)!
So, as n approaches infinity, the variance of the sampling distribution does approach 0! Or, in other words, the sampling distribution approaches a normal distribution with mean equal to the population mean and variance approaching 0!
One last question for @David Harper CFA FRM!
How can we have a normal distribution with variance 0?
Thanks for letting me put this down....
Brian
@David Harper CFA FRM - please let me know if you have any concerns with my thoughts.
Assume we have a population with a mean of 3.5 and a variance of 2.916667.
Now, let's assume we take a sample of size 2 from this population. The actual mean of this specific sample may or may not be 3.5 but its expected value is still 3.5. Similarly, the variance of this specific example might not be the variance of the population divided by n, or 2.9166667/2, but it is expected to be 2.916667/2.
Let's repeat this 1000 times. Then, the resulting 1000 samples of size 2 produce 1000 different means each of which is a point estimate of the population mean. Further, the mean of the 1000 point estimates is also a point estimate of the population mean! This mean of the 1000 individual means is the mean of the sampling distribution of the mean (which is also expected to be 3.5)!
Note that the sample size is still 2 and NOT 1000 in the above scenario! (I always found it difficult to decide if the sample size was 2 or 1000)!
The variance of the 1000 point data set of samples of size 2 is the variance of the original population divided by 2 - remember n = 2 not 1000!
Now, let's assume we take samples of size 100 and we repeat the above exercise. Now, n = 100 and we still produce 1000 point estimates each from a sample of size 100. Again, the expected value of the 1000 means will be close to 3.5 - it should be closer to 3.5 than the average associated with the 1000 point estimates from samples of size 2 but in both cases. the "expected value" is 3.5. Similarly, the variance of the distribution of means for the 1000 point estimates from samples of size 100 is expected to be 2.9166667/100 - i.e., the variance is much lower than in the first example. This is intuitive since the impact of an outlier on a sample size of 100 is significantly less than the impact an outlier would have on a sample size of 2!
Now is the interesting part.....and this is where I have had trouble in the past. We know that the sampling distribution of the sample mean will be normal or approximately normal regardless of the original population's distribution via the CLT.
I often wondered if the sampling in the above scenarios was with or without replacement. My analysis suggests to me that it must be with replacement (particularly if the population is discrete and small)!
Consider a single die! Then we know that the expected value is 3.5 and the variance is 2.9166667, as I elected to use above. Using a random number generator in excel, I generated 100 samples of size 2, i.e., or in other words, 200 rolls since they are all independent! Then I took the average of each pair to arrive at a sampling distribution of the mean. The average of the 100 sample means was 3.4250 (fairly close to the true population mean of 3.5). I also calculated the variance of the 100 averages which turned out to be 1.3619. This is pretty close to the population variance divided by 2, or 2.1966667/2 = 1.45833.
Then, I repeated the exercise with 100 samples of size 6! This is where it gets interesting. Since I am using a single die as my population, if I sample a size of 6 without replacement, then my average for every single iteration will equal the population average AND the variance of the sampling distribution of sample means will be 0 since every sample point estimate for the population mean would be 3.5! This is not consistent with the rule that the variance of the sampling distribution of the sampling mean is equal to the population variance divided by n! i.e., this rule suggests that the variance of the sampling distribution of sampling means (which are all 3.5) should be 2.9166667/6 = 0.486111 and not 0!
I have also been bothered by the fact that the variance of the sampling distribution of the mean approaches 0 as n approaches infinity! How can n approach infinity if the population has only 6 elements!!!!
Hence, I have concluded that the sampling must be "with replacement"!
So, I then generated 100 samples of size 500 each (obviously with replacement since we are dealing with a single die with 6 elements)! Now, the average of the 100 samples of size 500 is 3.5123 (closer to the population mean of 3.5) and the variance of this distribution of sample means is 0.0052 which is very close to the population variance divided by 500, or 2.916667/500 = 0.0058 (and incidentally, pretty close to 0 I might add)!
So, as n approaches infinity, the variance of the sampling distribution does approach 0! Or, in other words, the sampling distribution approaches a normal distribution with mean equal to the population mean and variance approaching 0!
One last question for @David Harper CFA FRM!
How can we have a normal distribution with variance 0?
Thanks for letting me put this down....
Brian
Last edited: