Extreme Value Theory

humheehum

New Member
Hi David - I have a couple of questions on EVT.

- In the Quant notes, page 110, under GEV - there is a sentence underneath the formula -"In this expression, a lower tail index corresponds to a fatter tail". I understand that the tail index, epsilon, is a measure of the "fat-tailness" of the distribution. In Wilmott and in the BT notes it is stated that Frechet distribution is the most relevant as it fits a fat-tail distribution. In Wilmott, for a Ferchet distribution epsilon is given to be greater than zero. So does this imply that epsilon must be greater than zero, but for it to exhibit more fat-tails the epsilon should be closer to zero,but positive?

- For the Peak over thresholds, could you please explain the role that epsilon and beta. What is beta here and how does it help/what role does it play in fitting a distribution on the tail?

- In the BT spreadsheet for EVT - why does the GPD cdf begin after zero? and why does the cdf formula does not refer to column B (Xi's)?

Thanks for your help again.

Regards,
Ashim
 

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Hi Ashim,

1. That's observant. That sentence, while technically accurate as it carried over from last year's EVT reading, is totally bad here. Typically (often but not always), the tail index = 1/shape param; so you can see how, under this definition, "the lower the tail index, the higher the shape param." But i don't explain that, so it's *awful* here. Your first instinct I am sure is correct (higher shape implies fatter tail), specifically:

Among the GEV, we tend to use the Frechet (shape > 0) and higher shape implies "fat/heavy tails" (heavy in the parent distribution, not in the child EVT).
Higher shape parameter implies heavier parent distribution


2. (that's a lowercase Greek xi, right? just checking i can never recall my Greeks!). These params (Wilmott 22.8.1) for the GPD are scale (beta) and shape (xi). Shape is the same as GEV above: gives weight to the tail. Scale is much like standard deviation, as it regards the dispersion of the observations over the threshold; higher scale means the range of the (child) GPD PDF will be wider (just like larger standard deviation gives a wider normal PDF).

3. It does not need to start at zero, there can be a third param to lower this zero limit. But here is starts at zero by definition. Because peaks over threshold is characterized by the distribution:

D(x) = P(X-threshold < x | X > threshold)

i.e., conditional on X (the loss) exceeding the threshold (this is the essence of POTS), what is the probability that (X-threshold) will be less than the value x. Under this condition, X-threshold starts at zero.

Re: the spreadsheet, not sure i follow as Col B is the x-axis...i will take a closer look ASAP

David
 

humheehum

New Member
Hi David - thanks for the reply.

On the shape parameter - I had thought that since we are fitting a distribution on the tails - a higher shape parameter would have implied fatter tails in the child not the parent distribution.

On the spreadsheet - I think I have not understood what column A and Column B are.

Another question is why is that the GEV distribution refers to the z-variable (I know the distribution function formula uses at normalized values - but what is the intuition behind standardized values for GEV and using actual(?) values for GPD - is it something to do with GEV assuming i.i.d variables?)

Thanks
Ashim
 

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Hi Ashim,

Strong questions (FYI, my best reference on this is Kevin Dowd's book, Market Risk, which is assigned for model risk but has a great section on EVT)

1. This is counter intuitive. Think about the parent P(X) that is the cdf until, say, P < 95%. Now start the child distribution C(X), so C(X) applies to 95% < P < 100%. Imagine child is ultra-dense, almost all mean and no tail, everything clustered near it's (child) mean. Zoom back to the parent, and you have an abrupty fat tail for the parent. As you start to make the child tail fatter, it makes the parent skinner..(I desparately need a metaphor)

2. A/B are just my X-axis. When I have time i'll see if i can improve the EVT XLS, it's just not mission critical right now...

3. I think GPD also assumes i.i.d. (*tentative* assertion). This is due to the conceptual different between sampling (i) the maximum loss within a time interval and (ii) sampling losses above a threshold. In the former case (block maxima/GEV), losses are standardized according to their distance from a mean value (location). In the latter case, losses are standardized according to their distance from a threshold. In a sense, both GEV and GPD have location and scale, but "location" in GEV is explicitly parametrized as the location/mean (in the sense you mean) and in GPD, location is implicit due to the selection of a threshold. So, the trade-off for GPD is you don't need to parametrize the "location" but you do need to select the threshold (both have scale, which is dispersion)

David
 
Top