Overview of Expected Shortfall Backtesting
Expected Shortfall (ES) is the expected loss on days when there is a Value-at-Risk (VaR) failure. If the VaR is 10 million and the ES is 12 million, we know the expected loss tomorrow; if it happens to be a very bad day, it is 20% higher than the VaR. ES is sometimes called Conditional Value-at-Risk (CVaR), Tail Value-at-Risk (TVaR), Tail Conditional Expectation (TCE), or Conditional Tail Expectation (CTE).
There are many approaches to estimating VaR and ES, and they may lead to different VaR
and ES estimates. How can one determine if models are accurately estimating the risk on
a daily basis? How can one evaluate which model performs better? The varbacktest
tools help validate the performance of VaR models with regards
to estimated VaR values. The esbacktest
, esbacktestbysim
, and esbacktestbyde
tools extend
these capabilities to evaluate VaR models with regards to estimated ES values.
For VaR backtesting, the possibilities every day are two: either there is a VaR failure or not. If the VaR confidence level is 95%, VaR failures should happen approximately 5% of the time. To backtest VaR, you only need to know whether the VaR was exceeded (VaR failure) or not on each day of the test window and the VaR confidence level. Risk Management Toolbox™ VaR backtesting tools support “frequency” (assess the proportion of failures) and “independence” (assess independence across time) tests, and these tests work with the binary sequence of "failure" or "no-failure" results over the test window.
For expected shortfall (ES), the possibilities every day are infinite: The VaR may be exceeded by 1%, or by 10%, or by 150%, and so on. For example, there are three VaR failures in the following example:
On failure days, the VaR is exceeded on average by 39%, but the estimated ES exceeds VaR by an average of 27%. How can you tell if 39% is significantly larger than 27%? Knowing the VaR confidence level is not enough, you must also know how likely are the different exceedances over the VaR according to the VaR model. In other words, you need some distribution information about what happens beyond the VaR according to your model assumptions. For thin-tail VaR models, 39% vs. 27% may be a large difference. However, for a heavy-tail VaR model where a severity of twice the VaR has a non-trivial probability of happening, then 39% vs. 27% over the three failure dates may not be a red flag.
A key difference between VaR backtesting and ES backtesting is that most ES
backtesting methods require information about the distribution of the returns on each
day, or at least the distribution of the tails beyond the VaR. One exception is the
“unconditional” test (see unconditionalNormal
and unconditionalT
) where you
can get approximate test results without providing the distribution information. This is
important in practice, because the “unconditional” test is much simpler to use and can
be used in principle for any VaR or ES model. The trade-off is that the approximate
results may be inaccurate, especially in borderline accept, or reject cases, or for
certain types of distributions.
The toolbox supports the following tests for expected shortfall backtesting for
table-based tests for the unconditional Acerbi-Szekely test using the esbacktest
object:
ES backtests are necessarily approximated in that they are sensitive to errors in the
predicted VaR. However, the minimally biased test has only a small sensitivity to VaR
errors and the sensitivity is prudential, in the sense that VaR errors lead to a more
punitive ES test. See Acerbi-Szekely (2017 and 2019) for details. When distribution
information is available, the minimally biased test (minBiasRelative
or minBiasAbsolute
) is recommended.
The toolbox supports the following Acerbi-Szekely simulation-based tests for expected
shortfall backtesting using the esbacktestbysim
object:
For the Acerbi-Szekely simulation-based tests, you must provide the model distribution
information as part of the inputs to esbacktestbysim
.
The toolbox also supports the following Du and Escanciano tests for expected shortfall
backtesting using the esbacktestbyde
object:
For the Du and Escanciano simulation-based tests, you must provide the model
distribution information as part of the inputs to esbacktestbyde
.
Conditional Test by Acerbi and Szekely
The conditional test statistic by Acerbi and Szekely is based on the conditional relationship
where
X
t is the portfolio outcome, that is,
the portfolio return or portfolio profit and loss for period
t.
VaR
t is the estimated VaR for period
t.
ES
t is the estimated expected shortfall
for period t.
The number of failures is defined as
where
N
is the number of periods in the test window
(t = 1
,…,N
).
I
t is the VaR failure indicator on
period t with a value of 1 if
X
t < -VaR, and 0 otherwise.
The conditional test statistic is defined as
The conditional test has two parts. A VaR backtest must be run for the number of
failures (NumFailures
), and a standalone conditional test is
performed for the conditional test statistic
Z
cond. The conditional test accepts
the model only when both the VaR test and the standalone conditional test accept the
model. For more information, see conditional
.
Unconditional Test by Acerbi and Szekely
The unconditional test statistic by Acerbi and Szekely is based on the unconditional relationship,
where
X
t is the portfolio outcome, that is,
the portfolio return or portfolio profit and loss for period
t.
P
VaR is the probability of VaR failure
defined as 1-VaR level.
ES
t is the estimated expected shortfall
for period t.
I
t is the VaR failure indicator on
period t with a value of 1 if
X
t < -VaR, and 0 otherwise.
The unconditional test statistic is defined as
The critical values for the unconditional test statistic are stable across a range
of distributions, which is the basis for the table-based tests. The esbacktest
class runs the
unconditional test against precomputed critical values under two distributional
assumptions, namely, normal distribution (thin tails, see unconditionalNormal
), and t distribution with 3
degrees of freedom (heavy tails, see unconditionalT
).
Quantile Test by Acerbi and Szekely
A sample estimator of the expected shortfall for a sample
Y
1,…,Y
N is:
where
N
is the number of periods in the test window
(t = 1
,…,N
).
P
VaR is the probability of VaR failure
defined as 1-VaR level.
Y
1,…,Y
N
are the sorted sample values (from smallest to largest), and is the largest integer less than or equal to
Np
VaR.
To compute the quantile test statistic, a sample of size N
is
created at each time t as follows. First, convert the portfolio
outcomes to X
t to ranks using the cumulative distribution function
P
t. If the distribution assumptions
are correct, the rank values
U
1,…,U
N
are uniformly distributed in the interval (0,1). Then at each time
t:
Invert the ranks U = (
U
1,…,U
N) to getN
quantiles .Compute the sample estimator .
Compute the expected value of the sample estimator
where
V
= (V
1,…,V
N) is a sample ofN
independent uniform random variables in the interval (0,1). This can be computed analytically.
The quantile test statistic by Acerbi and Szekely is defined as
The denominator inside the sum can be computed analytically as
where
I
x(z
,w
)
is the regularized incomplete beta function. For more information, see betainc
and quantile
.
Minimally Biased Test by Acerbi and Szekely
The minimally biased test statistic by Acerbi and Szekely is based on the following representation of the VaR and ES (see Acerbi and Szekely 2017 and 2019 for details and also Rockafellar and Uryasev 2002, and Acerbi and Tasche 2002):
where
X is the portfolio outcome.
(x)_ is the negative part function defined as (x)_ = max(0,-x).
ɑ is 1-VaR level.
The test statistic has an absolute version and a relative version. The absolute version of the minimally biased test statistic is given by
where
Xt is the portfolio outcome, that is the portfolio return or portfolio profit and loss for period t.
VaRt is the essential VaR for period t.
ESt is the expected shortfall for period t.
pVaR is the probability of Var Failure defined as 1-VaR level.
N is the number of periods in the test window (t = 1,...N).
(x)_ is the negative part function defined as (x)_ = max(0,-x).
The relative version of the minimally biased test statistic is given by
ES backtests are necessarily approximated in that they are sensitive to errors in
the predicted VaR. However, the minimally biased test has only a small sensitivity
to VaR errors and the sensitivity is prudential, in the sense that VaR errors lead
to a more punitive ES test. See Acerbi-Szekely (2017 and 2019) for details. When
distribution information is available, the minimally biased test is recommended. For
more information, see minBiasRelative
and minBiasAbsolute
.
ES Backtest Using Du-Escanciano Method
For each day, the Du-Escanciano model assumes a distribution for the returns. For example, if you have a normal distribution with a conditional variance of 1.5%, there is a corresponding cumulative distribution function Pt. By mapping the returns Xt with the distribution Pt, you get the “mapped returns” series Ut, also known as the "ranks" series, which by construction has values between 0 and 1 (see column 2 in the following table). Let α be the complement of the VaR level — for example, if the VaR level is 95%, α is 5%. If the mapped return Ut is smaller than α, then there is a VaR “violation” or VaR “failure.” This is equivalent to observing a return Xt smaller than the negative of the VaR value for that day, since, by construction, the negative of the VaR value gets mapped to α. Therefore, you can compare Ut against α without even knowing the VaR value. The series of VaR failures is denoted by ht and it is a series of 0's and 1's stored in column 3 in the following table. Finally, column 4 in the following table contains the “cumulative violations” series, denoted by Ht. This is the severity of the mapped VaR violations on days on which the VaR is violated. For example, if the mapped return Ut is 1% and α is 5%, Ht is 4%. Ht is defined as zero if there are no VaR violations.
Xt | Ut = Pt(Xt) | ht = Ut < α | Ht = (α - Ut) * ht |
---|---|---|---|
0.00208 | 0.5799 | 0 | 0 |
-0.01073 | 0.1554 | 0 | 0 |
-0.00825 | 0.2159 | 0 | 0 |
-0.02967 | 0.0073 | 1 | 0.0427 |
0.01242 | 0.8745 | 0 | 0 |
... | ... | ... | ... |
Given the violations series ht and the cumulative violations series Ht, the Du-Escanciano (DE) tests are summarized as:
Du-Escanciano Test | VaR Test | ES Test |
---|---|---|
Unconditional | Mean of ht | Mean of Ht |
Conditional | Autocorrelation of ht | Autocorrelation of Ht |
The DE VaR tests assess the mean value and the autocorrelation of the
ht series, and the
resulting tests overlap with known VaR tests. For example, the mean of
ht is expected to
match α. In other words, the proportion of time the VaR is violated is expected to
match the confidence level. This test is supported in the varbacktest
class with the proportion of failures (pof
) test (finite sample) and the binomial (bin
) test (large-sample approximation). In turn, the conditional VaR
test measures if there is a time pattern in the sequence of VaR failures
(back-to-back failures, and so on). The conditional coverage independence (cci
) test in the varbacktest
class tests for one-lag independence. The time between
failures independence (tbfi
) test in the varbacktest
class also assesses time independence for VaR
models.
The esbacktestbyde
class
supports the DE ES tests. The DE ES tests assess the mean value and the
autocorrelation of the
Ht series. For the
unconditional test (unconditionalDE
), the expected value is α/2 — for example, the
average value in the bottom 5% of a uniform (0,1) distribution is 2.5%. The
conditional test (conditionalDE
) assesses not only if a failure occurs but also if the
failure severity is correlated to previous failure occurrences and their
severities.
The test statistic for the unconditional DE ES test is
If the number of observations is large, the test statistic is distributed as
where N(μ,σ2) is the normal distribution with mean μ and variance σ2.
The unconditional DE ES test is a two-sided test that checks if the test statistic is close to the expected value of α/2. From the limiting distribution, a confidence level is derived. Finite-sample confidence intervals are estimated through simulation.
The test statistic for the conditional DE ES test is derived in several steps. First, define the autocovariance for lag j:
The autocorrelation for lag j is then
The test statistic for m lags is then
If the number of observations is large, the test statistic is distributed as a chi-square distribution with m degrees of freedom:
The conditional DE ES test is a one-sided test to determine if the conditional DE ES test statistic is much larger than zero. If so, there is evidence of autocorrelation. The limiting distribution computes large-sample critical values. Finite-sample critical values are estimated through simulation.
Comparison of ES Backtesting Methods
The backtesting tools supported by Risk Management Toolbox have the following requirements and features.
Backtesting Tool | PortfolioData Required | VarData Required | ESData Required | VaRLevel Requireda | PortfolioID and VaRID
Supported | Distribution Information Required | Supports Multiple Modelsb | Supports Multiple VaRLevel s |
---|---|---|---|---|---|---|---|---|
varbacktest | Yes | Yes | No | Yes | Yes | No | Yes | Yes |
esbacktest | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
esbacktestbysim | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes |
esbacktestbyde | Yes | No | No | Yes | Yes | Yes | No | Yes |
a b For example, you can backtest a
|
Risk Management Toolbox supports the following backtesting tools and their associated tests.
Test Type | Test Name | Tests for | Risk Measure | Critical Value Computation | Use Object | Use Function |
---|---|---|---|---|---|---|
Basel | Traffic light | Frequency | VaR | Exact finite-sample (binomial) | varbacktest | tl |
Various | Binomial | Frequency | VaR | Large-sample normal approximation | varbacktest | bin |
Kupiec | Proportion of failures | Frequency | VaR | Exact finite-sample (log likelihood) | varbacktest | pof |
Kupiec | Time until first failure | Independence | VaR | Exact finite-sample (log likelihood) | varbacktest | tuff |
Christoffersen | Conditional coverage, mixed | Frequency and independence | VaR | Exact finite-sample (log likelihood) | varbacktest | cc |
Christoffersen | Conditional coverage, independence | Independence | VaR | Exact finite-sample (log likelihood) | varbacktest | cci |
Haas | Mixed Kupiec test | Frequency and independence | VaR | Exact finite-sample (log likelihood) | varbacktest | tbf |
Haas | Independence (time between failures) | Independence | VaR | Exact finite-sample (log likelihood) | varbacktest | tbfi |
Acerbi-Szekely | "Test 2" or unconditional | Severity | ES | Tables of presimulated critical values, under normal and t distribution | esbacktest | unconditionalNormal and unconditionalT |
Acerbi-Szekely | "Test 1" or conditional | Severity | ES | Finite-sample simulation | esbacktestbysim | conditional |
Acerbi-Szekely | "Test 2" or unconditional | Severity | ES | Finite-sample simulation | esbacktestbysim | unconditional |
Acerbi-Szekely | "Test 1" or ranks (quantile) | Severity | ES | Finite-sample simulation | esbacktestbysim | quantile |
Acerbi-Szekely | Minimally Biased, relative version | Severity | ES | Finite-sample simulation | esbacktestbysim | minBiasRelative |
Acerbi-Szekely | Minimally Biased, absolute version | Severity | ES | Finite-sample simulation | esbacktestbysim | minBiasAbsolute |
Du-Escanciano | Unconditional | Severity | ES | Large-sample approximation and finite-sample simulation | esbacktestbyde | unconditionalDE |
Du-Escanciano | Conditional | Independence | ES | Large-sample approximation and finite-sample simulation | esbacktestbyde | conditionalDE |
References
[1] Basel Committee on Banking Supervision. Supervisory Framework for the Use of “Backtesting” in Conjunction with the Internal Models Approach to Market Risk Capital Requirements. January 1996. https://www.bis.org/publ/bcbs22.htm.
[2] Acerbi, C., and B. Szekely. Backtesting Expected Shortfall. MSCI Inc. December 2014.
[3] Acerbi, C., and B. Szekely. "General Properties of Backtestable Statistics. SSRN Electronic Journal. January, 2017.
[4] Acerbi, C., and B. Szekely. "The Minimally Biased Backtest for ES." Risk. September, 2019.
[5] Acerbi, C. and D. Tasche. “On the Coherence of Expected Shortfall.” Journal of Banking and Finance. Vol. 26, 2002, pp. 1487-1503.
[6] Du, Z., and J. C. Escanciano. "Backtesting Expected Shortfall: Accounting for Tail Risk." Management Science. Vol. 63, Issue 4, April 2017.
[7] Rockafellar, R. T. and S. Uryasev. "Conditional Value-at-Risk for General Loss Distributions." Journal of Banking and Finance. Vol. 26, 2002, pp. 1443-1471.
See Also
esbacktestbyde
| esbacktest
| esbacktestbysim
| varbacktest
Related Topics
- VaR Backtesting Workflow
- Value-at-Risk Estimation and Backtesting
- Expected Shortfall (ES) Backtesting Workflow with No Model Distribution Information
- Expected Shortfall (ES) Backtesting Workflow Using Simulation
- Expected Shortfall Estimation and Backtesting
- Workflow for Expected Shortfall (ES) Backtesting by Du and Escanciano
- Rolling Windows and Multiple Models for Expected Shortfall (ES) Backtesting by Du and Escanciano