False illusion of safety: is a handful of stress scenarios enough?
Transcrição
False illusion of safety: is a handful of stress scenarios enough?
False illusion of safety: is a handful of stress scenarios enough? Abstract The use of stress tests to assess bank solvency has developed rapidly over the past few years. In particular the stress tests run by the USA authorities and by companies under the Dodd-Frank Act suggest that most large Bank Holding Company are resilient to shocks similar to the last crisis. In general each crisis has its particularities and banks must be prepared to new stress scenarios. This paper aims to confront the severely adverse scenario with those based on the framework recently proposed by De Genaro (2015) for generating maximum entropy Monte Carlo simulations and so seek to shed some light on the false illusion of safety embedded in regulatory stress tests. Keywords: Stress Testing; Stress Scenarios; Maximum entropy Monte Carlo Simulation JEL: G01; G28; G21;C15 1 1 Introduction In the midst of the 2008 financial panic caused by the collapse of the subprime housing market, the U.S. government responded with unprecedented measures, including liquidity provision through various funding programs, debt and deposit guarantees and large-scale asset purchases. In February 2009, the U.S. banking supervisors conducted the first-ever system-wide stress test on 19 of the largest U.S. bank holding companies (BHCs), known as the Supervisory Capital Assessment Program (SCAP). Based on this success, the Federal Reserve institutionalized the use of supervisory stress tests for establishing minimum capital standards in 2010 through its now annual Comprehensive Capital Assessment and Review (CCAR) for the same large banking organizations. Shortly thereafter, the Dodd-Frank Act mandated stress testing for all banking organizations with more than USD 50 billion in total assets, as well as “systemically important non-bank financial institutions” designated by a newly established Financial Stability Oversight Council. The results of the quantitative assessment figure importantly in Federal Reserve policy decisions regarding whether to object to firms plans for future capital distribution and retention for example, planned dividend payouts. More generally, stress tests provide valuable information about the safety and soundness of individual firms, and, significantly, allow comparisons and aggregation across a range of firms. For example, the stress tests conducted by the Federal Reserve under the terms of the Dodd-Frank Act Stress Tests (DFAST) provide a very useful perspective on both individual firms and the banking system. Whereas stress testing is an important risk management tool it is inherently challenging for a number of reasons. First, it requires specifying one or more scenarios that are stressful but not implausibly disastrous. Scenarios must have certain elements of realism, but are certainly ahistorical, and may represent structural breaks in the processes that are being stressed. Second, stress testing requires forecasting earnings and capital conditional on the nature of the scenario; not only is this more difficult than unconditional forecasting, but the objects of interest are the tails of the distribution, which, almost by definition, are infrequently observed in historical data. The Federal Reserve Boards rules implementing DFAST require the Board to provide at least three different sets of scenarios, including baseline, adverse, and severely adverse scenarios, for both supervisory and company-run stress tests. The adverse and severely adverse scenarios are not forecasts, but rather are hypothetical scenarios designed to assess the strength of banking organizations and their resilience to adverse economic environments. In addition, the baseline scenario follows a contour very similar to the average projections from surveys of economic forecasters and does not represent the forecast of the Federal Reserve. 2 As stated above stress testing became notorious after the regulatory initiatives headed by the FED. However, the one prior U.S. experience of the usage of stress testing for determining capital requirements was a spectacular failure: the Office of Federal Housing Enterprise Oversight’s (OFHEO) risk-based capital stress test for Fannie Mae and Freddie Mac. Frame, Gerardi, and Willen (2015) study the sources of failure with OFHEO risk-based capital stress test for Fannie Mae and Freddie Mac. Their analysis uncovers two key problems with the implementation of the OFHEO stress test. The first pertains to model estimation frequency and specification, OFHEO had left the model specification and associated parameters static for the entire time the rule was in force. Second, the house price stress scenario was insufficiently dire. As the recent failures of stress test have shown, stress scenarios are vital for stress testing and a badly chosen set of scenarios might compromise its quality and credibility in two ways. First, the really dangerous scenarios might not have been considered. This results in a false illusion of safety. Consequently it may happen that banks go bankrupt although they have recently passed stress tests. According to Breuer and Csiszár (2013) a notable example are the supposedly successful stress tests of Irish Banks in 2010, which had to be bailed out a few months later. Second, the scenarios considered might be too implausible. This results in a false alarm and subsequent limited effect on risk-reducing actions by banks and regulators. In addition, regulators must bear in mind that what is a worst-case scenario for one portfolio might be a harmless scenario for another portfolio. This limitation becomes more evident in standard stress testing when only a handful of scenarios is taken into consideration. Stress testing provides relevant information when the quantity and quality of scenarios are rich enough to ensure that dangerous scenarios have been considered regardless the portfolio’s constituents. Christensen et al. (2015) observe that financial stress tests which consider a small number of hand-picked scenarios have, as a benefit, the fact that the model risk involved in the choice of a risk factor distribution is minimized. However, this advantage comes at a price. Not assuming any risk factor distribution stress testers cannot judge whether the stress scenarios are really dangerous or sufficiently plausible. Therefore this paper seeks to shed some lights on the false illusion of safety which arises when the stress test is performed with a handful of scenarios by implementing a methodology capable of constructing a richer set of stress scenarios. To achieve this goal, this paper resort to the framework proposed by De Genaro (2015) for generating multi-period trajectories via maximum entropy Monte Carlo simulations and its values are used as inputs for determining the stress test of the four largest US banks. Finally the stress test figures obtained with simulated scenarios are confronted with those arising when the 2015 severely adverse scenarios are used. As a policy implication, regulatory stress 3 scenarios should be enlarged in order to avoid the false illusion of safety. This paper contains, besides this introduction, 5 more sections. Section 2 presents the background to the regulatory stress test performed in the USA. Next, in section 3 the main elements of the multi-period stress testing problem are presented. Section 4 briefly reviews the two-step methodology proposed by De Genaro (2015) for generating stress scenarios using the concept of maximum entropy Monte Carlo simulations. Section 5 presents the outcome of a comparative study between the simulated scenarios and the 2015 severely adverse scenario when applied to determine the stress test for the four largest US banks. Section 6 presents the final remarks. 2 Background to Comprehensive Capital Analysis and Review (“CCAR”) and Dodd-Frank Stress Tests (“DFAST”) The CCAR and DFAST stress tests are separate exercises that rely on similar processes, data, supervisory exercises, and requirements, although DFAST is far less intensive as it is designed for firms that were not included in the original CCAR regime (the CCAR followed the 2009 Supervisory Capital Assessment Program (SCAP), a standardized stress test that was conducted for the 19 largest U.S. BHCs at that time, and was originally conducted for those same 19 large BHCs). Both exercises are run by the Federal Reserve and in both cases the aim of the authorities is to ensure that financial institutions have robust capital planning processes and adequate capital. The CCAR, which is conducted annually, is closer in scope to micro stress tests: when the Federal Reserve deems an institution’s capital adequacy or internal capital adequacy assessment processes unfavourable under the CCAR, it can request it to revise its plans to make capital distributions, such as dividend payments or stock repurchases. Closer in scope to macro stress tests, DFA stress tests in turn are forward-looking exercises conducted by the Federal Reserve and financial companies regulated by the Federal Reserve. The DFA stress tests aim to ensure that institutions have sufficient capital to absorb losses and support operations during adverse economic conditions. The Federal Reserve coordinates these processes to reduce duplicative requirements and to minimize burden. Accordingly, the Federal Reserve publishes annual stress test scenarios that are to be used in both the CCAR and DFAST exercises. In general, the baseline scenario will reflect the most recently available consensus views of the macroeconomic outlook expressed by professional forecasters, government agencies, and other public-sector organizations as of the beginning of the annual stress-test cycle (October 1 of each year). The severely adverse scenario will consist of a 4 set of economic and financial conditions that reflect the conditions of post-war U.S. recessions. The adverse scenario will consist of a set of economic and financial conditions that are more adverse than those associated with the baseline scenario but less severe than those associated with the severely adverse scenario. The Federal Reserve generally expects to use the same scenarios for all companies subject to the final rules though it may require a subset of companiesdepending on a company’s financial condition, size, complexity, risk profile, scope of operations, or activities, or risks to the U.S. economy - to include additional scenario components or additional scenarios that are designed to capture different effects of adverse events on revenue, losses, and capital. An example of such an additional component is the instantaneous global market shock that applies to companies with significant trading activity. An institution may choose to project additional economic and financial variables, beyond the mandatory supervisory scenarios provided, to estimate losses or revenues for some or all of its portfolios. In the last step, banks are required to use the variables in their forecasting models to assess their effects on the firm’s revenues, losses, balance sheet (including risk-weighted assets), liquidity, and capital position for each of the scenarios over a nine quarter horizon. 3 The multi-period stress testing problem Typically, a portfolio is defined as a collection of N financial instruments, represented by the vector θ (0) = [θi (0), . . . , θN (0)], where each θi (0) corresponds to the total notional value of an instrument i at an initial time t = 0. The uncertainty about futures states of the world is represented by a probability space (Ω, F, P) and a set of financial risks M defined on this space, where these risks are interpreted as portfolio or position losses over some fixed time horizon. Additionally, a risk measure is defined as a mapping ψ : M → R with the interpretation that ψ(L) gives the amount of cash that is needed to back a position with loss L. In its turn, asset values are defined as a function of time and a I-dimensional random vector of t−measurable risk factors Zt,i by f (t, Zt,1 , Zt,2 , . . . Zt,I ) where f : R+ × RI → R is a measurable function. For any portfolio comprised of N instruments and notional value θ (0) = [θi (0), . . . , θN (0)] the realized portfolio variation over the period [t − 1, t] is given by: L[t,t−1] = N X h i θi (0) fi (t, Zt ) − fi (t − 1, Zt−1 ) (1) i Here fi (t, Zt ) describes the exact risk mapping function which is not only limited to linear functions. 5 In a typical stress tests exercise the examiner is interested in assessing what happens to the portfolio losses given an adverse shock on the risk factors. In other words, it pictures the same portfolio at two different moments (today and future) and then calculates the maximum loss over all possible future scenarios Zt+m : sup L[t+m,t] (Zt ) = Zt+m ∈S N X h i θi (0) fi (t + m, Zt+m ) − fi (t, Zt ) (2) i Therefore a more realistic formulation arises when the planning horizon is explicitly incorporated into the loss function. In this case it is recognized that the worst loss could occur any time during the planning horizon and so capital requirements would be sufficient to support losses occurring any time not only those arriving at the terminal horizon. Formally the loss function capturing the worst-loss over planning horizon for a given stress scenario Zt can be describe as: # " t+T N XX ? L[t,t+T ] (Zt ) = min (3) θi (0) fi (t + 1, Zt+1 ) − fi (t, Zt ) t i Therefore the multi-period stress test in line with the CCAR and DFAS can be formally given by: sup L?[t+m,t] (Zt ) (4) Zt+m ∈S Notwithstanding the quality of this kind of problem depends crucially on the definition of the uncertainty sets, S. In stress testing the bias toward historical experience can lead to the risk of ignoring plausible but harmful scenarios which have not yet happened in history. As a way of avoiding the quality of stress testing depending too much on individual skills and more on models/methods, the Basel Committee on Banking Supervision (2005) issued a recommendation to construct stress scenarios observing two main dimensions: plausibility and severity. Severity tends to be the easiest component to take into account, because risk managers can ultimately look at historical data and define scenarios based on the largest movement observed for each risk factor. On the other hand, plausibility requires that after setting individual stress scenarios Si and Sj for risk factors i and j the resulting joint scenario (Si , Sj ) makes economic sense. The current literature in stress test has provided different approaches for generating plausible stress scenarios. A first attempt in this direction was made by Studer (1999) and Breuer and Krenn (1999), who developed what is called “traditional systematic stress tests”. In particular, Studer considered elliptical multivariate risk factor distributions and proposed to quantify the plausibility of a realization by its Mahalanobis distance: 6 sup L(Z) (5) Z:M aha(Z)≤k This approach was extended by Breuer et al. (2012) to a dynamic context where multi-period plausible stress scenarios can be generated. While this approach introduced the systematic treatment of plausibility for generating stress scenarios it has problems of its own. First, the maximum loss over a Mahalanobis ellipsoid depends on the choice of coordinates. Second, the Mahalanobis distance as a plausibility measure reflects only the first two moments of the risk factor distribution. It is important to notice that the argument for working with elliptical approximations to the risk factor is the relative tractability of the elliptical case. However, this ability to handle the problem has a cost of ignoring the fact that an given extreme scenario should be more plausible if the risk factor distribution has fatter tails. Recently, Breuer and Csiszár (2013) proposed a new method to overcomes the shortcomings of Studer‘s method. The authors introduce a systematic stress testing for general distributions where the plausibility of a mix scenario is determined by its relative entropy D(Qkν) with respect to some reference distribution ν, which could be interpreted as a prior distribution. A second innovation of their approach was the usage of mixed scenarios instead pure scenarios. Putting all elements together Breuer and Csiszár (2013) propose the following problem: sup EQ (L) := M axLoss(L, k) (6) Q:D(Qkν)≤k This method represents an important contribution to the literature as it moves the analysis of plausible scenarios beyond elliptical distributions. The authors show that under some regularity conditions the solution to (6) is obtained by means of the Maximum Loss Theorem and some concrete results are only available for linear and quadratic approximations to the loss function. It can be observed that while elliptical distributions have been an usual assumption for describing uncertainty on stress testing problems empirical research in financial time-series indicates the presence of a number of facts which are not properly captured by this class of distributions. Concretely, empirical literature indicates that asset returns display a number of so-called stylized facts: fat tails, volatility clustering and tail dependence. 4 Maximum entropy Monte Carlo Simulation Recently De Genaro (2015) proposed a hybrid two-step approach for generating stress scenario which has shown be flexible enough to capture all the three stylized facts present in financial asset returns: fat tails, volatility clustering, 7 and tail dependence. In his approach the author describes the uncertainty on S via a discrete set: S = {Z1,t (ω), Z2,t (ω), . . . , ZN,t (ω)} ∀ t ∈ [1, Tmax ] (7) Where Zi represents a realization (ω) for the source of uncertainty under consideration and Tmax is the maximum holding period. According to his methodology the first step consists in determining envei lope scenarios, Senv , for each risk factor in M by using Extreme Value Theory (EVT). One advantage of recognizing that worst-case scenarios are indeed associated with extraordinary events it is very straightforward to employ elements from EVT literature to pave the ground for constructing S. Basically, envelope scenarios would represent the worst-case scenario (WCS) for each risk factor and have the highest level of severity embedded: i Senv = [S(t)il , S(t)iu ] , ∀ t ∈ [1, Tmax ] and i = 1, . . . , I (8) Where Tmax is the maximum holding period and I is the number of risk factors. Once the parameters of the extreme value distribution have been estimated the upper and lower bounds of (8) can be obtained using the percentile of (??): S(t)il = inf{x ∈ < : F(ξ,σ) (x) ≥ αl } , ∀ t ∈ [1, Tmax ] and i = 1, . . . , I S(t)iu = inf{x ∈ < : F(ξ,σ) (x) ≥ αu } , ∀ t ∈ [1, Tmax ] and i = 1, . . . , I (9) (10) A compelling argument for this approach is the capacity of defining the size of the uncertainty set based on probabilistic terms instead of risk aversion parameters which can be either hard to estimate or that are attached to a specific family of utility functions. Thus, as it is standard in the risk management literature, for daily data, and assuming a confidence level of 99.96%, it is expected that actual variation would exceed S(t)l one out of 10 years. While these envelopes can be viewed as a worst-case scenario so it will protect against uncertainty by setting all returns to their lowest (highest) possible values the end points of the intervals, this approach can produce implausible scenarios. In practice, there is usually some form of dependence among future returns, and it rarely happens that all uncertain returns take their worst-case values simultaneously. It may therefore be desirable to incorporate some kind of variability and correlation of asset returns. A second limitation with this uncertainty set comes from the fact that for some no-linear instruments, such as options, the maximum loss may not occurs when the underlying asset hits its envelope scenario. For instance, a stress scenario for a portfolio with an ATM long straddle position is a scenario where the underlying remains unchanged. 8 However for an outright position this scenario generates almost no losses. The literature calls this the dimensional dependence of maximum loss and suggests that effective stress scenarios are not only made up of those exhibiting extreme variation. As a way to overcome the limitations above De Genaro (2015) suggested the adoption of a second step. So, assuming a given dynamic data generating α, Φ), which depends on a parameter vector α and an arbitrary distriprocess, Γ(α bution Φ the second step consists in generating trajectories for each risk factor along the holding period which are expected to fill as uniformly as possible the state space comprised by the upper, S(t)u , and lower bounds, S(t)l : S = {S(t) : S(t)il ≤ S(t) ≤ S(t)iu , ∀ t ∈ [1, Tmax ] and i = 1, . . . , I} (11) Where Tmax is the maximum holding period and I is the number of risk factors. De Genaro (2015) pointed out that given a particular realization ω there is no sufficient condition imposed to assure that S(t, ω) ∈ S. Therefore an additional structure is required to meet this condition. One way of assuring that S(t, ω) ∈ S is by recurring to the concept of exit time: τ (ω) := inf{t ≥ 0|S(t, ω) 6∈ S} (12) Therefore, it is possible for every trajectory generated by Monte Carlo to construct the simulated stopped process as: S(ω, t), if τ > t S̄(ω, t) = (13) S(ω, τ ), if τ ≤ t In this situation the use of stopped process is an artifice that assures for any ω the simulated stopped process is by construction contained in S. Avellaneda et al. (2000) also adopted a similar artifice in the context of option pricing for correcting price-misspecification and finite sample effects arising during the Monte Carlo Simulation. An important aspect with regard to quantitative stress tests is their computational tractability. A typical Monte Carlo experiment for risk measurement usually involves thousands of samples, which, combined with a 9 quarters horizon can yield millions of scenarios for just one portfolio. As one can easily observe, the computational power required to solve the problem when actual bank portfolios are used increases as the number of scenarios grow, and therefore the number of scenarios should be carefully chosen to maintain the stress test tractable. De Genaro (2015) proposed the use of Shannon n-gram (block) entropy for choosing a second plausible set S ? , S ? ⊂ S, formed by those trajectories with 9 highest entropy. The n-gram entropy (entropy per block of length n) is defined as: X En = − p Aλ1 , Aλ2 , . . . , Aλn logλ p Aλ1 , Aλ2 , . . . , Aλn (14) χ Where Aλ1 , Aλ2 , . . . , Aλn is a sequence of states generated from real-valued observations xi ∈ R which were discretised by mapping them onto λ nonoverlapping intervals Aλ (xi ). In the above equation the summation is done over all possible state sequences χ ∈ Aλ1 , Aλ2 , . . . , Aλn . The probabilities p(Aλ1 , Aλ2 , . . . , Aλn ) are calculated based on all subtrajectories x1 , x2 , . . . , xn contained within a given subvector of length L. In general, processes or variables with entropy equal to 0 are deterministic, in our context, trajectories with block-entropy close to zero should present low price variation. After computing the n-gram entropy for every simulated trajectory these values are sorted in ascending order: E1 (Z1 ) ≤ . . . ≤ Em (Zj ) where m is used to denote an ordering based on each entropy trajectory and Ei (Zj ) denotes the n-gram entropy for the j − th trajectory Zj calculated as defined by (14). Therefore after raking every simulated trajectory based on its block entropy our strategy consists in choosing the k − th largest values for every risk factor1 to form the final stress scenarios, S ? . To illustrate this concept, the pictures above present two sets of simulated trajectories grouped according to their block-entropy over a 10-quarters horizon: 1 For those risk factors which are jointly modeled we repeat this calculation for every factor in the group choosing those realization ωi for each factor l. The final set will be formed by the union of all realizations of every factor. So, if we have two risk factors and we choose only the two largest values for each factor, we will end up with 4 scenarios for each risk factors, in other words, if realization ω1 is the largest for factor one and ω3 is the largest for factor two we keep both realization in our final set of stress scenarios {ω1 , ω3 } for each factor. In this way it is possible to preserve the dependence structure presented in the data. 10 0.25 0.2 0.2 0.15 0.15 0.1 0.1 Price variation − % Price variation − % 0.25 0.05 0 −0.05 0.05 0 −0.05 −0.1 −0.1 −0.15 −0.15 −0.2 1 2 3 4 5 6 Holding Period 7 8 9 −0.2 1 10 Figure 1: Low entropy trajectories 2 3 4 5 6 Holding Period 7 8 9 10 Figure 2: High entropy trajectories As one can see in Figure 1, 1,000 trajectories were simulated and later classified as low entropy according to the block-entropy estimator. It can be observed that the vast majority of the trajectories are contained in the [−0.05, 0.05] interval, with just a few exceptions hitting the envelope scenarios. This lack of coverage means that a number of price paths will not be taken into consideration for determining the portfolio losses, which might lead to risk underestimation. On the other hand, as can be seen in Figure 2, 1,000 simulated trajectories were drawn and ranked as high entropy. In this second set one can observe, as expected, that trajectories with highest entropy indeed presented wider variations. Additionally, these trajectories performed better in making the coverage of state space formed by the envelopes scenarios more uniform. 5 Application: generating scenarios for S ? Once the framework proposed by De Genaro (2015) has been presented in the previous section, here this framework is applied for generating S ? . The first step involves defining the envelope scenarios: S(t)il = inf{x ∈ < : F(ξ,σ,l) (x) ≥ αl } , ∀ t ∈ [1, Tmax ] , i = 1, . . . , I (15) S(t)iu (16) = inf{x ∈ < : F(ξ,σ,u) (x) ≥ αu } , ∀ t ∈ [1, Tmax ] , i = 1, . . . , I Here F(ξ,σ,·) (x) is the CDF of a Generalized Pareto distribution parameterized as: ξx F(ξ,σ,·) (x) = 1 − 1 + σ − ξ1 11 σ ≥ 0 , ξ ≥ −0.5 (17) Where the last function argument denotes that this function is independently defined to each tail. Once the parameters of the GPD have been estimated, the upper and lower bounds of (8) are obtained using the percentiles of (17) for the level of severity of αl = 0.0004 and αl = 0.9996. Next we proceed by specifying the DGP, Γ(α, Φ) that will be used for generating paths of the simulated stopped process, S̄(ω, t): S = {S̄(ω, t) : S(t)il ≤ S̄(ω, t) ≤ S(t)iu , ∀ t ∈ [1, Tmax ] , i = 1, . . . , I} (18) In order to model dynamic volatility and fat tails together, the method implemented for generating asset returns follows a generalization of the two-step procedure of McNeil and Frey (2000) where assets volatility are modeled by GARCH-type models and tails distributions of GARCH innovations are modeled by EVT. Therefore, following the same spirit of the model employed by McNeil and Frey (2000), assume that the Data Generating Process describing asset returns is given by: rt = ω + ρrt−1 + σt t (19) where {t } is a sequence of independent Student’s t distribution with ν degrees of freedom to incorporate fat tails often presented in asset returns. Furthermore, assume that: σt2 = α0 + p X j=1 2 βj σt−j ++ q X αj 2t−j + j=1 q X ζj 1I{t−j <0} 2t−j (20) j=1 The indication function 1I{t−j <0} equal to 1 if t−j < 0 , and 0 otherwise. The specification above was first developed by Glosten, Jagannathan, & Runkle (1993). The GJR is a GARH model variant that includes leverage terms for modeling asymetric volatility clustering. An advantage of the GJR specification includes the asymmetric nature of the impact of innovations: with ξ 6= 0, a positive shock will have a different effect on volatility than will a negative shock, mirroring findings in equity market research about the impact of “bad news” and “good news” on market volatility. In this paper we shall concentrate on (20) assuming p = q = 1. This is preferred for two reasons. First, the GJR(1,1) model is by far the most frequently applied GARCH model variant specification. Second, we want to keep a more 12 parsimonious specification once we are handling a large scale problem. The second step according to McNeil and Frey (2000) requires modeling the marginal distributions for standardized innovations zt := rt /σt of each risk factor. To accomplish that, a non-parametric smooth kernel density estimator is implemented for describing the center of the data while the Generalized Pareto Distribution for the upper and lower tails above the threshold is adopted. To incorporate the dependence structure among different risk factors we adopted the “t-copula” which has received much attention in the context of modeling multivariate financial return data. A number of papers such as Mashal Zeevi (2002) and Breymann et al. (2003) have shown that the empirical fit of the t-copula is generally superior to that of the so-called Gaussian copula. One reason for this is the ability of the t-copula to better capture the phenomenon of dependent extreme values. This is a desirable feature in risk management because under extreme market conditions the co-movements of asset returns do not typically preserve the linear relationship observed under ordinary conditions. 5.1 Results In this subsection it is summarized some aspects of statistical inference, as well as, the outcomes for the models that we have proposed and estimated. Our dataset contains end of month closing prices for USD/EUR, USD/GBP and Dow Jones Total Stock Market Index (Dow Jones for short) over the period from January 1999 to October 2015. These variables were chosen among those considered by the FED to conduct stress test in 2015. Although the dimension of the problem is small, the example illustrates the qualitative properties of the proposed methods. The first step requires estimating the parameters for the AR-GARCH processes as defined by equations (19) and (20). In general non-Gaussian GARCH models parameters are estimated by quasi-maximum likelihood (QMLE). Bollerslev and Wooldridge (1992) showed that the QMLE still delivers consistent and asymptotically normal parameter estimates even if the true distribution is nonnormal. In our approach the efficiency of the filtering process, i.e., the construction of zt , is of paramount importance. This is so because the filtered residuals serve as an input to both the EVT tail estimation and the copula estimation. This suggests that we should search for an estimator which is efficient under conditions of non-normality. Therefore, as pioneered by Bollerslev (1987) and adopted here, model’s parameters can be estimated by maximizing the exact conditional t-distributed density with ν degrees of freedom rather than an approximate density. Having completed the calibration of GARCH parameters and hence obtained sequences of filtered residuals we now consider estimation of the tail behavior 13 by using a Pareto distribution for the tails and the Gaussian kernel for the interior of the distribution. A crucial issue for applying EVT is the estimation of the beginning of the tail. Unfortunately, the theory does not say where the tail should begin. We know that we must be sufficiently far out in the tail for the limiting argument to hold, but we also need enough observations to reliably estimate the parameters of the GPD. There is no correct choice of the threshold level. While McNeil and Frey (2000) use the “mean-excess-plot” as a tool for choosing the optimal threshold level, some authors, such as Mendes (2005), use an arbitrary threshold level of 90% confidence level (i.e. the largest 10% of the positive and negative returns are considered as the extreme observations). In this paper we define upper and lower thresholds such that 10% of the residuals are reserved for each tail. Estimating the parameters of a copula or the spectral measure is an important part of this framework. In this paper it is implemented the technique proposed by Romano (2002) which is based on a semi-parametric approach for estimating parameters of the t-copula. This method which makes mild assumptions on the margins starts by using the marginal distributions obtained by transforming the standardized residuals to uniform variates by the semiparametric empirical distribution and afterward plug it into the likelihood function to yield a canonical maximum likelihood (CML) for estimating the t-copula parameters. Maximum likelihood estimates of the parameters for each estimated model are presented in table 1 along with asymptotic standard error in parentheses: 14 ω ρ α0 α1 β1 ζ t(ν) ξ σ ξ σ USD/EUR USD/GBP Dow Jones DoF USD/EUR USD/GBP AR-GARCH Dow Jones 0.7065 (0.2914) -0.07574 (0.0867) 0.1754 (0.834) 0.212 (0.08) 0.7365 (0.1505) 0.184 (0.01247) 8.2 (1.3) 0.1376 (0.1736) -0.0694 (0.0773) 0.4692 (0.5303) 0.0632 (0.0271) 0.865 (0.1049) 0.1159 (0.0511) 5.95 (2.66) Lower Tail 0.70658 (0.29149) -0.07574 (0.0873) 7.6417E-04 (2.75E-04) 0.228 (0.1296) 0.735 (0.167) 0.6861 (0.32) 9.32 (4.34) 0.507 (1.55E-2) 0.815 (0.31E-2 ) 0.5413 (5.47E-3) 0.736 (3.17E-2) Upper Tail 0.5295 (8.36E-3) 0.852 (4.26E-2) 0.5641 (1.55E-2) 0.8162 (0.21E-2 ) 0.5996 (3.47E-2) 0.9107 (2.17E-2) t-copula 0.5622 (3.43E-2) 0.3495 (3.26E-2) USD/EUR 1 0.6258 0.2361 USD/GBP 0.6258 1 0.1464 24.96 Dow Jones 0.2361 0.1464 1 Table 1: Estimates of GARCH(1,1), tail distribution and t-copula For each of our three variables, table 1 reports the results of the maximum likelihood estimation of each model’s parameters. Even though the interpretation and analysis presented in Table 1 is standard in the financial econometrics literature, some results can be pointed out. First, the autoregressive parameters are not statistically significant, indicating a low momentum for these variables. Second, for all variables, GARCH coefficients α1 and β1 are significant at the 1% and their sum is less than one implying that the GARCH model is stationary, 15 thought the volatility is fairly persistent since (α1 +β1 ) is close to one. Third, the estimated number of degrees of freedom of the conditional t-distribution for the USD/EUR, USD/GBP and Dow Jones are smaller than 10 which suggests that the returns on the selected variables are conditionally non-normally distributed. In the middle of the table 1 the ML estimation of the GPD parameters fitted to the observations in excess over the thresholds are presented. For all variable, the estimated ξ is found to be positive for both lower tail and upper tail of the filtered residuals distribution. This indicate that the tails on both sides of the distribution are heavy and all moments up to the forth are finite. Finally, the parameters describing the dependence are presented at bottom of the table. The number of degrees of freedom estimated for the t-copula is equal to 24.96 which is a relatively high value. Once the parameters in each model have been estimated the next step consists of generating maximum entropy trajectories for up to 9–quarters by Monte Carlo simulation. Thus, based on parameters of table 1, 50,000 samples for the simulated stopped process were drawn for each variable over a 9–quarter horizon forming the set S. Finally, the uncertainty set S ? is formed by choosing from S those 1,000 trajectories with highest block-entropy. The results can be seen in figures 3, 5, 7 below: 16 EURUSD 0.3 0.2 0.2 0.1 0.1 Accumulated variation (%) Accumulated variation (%) EURUSD 0.3 0 -0.1 0 -0.1 -0.2 -0.2 -0.3 -0.3 -0.4 1 Figure 3: EUR/USD 2 3 4 5 Quarters ahead 6 7 8 -0.4 9 Simulated and regulatory scenarios for 1 Figure 4: EUR/USD 2 3 4 0.2 0.1 0.1 Accumulated variation (%) Accumulated variation (%) 17 0.2 0 -0.1 -0.3 -0.3 Figure 5: GBP/USD 4 5 Quarters ahead 9 -0.1 -0.2 3 8 0 -0.2 2 7 GBPUSD 0.3 1 6 Historical and regulatory scenarios for GBPUSD 0.3 -0.4 5 Quarters ahead 6 7 8 9 Simulated and regulatory scenarios for -0.4 1 Figure 6: GBP/USD 2 3 4 5 Quarters ahead 6 7 8 9 Historical and regulatory scenarios for Dow Jones 0.6 0.4 Accumulated variation (%) 0.2 0 -0.2 -0.4 -0.6 -0.8 1 2 3 4 5 Quarters ahead 6 7 8 9 Figure 7: Simulated and regulatory scenarios for Dow Jones Dow Jones 0.6 0.4 Accumulated variation (%) 0.2 0 -0.2 -0.4 -0.6 -0.8 1 2 3 4 5 Quarters ahead 6 7 8 9 Figure 8: Historical and regulatory scenarios for Dow Jones The red dotted lines represent the envelope scenario which were estimated using Extreme Value Theory and a confidence level of αl = 0.04% and αu = 99.96%. The blue dotted lines represent the regulatory scenario prescribed by the Federal Reserve to conduct the 2015 annual stress tests of Bank Holding Companies. Figures 3, 5 and 7 present the scenarios simulated based on maximum entropy Monte Carlo simulation. We observe that paths in this set cover almost all possible variation along the holding period which is desirable from a risk viewpoint because in this case CCAR & DFAST will be determined considering a broad range of future outcomes for each variable and no potentially harmful price variation is left aside. For comparative purposes, figures 4, 6 and 8 present the historical scenario for each market variable. We observe that all historical scenarios are within the envelope scenario. In particular for the EUR/USD and Dow Jones the regula18 tory scenarios are in line with the historical price variation along the planning horizon, on the other hand for GBP/USD the regulatory scenarios are less severe than the historical behavior for this variable. Finally, we notice that from the 50,000 trajectories that were originally generated only 75 hit the envelopes scenarios (boundaries of our problem) and therefore were stopped. This value represents a very high acceptance rate for our sampling algorithm. 5.2 Discussion of results and implications Even though the graphical results suggest that simulated scenarios outperform the regulatory scenarios a concrete measure is required to confirm this affirmation. One way to provide irrefutable elements would be to rerun the stress test using the simulated scenarios, however to accomplish that it will be necessary to obtain the balance sheet informations of each bank and its proprietary positions in each asset class, of course this is a confidential information and these data are not available. Another alternative for assessing the impact of the simulated scenarios on real data is via the reports released by the FED when stress test cycle is completed. In general the FED publishes during the first quarter the report describing the final results for the stress test conducted in the previous year. Among the results for every BHC it is available some measures such as projected stressed capital ratios, risk-weighted assets, losses, for each bank, however with limited disclosure of assets held and derivatives positions. Due to the lack of real data we resort to some proxy variables. In the United States all regulated financial institutions are required to file periodic financial and other information with their respective regulators and other parties. For banks in the U.S., one of the key reports required to be filed is the quarterly Consolidated Report of Condition and Income, generally referred to as the call report or RC report. Specifically, every National Bank, State Member Bank and insured Nonmember Bank is required by the Federal Financial Institutions Examination Council (FFIEC) to file a call report as of the close of business on the last day of each calendar quarter, i.e. the report date. The specific reporting requirements depend upon the size of the bank and whether or not it has any foreign offices. Among the informations provided, the call report have for each institution the gross (positive and negative) fair value and notional amount for OTC derivatives with unaffiliated financial institutions for interest Rate, FX, Equity, commodities and credit. The nominal amount is the dollar value used to calculate payments made on swaps and other risk management products. This amount generally does not change hands and is thus referred to as notional. Notional amount are not an accurate reflection of credit exposure as they do not reflect the market value of the underlying contracts and the benefits of close-out netting and collateral. 19 Measuring credit exposure in derivative contracts involves identifying those contracts where a bank would lose value if the counterparty to a contract defaulted today. The total of all contracts with positive value (i.e., derivatives receivables) to the bank is the gross positive fair value (GPFV) and represents an initial measurement of credit exposure. The total of all contracts with negative value (i.e., derivatives payables) to the bank is the gross negative fair value (GNFV) and represents a measurement of the exposure the bank poses to its counterparties. Banks usually hedge the market risk of their derivatives portfolios, the change in GPFV is matched by a similar increase in GNFVs, but this hedge is not perfect and some amount is still subject to price fluctuation. Therefore it is defined the net current exposure as the gross positive fair value minus the gross negative fair value which represents the residual exposure when all netting and hedging mechanisms are taken in consideration. According to the report published by the Office of the Controller of the Currency using data from call report for all required institutions during the third quarter 2015, the derivatives activity in the U.S. banking system is dominated by a small group of large financial institutions. Four large commercial banks (JPMorgan Chase, Citibank, Bank of America and Goldman Sachs) represent 91.3% of the total banking industry notional amounts and 80.3% of industry. In table 2 it is presented the exposures for the four large banks in the US according to the Consolidated Financial Statements for Holding Companies (FR Y-9C) computed by the Federal Reserve and published on its National Information Center website2 . 2 Link: http://www.ffiec.gov/nicpubweb/nicweb/HCSGreaterThan10B.aspx 20 Gross Positive and Negative Fair Values JP Morgan & Co Bank of America Corp Citigroup Inc. Goldman Sachs GPFV (1) GNFV (2) Net ((1)-(2)) GPFV (1) GNFV (2) Net ((1)-(2)) GPFV (1) GNFV (2) Net ((1)-(2)) GPFV (1) GNFV (2) Net ((1)-(2)) FX $185.015.000 $201.298.000 $(16.283.000) $103.825.000 $104.986.000 $(1.161.000) $146.831.000 $150.143.000 $(3.312.000) $102.604.000 $107.669.000 $(5.065.000) Equities $46.727.000 $47.549.000 $(822.000) $38.184.000 $35.102.000 $3.082.000 $27.208.000 $31.769.000 $(4.561.000) $56.221.000 $54.276.000 $1.945.000 Table 2: Reported Gross Positive Fair Value (GPFV) and Gross Negative Fair Value obtained from FR Y-9C report. Values in thousands of dollars. As of September 30, 2015. From table 2 it is possible to identify that after taking into consideration all netting mechanism banks still have exposure on these risk factors and therefore they are subjected to market risk. Whereas GPFV and GNFV bring valuable information regarding derivatives exposures it is not possible to identify whether the bank is long or short, so it is assumed that all banks have net long exposure on these risk factors. These figures provide a good perspective on how selected banks are exposed to FX and equities price risks, however there is no breakdown of figures available, such as which currency or stock indexes were traded. For simplicity lets assume that 90% of all equity net exposure comes from Dow Jones and 90% of all FX net exposure is equally split into USD/EUR and USD/GBP. Exposure to risk factors JP Morgan & Co Bank of America Corp Citigroup Inc. Goldman Sachs USD/EUR $7.327.350 $522.450 $1.490.400 $2.279.250 USD/GBP $7.327.350 $522.450 $1.490.400 $2.279.250 Dow Jones $739.800 $2.773.800 $4.104.900 $1.750.500 Table 3: Exposure to risk factors. Values in thousands of dollars. As of September 30, 2015 21 Once defined how much each bank is exposed to FX and equities our investigation is carried out by means of comparative analysis between the regulatory and the simulated scenarios. The exercise consists in simulating what would be the losses for each banks with regulatory scenarios and comparing these losses with those arising when the simulated scenarios is adopted. Table 4 exhibits the losses calculated for each bank using the Severely Adverse Scenario informed by the FED and 1,000 simulated scenarios using the methodology proposed in this paper. Stress test outcomes produced by different Stress Scenarios JP Morgan & Co Bank of America Corp Citigroup Inc. Goldman Sachs Simulated Scenarios $(5.118.961) $(2.131.354) $(3.476.757) $(2.429.482) Severely Adverse Scenario $(1.756.187) $(1.701.619) $(2.648.148) $(1.427.089) Ratio 291% 125% 131% 170% Table 4: Stress test results obtained using the Severely Adverse Scenario by the FED and the Maximum Entropy Monte Carlo Scenario. Ratio represents the the loss from the simulated scenario divided by the loss from Severely Adverse Scenario. Values in thousands of dollars From table 4 it is possible to confirm that the simulated scenarios are more severe than those provided by the FED and as a result all bank would have higher capital requirements using these scenarios. The most impacted bank in our analysis was JP Morgan & Co whose simulated loss was 291% higher with the simulated scenarios. 5.3 Robustness check Even though t-copula presents desirable properties for handling high dimension distributions of financial variables, the dependence structure among pairs of variables might vary substantially, ranging from independence to complex forms of non-linear dependence, and, in the case of the t-copula, all dependence is captured by only two parameters, the correlation coefficients and the number of degrees of freedom. Due to this potential limitation, we follow De Genaro (2015) and a robustness check is performed for assessing the t-copula results comparing its outcomes with a more flexible structure given by a fast-growing technique known as pair-copula originally proposed by Joe (1996). In general, as stated in Kurowicka and Cooke (2004) a multivariate density can, under appropriate regularity conditions, be expressed as a product of pair-copulae, acting on several different conditional probability distributions. 22 It is also clear that the construction is iterative in its nature, and that given a specific factorization, there are still many viable re-parameterizations. Here, we concentrate on D-vine, which is a special case of regular vines, where the specification is given in form of a nested set of trees. This section presents the results of a comparative study performed between pair copulae and t-copula using the same dataset used in section 5.1. Whereas pair-copulae is a very flexible approach for handling complex dependence among variable it requires the imposition of significant number of pairs combinations. To overcome this issue, we implemented an automated way to select pairs using Akaike criteria. For every pair of copula we estimate its parameters using maximum likelihood and choose the one with smaller AIC. We tested a total of 18 different pair-copulae: independent; Gaussian; t-copula; Clayton; Gumbel; Frank; Joe; BB1; BB6; BB7; BB8; rotated Clayton; rotated Gumbel; rotated Joe; rotated BB1; rotated BB6; rotated BB7 and rotated BB8. We also estimated the standard t-copula using the same dataset. We depicts the results on table below: Log-Likelihood AIC Pair Copula with AIC 68.20 -128.40 t-copula 65.27 -122.54 Table 5: Fitting results The configuration which presented the smaller AIC was given by: Pair Copula USD/EUR- Dow Jones USD/GBP - Dow Jones USD/EUR-USD/GBP given Dow Jones Family t-copula survival Gumbel Gaussian Parameters (0.71;4.70) 1.15 0.23 Table 6: Best pair-copula configuration In fact, from table 5 we can observe the superior performance of pair-copulae when considered the statistical criteria. As a reality check of our models we simulated 10,000 samples using each configuration and computed percentiles (0.01 − 0.99) for each risk factor and compared them with the empirical distribution. 23 USD/EUR USD/GBP Dow Jones 0.01 -3.37% -2.66% -5.89% 0.99 3.31% 2.10% 4.44% Table 7: Empirical percentiles for each risk factor USD/EUR USD/GBP Dow Jones 0.01 -3.40% -2.82% -5.92% 0.99 3.48% 2.41% 4.65% Table 8: Losses percentiles for each risk factor using Pair-copulae. USD/EUR USD/GBP Dow Jones 0.01 -3.38% -2.80% -5.87% 0.99 3.45% 2.38% 4.55% Table 9: Losses percentiles for each risk factor using t-copula We conclude from tables 8 and 9 that both methodologies are capable to produce results in line with those observed in practice (table 7). We performed a comparative study among t-copula and 18 possible pair copulae and we found that even though a pair copula presents a better fit in our example its results are not substantially different from a t-copula when applied to measure tail risk in a portfolio. Additionally, despite its appealing flexibility for describing dependence among risk factors, pair-copula requires a additional step where one should specify the dependence structure itself which in a real world application might be a high dimension problem there are no enough elements to replace the parsimonious t-copula. 6 Final remarks Stress testing has become an increasingly popular tool for assessing the resilience of financial institutions to adverse macro-financial developments. The 2008 financial turmoil and the euro area sovereign debt crisis, which exposed the financial sector to unprecedented adverse shocks, reinforced this trend. Regulators in different jurisdictions have implemented stricter rules and reforms so as to increase banks resilience. Stress tests held by Federal Reserve and 24 European Banking Authority allowed banks to meet strict capital requirements ensuring that they are allegedly well-cushioned during future crises. Despite the substantial analytical advances in stress testing techniques made in recent years by the regulators and academics, several challenges remain. Firstly, it is related to the fact that while the scenarios should be, by definition, stressful it should not be implausible. In other words, when designing the scenarios, due consideration needs to be given to ensuring a level of severity that is appropriate but it should reflect a material risk that enforce institutions to pursue risk-reduction actions. Another important point is that capital requirements based on a handful of stress scenarios can create a false illusion of safety. In general, bank holding companies (BHCs) have complex portfolios with a countless number of non-linear instruments so it is important to perform stress testing covering a wide spectrum of price variations as a way to unveil the potential losses resulting from non-linear payoffs. Whereas some stress testing frameworks, including the one presented here, have already made steps towards overcoming some of these analytical challenges, further efforts are clearly needed in coming years to improve the overall reliability and accuracy of stress test exercises. References [1] Avellaneda, M., Buff, R., Friedman, C. A., Grandechamp, N., Kruk, L. and Newman, J. (2001) Weighted Monte Carlo: A New Technique for Calibrating Asset-Pricing Models (2001). International Journal of Theoretical and Applied Finance, 4,1–29 . [2] Bertsimas, D. and Pachamanova, D. (2008). Robust multiperiod portfolio management in the presence of transaction costs, Computers & Operations Research, 35, 3-17. [3] Bollerslev, T. (1987), A Conditionally Heteroskedastic Time Series Model for Speculative Prices and Rates of Return, Review of Economics and Statistics, 69, 542–547 [4] Basel Committee on Banking Supervision.(2005). International Convergence of Capital Measurement and Capital Standards: A Revised Framework. Technical Report, Bank for International Settlements. [5] Bedford, T. and R. M. Cooke (2001). Probability density decomposition for conditionally dependent random variables modeled by vines. Annals of Mathematics and Artificial Intelligence 32, 245-268. [6] Bedford, T. and R. M. Cooke (2002). Vines - a new graphical model for dependent random variables. Annals of Statistics 30(4), 1031-1068. 25 [7] Bollerslev, T. and J.M. Wooldridge (1992), Quasi-Maximum Likelihood Estimation and Inference in Dynamic Models with Time Varying Covariances, Econometric Reviews, 11, 143–172. [8] Breuer, T. and Csiszár, I. (2013). Systematic stress tests with entropic plausibility constraints. Journal of Banking & Finance, 37, 1552-1559 [9] Breuer, T. and Krenn, G. (1999). Stress Testing. Guidelines on Market Risk 5, Oesterreichische Nationalbank, Vienna. [10] Breuer, T., Jandacka, M., Mencı́a, J., Summer, M.(2012). A systematic approach to multi-period stress testing of portfolio credit risk. Journal of Banking and Finance, 36(2), 332-340. [11] Breymann, W., Dias, A. and Embrechts, P. (2003). Dependence structures for multivariate high-frequency data in finance. Quantitative Finance 3, 1-14. [12] Christensen, J. H.E. , Lopez, J. A. and Rudebusch, G. D. (2015) A probability-based stress test of Federal Reserve assets and income. Journal of Monetary Economics, 73, 26-43 [13] Committee on Payment and Settlement Systems and Technical Committee of the International Organization of Securities Commissions (2012), Principles for financial markets infrastructures. [14] De Genaro (2015) Systematic multi-period stress scenarios with an application to CCP risk management, Journal of Banking and Finance, forthcoming. [15] Embrechts, P., McNeil, A. and Straumann, D. (2001). Correlation and dependency in risk management: properties and pitfalls. In Risk Management: Value at Risk and Beyond, M. Dempster H. Moffatt, eds. Cambridge University Press, 176-223. [16] Embrechts, P., Resnick, S. and Samorodnitsky, G. (1999). Extreme Value Theory as a Risk Management Tool. North American Actuarial Journal, 3(2), 30–41. [17] Glosten, L.R., R. Jagannathan and D. Runkle (1993), On the Relation Between the Expected Value and the Volatility of the Nominal Excess Return on Stocks, Journal of Finance, 48, 1779–1801. [18] Hansen, B.E. (1994), Autoregressive conditional density estimation, International Economic Review, 35(3), 705-730. [19] Joe, H. (1996). Families of m-variate distributions with given margins and m(m-1)/2 bi- variate dependence parameters. In L. Ruschendorf and B. Schweizer and M. D. Taylor (Ed.), Distributions with Fixed Marginals and Related Topics. 26 [20] Kurowicka, D. and R. M. Cooke (2004). Distribution - free continuous Bayesian belief nets. In Fourth International Conference on Mathematical Methods in Reliability Methodology and Practice, Santa Fe, New Mexico. [21] Mashal, R. and Zeevi, A. (2002). Beyond correlation: extreme comovements between financial assets. Unpublished, Columbia University. [22] McNeil, A. (1999) Extreme Value Theory for Risk Managers. Unpublished, ETH Zurich. [23] McNeil, A.J., Frey R., Estimation of tail-related risk measures for heteroscedastic financial time series: An extreme value approach, Journal of Empirical Finance, 7, 271-300. [24] Mendes, B.V., 2005. Asymmetric extreme interdependence in emerging equity markets. Applied Stochastic Models in Business and Industry, 21, 483– 498. [25] Nelsen, R. B. (1999). An Introduction to Copulas. Springer, New York. [26] Nelson, D.B. and Cao, C. (1992). Inequality Constraints in the Univariate GARCH Model, Journal of Business and Economic Statistics, 10, 229–235. [27] Ou, J. (2005). Theory of portfolio and risk based on incremental entropy. Journal of Risk Finance, 6, 31-39. [28] Philippatos, G. and Wilson, C. (1972). Entropy, market risk, and the selection of efficient portfolios. Applied Economics, 4, 209-220. [29] Romano, C. (2002) Calibrating and simulating copula functions: An application to the Italian stock market, Working paper n. 12, CIDEM. [30] Studer, G. (1999). Market risk computation for nonlinear portfolios. Journal of Risk, 1(4), 33-53. [31] Zhou, R., Ru Cai, R. and Tong, G. (2013). Applications of Entropy in Finance: A Review. Entropy, 15, 4909–4931; 27