Risk management: market risk

Overview

  • get logarithmic return data
  • check assumption of normality
  • VaR and ES estimation
  • backtesting
  • simulation study: estimation risk

Basic setup

  • loading required packages
  • optional: install missing packages with Pkg.add()
In [1]:
# Pkg.add("EconDatasets")
# Pkg.add("TimeData")

using EconDatasets
using TimeData
using Dates
using Gadfly
using Distributions
using JuMP
using NLopt
using Base.Test
Warning: New definition 
    get(Any,Colon) at /home/jovyan/.julia/v0.3/TimeData/src/abstractFuncs.jl:77
is ambiguous with: 
    get(Nullable{T},Any) at /home/jovyan/.julia/v0.3/Compat/src/nullable.jl:39.
To fix, define 
    get(Nullable{T},Colon)
before the new definition.
Warning: New definition 
    get(Any,Colon) at /home/jovyan/.julia/v0.3/TimeData/src/abstractFuncs.jl:77
is ambiguous with: 
    get(OrderedDict{K,V},Any...) at /home/jovyan/.julia/v0.3/DataStructures/src/delegate.jl:11.
To fix, define 
    get(OrderedDict{K,V},Colon)
before the new definition.
Warning: New definition 
    get(Any,Colon) at /home/jovyan/.julia/v0.3/TimeData/src/abstractFuncs.jl:77
is ambiguous with: 
    get(DefaultDictBase{K,V,F,D},Any...) at /home/jovyan/.julia/v0.3/DataStructures/src/delegate.jl:11.
To fix, define 
    get(DefaultDictBase{K,V,F,D},Colon)
before the new definition.
Warning: New definition 
    get(Any,Colon) at /home/jovyan/.julia/v0.3/TimeData/src/abstractFuncs.jl:77
is ambiguous with: 
    get(DefaultDict{K,V,F},Any...) at /home/jovyan/.julia/v0.3/DataStructures/src/delegate.jl:11.
To fix, define 
    get(DefaultDict{K,V,F},Colon)
before the new definition.
Warning: New definition 
    get(Any,Colon) at /home/jovyan/.julia/v0.3/TimeData/src/abstractFuncs.jl:77
is ambiguous with: 
    get(DefaultOrderedDict{K,V,F},Any...) at /home/jovyan/.julia/v0.3/DataStructures/src/delegate.jl:11.
To fix, define 
    get(DefaultOrderedDict{K,V,F},Colon)
before the new definition.

Get data

  • download data from Yahoo Finance: NASDAQ-100, S&P 500 and EURO STOXX 50
In [2]:
tickerSymbs = ["^NDX"
               "^GSPC"
               "^STOXX50E"]


dates = Date(1960,1,1):Date(2015,3,30)

indexData = [readYahooFinance(dates, symb) for symb in tickerSymbs];
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  519k    0  519k    0     0   505k      0 --:--:--  0:00:01 --:--:--  506k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  964k    0  964k    0     0   993k      0 --:--:-- --:--:-- --:--:--  993k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  385k    0  385k    0     0   536k      0 --:--:-- --:--:-- --:--:--  536k
  • take a peek at data
In [3]:
## display first five dates of NASDAQ
indexData[2][1:5, :]
Out[3]:

Timematr{Date}

Dimensions: (5, 6)

From: 1950-01-03, To: 1950-01-09

idxOpenHighLowCloseVolumeAdj_Close
11950-01-0316.6616.6616.6616.661.26e616.66
21950-01-0416.8516.8516.8516.851.89e616.85
31950-01-0516.9316.9316.9316.932.55e616.93
41950-01-0616.9816.9816.9816.982.01e616.98
51950-01-0917.0817.0817.0817.082.52e617.08

Select single index

  • create lookup table to be able to select index data by name:
In [4]:
lookUpIndices = {:nasdaq => 1, :sp500 => 2, :stoxx => 3}
Out[4]:
Dict{Any,Any} with 3 entries:
  :nasdaq => 1
  :sp500  => 2
  :stoxx  => 3
  • select index of interest
In [5]:
chosenIndex = :nasdaq
Out[5]:
:nasdaq
  • extract adjusted closing prices for a single index
  • rename column to index name
In [6]:
indexPriceData = indexData[lookUpIndices[chosenIndex]][:Adj_Close]
rename!(indexPriceData.vals, :Adj_Close, chosenIndex)

indexPriceData[1:5, :]
Out[6]:

Timematr{Date}

Dimensions: (5, 1)

From: 1985-10-01, To: 1985-10-07

idxnasdaq
11985-10-01112.14
21985-10-02110.825
31985-10-03110.87
41985-10-04110.075
51985-10-07108.2

Plot price data

  • enable plotting for TimeData instances:
In [7]:
loadPlotting()
Out[7]:
wstHist (generic function with 4 methods)
  • plot historic price data
In [8]:
gdfPlot(indexPriceData)
Out[8]:
Idx 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020 2030 2040 2050 2060 2070 1940 1942 1944 1946 1948 1950 1952 1954 1956 1958 1960 1962 1964 1966 1968 1970 1972 1974 1976 1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 2024 2026 2028 2030 2032 2034 2036 2038 2040 2042 2044 2046 2048 2050 2052 2054 2056 2058 2060 1900 1950 2000 2050 2100 1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020 2025 2030 2035 2040 2045 2050 2055 2060 nasdaq variable -6.0×10³ -5.0×10³ -4.0×10³ -3.0×10³ -2.0×10³ -1.0×10³ 0 1.0×10³ 2.0×10³ 3.0×10³ 4.0×10³ 5.0×10³ 6.0×10³ 7.0×10³ 8.0×10³ 9.0×10³ 1.0×10⁴ 1.1×10⁴ -5.0×10³ -4.8×10³ -4.6×10³ -4.4×10³ -4.2×10³ -4.0×10³ -3.8×10³ -3.6×10³ -3.4×10³ -3.2×10³ -3.0×10³ -2.8×10³ -2.6×10³ -2.4×10³ -2.2×10³ -2.0×10³ -1.8×10³ -1.6×10³ -1.4×10³ -1.2×10³ -1.0×10³ -8.0×10² -6.0×10² -4.0×10² -2.0×10² 0 2.0×10² 4.0×10² 6.0×10² 8.0×10² 1.0×10³ 1.2×10³ 1.4×10³ 1.6×10³ 1.8×10³ 2.0×10³ 2.2×10³ 2.4×10³ 2.6×10³ 2.8×10³ 3.0×10³ 3.2×10³ 3.4×10³ 3.6×10³ 3.8×10³ 4.0×10³ 4.2×10³ 4.4×10³ 4.6×10³ 4.8×10³ 5.0×10³ 5.2×10³ 5.4×10³ 5.6×10³ 5.8×10³ 6.0×10³ 6.2×10³ 6.4×10³ 6.6×10³ 6.8×10³ 7.0×10³ 7.2×10³ 7.4×10³ 7.6×10³ 7.8×10³ 8.0×10³ 8.2×10³ 8.4×10³ 8.6×10³ 8.8×10³ 9.0×10³ 9.2×10³ 9.4×10³ 9.6×10³ 9.8×10³ 1.0×10⁴ -5×10³ 0 5×10³ 1×10⁴ -5.0×10³ -4.5×10³ -4.0×10³ -3.5×10³ -3.0×10³ -2.5×10³ -2.0×10³ -1.5×10³ -1.0×10³ -5.0×10² 0 5.0×10² 1.0×10³ 1.5×10³ 2.0×10³ 2.5×10³ 3.0×10³ 3.5×10³ 4.0×10³ 4.5×10³ 5.0×10³ 5.5×10³ 6.0×10³ 6.5×10³ 7.0×10³ 7.5×10³ 8.0×10³ 8.5×10³ 9.0×10³ 9.5×10³ 1.0×10⁴ value
  • same plot with logarithmic prices
In [9]:
gdfPlot(log(indexPriceData))
Out[9]:
Idx 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020 2030 2040 2050 2060 2070 1940 1942 1944 1946 1948 1950 1952 1954 1956 1958 1960 1962 1964 1966 1968 1970 1972 1974 1976 1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 2024 2026 2028 2030 2032 2034 2036 2038 2040 2042 2044 2046 2048 2050 2052 2054 2056 2058 2060 1900 1950 2000 2050 2100 1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020 2025 2030 2035 2040 2045 2050 2055 2060 nasdaq variable -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8 7.0 7.2 7.4 7.6 7.8 8.0 8.2 8.4 8.6 8.8 9.0 9.2 9.4 9.6 9.8 10.0 10.2 10.4 10.6 10.8 11.0 11.2 11.4 11.6 11.8 12.0 12.2 12.4 12.6 12.8 13.0 13.2 13.4 13.6 13.8 14.0 -5 0 5 10 15 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 value

Plot returns

  • calculate discrete and logarithmic returns
In [10]:
function calcDiscRet(tm::Timematr)
    return 100*(tm[2:end, :] .- tm[1:(end-1), :])./tm[1:(end-1), :]
end

function calcLogRet(tm::Timematr)
    return 100*(log(tm[2:end, :]) .- log(tm[1:(end-1), :]))
end
Out[10]:
calcLogRet (generic function with 1 method)
In [11]:
indexRetData = calcLogRet(indexPriceData)
indexRetData[1:5; :]
Out[11]:

Timematr{Date}

Dimensions: (5, 1)

From: 1985-10-02, To: 1985-10-08

idxnasdaq
11985-10-02-1.18
21985-10-030.041
31985-10-04-0.72
41985-10-07-1.718
51985-10-08-0.966
  • plotting logarithmic return data: returns exhibit volatility clusters
In [12]:
gdfPlot(indexRetData)
Out[12]:
Idx 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020 2030 2040 2050 2060 2070 1940 1942 1944 1946 1948 1950 1952 1954 1956 1958 1960 1962 1964 1966 1968 1970 1972 1974 1976 1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 2024 2026 2028 2030 2032 2034 2036 2038 2040 2042 2044 2046 2048 2050 2052 2054 2056 2058 2060 1900 1950 2000 2050 2100 1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020 2025 2030 2035 2040 2045 2050 2055 2060 nasdaq variable -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 -60 -58 -56 -54 -52 -50 -48 -46 -44 -42 -40 -38 -36 -34 -32 -30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 -60 -30 0 30 60 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 value

Logarithmic return distribution

  • extract return data without metadata
In [13]:
logRets = asArr(indexRetData, Float64)[:];
  • display histogram
In [14]:
plot(x=logRets, Geom.histogram(bincount = 100))
Out[14]:
x -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 -60 -58 -56 -54 -52 -50 -48 -46 -44 -42 -40 -38 -36 -34 -32 -30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 -60 -30 0 30 60 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 -1500 -1000 -500 0 500 1000 1500 2000 2500 -1000 -950 -900 -850 -800 -750 -700 -650 -600 -550 -500 -450 -400 -350 -300 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 1050 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1700 1750 1800 1850 1900 1950 2000 -1000 0 1000 2000 -1000 -900 -800 -700 -600 -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000

fit normal distribution

pdf:

$$ \begin{equation*} f(x_{i};\mu,\sigma)=\frac{1}{\sqrt{2 \pi \sigma^{2}}}\exp\left(-\frac{(x_{i}-\mu)^{2}}{2\sigma^{2}}\right) \end{equation*} $$
  • log-likelihood function
$$ \begin{align*} \ln(f(x_{i};\mu,\sigma))&=\ln\left((2\pi \sigma^{2})^{-\frac{1}{2}}\exp\left(-\frac{(x_{i}-\mu)^{2}}{2\sigma^{2}}\right)\right)\\ &=\ln\left((2\pi \sigma^{2})^{-\frac{1}{2}}\right) + \ln\left(\exp\left(-\frac{(x_{i}-\mu)^{2}}{2\sigma^{2}}\right)\right)\\ &=-\frac{1}{2}\ln\left(2\pi \sigma^{2}\right) - \left(\frac{(x_{i}-\mu)^{2}}{2\sigma^{2}}\right) \end{align*} $$
  • implementation
In [15]:
function norm_negLlh(x::Array{Float64, 1}, mu::Float64, sigma::Float64)
    llhs = -0.5*log(sigma^2*2*pi)-0.5*(x-mu).^2/sigma^2
    return return -sum(llhs)
end
Out[15]:
norm_negLlh (generic function with 1 method)
  • test negative log-likelihood with built-in function
In [16]:
normDist = Normal(0., 1.0)

@test_approx_eq -sum(log(pdf(normDist, logRets))) norm_negLlh(logRets, 0., 1.)
  • determine objective function
In [17]:
count = 0 # keep track of number of function evaluations

function objFunc(x::Vector, grad::Vector)
    if length(grad) > 0
    end

    global count
    count::Int += 1
    # println("f_$count($x)")
    
    mu = x[1]
    sigma = x[2]

    return norm_negLlh(logRets, mu, sigma)
end
Out[17]:
objFunc (generic function with 1 method)
  • fit normal distribution using NLopt
In [18]:
opt = Opt(:LN_COBYLA, 2)
lower_bounds!(opt, [-Inf, 0.])
xtol_rel!(opt,1e-8)

min_objective!(opt, objFunc)

(minf,minx,ret) = optimize(opt, [0.0, 1.0])
println("got $minf at $minx after $count iterations (returned $ret)")
got 14526.376576784487 at [0.04915571949192298,1.7076314260696241] after 67 iterations (returned XTOL_REACHED)
  • compare result to built-in function
In [19]:
normFit = fit(Normal, logRets)
Out[19]:
Normal(μ=0.0491557033633773, σ=1.7076314274539048)
  • unit test
In [20]:
muHat, sigmaHat = minx

muHat_builtin, sigmaHat_builtin = params(normFit)

@test_approx_eq_eps muHat muHat_builtin 0.0000001
@test_approx_eq_eps sigmaHat sigmaHat_builtin 0.0000001

Comparison with empirical distribution

  • calculate values of empirical cdf:
In [21]:
function ecdfVals(x::Array{Float64, 1})
    xVals = sort(x)
    nObs = size(x, 1)
    yVals = [1:nObs]./(1+nObs)
    return (xVals, yVals)
end

xVals, yVals = ecdfVals(logRets);
  • plot ecdf together with cdf of fitted normal distribution:
In [22]:
# calculate values of normal cdf
nCdfVals = cdf(normFit, xVals)

plot(layer(x=xVals, y=nCdfVals, Geom.step),
layer(x=xVals, y=yVals, Geom.line, Theme(default_color=color("red"))))
Out[22]:
x -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 -60 -58 -56 -54 -52 -50 -48 -46 -44 -42 -40 -38 -36 -34 -32 -30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 -60 -30 0 30 60 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75 -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95 2.00 -1 0 1 2 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 y

qq-plot

  • compare quantiles of fitted normal distribution with empirical quantiles
In [23]:
normVals = quantile(normFit, yVals)

plot(layer(x=xVals, y=normVals, Geom.point,Theme(default_point_size=0.6mm)),
layer(x=[minimum(xVals), maximum(xVals)], y=[minimum(xVals), maximum(xVals)], Geom.line))
Out[23]:
x -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 -60 -58 -56 -54 -52 -50 -48 -46 -44 -42 -40 -38 -36 -34 -32 -30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 -60 -30 0 30 60 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 -60 -58 -56 -54 -52 -50 -48 -46 -44 -42 -40 -38 -36 -34 -32 -30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 -60 -30 0 30 60 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 y

VaR and ES

Historical simulation

  • VaR:
In [24]:
alpha = 0.98
logLosses = -logRets

varHS = quantile(logLosses, alpha)
Out[24]:
3.8419575995586728
  • expected shortfall:
In [25]:
function empES(x::Array{Float64, 1}, alphas::Array{Float64, 1})
    nVars = length(alphas)
    varHS = quantile(x, alphas)
    esHS = Array(Float64, nVars)
        
    for ii=1:nVars
        exceedances = x[x .> varHS[ii]]
        esHS[ii] = mean(exceedances)
    end
    return esHS
end

empES(logLosses, [alpha])
Out[25]:
1-element Array{Float64,1}:
 5.42294

Under normality

  • VaR: using
\begin{equation*} \text{VaR}_{\alpha}=\mu_{L}+\sigma\Phi^{-1}(\alpha) \end{equation*}
In [26]:
muL = -muHat
sigma = sigmaHat
varNorm = muL + sigma*quantile(Normal(), alpha)
Out[26]:
3.4578904615592325
  • expected shortfall: using
\begin{equation*} \text{ES}_{\alpha}=\mu_{L}+\sigma \frac{\phi\left( \Phi^{-1}(\alpha) \right) }{1-\alpha} \end{equation*}
In [27]:
esNorm = muL + sigma*(pdf(Normal(), quantile(Normal(), alpha))/(1-alpha))
Out[27]:
4.084860801591298
In [28]:
varNorm = quantile(normFit, alpha)

(varHS, varNorm)
Out[28]:
(3.8419575995586728,3.5562018872574974)

Model risk

  • given that returns in the real world were indeed generated by an underlying normal distribution, we could determine the risk inherent to the investment up to a small error arising from estimation errors
  • however, returns of the real world are not normally distributed
  • in addition to the risk deduced from the model, the model itself could be significantly different to the processes of the real world that are under consideration
  • the risk of deviations of the specified model from the real world is called model risk

Backtesting

  • calculate hit rates for VaR under normality:
In [29]:
hitsNorm = sum(logLosses .> varNorm)./size(logLosses, 1)

println("Instead of $(1-alpha) exceedances, we get:")
println("   using normality:       $hitsNorm")
Instead of 0.020000000000000018 exceedances, we get:
   using normality:       0.026365348399246705
  • visualize backtesting
In [30]:
exceedInd = asArr(-indexRetData .> varNorm, Bool, false)[:]
exceedances = indexRetData[exceedInd, :]
noExceedances = indexRetData[!exceedInd, :]

plot(layer(x=idx(noExceedances), y=asArr(noExceedances, Float64), 
Geom.point, Theme(default_point_size=0.5mm)),
layer(x=idx(exceedances), y=asArr(exceedances, Float64), 
Geom.point, Theme(default_color=color("red"), default_point_size=0.5mm)))
Out[30]:
x Jan 1, 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020 2030 2040 2050 2060 2070 Jan 1, 1940 1960 1980 2000 2020 2040 2060 Jan 1, 1940 1960 1980 2000 2020 2040 2060 Jan 1, 1940 1960 1980 2000 2020 2040 2060 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 -60 -58 -56 -54 -52 -50 -48 -46 -44 -42 -40 -38 -36 -34 -32 -30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 -60 -30 0 30 60 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 y
  • backtesting VaR-calculations based on assumption of independent normally distributed losses generally leads to two patterns:

    • percentage frequencies of VaR-exceedances are higher than the confidence levels specified: normal distribution assigns too less probability to large losses
    • VaR-exceedances occur in clusters: given an exceedance of VaR today, the likelihood of an additional exceedance in the days following is larger than average
  • the results of the backtesting procedure indicate substantial model risk involved in the framework of assumed normally distributed losses

  • clustered exceedances indicate violation of independence of losses over time $\Rightarrow$ clusters have to be captured through time series models
  • backtesting historical simulation VaR:
In [31]:
hitsHS = sum(logLosses .> varHS)./size(logLosses, 1)

println("Instead of $(1-alpha) exceedances, we get:")
println("   historical simulation: $hitsHS")
Instead of 0.020000000000000018 exceedances, we get:
   historical simulation: 0.02004304546677428
  • Note: exceedance frequencies for historical simulation equal predefined confidence level per definition

$\Rightarrow$ overfitting

Simulation study: estimation error

  • historical simulation does not entail any model risk
  • backtesting historical simulation always shows perfectly accurate in-sample frequencies

$\Rightarrow$ is historical simulation the best VaR estimator?

  • although historical simulation does not entail model risk, the associated estimation risk can be huge
  • for example, with 2500 observations, only 2.5 observations are above $VaR_{0.999}$

Simulation study:

  • take some return distribution as given $\Rightarrow$ VaR is known
  • simulate sample of given size
  • estimate VaR for given sample with historical simulation
  • repeat procedure to get distribution of historical simulation estimator
In [32]:
distr = TDist(3)
alphas = [0.99, 0.995, 0.999]

nReps = 100000
nObs = 2500

# preallocation
simVals = Array(Float64, nObs)

simVar = Array(Float64, nReps, length(alphas))

@time begin
    for ii=1:nReps
        simVals = rand(distr, nObs)
        simVar[ii, :] = quantile(simVals, alphas)
    end
end
elapsed time: 56.254995155 seconds (5627688420 bytes allocated, 14.34% gc time)
In [33]:
ps = [plot(x=simVar[:, ii][:], Geom.histogram(bincount = 100)) for ii=1:length(alphas)]

vstack(ps...)
Out[33]:
x -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 -40 -38 -36 -34 -32 -30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 -50 0 50 100 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 -1.0×10⁴ -8.0×10³ -6.0×10³ -4.0×10³ -2.0×10³ 0 2.0×10³ 4.0×10³ 6.0×10³ 8.0×10³ 1.0×10⁴ 1.2×10⁴ 1.4×10⁴ 1.6×10⁴ 1.8×10⁴ -8.00×10³ -7.50×10³ -7.00×10³ -6.50×10³ -6.00×10³ -5.50×10³ -5.00×10³ -4.50×10³ -4.00×10³ -3.50×10³ -3.00×10³ -2.50×10³ -2.00×10³ -1.50×10³ -1.00×10³ -5.00×10² 0 5.00×10² 1.00×10³ 1.50×10³ 2.00×10³ 2.50×10³ 3.00×10³ 3.50×10³ 4.00×10³ 4.50×10³ 5.00×10³ 5.50×10³ 6.00×10³ 6.50×10³ 7.00×10³ 7.50×10³ 8.00×10³ 8.50×10³ 9.00×10³ 9.50×10³ 1.00×10⁴ 1.05×10⁴ 1.10×10⁴ 1.15×10⁴ 1.20×10⁴ 1.25×10⁴ 1.30×10⁴ 1.35×10⁴ 1.40×10⁴ 1.45×10⁴ 1.50×10⁴ 1.55×10⁴ 1.60×10⁴ -1×10⁴ 0 1×10⁴ 2×10⁴ -8.00×10³ -7.50×10³ -7.00×10³ -6.50×10³ -6.00×10³ -5.50×10³ -5.00×10³ -4.50×10³ -4.00×10³ -3.50×10³ -3.00×10³ -2.50×10³ -2.00×10³ -1.50×10³ -1.00×10³ -5.00×10² 0 5.00×10² 1.00×10³ 1.50×10³ 2.00×10³ 2.50×10³ 3.00×10³ 3.50×10³ 4.00×10³ 4.50×10³ 5.00×10³ 5.50×10³ 6.00×10³ 6.50×10³ 7.00×10³ 7.50×10³ 8.00×10³ 8.50×10³ 9.00×10³ 9.50×10³ 1.00×10⁴ 1.05×10⁴ 1.10×10⁴ 1.15×10⁴ 1.20×10⁴ 1.25×10⁴ 1.30×10⁴ 1.35×10⁴ 1.40×10⁴ 1.45×10⁴ 1.50×10⁴ 1.55×10⁴ 1.60×10⁴ x -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 -2.0 -1.8 -1.6 -1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8 7.0 7.2 7.4 7.6 7.8 8.0 8.2 8.4 8.6 8.8 9.0 9.2 9.4 9.6 9.8 10.0 10.2 10.4 10.6 10.8 11.0 11.2 11.4 11.6 11.8 12.0 12.2 12.4 12.6 12.8 13.0 13.2 13.4 13.6 13.8 14.0 14.2 14.4 14.6 14.8 15.0 15.2 15.4 15.6 15.8 16.0 -10 0 10 20 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 -6.0×10³ -5.0×10³ -4.0×10³ -3.0×10³ -2.0×10³ -1.0×10³ 0 1.0×10³ 2.0×10³ 3.0×10³ 4.0×10³ 5.0×10³ 6.0×10³ 7.0×10³ 8.0×10³ 9.0×10³ 1.0×10⁴ 1.1×10⁴ -5.0×10³ -4.8×10³ -4.6×10³ -4.4×10³ -4.2×10³ -4.0×10³ -3.8×10³ -3.6×10³ -3.4×10³ -3.2×10³ -3.0×10³ -2.8×10³ -2.6×10³ -2.4×10³ -2.2×10³ -2.0×10³ -1.8×10³ -1.6×10³ -1.4×10³ -1.2×10³ -1.0×10³ -8.0×10² -6.0×10² -4.0×10² -2.0×10² 0 2.0×10² 4.0×10² 6.0×10² 8.0×10² 1.0×10³ 1.2×10³ 1.4×10³ 1.6×10³ 1.8×10³ 2.0×10³ 2.2×10³ 2.4×10³ 2.6×10³ 2.8×10³ 3.0×10³ 3.2×10³ 3.4×10³ 3.6×10³ 3.8×10³ 4.0×10³ 4.2×10³ 4.4×10³ 4.6×10³ 4.8×10³ 5.0×10³ 5.2×10³ 5.4×10³ 5.6×10³ 5.8×10³ 6.0×10³ 6.2×10³ 6.4×10³ 6.6×10³ 6.8×10³ 7.0×10³ 7.2×10³ 7.4×10³ 7.6×10³ 7.8×10³ 8.0×10³ 8.2×10³ 8.4×10³ 8.6×10³ 8.8×10³ 9.0×10³ 9.2×10³ 9.4×10³ 9.6×10³ 9.8×10³ 1.0×10⁴ -5×10³ 0 5×10³ 1×10⁴ -5.0×10³ -4.5×10³ -4.0×10³ -3.5×10³ -3.0×10³ -2.5×10³ -2.0×10³ -1.5×10³ -1.0×10³ -5.0×10² 0 5.0×10² 1.0×10³ 1.5×10³ 2.0×10³ 2.5×10³ 3.0×10³ 3.5×10³ 4.0×10³ 4.5×10³ 5.0×10³ 5.5×10³ 6.0×10³ 6.5×10³ 7.0×10³ 7.5×10³ 8.0×10³ 8.5×10³ 9.0×10³ 9.5×10³ 1.0×10⁴ x -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8 7.0 7.2 7.4 7.6 7.8 8.0 8.2 8.4 8.6 8.8 9.0 9.2 9.4 9.6 9.8 10.0 10.2 10.4 10.6 10.8 11.0 -5 0 5 10 15 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 -6.0×10³ -5.0×10³ -4.0×10³ -3.0×10³ -2.0×10³ -1.0×10³ 0 1.0×10³ 2.0×10³ 3.0×10³ 4.0×10³ 5.0×10³ 6.0×10³ 7.0×10³ 8.0×10³ 9.0×10³ 1.0×10⁴ 1.1×10⁴ -5.0×10³ -4.8×10³ -4.6×10³ -4.4×10³ -4.2×10³ -4.0×10³ -3.8×10³ -3.6×10³ -3.4×10³ -3.2×10³ -3.0×10³ -2.8×10³ -2.6×10³ -2.4×10³ -2.2×10³ -2.0×10³ -1.8×10³ -1.6×10³ -1.4×10³ -1.2×10³ -1.0×10³ -8.0×10² -6.0×10² -4.0×10² -2.0×10² 0 2.0×10² 4.0×10² 6.0×10² 8.0×10² 1.0×10³ 1.2×10³ 1.4×10³ 1.6×10³ 1.8×10³ 2.0×10³ 2.2×10³ 2.4×10³ 2.6×10³ 2.8×10³ 3.0×10³ 3.2×10³ 3.4×10³ 3.6×10³ 3.8×10³ 4.0×10³ 4.2×10³ 4.4×10³ 4.6×10³ 4.8×10³ 5.0×10³ 5.2×10³ 5.4×10³ 5.6×10³ 5.8×10³ 6.0×10³ 6.2×10³ 6.4×10³ 6.6×10³ 6.8×10³ 7.0×10³ 7.2×10³ 7.4×10³ 7.6×10³ 7.8×10³ 8.0×10³ 8.2×10³ 8.4×10³ 8.6×10³ 8.8×10³ 9.0×10³ 9.2×10³ 9.4×10³ 9.6×10³ 9.8×10³ 1.0×10⁴ -5×10³ 0 5×10³ 1×10⁴ -5.0×10³ -4.5×10³ -4.0×10³ -3.5×10³ -3.0×10³ -2.5×10³ -2.0×10³ -1.5×10³ -1.0×10³ -5.0×10² 0 5.0×10² 1.0×10³ 1.5×10³ 2.0×10³ 2.5×10³ 3.0×10³ 3.5×10³ 4.0×10³ 4.5×10³ 5.0×10³ 5.5×10³ 6.0×10³ 6.5×10³ 7.0×10³ 7.5×10³ 8.0×10³ 8.5×10³ 9.0×10³ 9.5×10³ 1.0×10⁴
  • given the distribution of our historical simulation estimator, we can evaluate mean squared error as a measure of deviation from the true value
In [34]:
trueVars = quantile(distr, alphas)

mserrors = Float64[mean((simVar[:, ii] - trueVars[ii]).^2) for ii=1:length(alphas)]
Out[34]:
3-element Array{Float64,1}:
 0.112282
 0.334593
 4.19842 

Reproduction info

In [35]:
versioninfo()
Julia Version 0.3.6
Commit a05f87b* (2015-01-08 22:33 UTC)
Platform Info:
  System: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i5-4210U CPU @ 1.70GHz
  WORD_SIZE: 64
  BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3
In [36]:
Pkg.status()
19 required packages:
 - DataArrays                    0.2.13
 - DataFrames                    0.6.3
 - Dates                         0.3.2
 - Debug                         0.1.2
 - Distributions                 0.7.0
 - EconDatasets                  0.0.2
 - GLM                           0.4.5
 - Gadfly                        0.3.11
 - IJulia                        0.2.3
 - JuMP                          0.9.0
 - MAT                           0.2.11
 - NLopt                         0.2.0
 - Plotly                        0.0.3+             master
 - Quandl                        0.4.1
 - RDatasets                     0.1.2
 - Taro                          0.1.4
 - TimeData                      0.5.1
 - TimeSeries                    0.5.7
 - Winston                       0.11.9
57 additional packages:
 - ArrayViews                    0.6.1
 - BinDeps                       0.3.11
 - Blosc                         0.1.2
 - Cairo                         0.2.26
 - Calculus                      0.1.6
 - Codecs                        0.1.3
 - Color                         0.4.4
 - Compat                        0.4.1
 - Compose                       0.3.11
 - Contour                       0.0.6
 - Copulas                       0.0.0-             master (unregistered)
 - DataStructures                0.3.8
 - Distances                     0.2.0
 - Docile                        0.4.10
 - DualNumbers                   0.1.3
 - Econometrics                  0.0.0-             master (unregistered)
 - FixedPointNumbers             0.0.7
 - GZip                          0.2.15
 - GnuTLS                        0.0.4
 - Graphics                      0.1.0
 - Grid                          0.3.8
 - Gumbo                         0.1.2
 - HDF5                          0.4.13
 - HTTPClient                    0.1.4
 - Hexagons                      0.0.2
 - HttpCommon                    0.0.12
 - HttpParser                    0.0.11
 - ImmutableArrays               0.0.6
 - IniFile                       0.2.4
 - Iterators                     0.1.7
 - JSON                          0.4.2
 - JavaCall                      0.2.2
 - KernelDensity                 0.1.0              b2c9f7d6 (dirty)
 - LibCURL                       0.1.4
 - Loess                         0.0.3
 - MathProgBase                  0.3.11
 - Memoize                       0.0.0
 - NaNMath                       0.0.2
 - Nettle                        0.1.8
 - NumericFuns                   0.2.3
 - Optim                         0.4.1
 - PDMats                        0.3.3
 - REPLCompletions               0.0.3
 - Reexport                      0.0.2
 - Requests                      0.0.8
 - Requires                      0.1.1
 - ReverseDiffSparse             0.2.7
 - SHA                           0.0.4
 - Showoff                       0.0.4
 - SortingAlgorithms             0.0.4
 - StatsBase                     0.6.14
 - Tk                            0.3.1
 - URIParser                     0.0.5
 - WoodburyMatrices              0.0.1
 - WorldBankDataTd               0.0.0-             master (unregistered)
 - ZMQ                           0.1.17
 - Zlib                          0.1.8