\documentclass[article, nojss]{jss} \author{Ostap Okhrin \\Technische Universit\"{a}t Dresden \And Alexander Ristig} \title{Hierarchical Archimedean Copulae: The \pkg{HAC} Package} \Plainauthor{Ostap Okhrin, Alexander Ristig} \Plaintitle{Hierarchical Archimedean Copulae: The HAC Package} \Abstract{ This paper presents the \proglang{R} package \pkg{HAC}, which provides user friendly methods for dealing with hierarchical Archimedean copulae (HAC). Computationally efficient estimation procedures allow to recover the structure and the parameters of HAC from data. In addition, arbitrary HAC can be constructed to sample random vectors and to compute the values of the corresponding cumulative distribution plus density functions. Accurate graphics of the HAC structure can be produced by the \code{plot} method implemented for these objects. } \Keywords{copula, \proglang{R}, hierarchical Archimedean copula, HAC} \Plainkeywords{copula, R, hierarchical Archimedean copula, HAC} \Address{ Ostap Okhrin\\ Chair of Econometrics and Statistics esp. Transportation\\ Institute of Economics and Transport\\ Faculty of Transportation\\ Technische Universit\"{a}t Dresden\\ 01069 Dresden, Germany\\ E-mail: \email{ostap.okhrin@tu-dresden.de}\\ URL: \url{https://tu-dresden.de/bu/verkehr/ivw/osv/die-professur/inhaber-in} } \usepackage{enumerate} \usepackage{amsmath} \usepackage{amssymb} \usepackage{multirow} \usepackage{amsthm} \usepackage{rotating} \newtheorem{algorithm}{Algorithm} \usepackage[english, ngerman]{babel} \selectlanguage{english} \newsavebox\verbboxone \newsavebox\verbboxtwo %\VignetteIndexEntry{HAC} %\VignetteDepends{HAC} %\VignetteDepends{copula} \begin{document} <>= options(prompt = "R> ", continue = "+ ", width = 70, useFancyQuotes = FALSE) library("HAC") @ \section[Introduction]{Introduction}\label{sec:Intro} The use of copulae in applied statistics began in the end of the 90ies, when \citet{embrechts_mcneil_straumann_1999} introduced copula to empirical finance in the context of risk management. Nowadays, quantitative orientated sciences like biostatistics and hydrology use copulae to attempt measuring the dependence of random variables, e.g., \citet{lakhal_2010, acar_craiu_yao_2011, bardossy_2006, genest_favre_2007, bardossy_li_2008}. In finance, copulae became a standard tool, explicitly on value at risk ($\operatorname{VaR}$) measurement and in valuation of structured credit portfolios, see \citet{mendes_souza_2004, junker_may_2005} and \citet{li_2000}. This paper aims at providing the necessary tools for academics and practitioners for simple and effective use of hierarchical Archimedean copulae (HAC) in their statistical analysis. A copula is the function splitting a multivariate distribution into its margins and a pure dependency component. Formally, copulae are introduced in \citet{sklar_1959} stating that if $F$ is an arbitrary $d$-dimensional continuous distribution function of the random vector $X=(X_1,\dots,X_d)^{\top}$, then the associated copula is unique and defined as the continuous mapping $C:[0,1]^d\rightarrow[0,1]$ which satisfies the equality \begin{align*} C(u_1,\dots,u_d)&=F\{F_1^{-1}(u_1),\dots,F_d^{-1}(u_d)\},\quad u_1,\ldots,u_d\in[0,1], \end{align*} where $F_1^{-1}(\cdot),\ldots,F_d^{-1}(\cdot)$ are the quantile functions of the corresponding continuous marginal distribution functions $F_1(x_1),\ldots,F_d(x_d)$. Accordingly, a $d$-dimensional density $f(\cdot)$ can be split in the copula density $c(\cdot)$ and the product of the marginal densities. For an overview and recent developments of copulae we refer to \citet{nelsen_2006}, \citet{cherubini_luciano_vecchiato_2004}, \citet{joe_1997} and \citet{jaworski_durante_haerdle_2013}. If $F(\cdot)$ belongs to the class of elliptical distributions, then $C(\cdot)$ is an elliptical copula, which in most cases cannot be given explicitly because the distribution function $F(\cdot)$ and the inverse marginal distributions $F_j(\cdot)$ usually have integral representations. One of the classes that overcomes this drawback of elliptical copulae is the class of Archimedean copulae, which, however, is very restrictive yet for moderate dimensions. Among other \proglang{R} \citep{R} packages dealing with Archimedean copula \citep[see for example][]{task}, we would like to mention the \pkg{copula} and the \pkg{fCopulae} package, c.f.~\cite{yan_2007, kojadinovic_yan_2010, hofert_maechler_2011, copula} and \cite{fCopulae}. HAC generalize the concept of simple Archimedean copulae by substituting (a) marginal distribution(s) by a further HAC. This class is thoroughly analyzed in \citet{embrechts_lindskog_mcneil_2003, whelan_2004, savu_trede_2010, hofert_11, okhrin_okhrin_schmid_2013b}. The first sampling algorithms for special HAC structures were provided by the \pkg{QRMlib} package of \citet{QRMlib_2011}, which is not updated anymore, but several functions were ported to the \pkg{QRM} package \citep[see][]{QRM_2012}. \citet{nacopula} presented the comprehensive \pkg{nacopula} package which, among other features, allows sampling from arbitrary HAC and was integrated into the package \pkg{copula} from version 0.8-1 on. The central contribution of the \pkg{HAC} package \citep{hac} is the estimation of the parameter and the structure for this class of copulae, as discussed in \citet{okhrin_okhrin_schmid_2013a}, including a simple and intuitive representation of HAC as \proglang{R} objects of the class `\code{hac}'. We provide several maximum likelihood (ML) based estimation procedure, which determine the parameter and the structure simultaneously as well as procedures for a predetermined structure estimating only the parameters. The asymptotic properties of the procedures are rigorously studied in the literature, e.g., see \citet{okhrin_okhrin_schmid_2013a} and \citet{gorecki_hofert_holena_2014}. Besides, the package offers functions for producing graphics of the copula's structure, for sampling random vectors from a given copula and for computing values of the corresponding distribution and density. An earlier version of this paper, for \pkg{HAC} 1.0-0, has been published as \citet{okhrin_ristig_2014}. The paper is organized as follows. Section~\ref{sec:PHAC} describes shortly the theoretical aspects of HAC and its estimation. Section~\ref{sec:AHAC} presents the functions of the \pkg{HAC} package and Section~\ref{sec:sim} a simulation study. Section \ref{sec:Con}~concludes. \section[Hierarchical Archimedean copulae]{Hierarchical Archimedean copulae}\label{sec:PHAC} As mentioned above, the large class of copulae, which can describe tail dependency, non-ellipticity, and, most importantly, has close form representation \begin{align}\label{eqn:m_arch} C(u_1,\ldots,u_d; \theta)&=\phi_{\theta}\left\{\phi_{\theta}^{-1}(u_1)+\dots+\phi_{\theta}^{-1}(u_d)\right\},\quad u_1,\ldots,u_d\in[0,1], \end{align} where $\phi_{\theta}(\cdot)\in\mathfrak{L} = \{\phi_{\theta}:[0;\infty)\rightarrow [0,1]\,|\, \phi_{\theta}(0)=1,\,\phi_{\theta}(\infty)=0;\,(-1)^j\phi_{\theta}^{(j)}\geq0;\,j \in \mathbb{N} \}$ and $(-1)^{j}\phi_{\theta}^{(j)}(x)$ being non-decreasing and convex on $[0,\infty)$, for $x>0$, is the class of Archimedean copulae. The function $\phi(\cdot)$ is called the generator of the copula and commonly depends on a single parameter $\theta$. For example, the Gumbel generator is given by $\phi_{\theta}(x) = \exp(-x^{1/\theta})$ for $0\leq x<\infty,\ 1\leq \theta< \infty$. Detailed reviews of the properties of Archimedean copulae can be found in \citet{mcneil_neslehova_2008} and in \citet{joe_1997}. A disadvantage of Archimedean copulae is the fact that the multivariate dependency structure is very restricted, since it typically depends on a single parameter of the generator function $\phi(\cdot)$. Moreover, the rendered dependency is symmetric with respect to the permutation of variables, i.e., the distribution is exchangeable. HAC (also called nested Archimedean copulae) overcome this problem by considering the compositions of simple Archimedean copulae. For example, the special case of four-dimensional fully nested HAC can be given by \begin{align}\label{eqn:fully_nested_a} C(u_1, u_2, u_3, u_4) &= C_3\{C_2(u_1, u_2, u_3), u_4\}\\ &= \phi_3\{\phi^{-1}_3\circ C_2(u_1,u_2,u_3)+\phi^{-1}_3(u_4)\},\nonumber \end{align} where $C_j(u_1, \ldots, u_{j+1})=\phi_j[\phi^{-1}_j \{C_{j-1}(u_1,\ldots,u_{j})\}+\phi^{-1}_j(u_{j+1})]$, $j=2,\ldots,d-1$, and $C_1 = \phi_1\{\phi^{-1}_1(u_1)+\phi^{-1}_1(u_2)\}$. The functional form of $C_j(\cdot)$ indicates that the composition can be applied recursively. A different segmentation of the variables leads naturally to more complex HAC. In the following, let $d$-dimensional HAC be denoted by $C(u_1,\dots,u_d; s,\pmb{\theta})$, where $\pmb{\theta}$ denotes the vector of feasible dependency parameters and $s=(\ldots(i_g i_k)i_{\ell}\ldots)$ the structure of the entire HAC, where $i_m\in\{1,\ldots,d\}$ is a reordering of the indices of the variables with $m=1\ldots,d$, and $g,k,\ell\in\{1,\ldots,d: g\neq k\neq\ell\}$. Structures of subcopulae are denoted by $s_j$ with $s=s_{d-1}$. For instance, the structure according to Equation~\ref{eqn:fully_nested_a} is $s=(s_2)4$ with $s_j=(s_{j-1}(j+1))$, $j=2,3$, for the sucopulae and $s_1=(12)$. A clear definition of the structure is essential, as $s$ is in fact a parameter to estimate. Thus, Equation~\ref{eqn:fully_nested_a} can be rewritten as \begin{align*} C(u_1, u_2, u_3, u_4; s=(((12)3)4), \pmb{\theta}) &= C\{u_1, u_2, u_3, u_4; (s_{2}4),(\theta_1, \theta_2, \theta_3)^\top\}\\ & = \quad \phi_{\theta_{3}}(\phi^{-1}_{\theta_{3}}\circ C_2\{u_1, u_2, u_3; (s_{1}(3)), (\theta_1,\theta_{2})^\top\}+\phi^{-1}_{\theta_{3}}(u_4)). \end{align*} Figure~\ref{fig:4dimHAC} presents the four-dimensional fully and partially nested Archimedean copula. <