Wednesday, April 13, 2016

copula

I heard of copula-bivariate statistics many times, but rarely understood what people are calculating when they talked about "fitting a copula". Hopefully I've already got some basic understanding on empirical distribution before I finished the copula homework. Now I've done the "quantiles" part (see the last log), which I assumed would be enough, maybe... My goal is to understand how copula works, by that I mean the principles, and then apply it to the precipitation-temperature case, which I've read some articles about.

First thing first, what do we use copula for, and what is copula function?
Copula describes how the two marginal distributions are linked together to form the joint distribution, and the dependency structure between marginal variables. The idea is to separate the copula fitting from the univariate probability distributions fitting. In other words, copula is used for constructing multivariate probability distribution from univariate probability distributions.
According to Sklar's theorem, there exists a "copula" function C(u, v) such that
Fxy(X,Y) = C[Fx(x), Fy(Y)] = C(u, v)
1. C(u, v) is a 2D function defined on [0, 1]^2, given Fx(X) and Fy(Y) are CDFs.
2. C(u, v) doesn't depend on the marginal distributions
3. C(u, v) is a joint CDF with marginal distributions rescaled to uniform distributions, u=Fx(X), v=Fy(Y).

What are the procedures of fitting a copula distribution? (Gao et al., 2007)
1. fit marginal distributions to each variable using some type of parametric univariate distributions: Fx(X), Fy(Y)
2. fit the copula: C(Fx(X), Fy(Y))
3. sample from the copula using Monte Carlo simulations
3.1 unconditional
      (1) generate random samples of u uniformly from [0, 1]
      (2) generate v|u using the inverse conditional CDF Cv|u^(-1)
      (3) generate x,y
3.2 conditional: