Skip to contents

Gradient descent based on the Clusterpath algorithm adapted to the likelihood of a graphical Hüsler-Reiss model. Usefull when the precision matrix \(\Theta\) (or the variogram \(\Gamma\)) has a block matrix structure.

Usage

get_cluster(gamma, weights, eps_f, ...)

HR_Clusterpath(
  data,
  zeta,
  lambda,
  p = NULL,
  eps_g = 0.001,
  eps_f = 0.01,
  it_max = 1000
)

Arguments

gamma

For get_clusters(), a \(d \times d\) matrix : the variogram matrix \(\Gamma\).

weights

For get_clusters(), the \(d \times d\) symmetric weightsmatrix with a zero diagonal.

eps_f

A positive number : tolerance threshold for merging clusters.

data

For HR_Clusterapth(), a \(n \times d\) matrix : the matrix of data.

zeta

For HR_Clusterapth(), a positive number : tuned parameter for the exponential weights.

lambda

For get_clusters(), a positive number : the weight of the penalty.

For HR_Clusterpath(), a numerix vector of positive number : the grid line for \(\lambda\).

p

For HR_Clusterapth(), a numeric between 0 and 1, or NULL. If NULL (default), it is assumed that the data are already on multivariate Pareto scale. Else, p is used as the probability in the function data2mpareto() to standardize the data (see graphicalExtremes documentation).

eps_g

A positive number : tolerance threshold for the convergence of the gradient descent.

it_max

An integer : maximal number of iteration of the gradient descent algorithm.

Value

HR_Clusterpath() is a gradient descent from the data with default values and method for the optimization : the variogram matrix \(\Gamma\) is the empirical variogram and the weights are set as the exponential weights, which depends of only one tuning parameter \(\zeta\) and where we have some theoretical results.

get_clusters() is a freer version of the previous function. You can use your own estimation for the variogram \(\Gamma\) and customizable weights with the matrix \(W\).

Both of them produce a list of results :

  • $R : the \(R\) matrix of the clusters.

  • $clusters : a list of the variable indices, clustered.

  • $nllh : the value of the negative penalised negative loglikelihood.

  • $lambda : the value of lambda.

  • $message : a message about the optimization results.

In the case of HR_Clusterpath() it is a list of the previous list.

Details

The block matrix models is defined from two elements : the cluster's partition \(\{C_1, \dots, C_K\}\), included in \(V\) and the \(R\) matrix which belongs to \(\mathcal S_K(\mathbb R)\), the set of symmetric \(K \times K\) matrix (See also build_theta() and extract_R_matrix()).

The Clusterpath aims to find optimum of some penalised negative loglikelihood defined in neg_likelihood_pen().

Some results for HR_Clusterpath()

When we get replications \(X_1, \dots, X_n\) of a random vector \(X\) which belongs to the attraction domain of a \(K\)-block Hüsler-Reiss graphical model, the minimum of the penalised negative loglikelihood conveges almost surely to the true precision matrix of the model \(\Theta^*\), for all \(\lambda\), provided that we choose a sequence \((\zeta_n)_{n\in \mathbb N^*}\) which grows slower than the \(log(n)\) sequence. In general, the \(\zeta \in [1,2]\) is a good choice.

Examples

############################################################################
#                            With get_clusters
############################################################################
# Customizable weights
W <- matrix(c(0, 1, 1, 1,
              1, 0, 1, 1,
              1, 1, 0, 1,
              1, 1, 1, 0), nc = 4)

# Free choice of variogram
gamma <- graphicalExtremes::generate_random_Gamma(d = 4)

# Choice of initial condition for the optimization
R <- matrix(c(1, 0, 0, -1,
              0, 1, 1, -2,
              0, 1, 1, -1,
              -1, -2, -1, 1), nc = 4)
lambda <- 2.2

Cluster_HR <- get_cluster(gamma, W, 100)

Cluster_HR(R, 2.2)
#> Error in U %*% R: non-conformable arguments

############################################################################
#                            With HR_Clusterpath
############################################################################
# Construction of clusters and R matrix for simulation
R <- matrix(c(1, -3, 0,
              -3, 2, -2,
              0, -2, 1), nc = 3)
clusters <- list(1:5, 6:10, 11:15)

# Construction of induced theta and corresponding variogram gamma
Theta <- build_theta(R, clusters)
Gamma <- graphicalExtremes::Theta2Gamma(Theta)

gr3_bal_sim_param_cluster <-
  list(
    R = R,
    clusters = clusters,
    Theta = Theta,
    Gamma = Gamma,
    chi = 1,
    n = 1e3,
    d = 15
  )

# Simulation of the data
set.seed(804)
data <- graphicalExtremes::rmpareto(n = gr3_bal_sim_param_cluster$n,
                                    model = "HR",
                                    par = gr3_bal_sim_param_cluster$Gamma)

# Optimization with Clusterpath algorithm with empirical variogram and exponential weights
lambda <- c(0, 0.1, 0.5, 1, 2)

HR_Clusterpath(data = data,
               zeta = gr3_bal_sim_param_cluster$chi,
               lambda = lambda,
               eps_f = 1e-1)
#> [[1]]
#> [[1]]$R
#>              [,1]        [,2]        [,3]        [,4]        [,5]      [,6]
#>  [1,] 11.62569700  0.67369437  0.14917425  0.79782350  0.81549832 -2.792292
#>  [2,]  0.67369437 10.53787951  1.05168936  1.42552262  1.30727908 -2.422037
#>  [3,]  0.14917425  1.05168936 11.22827891  1.43595090  0.79434028 -3.261501
#>  [4,]  0.79782350  1.42552262  1.43595090 11.32735995  0.88472654 -2.812041
#>  [5,]  0.81549832  1.30727908  0.79434028  0.88472654 10.20111555 -3.095132
#>  [6,] -2.79229153 -2.42203669 -3.26150083 -2.81204104 -3.09513159 17.386299
#>  [7,] -3.46255518 -2.91051426 -3.18341446 -3.42025788 -2.18772269  1.519077
#>  [8,] -2.06212105 -3.43386045 -2.28495826 -1.95434999 -3.35961874  1.361253
#>  [9,] -3.58274945 -2.74227664 -2.63255032 -2.93335439 -2.43851909  1.914010
#> [10,] -1.89746317 -2.32967878 -3.03570771 -3.88542699 -2.68035986  1.838097
#> [11,] -0.23128024 -0.62256783 -0.10789576 -0.61237958  0.02556595 -2.255061
#> [12,] -0.04118799 -0.28309068 -0.03209155  0.02202898 -0.08742993 -1.676326
#> [13,] -0.27531206  0.07202651  0.43777054 -0.11127959 -0.06601219 -1.981196
#> [14,] -0.15728786 -0.04916659 -0.42175823  0.18720272  0.20268767 -1.799423
#> [15,]  0.44036110 -0.27489952 -0.13732712 -0.35152576 -0.31641930 -1.923729
#>             [,7]       [,8]      [,9]     [,10]       [,11]       [,12]
#>  [1,] -3.4625552 -2.0621211 -3.582749 -1.897463 -0.23128024 -0.04118799
#>  [2,] -2.9105143 -3.4338605 -2.742277 -2.329679 -0.62256783 -0.28309068
#>  [3,] -3.1834145 -2.2849583 -2.632550 -3.035708 -0.10789576 -0.03209155
#>  [4,] -3.4202579 -1.9543500 -2.933354 -3.885427 -0.61237958  0.02202898
#>  [5,] -2.1877227 -3.3596187 -2.438519 -2.680360  0.02556595 -0.08742993
#>  [6,]  1.5190774  1.3612527  1.914010  1.838097 -2.25506097 -1.67632609
#>  [7,] 17.8097863  0.9736359  2.177892  1.908403 -1.23845305 -2.33417631
#>  [8,]  0.9736359 16.8615867  1.378395  1.442956 -1.47371448 -2.21745767
#>  [9,]  2.1778924  1.3783949 17.162554  1.861684 -1.88155578 -2.11439936
#> [10,]  1.9084033  1.4429559  1.861684 16.661355 -1.85037578 -2.46393406
#> [11,] -1.2384531 -1.4737145 -1.881556 -1.850376  6.12434446  1.25874961
#> [12,] -2.3341763 -2.2174577 -2.114399 -2.463934  1.25874961  6.55382058
#> [13,] -1.5962204 -1.8417352 -2.249246 -1.825178  0.84281588  0.83285285
#> [14,] -2.0294652 -1.7612351 -2.047525 -1.703477  0.78110014  1.22367932
#> [15,] -2.0260159 -1.6287752 -1.872359 -2.040894  1.24070742  1.35896229
#>             [,13]       [,14]      [,15]
#>  [1,] -0.27531206 -0.15728786  0.4403611
#>  [2,]  0.07202651 -0.04916659 -0.2748995
#>  [3,]  0.43777054 -0.42175823 -0.1373271
#>  [4,] -0.11127959  0.18720272 -0.3515258
#>  [5,] -0.06601219  0.20268767 -0.3164193
#>  [6,] -1.98119607 -1.79942282 -1.9237286
#>  [7,] -1.59622037 -2.02946515 -2.0260159
#>  [8,] -1.84173523 -1.76123511 -1.6287752
#>  [9,] -2.24924619 -2.04752523 -1.8723595
#> [10,] -1.82517818 -1.70347677 -2.0408942
#> [11,]  0.84281588  0.78110014  1.2407074
#> [12,]  0.83285285  1.22367932  1.3589623
#> [13,]  6.13662851  0.91971361  0.7043720
#> [14,]  0.91971361  5.90209604  0.7528583
#> [15,]  0.70437198  0.75285827  6.0746840
#> 
#> [[1]]$clusters
#> [[1]]$clusters[[1]]
#> [1] 1
#> 
#> [[1]]$clusters[[2]]
#> [1] 2
#> 
#> [[1]]$clusters[[3]]
#> [1] 3
#> 
#> [[1]]$clusters[[4]]
#> [1] 4
#> 
#> [[1]]$clusters[[5]]
#> [1] 5
#> 
#> [[1]]$clusters[[6]]
#> [1] 6
#> 
#> [[1]]$clusters[[7]]
#> [1] 7
#> 
#> [[1]]$clusters[[8]]
#> [1] 8
#> 
#> [[1]]$clusters[[9]]
#> [1] 9
#> 
#> [[1]]$clusters[[10]]
#> [1] 10
#> 
#> [[1]]$clusters[[11]]
#> [1] 11
#> 
#> [[1]]$clusters[[12]]
#> [1] 12
#> 
#> [[1]]$clusters[[13]]
#> [1] 13
#> 
#> [[1]]$clusters[[14]]
#> [1] 14
#> 
#> [[1]]$clusters[[15]]
#> [1] 15
#> 
#> 
#> [[1]]$nllh
#> [1] -18.74698
#> 
#> [[1]]$lambda
#> [1] 0
#> 
#> [[1]]$message
#> [1] "Initial guess for null lambda."
#> 
#> 
#> [[2]]
#> [[2]]$R
#>              [,1]       [,2]        [,3]       [,4]      [,5]      [,6]
#>  [1,] 11.62569700  0.7457596  0.40203639  0.7394198 -2.891927 -3.196674
#>  [2,]  0.74575960  1.2600307  0.99832528  1.1216766 -2.726927 -2.667609
#>  [3,]  0.40203639  0.9983253 11.22827891  1.3281153 -3.012906 -2.963637
#>  [4,]  0.73941985  1.1216766  1.32811532 11.3273600 -3.016338 -3.059981
#>  [5,] -2.89192673 -2.7269266 -3.01290630 -3.0163376 17.386299  1.824530
#>  [6,] -3.19667448 -2.6676091 -2.96363725 -3.0599811  1.824530  1.988305
#>  [7,] -2.14802383 -3.2150179 -2.39972033 -2.1145902  1.334903  1.263783
#>  [8,] -2.27285771 -2.6083130 -3.01611967 -3.4560025  1.844892  1.872678
#>  [9,] -0.06893046 -0.1067248 -0.09171188 -0.1404846 -1.931374 -1.931758
#>            [,7]      [,8]        [,9]
#>  [1,] -2.148024 -2.272858 -0.06893046
#>  [2,] -3.215018 -2.608313 -0.10672483
#>  [3,] -2.399720 -3.016120 -0.09171188
#>  [4,] -2.114590 -3.456002 -0.14048459
#>  [5,]  1.334903  1.844892 -1.93137369
#>  [6,]  1.263783  1.872678 -1.93175793
#>  [7,] 16.861587  1.426461 -1.78302008
#>  [8,]  1.426461 16.661355 -1.92776632
#>  [9,] -1.783020 -1.927766  0.99906912
#> 
#> [[2]]$clusters
#> [[2]]$clusters[[1]]
#> [1] 1
#> 
#> [[2]]$clusters[[2]]
#> [1] 2 5
#> 
#> [[2]]$clusters[[3]]
#> [1] 3
#> 
#> [[2]]$clusters[[4]]
#> [1] 4
#> 
#> [[2]]$clusters[[5]]
#> [1] 6
#> 
#> [[2]]$clusters[[6]]
#> [1] 7 9
#> 
#> [[2]]$clusters[[7]]
#> [1] 8
#> 
#> [[2]]$clusters[[8]]
#> [1] 10
#> 
#> [[2]]$clusters[[9]]
#> [1] 11 12 13 14 15
#> 
#> 
#> [[2]]$nllh
#> [1] -18.6258
#> 
#> [[2]]$lambda
#> [1] 0.1
#> 
#> [[2]]$message
#> NULL
#> 
#> 
#> [[3]]
#> [[3]]$R
#>            [,1]       [,2]       [,3]      [,4]      [,5]       [,6]
#> [1,] 11.6256970  0.8385883  0.7209617 -2.870505 -2.379425 -0.1184871
#> [2,]  0.8385883  1.2068774  1.0353637 -2.778825 -2.825443 -0.1034719
#> [3,]  0.7209617  1.0353637  1.0749056 -2.880245 -2.564407 -0.1325878
#> [4,] -2.8705052 -2.7788255 -2.8802455  1.762949  1.445556 -1.9056921
#> [5,] -2.3794245 -2.8254432 -2.5644066  1.445556 16.861587 -1.8442872
#> [6,] -0.1184871 -0.1034719 -0.1325878 -1.905692 -1.844287  0.9999403
#> 
#> [[3]]$clusters
#> [[3]]$clusters[[1]]
#> [1] 1
#> 
#> [[3]]$clusters[[2]]
#> [1] 2 5
#> 
#> [[3]]$clusters[[3]]
#> [1] 3 4
#> 
#> [[3]]$clusters[[4]]
#> [1]  6  7  9 10
#> 
#> [[3]]$clusters[[5]]
#> [1] 8
#> 
#> [[3]]$clusters[[6]]
#> [1] 11 12 13 14 15
#> 
#> 
#> [[3]]$nllh
#> [1] -18.62144
#> 
#> [[3]]$lambda
#> [1] 0.5
#> 
#> [[3]]$message
#> NULL
#> 
#> 
#> [[4]]
#> [[4]]$R
#>            [,1]       [,2]      [,3]      [,4]       [,5]
#> [1,] 11.6256970  0.8376860 -2.846009 -2.471630 -0.1356053
#> [2,]  0.8376860  1.0158239 -2.801814 -2.689447 -0.1224851
#> [3,] -2.8460095 -2.8018145  1.706345  1.483118 -1.9023560
#> [4,] -2.4716297 -2.6894469  1.483118 16.861587 -1.8596616
#> [5,] -0.1356053 -0.1224851 -1.902356 -1.859662  0.9894070
#> 
#> [[4]]$clusters
#> [[4]]$clusters[[1]]
#> [1] 1
#> 
#> [[4]]$clusters[[2]]
#> [1] 2 5 3 4
#> 
#> [[4]]$clusters[[3]]
#> [1]  6  7  9 10
#> 
#> [[4]]$clusters[[4]]
#> [1] 8
#> 
#> [[4]]$clusters[[5]]
#> [1] 11 12 13 14 15
#> 
#> 
#> [[4]]$nllh
#> [1] -18.61705
#> 
#> [[4]]$lambda
#> [1] 1
#> 
#> [[4]]$message
#> NULL
#> 
#> 
#> [[5]]
#> [[5]]$R
#>            [,1]      [,2]       [,3]
#> [1,]  0.8905508 -2.739169 -0.1168082
#> [2,] -2.7391691  1.553834 -1.8858920
#> [3,] -0.1168082 -1.885892  0.9843714
#> 
#> [[5]]$clusters
#> [[5]]$clusters[[1]]
#> [1] 1 2 5 3 4
#> 
#> [[5]]$clusters[[2]]
#> [1]  6  7  9 10  8
#> 
#> [[5]]$clusters[[3]]
#> [1] 11 12 13 14 15
#> 
#> 
#> [[5]]$nllh
#> [1] -18.62088
#> 
#> [[5]]$lambda
#> [1] 2
#> 
#> [[5]]$message
#> NULL
#> 
#>