Use this function to set the penalty function to $$\lambda \sum w_{ij} | x_{i} - x_{j} |,$$ where \(\lambda\) is set by the lambda argument below.

moma_cluster(..., w = NULL, ADMM = FALSE, acc = FALSE, eps = 1e-10,
  ..., lambda = 0, select_scheme = "g")

Arguments

...

Forces users to specify all arguments by name.

w

A symmetric square matrix. w[i, j] is the \(w_{ij}\) described above.

ADMM

A Boolean value. Set to TRUE to use ADMM, set to FALSE to use AMA. Defaults to FALSE.

acc

A Boolean value. Set to TRUE to use the accelerated version of the algorithm. Currently we support accelerated AMA only.

eps

A small numeric value. The precision used when solving the proximal operator.

lambda

A vector containing penalty values

select_scheme

A char being either "b" (nested BIC search) or "g" (grid search).

MoMA provides a flexible framework for regularized multivariate analysis with several tuning parameters for different forms of regularization. To assist the user in selecting these parameters (alpha_u, alpha_v, lambda_u, lambda_v), we provide two selection modes: grid search ("g") and nested BIC search ("b"). Grid search means we solve the problem for all combinations of parameter values provided by the user.

To explain nested BIC search, we need to look into how the algorithm runs. To find an (approximate) solution to a penalized SVD (Singular Value Decomposition) problem is to solve two penalized regression problems iteratively. Let's call them problem u and problem v, which give improving estimates of the right singular vector, u, and the left singular vector, v, respectively. For each regression problem, we can select the optimal parameters based on BIC.

The nested BIC search is essentially two 2-D searches. We start from SVD solutions, and then find the optimal parameters for problem u, given current estimate of v. Using the result from previous step, update current estimate of u, and then do the same thing for problem v, that is, to find the optimal parameters for problem v given current estimate of u. Repeat the above until convergence or the maximal number of iterations has been reached.

Users are welcome to refer to section 3.1: Selection of Regularization Parameters in the paper cited below.

Value

A moma_sparsity_type object, which is a list containing the values of w, ADMM, acc and eps.

References

Chi, Eric C., and Kenneth Lange. "Splitting Methods for Convex Clustering." Journal of Computational and Graphical Statistics 24.4 (2015): 994-1013. doi: 10.1080/10618600.2014.948181 .