CBASS returns a fast approximation to the Convex BiClustering solution path along with visualizations such as dendrograms and heatmaps. CBASS solves the Convex Biclustering problem using an efficient Algorithmic Regularization scheme.

  row_weights = sparse_rbf_kernel_weights(k = "auto", phi = "auto", dist.method =
    "euclidean", p = 2),
  col_weights = sparse_rbf_kernel_weights(k = "auto", phi = "auto", dist.method =
    "euclidean", p = 2),
  row_labels = rownames(X),
  col_labels = colnames(X), = TRUE,
  t = 1.01,
  back_track = FALSE,
  exact = FALSE,
  norm = 2,
  npcs = min(4L, NCOL(X), NROW(X)),
  dendrogram.scale = NULL,
  status = (interactive() && (clustRviz_logger_level() %in% c("MESSAGE", "WARNING",



The data matrix (\(X \in R^{n \times p}\)). If X has missing values - NA or NaN values - they will be automatically imputed.


Unused arguements. An error will be thrown if any unrecognized arguments as given.


One of the following:

  • A function which, when called with argument X, returns a n-by-n matrix of fusion weights.

  • A matrix of size n-by-n containing fusion weights

Note that the weights will be renormalized to sum to \(1/\sqrt{n}\) internally.


One of the following:

  • A function which, when called with argument t(X), returns a p-by-p matrix of fusion weights. (Note the transpose.)

  • A matrix of size p-by-p containing fusion weights

Note that the weights will be renormalized to sum to \(1/\sqrt{p}\) internally.


A character vector of length \(n\): row (observation) labels


A character vector of length \(p\): column (variable) labels

A logical: Should X be centered globally? I.e., should the global mean of X be subtracted?


A number greater than 1: the size of the multiplicative update to the cluster fusion regularization parameter (not used by back-tracking variants). Typically on the scale of 1.005 to 1.1.


A logical: Should back-tracking be used to exactly identify fusions? By default, back-tracking is not used.


A logical: Should the exact solution be computed using an iterative algorithm? By default, algorithmic regularization is applied and the exact solution is not computed. Setting exact = TRUE often significantly increases computation time.


Which norm to use in the fusion penalty? Currently only 1 and 2 (default) are supported.


An integer >= 2. The number of principal components to compute for path visualization.


A character string denoting how the scale of dendrogram regularization proportions should be visualized. Choices are 'original' or 'log'; if not provided, a data-driven heuristic choice is used.


Should a status message be printed to the console?


An object of class CBASS containing the following elements (among others):

  • X: the original data matrix

  • n: the number of observations (rows of X)

  • p: the number of variables (columns of X)

  • alg.type: the CBASS variant used

  • row_fusions: A record of row fusions - see the documentation of CARP for details of what this may include.

  • col_fusions: A record of column fusions - see the documentation of CARP for details of what this may include.


if (FALSE) { cbass_fit <- CBASS(presidential_speech) print(cbass_fit) plot(cbass_fit) }