MoMA
Packagevignettes/moma-quick-start.Rmd
moma-quick-start.Rmd
The unified SFPCA (Sparse and Functional Principal Component Analysis) method enjoys many advantages over existing approaches to regularized PC, because it
allows for arbitrary degrees and forms of regularization;
unifies many existing methods;
admits a tractable, efficient, and theoretically well-grounded algorithm.
The problem is formulated as follows.
\[\max_{u,\,,v}{u}^{T} {X} {v}-\lambda_{{u}} P_{{u}}({u})-\lambda_{{v}} P_{{v}}({v})\] \[\text{s.t. } \| u \| _ {S_u} \leq 1, \, \| v \| _ {S_v} \leq 1.\] Typically, we take \({S}_{{u}}={I}+\alpha_{{u}} {\Omega}_{{u}}\) where \(\Omega_u\) is the second- or fourth-difference matrix, so that the \(\|u \|_{S_u}\) penalty term encourages smoothness in the estimated singular vectors. \(P_u\) and \(P_v\) are sparsity inducing penalties that satisfy the following conditions:
\(P \geq 0\), \(P\) defined on \([0,+\infty)\);
\(P(cx) = c P (x), \forall \, c > 0\).
Currently, the package supports arbitrary combination of the following.
So far, we have incorporated the following penalties in MoMA. The code under each penalty is only an example specification of the penalty. They should be carefully tailored based on your particular data set.
moma_lasso
;moma_lasso
(smoothly clipped absolute deviation), see moma_scad
;moma_mcp
;moma_slope
;moma_grplasso
;moma_fusedlasso
;# `algo` is indicates which algorithm to solve the proximal operator
# "dp": dynamic programming, "path": path-based algorithm
moma_fusedlasso(algo = "dp")
moma_l1tf
;moma_spfusedlasso
;moma_cluster
.\[\max_{u,\, v} \, {u}^{T} {X} {v} - 4 \sum_i | v_i - v_{i-1}|\] \[ \text{s.t. } u^T(I + 3 \Delta) u \leq 1,\, v^Tv \leq 1.\]
# `p` is the length of `u`
moma_sfpca(X,
center = FALSE,
v_sparse = moma_fusedlasso(lambda = 4),
u_smooth = moma_smoothness(second_diff_mat(p), alpha = 3)
)
R6 methods to support access of results.
Shiny supports interation with MoMA.
Fast. MoMA
uses the Rcpp
and RcppArmadillo
libraries for speed (Eddelbuettel and François 2011; Eddelbuettel and Sanderson 2014; Sanderson and Curtin 2016).
Eddelbuettel, Dirk, and Romain François. 2011. “Rcpp
: Seamless R
and C++
Integration.” Journal of Statistical Software 40 (8): 1–18. https://doi.org/10.18637/jss.v040.i08.
Eddelbuettel, Dirk, and Conrad Sanderson. 2014. “RcppArmadillo
: Accelerating R
with High-Performance C++
Linear Algebra.” Computational Statistics and Data Analysis 71: 1054–63. https://doi.org/10.1016/j.csda.2013.02.005.
Sanderson, Conrad, and Ryan Curtin. 2016. “Armadillo
: A Template-Based C++
Library for Linear Algebra.” Journal of Open Source Software 1 (2): 26. https://doi.org/10.21105/joss.00026.