This section is based on the Simulation Study section in the paper Sparse and Functional Principal Component Analysis. Note this is not a complete replication of the Simulation Study section. Please refer to the paper for further details.

After introducing the simulated data set and the model, we give a quick demonstration of the moma_sfpca function. Then we showcase the parameter selection in MoMA.

Data Set

We simulate data according to the low-rank model \[X = \sum_{k=1}^{K} d_{k} \boldsymbol{u}_{k} \boldsymbol{v}_{k}^{T}+{E}, \]

where \(E_{i j} \stackrel{\mathrm{IID}}{\sim} \mathcal{N}(0,1)\), \(K = 3\), \(p = 2\), \(d_1 = n / 4, d_{2}=n / 5, d_{3}=n / 6\). Left singular vectors \(u_1,u_2, u_3\) are sampled uniformly from the space of orthogonal matrices. The signal in the right singular vectors \(v_1,v_2,v_3\), each of which have a combination of sparsity and smoothness, takes the form of a sinusoidal pulse.

Now we obtain the data.

## norm(X) / norm(noise) =  1.26480989172193

Model Description

Recall the SFPCA framework:

\[\max_{u,\,,v}{u}^{T} {X} {v}-\lambda_{{u}} P_{{u}}({u})-\lambda_{{v}} P_{{v}}({v})\] \[\text{s.t. } \| u \| _ {S_u} \leq 1, \, \| v \| _ {S_v} \leq 1.\] Typically, we take \({S}_{{u}}={I}+\alpha_{{u}} {\Omega}_{{u}}\) where \(\Omega_u\) is the second- or fourth-difference matrix, so that the \(\|u \|_{S_u}\) penalty term encourages smoothness in the estimated singular vectors. \(P_u\) and \(P_v\) are sparsity inducing penalties.

Here we impose no penalty on the u side, and set \(\Omega_v\) to be the second-difference matric, \(P_u(x) = \|x\|_1\), i.e., the LASSO.

Appetizer

Before we delve into various modeling choices provided by MoMA, we give a one-line solution to the above problem. Suppose we are interested in what the solution looks like when \(\alpha_v = 2, \lambda_v = 1\). We run the following code to compare the model and its unpenalized version (simple SVD).

MoMA attains almost perfect recovery on this data set!

Parameter Selection

Tuning parameters are of concern in the model. Currently we provide two parameter selection methods: grid search and nested BIC selection.

Nested BIC Selection

We can run a greedy selection procedure to pick the best parameter based on BIC. Please see ?select_scheme for details.

Note the selection procedure is pretty fast, we can select \(\lambda_v\) out of a much finer grid (seq(0,3,length.out = 40)).

## Start a final run on the chosen parameters.[av, au, lu, lv] = [2, 0, 0, 2.46154]
plot(res$V,
    ylab = "v",
    type = "l",
    main = bquote(lambda[v] == .(res$chosen_lambda_v))
)

The selection procudure agrees the bumps located on [70,200] should be eliminated.

Multiple PCs and Deflation Schemes

To recover the other two signals, set rank = 3.

To experiment with different deflation schemes to achieve a certain type of orthogonality, specify the deflation_scheme argument.