mplearn.feature_selection.base_selector
.ThresholdedOLS
- class mplearn.feature_selection.base_selector.ThresholdedOLS(*, num_features_to_select=None, screening_thresh=None)[source]
Feature selection with the Thresholded OLS selector.
This class is designed to be used as a base feature selector on the minipatches with the
mplearn.feature_selection.AdaSTAMPS
class.- Parameters
- num_features_to_selectint or float, default=None
The number of features to select from the m features in a minipatch.
If
None
, it employs the Bonferroni procedure as described in [1] to automatically decide the number of features to select on the minipatch.If positive integer, it is the absolute number of features to select on a minipatch.
If float in the interval (0.0, 1.0], it is the percentage of the m features in a minipatch to select.
- screening_threshfloat, default=None
This is ignored if the minipatch has more observations n than features m. For high-dimensional minipatches (n<m),
screening_thresh
should be a float in the interval (0.0, 1.0), which will first apply an efficient screening rule from [1] to reduce the number of features in the minipatch toround(screening_thresh * n)
.
- Attributes
- selection_indicator_ndarray of shape (m,) or (
round(screening_thresh * n)
,) A binary selection indicator for the features in the minipatch (1 for selected features and 0 for unselected features). If low-dimensional minipatch (n>m), the shape is (m,). Otherwise, the shape is (
round(screening_thresh * n)
,).- Fk_ndarray of shape (m,) or (
round(screening_thresh * n)
,) The corresponding integer indices of the features in
selection_indicator_
. Note that these indices correspond to these features’ column indices in the full data X_full (N observations and M features).
- selection_indicator_ndarray of shape (m,) or (
References
- 1
Giurcanu, M. . “Thresholding least-squares inference in high-dimensional regression models.” Electron. J. Statist. 10 (2) 2124 - 2156, 2016.
- fit(X, y, Fk)[source]
Fit the thresholded OLS base selector to a minipatch.
- Parameters
- Xndarray of shape (n, m)
The data matrix corresponding to the minipatch (n observations and m features).
- yndarray of shape (n,)
The target values corresponding to the minipatch.
- Fkndarray of shape (m,)
The integer indices of the features in the minipatch. Note that these indices correspond to these features’ column indices in the full data X_full. For example,
X = X_full[:, F_k]
.
- Returns
- selfobject
Fitted estimator.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
- paramsdict
Parameter names mapped to their values.
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects. The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
- **paramsdict
Estimator parameters.
- Returns
- selfestimator instance
Estimator instance.