mplearn.feature_selection.base_selector.DecisionTreeSelector

class mplearn.feature_selection.base_selector.DecisionTreeSelector(*, mode='classifier', max_depth=5, criterion='gini', num_features_to_select=0.1, random_state=0)[source]

Feature selection with the decision tree selector.

This class is designed to be used as a base feature selector on the minipatches with the mplearn.feature_selection.AdaSTAMPS class. This is a wrapper built around the DecisionTreeClassifier and the DecisionTreeRegressor from the sklearn package.

Parameters
mode{‘classifier’, ‘regressor’}

Controls the type of the decision tree model to use.

max_depthint, default=5

The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples.

criterion{‘gini’, ‘entropy’, ‘squared_error’, ‘friedman_mse’, ‘absolute_error’, ‘poisson’}

The criterion to measure the quality of a split. If mode='classifier', this must be {‘gini’, ‘entropy’}. If mode='regressor', this must be {‘squared_error’, ‘friedman_mse’, ‘absolute_error’, ‘poisson’}.

num_features_to_selectint or float, default=0.1

The number of features to select from the m features in a minipatch.

  • If positive integer, it is the absolute number of features to select on a minipatch.

  • If float in the interval (0.0, 1.0], it is the percentage of the m features in a minipatch to select.

random_stateint, default=0

Controls the randomness of the decision tree model.

Attributes
selection_indicator_ndarray of shape (m,)

A binary selection indicator for the features in the minipatch (1 for selected features and 0 for unselected features).

Fk_ndarray of shape (m,)

The corresponding integer indices of the features in selection_indicator_. Note that these indices correspond to these features’ column indices in the full data X_full (N observations and M features).

fit(X, y, Fk)[source]

Fit the decision tree base selector to a minipatch.

Parameters
Xndarray of shape (n, m)

The data matrix corresponding to the minipatch (n observations and m features).

yndarray of shape (n,)

The target values corresponding to the minipatch.

Fkndarray of shape (m,)

The integer indices of the features in the minipatch. Note that these indices correspond to these features’ column indices in the full data X_full. For example, X = X_full[:, F_k].

Returns
selfobject

Fitted estimator.

get_params(deep=True)

Get parameters for this estimator.

Parameters
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsdict

Parameter names mapped to their values.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects. The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters
**paramsdict

Estimator parameters.

Returns
selfestimator instance

Estimator instance.