DyRFE
galaxy_ml.feature_selectors.DyRFE(estimator, n_features_to_select=None, step=1, verbose=0)
Mainly used with DyRFECV
Parameters
- estimator: object
A supervised learning estimator with afit
method that provides information about feature importance either through acoef_
attribute or through afeature_importances_
attribute. - n_features_to_select: int or None (default=None)
The number of features to select. IfNone
, half of the features are selected. - step: int, float or list, optional (default=1)
If greater than or equal to 1, thenstep
corresponds to the (integer) number of features to remove at each iteration. If within (0.0, 1.0), thenstep
corresponds to the percentage (rounded down) of features to remove at each iteration. If list, a series of steps of features to remove at each iteration. Iterations stops when steps finish - verbose: int, (default=0)
Controls verbosity of output.
DyRFECV
galaxy_ml.feature_selectors.DyRFECV(estimator, step=1, min_features_to_select=1, cv='warn', scoring=None, verbose=0, n_jobs=None)
Compared with RFECV, DyRFECV offers flexiable step
to eleminate
features, in the format of list, while RFECV supports only fixed number
of step
.
Parameters
- estimator: object
A supervised learning estimator with afit
method that provides information about feature importance either through acoef_
attribute or through afeature_importances_
attribute. - step: int or float, optional (default=1)
If greater than or equal to 1, thenstep
corresponds to the (integer) number of features to remove at each iteration. If within (0.0, 1.0), thenstep
corresponds to the percentage (rounded down) of features to remove at each iteration. If list, a series of step to remove at each iteration. iteration stopes when finishing all steps Note that the last iteration may remove fewer thanstep
features in order to reachmin_features_to_select
. - min_features_to_select: int, (default=1)
The minimum number of features to be selected. This number of features will always be scored, even if the difference between the original feature count andmin_features_to_select
isn't divisible bystep
. -
cv: int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy. Possible inputs for cv are: - None, to use the default 3-fold cross-validation, - integer, to specify the number of folds. - :term:CV splitter
, - An iterable yielding (train, test) splits as arrays of indices.For integer/None inputs, if ``y`` is binary or multiclass, :class:`sklearn.model_selection.StratifiedKFold` is used. If the estimator is a classifier or if ``y`` is neither binary nor multiclass, :class:`sklearn.model_selection.KFold` is used. Refer :ref:`User Guide <cross_validation>` for the various cross-validation strategies that can be used here. .. versionchanged:: 0.20 ``cv`` default value of None will change from 3-fold to 5-fold in v0.22.
-
scoring: string, callable or None, optional, (default=None)
A string (see model evaluation documentation) or a scorer callable object / function with signaturescorer(estimator, X, y)
. - verbose: int, (default=0)
Controls verbosity of output. - n_jobs: int or None, optional (default=None)
Number of cores to run in parallel while fitting across folds.None
means 1 unless in a :obj:joblib.parallel_backend
context.-1
means using all processors. See :term:Glossary <n_jobs>
for more details.