DyRFE
galaxy_ml.feature_selectors.DyRFE(estimator, n_features_to_select=None, step=1, verbose=0)
Mainly used with DyRFECV
Parameters
- estimator: object
A supervised learning estimator with afitmethod that provides information about feature importance either through acoef_attribute or through afeature_importances_attribute. - n_features_to_select: int or None (default=None)
The number of features to select. IfNone, half of the features are selected. - step: int, float or list, optional (default=1)
If greater than or equal to 1, thenstepcorresponds to the (integer) number of features to remove at each iteration. If within (0.0, 1.0), thenstepcorresponds to the percentage (rounded down) of features to remove at each iteration. If list, a series of steps of features to remove at each iteration. Iterations stops when steps finish - verbose: int, (default=0)
Controls verbosity of output.
DyRFECV
galaxy_ml.feature_selectors.DyRFECV(estimator, step=1, min_features_to_select=1, cv='warn', scoring=None, verbose=0, n_jobs=None)
Compared with RFECV, DyRFECV offers flexiable step to eleminate
features, in the format of list, while RFECV supports only fixed number
of step.
Parameters
- estimator: object
A supervised learning estimator with afitmethod that provides information about feature importance either through acoef_attribute or through afeature_importances_attribute. - step: int or float, optional (default=1)
If greater than or equal to 1, thenstepcorresponds to the (integer) number of features to remove at each iteration. If within (0.0, 1.0), thenstepcorresponds to the percentage (rounded down) of features to remove at each iteration. If list, a series of step to remove at each iteration. iteration stopes when finishing all steps Note that the last iteration may remove fewer thanstepfeatures in order to reachmin_features_to_select. - min_features_to_select: int, (default=1)
The minimum number of features to be selected. This number of features will always be scored, even if the difference between the original feature count andmin_features_to_selectisn't divisible bystep. -
cv: int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy. Possible inputs for cv are: - None, to use the default 3-fold cross-validation, - integer, to specify the number of folds. - :term:CV splitter, - An iterable yielding (train, test) splits as arrays of indices.For integer/None inputs, if ``y`` is binary or multiclass, :class:`sklearn.model_selection.StratifiedKFold` is used. If the estimator is a classifier or if ``y`` is neither binary nor multiclass, :class:`sklearn.model_selection.KFold` is used. Refer :ref:`User Guide <cross_validation>` for the various cross-validation strategies that can be used here. .. versionchanged:: 0.20 ``cv`` default value of None will change from 3-fold to 5-fold in v0.22. -
scoring: string, callable or None, optional, (default=None)
A string (see model evaluation documentation) or a scorer callable object / function with signaturescorer(estimator, X, y). - verbose: int, (default=0)
Controls verbosity of output. - n_jobs: int or None, optional (default=None)
Number of cores to run in parallel while fitting across folds.Nonemeans 1 unless in a :obj:joblib.parallel_backendcontext.-1means using all processors. See :term:Glossary <n_jobs>for more details.