@@ -131,33 +131,52 @@ number of features.
131131 elimination example with automatic tuning of the number of features
132132 selected with cross-validation.
133133
134+ .. _select_from_model :
134135
135- .. _l1_feature_selection :
136+ Feature selection using SelectFromModel
137+ =======================================
138+
139+ :class: `SelectFromModel ` is a meta-transformer that can be used along with any
140+ estimator that has a ``coef_ `` or ``feature_importances_ `` attribute after fitting.
141+ The features are considered unimportant and removed, if the corresponding
142+ ``coef_ `` or ``feature_importances_ `` values are below the provided
143+ ``threshold `` parameter. Apart from specifying the threshold numerically,
144+ there are build-in heuristics for finding a threshold using a string argument.
145+ Available heuristics are "mean", "median" and float multiples of these like
146+ "0.1*mean".
147+
148+ For examples on how it is to be used refer to the sections below.
149+
150+ .. topic :: Examples
151+
152+ * :ref: `example_feature_selection_plot_select_from_model_boston.py `: Selecting the two
153+ most important features from the Boston dataset without knowing the
154+ threshold beforehand.
136155
137156L1-based feature selection
138- ==========================
157+ --------------------------
139158
140159.. currentmodule :: sklearn
141160
142- Selecting non-zero coefficients
143- ---------------------------------
144-
145161:ref: `Linear models <linear_model >` penalized with the L1 norm have
146162sparse solutions: many of their estimated coefficients are zero. When the goal
147163is to reduce the dimensionality of the data to use with another classifier,
148- they expose a `` transform `` method to select the non-zero coefficient. In
149- particular, sparse estimators useful for this purpose are the
150- :class: `linear_model.Lasso ` for regression, and
164+ they can be used along with :class: ` feature_selection.SelectFromModel `
165+ to select the non-zero coefficients. In particular, sparse estimators useful for
166+ this purpose are the :class: `linear_model.Lasso ` for regression, and
151167of :class: `linear_model.LogisticRegression ` and :class: `svm.LinearSVC `
152168for classification::
153169
154170 >>> from sklearn.svm import LinearSVC
155171 >>> from sklearn.datasets import load_iris
172+ >>> from sklearn.feature_selection import SelectFromModel
156173 >>> iris = load_iris()
157174 >>> X, y = iris.data, iris.target
158175 >>> X.shape
159176 (150, 4)
160- >>> X_new = LinearSVC(C=0.01, penalty="l1", dual=False).fit_transform(X, y)
177+ >>> lsvc = LinearSVC(C=0.01, penalty="l1", dual=False).fit(X, y)
178+ >>> model = SelectFromModel(lsvc, prefit=True)
179+ >>> X_new = model.transform(X)
161180 >>> X_new.shape
162181 (150, 3)
163182
@@ -241,23 +260,27 @@ of features non zero.
241260 http://hal.inria.fr/hal-00354771/
242261
243262Tree-based feature selection
244- ============================
263+ ----------------------------
245264
246265Tree-based estimators (see the :mod: `sklearn.tree ` module and forest
247266of trees in the :mod: `sklearn.ensemble ` module) can be used to compute
248267feature importances, which in turn can be used to discard irrelevant
249- features::
268+ features (when coupled with the :class: `sklearn.feature_selection.SelectFromModel `
269+ meta-transformer)::
250270
251271 >>> from sklearn.ensemble import ExtraTreesClassifier
252272 >>> from sklearn.datasets import load_iris
273+ >>> from sklearn.feature_selection import SelectFromModel
253274 >>> iris = load_iris()
254275 >>> X, y = iris.data, iris.target
255276 >>> X.shape
256277 (150, 4)
257278 >>> clf = ExtraTreesClassifier()
258- >>> X_new = clf.fit(X, y).transform(X )
279+ >>> clf = clf.fit(X, y)
259280 >>> clf.feature_importances_ # doctest: +SKIP
260281 array([ 0.04..., 0.05..., 0.4..., 0.4...])
282+ >>> model = SelectFromModel(clf, prefit=True)
283+ >>> X_new = model.transform(X)
261284 >>> X_new.shape # doctest: +SKIP
262285 (150, 2)
263286
@@ -278,12 +301,13 @@ the actual learning. The recommended way to do this in scikit-learn is
278301to use a :class: `sklearn.pipeline.Pipeline `::
279302
280303 clf = Pipeline([
281- ('feature_selection', LinearSVC(penalty="l1")),
304+ ('feature_selection', SelectFromModel( LinearSVC(penalty="l1") )),
282305 ('classification', RandomForestClassifier())
283306 ])
284307 clf.fit(X, y)
285308
286309In this snippet we make use of a :class: `sklearn.svm.LinearSVC `
310+ coupled with :class: `sklearn.feature_selection.SelectFromModel `
287311to evaluate feature importances and select the most relevant features.
288312Then, a :class: `sklearn.ensemble.RandomForestClassifier ` is trained on the
289313transformed output, i.e. using only relevant features. You can perform
0 commit comments