Utilities for scikit-learn.
Install this via pip (or your favourite package manager):
pip install sklearn-utilitiesSee Docs for more information.
EstimatorWrapperBase: base class for wrappers. Redirects all attributes which are not in the wrapper to the wrapped estimator.DataFrameWrapper: tries to convert every estimator output to a pandas DataFrame or Series.FeatureUnionPandas: aFeatureUnionthat works with pandas DataFrames.IncludedColumnTransformerPandas,ExcludedColumnTransformerPandas: select columns by name.AppendPredictionToX: appends the prediction of y to X.AppendXPredictionToX: appends the prediction of X to X.DropByNoisePrediction: drops columns which has high importance in predicting noise.DropMissingColumns: drops columns with missing values above a threshold.DropMissingRowsY: drops rows with missing values in y. Usefeature_engine.DropMissingDatafor X.IntersectXY: drops rows where the index of X and y do not intersect. Use withfeature_engine.DropMissingData.ReindexMissingColumns: reindexes columns of X intransform()to match the columns of X infit().ReportNonFinite: reports non-finite values in X and/or y.IdTransformer: a transformer that does nothing.RecursiveFitSubtractRegressor: a regressor that recursively fits a regressor and subtracts the prediction from the target.SmartMultioutputEstimator: aMultiOutputEstimatorthat supports tuple of arrays inpredict()and supports pandasSeriesandDataFrame.until_event(),since_event(): calculates the time since or until events (Series[bool])ComposeVarEstimator: composes mean and std/var estimators.DummyRegressorVar:DummyRegressorthat returns 1.0 for std/var.TransformedTargetRegressorVar:TransformedTargetRegressorwith std/var support.StandardScalerVar:StandardScalerwith std/var support.EvalSetWrapper,CatBoostProgressBarWrapper: wrapper that passeseval_settofit()usingtrain_test_split(), mainly forCatBoost. The latter shows progress bar (usingtqdm) as well. Useful for early stopping. For LightGBM, seelightgbm-callbacks.
add_missing_values(): adds missing values to a dataset.
PCATorch: faster PCA using PyTorch with GPU support.
SkorchReshaper,SkorchCNNReshaper: reshapes X and y fornn.Linearandnn.Conv1d/2drespectively. (Fornn.Conv2d, usesnp.sliding_window_view().)AllowNaN: wraps a loss module and assign 0 to y and y_hat for indices where y contains NaN inforward()..
Thanks goes to these wonderful people (emoji key):
This project follows the all-contributors specification. Contributions of any kind welcome!