You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add gradient calculation in _huber_loss_and_gradient Add tests to check the correctness of the loss and gradient Fix for old scipy Add parameter sigma for robust linear regression Add gradient formula to robust _huber_loss_and_gradient Add fit_intercept option and fix tests Add docs to HuberRegressor and the helper functions Add example demonstrating ridge_regression vs huber_regression Add sample_weight implementation Add scaling invariant huber test Remove exp and add bounds to fmin_l_bfgs_b Add sparse data support Add more tests and refactoring of code Add narrative docs review huber regressor Minor additions to docs and tests Minor fixes that deals with dealing with NaN values in targets and old verions of SciPy and NumPy Add HuberRegressor to robust estimator Refactored computation of gradient and make docs render properly Temp Remove float64 dtype conversion trivial optimizations and add a note about R Remove sample_weights special_casing address @amueller comments
* :ref:`RANSAC <ransac_regression>` is faster, and scales much better
910
-
with the number of samples
910
+
* :ref:`HuberRegressor <huber_regression>` should be faster than
911
+
:ref:`RANSAC <ransac_regression>` and :ref:`Theil Sen <theil_sen_regression>`
912
+
unless the number of samples are very large, i.e ``n_samples`` >> ``n_features``.
913
+
This is because :ref:`RANSAC <ransac_regression>` and :ref:`Theil Sen <theil_sen_regression>`
914
+
fit on smaller subsets of the data. However, both :ref:`Theil Sen <theil_sen_regression>`
915
+
and :ref:`RANSAC <ransac_regression>` are unlikely to be as robust as
916
+
:ref:`HuberRegressor <huber_regression>` for the default parameters.
911
917
912
-
* :ref:`RANSAC <ransac_regression>` will deal better with large
913
-
outliers in the y direction (most common situation)
918
+
* :ref:`RANSAC <ransac_regression>` is faster than :ref:`Theil Sen <theil_sen_regression>`
919
+
and scales much better with the number of samples
920
+
921
+
* :ref:`RANSAC <ransac_regression>` will deal better with large
922
+
outliers in the y direction (most common situation)
914
923
915
924
* :ref:`Theil Sen <theil_sen_regression>` will cope better with
916
925
medium-size outliers in the X direction, but this property will
@@ -1050,6 +1059,67 @@ considering only a random subset of all possible combinations.
1050
1059
1051
1060
.. [#f2] T. Kärkkäinen and S. Äyrämö: `On Computation of Spatial Median for Robust Data Mining. <http://users.jyu.fi/~samiayr/pdf/ayramo_eurogen05.pdf>`_
1052
1061
1062
+
.. _huber_regression:
1063
+
1064
+
Huber Regression
1065
+
----------------
1066
+
1067
+
The :class:`HuberRegressor` is different to :class:`Ridge` because it applies a
1068
+
linear loss to samples that are classified as outliers.
1069
+
A sample is classified as an inlier if the absolute error of that sample is
1070
+
lesser than a certain threshold. It differs from :class:`TheilSenRegressor`
1071
+
and :class:`RANSACRegressor` because it does not ignore the effect of the outliers
0 commit comments