StandardScaler in Sklearn29 Aug 2024 | 4 min read When and How to Use StandardScaler?When the features of the given dataset fluctuate significantly within their ranges or are recorded in various units of measurement, StandardScaler enters the picture. The data are scaled to a variance of 1 after the mean is reduced to 0 via StandardScaler. But when determining the empirical mean of the data and standard deviation, outliers present in data have a significant impact that reduces the spectrum of characteristic values. Many machine learning algorithms may encounter issues due to these variations in the starting features. For algorithms that calculate distance, for instance, if any of the dataset's features have values having large or completely different ranges, that particular feature of the dataset will control the distance calculation. The StandardScaler function of sklearn is based on the theory that the dataset's variables whose values lie in different ranges do not have an equal contribution to the model's fit parameters and training function and may even lead to bias in the predictions made with that model. Therefore, before including the features in the machine learning model, we must normalize the data (µ = 0, σ = 1). Standardization in feature engineering is commonly employed to address this potential issue. Standardizing using SklearnBy eliminating the mean from the features and scaling them to unit variance, features are standardised using this function. The formula for calculating a feature's standard score is z = (x - u) / s, where u is the training feature's mean (or zero if with_mean = False) and s is the standard deviation of the sample (or one if with_std = False). By calculating the pertinent statistics on the features in the training set, centring and scaling are applied independently to each feature. Then, for usage with later samples using transform(), the fit() method stores the mean and standard deviation. Parameters:
Attributes:
Methods of the StandardScaler Class
Example of StandardScalerFirstly, we will import the required libraries. To use the StandardScaler function, we need to import the Sklearn library. Then we will load the iris dataset. We can import the IRIS dataset from the sklearn.datasets library. We will create an object of the StandardScaler class. Separating the independent and target features. We will use the fit transform() method to implement the transformation to the dataset. Syntax: We initially built an instance of the StandardScaler() method following the syntax mentioned above. Additionally, we standardise the data by using fit_transform() together with the provided object. Code Output [[5.1 3.5 1.4 0.2] [4.9 3. 1.4 0.2] [4.7 3.2 1.3 0.2]] [[-0.90068117 1.01900435 -1.34022653 -1.3154443 ] [-1.14301691 -0.13197948 -1.34022653 -1.3154443 ] [-1.38535265 0.32841405 -1.39706395 -1.3154443 ]] [5.84333333 3.05733333 3.758 1.19933333] Next TopicFilter List in Python |
We'll create a Wordle clone for the terminal using this tutorial. Millions of people have used Wordle since Josh Wardle first released it in October 2021. Although the original game may be played online, we'll create a command-line application for our version and utilize the Rich...
22 min read
Introduction In this article, we discuss photogrammetry with Python. Ever questioned how we can understand the things we see? Like we see someone taking walks, whether we recognize it or not, using the prerequisite knowledge, our mind understands what is taking place and stores it as facts....
6 min read
In this tutorial, we will write the Python code to flattening the given linked list. A given linked list which consists of every node represents a linked list and contains two pointers of its type. The first pointer represents the pointer to the node and...
6 min read
A min-heap is a data structure that satisfies the heap property, which states that the value of each node is less than or equal to its children. It means that the minimum value of the heap is always stored at the root. Here is the algorithm for...
8 min read
In this tutorial, we are going to discuss the trigonometric cosine (cos) function in Python. We will talk about modules that we can use to implement the cos function in our Python program. We will also learn about plotting graphs using the cos function in the...
5 min read
Blender is a powerful open-source 3D creation software that allows users to create a wide range of 3D models, animations, and visual effects. It includes a Python API that enables developers to automate and extend Blender's functionality. Blender's Python API is a comprehensive library of modules and...
7 min read
Tkinter is a standard library Python that used for GUI application. Tkinter with Python offers a straightforward and fast way to create GUI applications. Tk GUI works on the object-oriented approach which makes it a powerful library. Tkinter is widely available for all operating systems. It...
1 min read
| Airflow Python Operator In this tutorial, we will learn about the Apache Airflow and its operators. We will discuss all the operators of airflow however our primary aim is explore the Python operators and how we can use it. Before dive deep into this topic...
6 min read
A program or process's smallest unit is called a thread, and it can run on its own or as part of a schedule set by the Operating System. Multitasking in a computer system is achieved by dividing a process into threads by an operating system. A...
6 min read
In this tutorial, we will show how the user can calculate the area of the circle by using Python with the given radius of the circle. To understand the format of the input-output of the code, the user must pay attention the following: INPUT FORMAT: The input of the...
2 min read
We request you to subscribe our newsletter for upcoming updates.
We provides tutorials and interview questions of all technology like java tutorial, android, java frameworks
G-13, 2nd Floor, Sec-3, Noida, UP, 201301, India