Finding the Nearest Number in a DataFrame Using Pandas

Last Updated : 06 Jan, 2025

When working with data - pandas provide various techniques to find the closest number to given target value in a dataset using methods like argsort, idxmin and slicing techniques.

Method 1: Using '`argsort'` to Find the Nearest Number

Python

import pandas as pd import numpy as np df = pd.DataFrame({ 'values': [10, 20, 30, 40, 50] }) # Target number target = 33 differences = np.abs(df['values'] - target) nearest_index = differences.argsort()[0] nearest_value = df['values'].iloc[nearest_index] print(f"Nearest value to {target} is {nearest_value}")

Output:

Nearest value to 33 is 30

In this case we compute the absolute difference between the target number and each value in the dataset using abs. argsort() sorts the differences.

It is helpful when we need the position of closest number in a dataset. Once the indices are sorted selecting the nearest value is simple and fast. Here we use argsort()[0] to get the nearest first value because the [0] refers to the index of the smallest difference and hence the closest number in the dataset.

Method 2. Using '`idxmin()'` to Find the Nearest Number

Python

import pandas as pd import numpy as np df = pd.DataFrame({ 'values': [10, 20, 30, 40, 50] }) # Target number target = 33 differences = np.abs(df['values'] - target) nearest_index = differences.idxmin() nearest_value = df['values'].iloc[nearest_index] print(f"Nearest value to {target} is {nearest_value}")

Output:

Nearest value to 33 is 30

Here also we first compute the absolute difference between the target and each value in the dataset but instead of sorting we can directly call idxmin() on absolute differences to get the index of the smallest difference.

It directly gives us the index of the smallest value making it useful when we only need the single nearest value and is much faster as we don't need to sort index. It can be useful when dataset is large as sorting will take a lot of time and computing power.

Method 3. Finding n Nearest Numbers using argsort () slicing

Python

import pandas as pd import numpy as np df = pd.DataFrame({ 'values': [10, 20, 30, 40, 50] }) # Target number target = 33 N = 3 # Number of nearest values you want differences = np.abs(df['values'] - target) nearest_indices = differences.argsort()[:N] nearest_values = df['values'].iloc[nearest_indices] print(f"The {N} nearest values to {target} are {nearest_values.tolist()}")

Output:

The 3 nearest values to 33 are [30, 40, 20]

Someties we need to find N nearest values to a given target. To achieve this we can use argsort() with slicing to extract the N closest values. It is same as method 1 but here we use argsort()[:N] which will give N index of sorted array.

Conclusion

When working with numerical data in Pandas finding the nearest number to a target is a common. Depending upon our needs we can use argsort() or idxmin().

Use idxmin() for a simpler and direct approach where we want single nearest number. It is comparatively very fast.
Use argsort() when we need sorted indices and wants to extract more than one nearest number.
To find multiple nearest numbers we use argsort() with slicing to extract the closest N values.

These methods provide efficient and flexible ways to handle nearest number searches in our datasets.

How to get nth row in a Pandas DataFrame?

sahilgupta03