FuzzyWuzzy Python Library

17 Mar 2025 | 6 min read

In this tutorial, we will learn how we can match the string using the Python built-in fuzzyWuzzy library and determine how they are similar using various examples.

Introduction

Python provides a few methods to compare two strings. A few main methods are given below.

Using Regex
Simple Compare
Using dfflib

But there is another method that can be effectively used for comparison, known as fuzzywuzzy. This method is quite effective in differentiating the two strings referring to the same thing, but they are written slightly differently. Sometimes we need a program that can automatically identify wrong spelling.

It is a process of finding strings that match a given pattern. It uses Levenshtein Distance to calculate the difference between sequences.

This library can help map databases that lack a common key, such as joining two tables by company name, and these appear differently in both tables.

Example

Let's see the following example.

Output:

True

The above code returns true because strings are matched an exactly (100 %), what if we make the change in str2.

Output:

False

Here the above code returns the false, and strings are pretty identical to the human eyes, but not for the interpreter. However, we can solve this problem by converting both strings to lower case.

Output:

True

But if we make changes in charset, we will get another problem.

Output:

True

To resolve such types of problems, we need more effective tools to compare the strings. And fuzzywuzzy is the best tool to calculate the strings.

The Levenshtein Distance

The levenshtein distance is used to calculate the distance between two sequences of words. It calculates the minimum number edits that we need to change in the given string. These edits can be insertion, deletions or substitution.

Example -

We will use the above function in the earlier example where we were trying to compare "Welcome to javatpoint." to "Welcome to javatpoint". We can see both strings are likely to same because Levensthtein's length is small.

The FuzzyWuzzy Package

The name of this library something weird and funny, but it is advantageous. It has a unique way to compare both strings and returns the score out of 100 of how much string is matched. To work with this library, we need to install it in our Python environment.

Installation

We can install this library using the pip command.

Collecting fuzzywuzzy Downloading fuzzywuzzy-0.18.0-py2.py3-none-any.whl (18 kB) Installing collected packages: fuzzywuzzy Successfully installed fuzzywuzzy-0.18.0

Now type the following command and press enter.

Let's understand the following methods of fuzzuwuzzy library

Fuzz Module

The fuzz module is used to compare the two given string at a time. It returns a score out of 100 after comparison using the different methods.

Fuzz.ratio()

It is one of the important methods of fuzz module. It compares the string and score on the basis of how much the given string are matched. Let's understand the following example.

Example -

Output:

As we can see in the above code, the fuzz.ratio() method returned the score which means there is very slight difference between the strings.

Fuzz.partial_ratio()

The fuzzywuzzy library provides another powerful method - partial_ratio(). It is used to handle the complex string comparison such as substring matching. Let's see the following example.

Example -

Output:

44 100

Explanation:

The partial_ratio() method can detect the substring. Thus, it yields a 100% similarity. It follows the optimal partial logic where the short length string k and longer string m, the algorithm finds the best matching length k-substring.

Fuzz.token_sort_ratio

This method does not guarantee to get an accurate result because if we make the changes in the order of string. It may not give an accurate result.

But fuzzywuzzy module provides the solution. Let's understand the following example.

Example -

Output:

59 74 100

Explanation:

In the above code, we have used token_sort_ratio() method which provides an advantage over partial_ratio. In this method, string token sorted alphabetically and joined together. But there is another situation such as what if the strings are widely different in the length.

Let's understand the following example.

Example -

Output:

40 64 61 95

In the above code, we have used another method called fuzz.token_set_ratio() that performs a set operation and takes out the common token and then makes ratio() pairwise comparison.

The intersection of the sorted token is always the same because the substring or smaller string consists of larger chunks of the original string or remaining token is closer to each other.

The fuzzywuzzy package provides the process module that allows us to calculate the string with the highest similarity. Let's understand the following example.

Example -

Output:

[('hello', 90), ('Hello Good', 90), ('Morning', 90), ('Good Evenining', 59)] ('hello', 90)

The above code will return the highest matching percentage of given string list.

Fuzz.WRatio

The process module also provides the WRatio, which gives a better result than the simple ratio. It handles lower and upper cases and some other parameters too. Let's understand the following example.

Example -

Output:

Conclusion

In this tutorial, we have discussed how to match the string and determine how closely they are. We have illustrated the simple example but they are enough to clear that how computer treats the mismatched strings. Many real-life applications such as spell checking, bioinformatics to match, DNA sequence etc. are based on the fuzzy logic.

Next TopicDask Python

← prev next →

FuzzyWuzzy Python Library

Introduction

Example

The Levenshtein Distance

The FuzzyWuzzy Package

Installation

Fuzz Module

Fuzz.ratio()

Fuzz.partial_ratio()

Fuzz.token_sort_ratio

Fuzz.WRatio

Conclusion

Contact info

Follow us

Tutorials

Interview Questions

Online Compiler

Python

Java

.Net Framework

AI, ML and Data Science

Cloud Technology

B.Tech and MCA

Web Technology

PHP

Software Testing

Technical Interview

Java Interview

Python

Web Interview

Database Interview

B.Tech / MCA

Important Interview

Software Testing Interview

Company Interviews

Online Compilers

Multiple Choice Questions

Python Questions

FuzzyWuzzy Python Library

Introduction

Example

The Levenshtein Distance

The FuzzyWuzzy Package

Installation

Fuzz Module

Fuzz.ratio()

Fuzz.partial_ratio()

Fuzz.token_sort_ratio

Fuzz.WRatio

Conclusion

Related Posts

Solar System Visualization Project with Python

Huffman Coding using Python

Best Resources to Learn NumPy and Pandas

What is PyDev

10 Python Image Manipulation Tools

Some Advance Ways to Use Python Dictionaries

Python Automation Project Ideas

A Colour game using Tkinter in Python

How to Determine if a Binary Tree is Height-Balanced using Python

Introduction to PyQtGraph Module in Python

Subscribe to Tpoint Tech

Contact info

Follow us

Tutorials

Interview Questions

Online Compiler