Python | Remove duplicates in Matrix

Python | Remove duplicates in Matrix

Let's understand how to remove duplicate rows in a matrix (2D list) in Python.

1. Understanding the Problem

Given a matrix:

matrix = [ [1, 2, 3], [4, 5, 6], [1, 2, 3], [7, 8, 9], [4, 5, 6] ] 

Our goal is to remove the duplicate rows such that the output will be:

cleaned_matrix = [ [1, 2, 3], [4, 5, 6], [7, 8, 9] ] 

2. Basic Approach Using Lists

The most straightforward way is to iterate over each row and check for its existence in the new matrix. If it doesn't exist, append it:

cleaned_matrix = [] for row in matrix: if row not in cleaned_matrix: cleaned_matrix.append(row) print(cleaned_matrix) 

3. Using Set for Tracking with Loop

Using a set for tracking can make the process more efficient. As lists (rows in our matrix) are mutable and can't be added to a set, we need to convert them to tuples:

seen = set() cleaned_matrix = [] for row in matrix: row_tuple = tuple(row) if row_tuple not in seen: seen.add(row_tuple) cleaned_matrix.append(row) print(cleaned_matrix) 

4. Using List Comprehension

The procedure can be condensed using list comprehension:

seen = set() cleaned_matrix = [seen.add(tuple(row)) or row for row in matrix if tuple(row) not in seen] print(cleaned_matrix) 

Again, the or operator is used due to its short-circuiting behavior.

5. Conclusion

All methods provided will produce the cleaned matrix:

[ [1, 2, 3], [4, 5, 6], [7, 8, 9] ] 

While all the methods effectively remove duplicates, using a set for membership checking is usually more efficient for larger matrices. Checking membership in a set is typically an O(1) operation, whereas in a list it's O(n).


More Tags

filereader backtracking zero nse numericupdown nsdocumentdirectory postgresql-11 bc jpa sequelize.js

More Programming Guides

Other Guides

More Programming Examples