Java Program to Remove Duplicate Words from a String

Introduction

Removing duplicate words from a string is a common task in text processing, especially when cleaning up or normalizing text data. This exercise helps you understand how to split strings, use sets to filter out duplicates, and reassemble the string in Java. This guide will walk you through writing a Java program that removes duplicate words from a given string.

Problem Statement

Create a Java program that:

  • Prompts the user to enter a string.
  • Removes all duplicate words from the string.
  • Displays the string without duplicates.

Example:

  • Input: "Java is a programming language. Java is also an island. Java is popular."
  • Output: "Java is a programming language. also an island. popular."

Solution Steps

  1. Read the String: Use the Scanner class to take the string as input from the user.
  2. Split the String into Words: Use the split() method to break the string into words.
  3. Use a Set to Remove Duplicates: Use a LinkedHashSet to retain the order of words while removing duplicates.
  4. Reassemble the String: Combine the unique words back into a single string.
  5. Display the Result: Print the string without duplicate words.

Java Program

// Java Program to Remove Duplicate Words from a String // Author: https://www.rameshfadatare.com/ import java.util.LinkedHashSet; import java.util.Scanner; import java.util.Set; public class RemoveDuplicateWords { public static void main(String[] args) { // Step 1: Read the string from the user try (Scanner scanner = new Scanner(System.in)) { System.out.print("Enter a string: "); String input = scanner.nextLine(); // Step 2: Split the string into words String[] words = input.split("\\s+"); // Step 3: Use a LinkedHashSet to remove duplicates and retain order Set<String> uniqueWords = new LinkedHashSet<>(); for (String word : words) { uniqueWords.add(word); } // Step 4: Reassemble the string without duplicates String result = String.join(" ", uniqueWords); // Step 5: Display the result System.out.println("String after removing duplicate words: " + result); } } } 

Explanation

Step 1: Read the String

  • The Scanner class is used to read a string input from the user. The nextLine() method captures the entire line as a string.

Step 2: Split the String into Words

  • The split() method is used to divide the string into words based on whitespace. The regex \\s+ handles multiple spaces between words.

Step 3: Use a Set to Remove Duplicates

  • A LinkedHashSet is used to store the words. The LinkedHashSet automatically removes duplicates while maintaining the insertion order, ensuring the original order of words is preserved.

Step 4: Reassemble the String

  • The String.join() method is used to concatenate the unique words back into a single string, with spaces separating the words.

Step 5: Display the Result

  • The program prints the string after removing duplicate words.

Output Example

Example:

Enter a string: Java is a programming language. Java is also an island. Java is popular. String after removing duplicate words: Java is a programming language. also an island. popular. 

Conclusion

This Java program demonstrates how to remove duplicate words from a user-input string. It covers essential concepts such as string manipulation, using sets to filter duplicates, and reassembling strings, making it a valuable exercise for beginners learning Java programming.

Leave a Comment

Scroll to Top