 
  Data Structure Data Structure
 Networking Networking
 RDBMS RDBMS
 Operating System Operating System
 Java Java
 MS Excel MS Excel
 iOS iOS
 HTML HTML
 CSS CSS
 Android Android
 Python Python
 C Programming C Programming
 C++ C++
 C# C#
 MongoDB MongoDB
 MySQL MySQL
 Javascript Javascript
 PHP PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Java program to delete duplicate lines in text file
The interface set does not allow duplicate elements. The add() method of this interface accepts elements and adds to the Set object, if the addition is successful it returns true, if you try to add an existing element using this method, the addition operations fails returning false.
Problem Statement
Given a file which contains duplicate lines, write a program in Java to read the file, remove duplicate lines, and write the unique lines to a new file.
Input
Hello how are you Hello how are you welcome to Tutorialspoint
Output
Hello how are you welcome to Tutorialspoint
Basic Approch
Basic approch, to remove duplicate lines from a File −
- Step 1. Instantiate Scanner class (any class that reads data from a file)
- Step 2. Instantiate the FileWriter class (any class that writes data into a file)
- Step 3. Create an object of the Set interface.
- Step 4. Read each line of the file Store it in a String say input.
- Step 5. Try to add this String to the Set object.
- Step 6. If the addition is successful, append that particular line to file writer.
- Step 7. Finally, flush the contents of the FileWriter to the output file.
If a file contains a particular line more than one time, for the 1st time it is added to the set object and thus appended to the file writer.
If the same line is encountered again while reading all the lines in the file, since it already exists in the set object the add() method rejects it.
Example
The following Java program removes the duplicate lines from the above file and adds them to the file named output.txt.
 import java.io.File; import java.io.FileWriter; import java.util.HashSet; import java.util.Scanner; import java.util.Set; public class DeletingDuplcateLines {    public static void main(String args[]) throws Exception {       String filePath = "D://sample.txt";       String input = null;       //Instantiating the Scanner class       Scanner sc = new Scanner(new File(filePath));       //Instantiating the FileWriter class       FileWriter writer = new FileWriter("D://output.txt");       //Instantiating the Set class       Set set = new HashSet();       while (sc.hasNextLine()) {          input = sc.nextLine();          if(set.add(input)) {             writer.append(input+"
");          }       }       writer.flush();       System.out.println("Contents added............");    } } Output
Contents added............
The contents of the output.txt will be:
Hello how are you welcome to Tutorialspoint
