windows - Batch file to split .csv file

Windows - Batch file to split .csv file

To split a .csv file using a batch script on Windows, you can use a combination of command-line tools like findstr, for, and batch script commands. Below is an example of how you can split a CSV file into smaller files based on a specified number of lines per file.

Example Batch Script to Split CSV File

@echo off setlocal enabledelayedexpansion REM Source CSV file and output directory set "sourceFile=YourSourceFile.csv" set "outputDir=OutputDirectory" REM Number of lines per split file set "linesPerFile=1000" REM Ensure output directory exists if not exist "%outputDir%" mkdir "%outputDir%" REM Initialize variables set "splitCount=0" set "header=" REM Read source file line by line for /f "usebackq tokens=*" %%A in ("%sourceFile%") do ( set /a "lineCount+=1" set "line=%%A" REM Capture header line if !lineCount! equ 1 ( set "header=!line!" continue ) REM Output header + linesPerFile records to new file if !splitCount! equ 0 ( set /a "splitCount=1" set "splitFile=%outputDir%\split_!splitCount!.csv" echo !header! > "!splitFile!" ) REM Append lines to current split file echo !line! >> "!splitFile!" REM Check if linesPerFile reached, then start new split file if !lineCount! equ !linesPerFile! ( set /a "lineCount=0" set /a "splitCount+=1" set "splitFile=%outputDir%\split_!splitCount!.csv" echo !header! > "!splitFile!" ) ) echo Splitting complete. pause 

Explanation:

  1. Variables Setup:

    • sourceFile: Specifies the path of the CSV file you want to split.
    • outputDir: Specifies the directory where the split CSV files will be saved.
    • linesPerFile: Specifies the maximum number of lines per split file.
  2. Create Output Directory:

    • Checks if the outputDir exists. If not, creates it using mkdir.
  3. Read and Split the CSV File:

    • Uses a for /f loop to read each line from sourceFile.
    • Skips the header line (lineCount equ 1) and captures it into header.
  4. Split Logic:

    • Starts writing lines to a new split file (split_1.csv) and includes the header.
    • Appends subsequent lines to the current split file until linesPerFile is reached.
    • When linesPerFile is reached, closes the current split file and starts a new one (split_2.csv, split_3.csv, and so on).
  5. Completion Message:

    • Displays "Splitting complete." and pauses the script (pause), allowing you to review any messages.

Usage:

  • Replace YourSourceFile.csv with the actual path of your CSV file to split.
  • Adjust linesPerFile as needed based on how many lines you want in each split file.
  • Ensure your CSV file has a header line if you want to preserve it in each split file.

Notes:

  • This script assumes each line in the CSV file is a record and does not handle cases where fields contain line breaks or special characters that might complicate splitting based on line counts.
  • It's recommended to test this script with a small subset of your data to ensure it behaves as expected before running it on larger files.

By following these steps, you can create a batch script to split a CSV file into smaller files based on a specified number of lines per file on Windows. Adjustments can be made for different splitting criteria or additional file handling requirements as needed.

Examples

  1. How to split a large CSV file into smaller files using a batch script?

    • Description: This batch script reads a large CSV file and splits it into smaller files with a specified number of lines.
    • Code:
      @echo off setlocal enabledelayedexpansion set "inputFile=largefile.csv" set "linesPerFile=100" set "fileCount=1" set "lineCount=0" for /f "tokens=*" %%A in ('type "%inputFile%"') do ( if !lineCount! lss %linesPerFile% ( echo %%A >> "part!fileCount!.csv" set /a lineCount+=1 ) else ( set /a fileCount+=1 set lineCount=1 echo %%A >> "part!fileCount!.csv" ) ) 
  2. How to include headers in each split CSV file?

    • Description: Modify the batch script to include the header row in each split file.
    • Code:
      @echo off setlocal enabledelayedexpansion set "inputFile=largefile.csv" set "linesPerFile=100" set "fileCount=1" set "lineCount=0" set "header=" for /f "tokens=*" %%A in ('type "%inputFile%"') do ( if !lineCount! equ 0 ( set "header=%%A" echo %%A >> "part!fileCount!.csv" ) else ( if !lineCount! lss %linesPerFile% ( echo %%A >> "part!fileCount!.csv" set /a lineCount+=1 ) else ( set /a fileCount+=1 set lineCount=1 echo !header! >> "part!fileCount!.csv" echo %%A >> "part!fileCount!.csv" ) ) ) 
  3. How to split CSV file based on a specific number of records?

    • Description: Adjust the batch file to split the CSV file based on a specified number of records.
    • Code:
      @echo off setlocal enabledelayedexpansion set "inputFile=largefile.csv" set "recordsPerFile=50" set "fileCount=1" set "recordCount=0" for /f "tokens=*" %%A in ('type "%inputFile%"') do ( echo %%A >> "part!fileCount!.csv" set /a recordCount+=1 if !recordCount! geq %recordsPerFile% ( set /a fileCount+=1 set recordCount=0 ) ) 
  4. How to use a specific delimiter when splitting a CSV file?

    • Description: This script can handle CSV files with a specific delimiter (like semicolon).
    • Code:
      @echo off setlocal enabledelayedexpansion set "inputFile=largefile.csv" set "linesPerFile=100" set "fileCount=1" set "lineCount=0" for /f "delims=" %%A in ('type "%inputFile%"') do ( if !lineCount! lss %linesPerFile% ( echo %%A >> "part!fileCount!.csv" set /a lineCount+=1 ) else ( set /a fileCount+=1 set lineCount=1 echo %%A >> "part!fileCount!.csv" ) ) 
  5. How to handle empty lines when splitting a CSV file?

    • Description: This script skips empty lines while processing the CSV file.
    • Code:
      @echo off setlocal enabledelayedexpansion set "inputFile=largefile.csv" set "linesPerFile=100" set "fileCount=1" set "lineCount=0" for /f "tokens=*" %%A in ('type "%inputFile%"') do ( if not "%%A"=="" ( if !lineCount! lss %linesPerFile% ( echo %%A >> "part!fileCount!.csv" set /a lineCount+=1 ) else ( set /a fileCount+=1 set lineCount=1 echo %%A >> "part!fileCount!.csv" ) ) ) 
  6. How to include a file naming convention for split CSV files?

    • Description: Modify the batch file to include a specific naming format for the output files.
    • Code:
      @echo off setlocal enabledelayedexpansion set "inputFile=largefile.csv" set "linesPerFile=100" set "fileCount=1" set "lineCount=0" for /f "tokens=*" %%A in ('type "%inputFile%"') do ( echo %%A >> "output_part_!fileCount!.csv" set /a lineCount+=1 if !lineCount! geq %linesPerFile% ( set /a fileCount+=1 set lineCount=0 ) ) 
  7. How to log progress while splitting a CSV file?

    • Description: This script logs the number of lines processed to the console.
    • Code:
      @echo off setlocal enabledelayedexpansion set "inputFile=largefile.csv" set "linesPerFile=100" set "fileCount=1" set "lineCount=0" for /f "tokens=*" %%A in ('type "%inputFile%"') do ( echo %%A >> "part!fileCount!.csv" set /a lineCount+=1 echo Processed line !lineCount! if !lineCount! geq %linesPerFile% ( set /a fileCount+=1 set lineCount=0 ) ) 
  8. How to set a specific output directory for split CSV files?

    • Description: Specify an output directory where the split files will be saved.
    • Code:
      @echo off setlocal enabledelayedexpansion set "inputFile=largefile.csv" set "outputDir=output" set "linesPerFile=100" set "fileCount=1" set "lineCount=0" mkdir "%outputDir%" for /f "tokens=*" %%A in ('type "%inputFile%"') do ( echo %%A >> "%outputDir%\part!fileCount!.csv" set /a lineCount+=1 if !lineCount! geq %linesPerFile% ( set /a fileCount+=1 set lineCount=0 ) ) 
  9. How to split a CSV file by specific column values?

    • Description: This batch file splits CSV based on unique values in a specific column.
    • Code:
      @echo off setlocal enabledelayedexpansion set "inputFile=largefile.csv" set "fileCount=1" set "header=" set "prevValue=" for /f "tokens=1,* delims=," %%A in ('type "%inputFile%"') do ( if not defined header ( set "header=%%A,%%B" echo %%A,%%B >> "part!fileCount!.csv" ) else ( if "%%A" neq "!prevValue!" ( set /a fileCount+=1 echo %%A,%%B >> "part!fileCount!.csv" ) else ( echo %%A,%%B >> "part!fileCount!.csv" ) ) set "prevValue=%%A" ) 

More Tags

pygame-surface pymssql jmeter-4.0 streamwriter jackson2 openhardwaremonitor sql-date-functions pdf-generation qcombobox r.java-file

More Programming Questions

More Statistics Calculators

More Fitness Calculators

More Investment Calculators

More Physical chemistry Calculators