The to_csv()
method in Pandas is used to write to a CSV file.
Example
import pandas as pd data = { 'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35], 'City': ['New York', 'San Francisco', 'Los Angeles'] } df = pd.DataFrame(data) # use to_csv() to write df to a CSV file df.to_csv('sample_data.csv') '' Output sample_data.csv: Name,Age,City Alice,25,New York Bob,30,San Francisco Charlie,35,Los Angeles ''
to_csv() Syntax
The syntax of the to_csv()
method in Pandas is:
df.to_csv(path_or_buf, sep=',', header=True, index=False, mode='w', encoding=None, quoting=None, line_terminator='\n')
to_csv() Arguments
The to_csv()
method takes following common arguments:
path_or_buf
(optional) - represents the path or buffer object where the DataFrame will be saved as a CSV filesep
(optional) - specifies the delimiter to be used in the output CSV fileheader
(optional) - indicates whether to include the header row in the output CSV fileindex
(optional) - determines whether to include the index column in the output CSV filemode
(optional) - specifies the mode in which the output file will be openedencoding
(optional) - sets the character encoding to be used when writing the CSV filequoting
(optional) - determines the quoting behavior for fields that contain special charactersline_terminator
(optional) - specifies the character sequence used to terminate lines in the output CSV file
to_csv() Return Value
The return value of to_csv()
is either none, when writing to a file or file-like object, or the CSV-formatted string representation of the DataFrame, when no file destination is specified.
Example1: Write to a CSV File
import pandas as pd # create dataframe data = {'Name': ['Tom', 'Nick', 'John', 'Tom'], 'Age': [20, 21, 19, 18], 'City': ['New York', 'London', 'Paris', 'Berlin']} df = pd.DataFrame(data) # use to_csv() to write to csv file df.to_csv(path_or_buf='output_path.csv')
In this example, we wrote a DataFrame to the CSV file output.csv
using the path_or_buf
argument to specify the file name to write to inside the to_csv()
method.
Our output.csv
would look like this:
,Name,Age,City 0,Tom,20,New York 1,Nick,21,London 2,John,19,Paris 3,Tom,18,Berlin
Example 2: Using Different Delimiters in CSV Files
import pandas as pd # create dataframe data = {'Name': ['Tom', 'Nick', 'John', 'Tom'], 'Age': [20, 21, 19, 18], 'City': ['New York', 'London', 'Paris', 'Berlin']} df = pd.DataFrame(data) # use to_csv() to write to csv file with semicolon as the delimiter # also exclude indices df.to_csv('output_with_semicolon.csv', sep=';', index=False)
Here, we have used the sep=';'
argument inside to_csv()
to write to a CSV file with semicolon as the delimiter.
Also, we have used index=False
to exclude indices while writing to a CSV file.
Our output_with_semicolon.csv
would look like this:
Name;Age;City Tom;20;New York Nick;21;London John;19;Paris Tom;18;Berlin
Example 3: Controlling Column Headers With header Argument
import pandas as pd # create dataframe data = {'Name': ['Tom', 'Nick', 'John', 'Tom'], 'Age': [20, 21, 19, 18], 'City': ['New York', 'London', 'Paris', 'Berlin']} df = pd.DataFrame(data) # use to_csv() to write to csv file and exclude headers df.to_csv('output_without_headers.csv', header=False)
Here, since we are using header=False
inside to_csv()
, the column names Name
, Age
, City
are not present in the output file.
Hence, our output_without_headers.csv
would look like this:
0,Tom,20,New York 1,Nick,21,London 2,John,19,Paris 3,Tom,18,Berlin
Example 4: Writing and Appending to CSVs with Pandas
In Pandas, the mode
parameter in the to_csv()
method is used to specify the mode in which the file is opened. When
mode='w'
(default) - write mode. It will open the file for writing, and any existing file with the same name will be overwritten. If the file does not exist, it creates a new file.mode='a'
- append mode. It will open the file for writing, but data will be appended to the end of the file if it already exists. If the file doesn't exist, it creates a new one.mode='x'
- exclusive creation mode. The operation will fail if the file already exists.
Let's look at an example.
import pandas as pd # create the first DataFrame data1 = { 'Name': ['Tom', 'Nick', 'John'], 'Age': [20, 21, 19], 'City': ['New York', 'London', 'Paris'] } df1 = pd.DataFrame(data1) # write the first DataFrame to the file with headers df1.to_csv('output.csv', mode='w', header=True, index=False) # create the second DataFrame data2 = { 'Name': ['Anna', 'Elsa', 'Olaf'], 'Age': [25, 28, 5], 'City': ['Arendelle', 'Arendelle', 'Arendelle'] } df2 = pd.DataFrame(data2) # append the second DataFrame to the same file without headers df2.to_csv('output.csv', mode='a', header=False, index=False)
In the above example, we created two DataFrames df1 and df2 each containing data for three individuals with Name
, Age
, and City
columns.
First, we wrote df1 to output.csv
with column headers and without row indices.
Then, we appended df2 to output.csv
without adding the headers again, ensuring a continuous list of six individuals in the final file.
Hence, our output.csv
would look like this:
Name,Age,City Tom,20,New York Nick,21,London John,19,Paris Anna,25,Arendelle Elsa,28,Arendelle Olaf,5,Arendelle
Example 5: Controlling Quotation Marks While Writing to CSV Files
The quoting
parameter in the to_csv()
method controls the quoting behavior of values within the CSV file.
The quoting
parameter can take one of the following values:
csv.QUOTE_MINIMAL
(default) - quote only when needed (e.g. for special characters like commas or quotes)csv.QUOTE_ALL
- quote everything, alwayscsv.QUOTE_NONNUMERIC
- quote non-numeric values, leave numeric values unquotedcsv.QUOTE_NONE
- never quote anything, ensure special characters are handled manually
Let's look at an example.
import pandas as pd import csv data = { 'Name': ['Tom', 'Anna,Smith', 'John"Doe'], 'Age': [25, 30, 40] } df = pd.DataFrame(data) # save with different quoting options df.to_csv('output_minimal.csv', quoting=csv.QUOTE_MINIMAL, index=False) df.to_csv('output_all.csv', quoting=csv.QUOTE_ALL, index=False) df.to_csv('output_nonnumeric.csv', quoting=csv.QUOTE_NONNUMERIC, index=False) df.to_csv('output_none.csv', quoting=csv.QUOTE_NONE, index=False, escapechar='\\')
Here, when we use quoting=csv.QUOTE_MINIMAL
, only 'Anna,Smith'
is quoted since it contains a comma, which is the default delimiter.
So our output_minimal.csv
file would look like this:
Name,Age Tom,25 "Anna,Smith",30 John"Doe",40
And, when we use quoting=csv.QUOTE_ALL
, every value, including headers, is quoted. Note that the double quote inside 'John"Doe'
is escaped with another double quote.
Hence, our output_all.csv
file would look like this:
"Name","Age" "Tom","25" "Anna,Smith","30" "John""Doe","40"
Likewise, when we use quoting=csv.QUOTE_NONNUMERIC
, all non-numeric fields are quoted. Again, the double quote in 'John"Doe'
is escaped with another double quote, but the ages, being numeric, are not quoted.
So our output_nonnumeric.csv
file would look like this:
"Name",Age "Tom",25 "Anna,Smith",30 "John""Doe",40
Finally, when we use quoting=csv.QUOTE_NONE
, no values are quoted, but the special characters are escaped using the escape character, which we've set as the backslash \\
.
Hence, our output_none.csv
file would look like this:
Name,Age Tom,25 Anna\,Smith,30 John\"Doe,40
Example 6: Customize CSV Line Endings With Pandas
import pandas as pd data = { 'Name': ['Tom', 'Alice', 'Bob'], 'Age': [25, 30, 35], 'City': ['New York', 'San Francisco', 'Los Angeles'] } df = pd.DataFrame(data) # save with different line terminators df.to_csv('output_unix.csv', line_terminator='\n', index=False) df.to_csv('output_tilde.csv', line_terminator='~', index=False)
Here, when we use line_terminator='\n'
, the CSV file would look like this:
Name,Age,City Tom,25,New York Alice,30,San Francisco Bob,35,Los Angeles
And when we use line_terminator='~'
, the CSV file would look like this:
Name,Age,City~Tom,25,New York~Alice,30,San Francisco~Bob,35,Los Angeles~