A while ago I wrote an article about exporting data to different spreadsheet formats. As recently I was reimplementing export to Excel for the 1st things 1st project, I noticed that the API changed a little, so it's time to blog about that again.
For Excel export I am using the XLSX file format which is a zipped XML-based format for spreadsheets with formatting support. XLSX files can be opened with Microsoft Excel, Apache OpenOffice, Apple Numbers, LibreOffice, Google Drive, and a handful of other applications. For building the XLSX file I am using openpyxl library.
Installing openpyxl
You can install openpyxl to your virtual environment the usual way with pip:
(venv) pip install openpyxl==2.6.0
Simplest Export View
To create a function exporting data from a QuerySet to XLSX file, you would need to create a view that returns a response with a special content type and file content as an attachment. Plug that view to URL rules and then link it from an export button in a template.
Probably the simplest view that generates XLSX file out of Django QuerySet would be this:
# movies/views.py from datetime import datetime from datetime import timedelta from openpyxl import Workbook from django.http import HttpResponse from .models import MovieCategory, Movie def export_movies_to_xlsx(request): """ Downloads all movies as Excel file with a single worksheet """ movie_queryset = Movie.objects.all() response = HttpResponse( content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet', ) response['Content-Disposition'] = 'attachment; filename={date}-movies.xlsx'.format( date=datetime.now().strftime('%Y-%m-%d'), ) workbook = Workbook() # Get active worksheet/tab worksheet = workbook.active worksheet.title = 'Movies' # Define the titles for columns columns = [ 'ID', 'Title', 'Description', 'Length', 'Rating', 'Price', ] row_num = 1 # Assign the titles for each cell of the header for col_num, column_title in enumerate(columns, 1): cell = worksheet.cell(row=row_num, column=col_num) cell.value = column_title # Iterate through all movies for movie in movie_queryset: row_num += 1 # Define the data for each cell in the row row = [ movie.pk, movie.title, movie.description, movie.length_in_minutes, movie.rating, movie.price, ] # Assign the data for each cell of the row for col_num, cell_value in enumerate(row, 1): cell = worksheet.cell(row=row_num, column=col_num) cell.value = cell_value workbook.save(response) return response
If you try this, you will notice, that there is no special formatting in it, all columns are of the same width, the value types are barely recognized, the header is displayed the same as the content. This is enough for further data export to CSV or manipulation with pandas. But if you want to present the data for the user in a friendly way, you need to add some magic.
Creating More Worksheets
By default, each Excel file has one worksheet represented as a tab. You can access it with:
worksheet = workbook.active worksheet.title = 'The New Tab Title'
If you want to create tabs dynamically with data from the database of Python structures, you can at first delete the current tab and add the others with:
workbook.remove(workbook.active) for index, category in enumerate(category_queryset): worksheet = workbook.create_sheet( title=category.title, index=index, )
Although not all spreadsheet applications support this, you can set the background color of the worksheet tab with:
worksheet.sheet_properties.tabColor = 'f7f7f9'
Working with Cells
Each cell can be accessed by its 1-based indexes for the rows and for the columns:
top_left_cell = worksheet.cell(row=1, column=1) top_left_cell.value = "This is good!"
Styles and formatting are applied to individual cells instead of rows or columns. There are several styling categories with multiple configurations for each of them. You can find some available options from the documentation, but even more by exploring the source code.
from openpyxl.styles import Font, Alignment, Border, Side, PatternFill top_left_cell.font = Font(name='Calibri', bold=True) top_left_cell.alignment = Alignment(horizontal='center') top_left_cell.border = Border( bottom=Side(border_style='medium', color='FF000000'), ) top_left_cell.fill = PatternFill( start_color='f7f7f9', end_color='f7f7f9', fill_type='solid', )
If you are planning to have multiple styled elements, instantiate the font, alignment, border, fill options upfront and then assign the instances to the cell attributes. Otherwise, you can get into memory issues when you have a lot of data entries.
Setting Column Widths
If you want to have some wider or narrower width for some of your columns, you can do this by modifying column dimensions. They are accessed by column letter which can be retrieved using a utility function:
from openpyxl.utils import get_column_letter column_letter = get_column_letter(col_num) column_dimensions = worksheet.column_dimensions[column_letter] column_dimensions.width = 40
The units here are some relative points depending on the width of the letters in the specified font. I would suggest playing around with the width value until you find what works for you.
When defining column width is not enough, you might want to wrap text into multiple lines so that everything can be read by people without problems. This can be done with the alignment setting for the cell as follows:
from openpyxl.styles import Alignment wrapped_alignment = Alignment(vertical='top', wrap_text=True) cell.alignment = wrapped_alignment
Data Formatting
Excel automatically detects text or number types and aligns text to the left and numbers to the right. If necessary that can be overwritten.
There are some gotchas on how to format cells when you need a percentage, prices, or time durations.
Percentage
For percentage, you have to pass the number in float format from 0.0 till 1.0 and style should be 'Percent' as follows:
cell.value = 0.75 cell.style = 'Percent'
Currency
For currency, you need values of Decimal
format, the style should be 'Currency', and you will need a special number format for currency other than American dollars, for example:
from decimal import Decimal cell.value = Decimal('14.99') cell.style = 'Currency' cell.number_format = '#,##0.00 €'
Durations
For time duration, you have to pass timedelta as the value and define special number format:
from datetime import timedelta cell.value = timedelta(minutes=90) cell.number_format = '[h]:mm;@'
This number format ensures that your duration can be greater than '23:59', for example, '140:00'.
Freezing Rows and Columns
In Excel, you can freeze rows and columns so that they stay fixed when you scroll the content vertically or horizontally. That's similar to position: fixed
in CSS.
To freeze the rows and columns, locate the top-left cell that is below the row that you want to freeze and is on the right from the column that you want to freeze. For example, if you want to freeze one row and one column, the cell would be 'B2'. Then run this:
worksheet.freeze_panes = worksheet['B2']
Fully Customized Export View
So having the knowledge of this article now we can build a view that creates separate sheets. for each movie category. Each sheet would list movies of the category with titles, descriptions, length in hours and minutes, rating in percent, and price in Euros. The tabs, as well as the headers, can have different background colors for each movie category. Cells would be well formatted. Titles and descriptions would use multiple lines to fully fit into the cells.
# movies/views.py from datetime import datetime from datetime import timedelta from openpyxl import Workbook from openpyxl.styles import Font, Alignment, Border, Side, PatternFill from openpyxl.utils import get_column_letter from django.http import HttpResponse from .models import MovieCategory, Movie def export_movies_to_xlsx(request): """ Downloads all movies as Excel file with a worksheet for each movie category """ category_queryset = MovieCategory.objects.all() response = HttpResponse( content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet', ) response['Content-Disposition'] = 'attachment; filename={date}-movies.xlsx'.format( date=datetime.now().strftime('%Y-%m-%d'), ) workbook = Workbook() # Delete the default worksheet workbook.remove(workbook.active) # Define some styles and formatting that will be later used for cells header_font = Font(name='Calibri', bold=True) centered_alignment = Alignment(horizontal='center') border_bottom = Border( bottom=Side(border_style='medium', color='FF000000'), ) wrapped_alignment = Alignment( vertical='top', wrap_text=True ) # Define the column titles and widths columns = [ ('ID', 8), ('Title', 40), ('Description', 80), ('Length', 15), ('Rating', 15), ('Price', 15), ] # Iterate through movie categories for category_index, category in enumerate(category_queryset): # Create a worksheet/tab with the title of the category worksheet = workbook.create_sheet( title=category.title, index=category_index, ) # Define the background color of the header cells fill = PatternFill( start_color=category.html_color, end_color=category.html_color, fill_type='solid', ) row_num = 1 # Assign values, styles, and formatting for each cell in the header for col_num, (column_title, column_width) in enumerate(columns, 1): cell = worksheet.cell(row=row_num, column=col_num) cell.value = column_title cell.font = header_font cell.border = border_bottom cell.alignment = centered_alignment cell.fill = fill # set column width column_letter = get_column_letter(col_num) column_dimensions = worksheet.column_dimensions[column_letter] column_dimensions.width = column_width # Iterate through all movies of a category for movie in category.movie_set.all(): row_num += 1 # Define data and formats for each cell in the row row = [ (movie.pk, 'Normal'), (movie.title, 'Normal'), (movie.description, 'Normal'), (timedelta(minutes=movie.length_in_minutes), 'Normal'), (movie.rating / 100, 'Percent'), (movie.price, 'Currency'), ] # Assign values, styles, and formatting for each cell in the row for col_num, (cell_value, cell_format) in enumerate(row, 1): cell = worksheet.cell(row=row_num, column=col_num) cell.value = cell_value cell.style = cell_format if cell_format == 'Currency': cell.number_format = '#,##0.00 €' if col_num == 4: cell.number_format = '[h]:mm;@' cell.alignment = wrapped_alignment # freeze the first row worksheet.freeze_panes = worksheet['A2'] # set tab color worksheet.sheet_properties.tabColor = category.html_color workbook.save(response) return response
The Takeaways
- Spreadsheet data can be used for further mathematical processing with pandas.
- XLSX file format allows quite a bunch of formatting options that can make your spreadsheet data more presentable and user-friendly.
- To see Excel export in action, go to 1st things 1st, log in as a demo user, and navigate to project results where you can export them as XLSX. Feedback is always welcome.
Cover photo by Tim Evans.
Top comments (0)