Professional Documents
Culture Documents
Installation
Install openpyxl using pip. It is advisable to do this in a Python virtualenv without
system packages:
Note
There is support for the popular lxml library which will be used if it is installed.
This is particular useful when creating large files.
Warning
To be able to include images (jpeg, png, bmp,…) into an openpyxl file, you will
also need the “pillow” library that can be installed with:
Sometimes you might want to work with the checkout of a particular version.
This may be the case if bugs have been fixed but a release has not yet been
made.
Create a workbook
There is no need to create a file on the filesystem to get started with openpyxl.
Just import the Workbook class and start work:
A workbook is always created with at least one worksheet. You can get it by
using the Workbook.active property:
>>> ws = wb.active
Note
This is set to 0 by default. Unless you modify its value, you will always get the
first worksheet by using this method.
You can create new worksheets using the Workbook.create_sheet() method:
Sheets are given a name automatically when they are created. They are
numbered in sequence (Sheet, Sheet1, Sheet2, …). You can change this name at
any time with the Worksheet.title property:
Once you gave a worksheet a name, you can get it as a key of the workbook:
You can review the names of all worksheets of the workbook with
the Workbook.sheetname attribute
>>> print(wb.sheetnames)
['Sheet2', 'New Title', 'Sheet1']
Workbook.copy_worksheet() method:
Note
Only cells (including values, styles, hyperlinks and comments) and certain
worksheet attributes (including dimensions, format and properties) are copied.
All other workbook / worksheet attributes are not copied - e.g. Images, Charts.
You also cannot copy worksheets between workbooks. You cannot copy a
worksheet if the workbook is open in read-only or write-only mode.
Playing with data
Accessing one cell
Now we know how to get a worksheet, we can start modifying cells content.
Cells can be accessed directly as keys of the worksheet:
>>> c = ws['A4']
This will return the cell at A4, or create one if it does not exist yet. Values can be
directly assigned:
>>> ws['A4'] = 4
Note
Because of this feature, scrolling through cells instead of accessing them directly
will create them all in memory, even if you don’t assign them a value.
Something like
Note
If you need to iterate through all the rows or columns of a file, you can instead
use the Worksheet.rows property:
>>> ws = wb.active
>>> ws['C9'] = 'hello world'
>>> tuple(ws.rows)
((<Cell Sheet.A1>, <Cell Sheet.B1>, <Cell Sheet.C1>),
(<Cell Sheet.A2>, <Cell Sheet.B2>, <Cell Sheet.C2>),
(<Cell Sheet.A3>, <Cell Sheet.B3>, <Cell Sheet.C3>),
(<Cell Sheet.A4>, <Cell Sheet.B4>, <Cell Sheet.C4>),
(<Cell Sheet.A5>, <Cell Sheet.B5>, <Cell Sheet.C5>),
(<Cell Sheet.A6>, <Cell Sheet.B6>, <Cell Sheet.C6>),
(<Cell Sheet.A7>, <Cell Sheet.B7>, <Cell Sheet.C7>),
(<Cell Sheet.A8>, <Cell Sheet.B8>, <Cell Sheet.C8>),
(<Cell Sheet.A9>, <Cell Sheet.B9>, <Cell Sheet.C9>))
>>> tuple(ws.columns)
((<Cell Sheet.A1>,
<Cell Sheet.A2>,
<Cell Sheet.A3>,
<Cell Sheet.A4>,
<Cell Sheet.A5>,
<Cell Sheet.A6>,
...
<Cell Sheet.B7>,
<Cell Sheet.B8>,
<Cell Sheet.B9>),
(<Cell Sheet.C1>,
<Cell Sheet.C2>,
<Cell Sheet.C3>,
<Cell Sheet.C4>,
<Cell Sheet.C5>,
<Cell Sheet.C6>,
<Cell Sheet.C7>,
<Cell Sheet.C8>,
<Cell Sheet.C9>))
Note
Values only
If you just want the values from a worksheet you can use
the Worksheet.values property. This iterates over all the rows in a worksheet but
returns just the cell values:
Data storage
Once we have a Cell , we can assign it a value:
Saving to a file
Warning
The filename extension is not forced to be xlsx or xlsm, although you might have
some trouble opening it directly with another application if you don’t use an
official extension.
As OOXML files are basically ZIP files, you can also open it with your favourite
ZIP archive manager.
>>> wb = load_workbook('document.xlsx')
>>> wb.template = True
>>> wb.save('document_template.xltx')
Saving as a stream
If you want to save the file to a stream, e.g. when using a web application such
as Pyramid, Flask or Django then you can simply provide a NamedTemporaryFile() :
Warning
You should monitor the data attributes and document extensions for saving
documents in the document templates and vice versa, otherwise the result table
engine can not open the document.
Note
>>> wb = load_workbook('document.xlsx')
>>> # Need to save with the extension *.xlsx
>>> wb.save('new_document.xlsm')
>>> # MS Excel can't open the document
>>>
>>> # or
>>>
>>> # Need specify attribute keep_vba=True
>>> wb = load_workbook('document.xlsm')
>>> wb.save('new_document.xlsm')
>>> # MS Excel will not open the document
>>>
>>> # or
>>>
>>> wb = load_workbook('document.xltm', keep_vba=True)
>>> # If we need a template document, then we must specify extension as *.xltm.
>>> wb.save('new_document.xlsm')
>>> # MS Excel will not open the document
Note
Warning
openpyxl does currently not read all possible items in an Excel file so shapes will
be lost from existing files if they are opened and saved with the same name.
You can find the spec by searching for ECMA-376, most of the implementation
specifics are in Part 4.
This ends the tutorial for now, you can proceed to the Simple usage section
Simple usage
Example: Creating a simple spreadsheet and bar
chart
In this example we’re going to create a sheet from scratch and add some data
and then plot it. We’ll also explore some limited cell style and formatting.
To start, let’s load in openpyxl and create a new workbook. and get the active
sheet. We’ll also enter our tree data.
>>> wb = Workbook()
>>> ws = wb.active
>>> treeData = [["Type", "Leaf Color", "Height"], ["Maple", "Red", 549], ["Oak",
"Green", 783], ["Pine", "Green", 1204]]
Next we’ll enter this data onto the worksheet. As this is a list of lists, we can
simply use the Worksheet.append() function.
Now we should make our heading Bold to make it stand out a bit more, to do
that we’ll need to create a styles.Font and apply it to all the cells in our header
row.
>>> ft = Font(bold=True)
>>> for row in ws["A1:C1"]:
... for cell in row:
... cell.font = ft
It’s time to make some charts. First, we’ll start by importing the appropriate
packages from openpyxl.chart then define some basic attributes
That’s created the skeleton of what will be our bar chart. Now we need to add
references to where the data is and pass that to the chart object
>>> chart.add_data(data)
>>> chart.set_categories(categories)
And there you have it. If you open that doc now it should look something like
this
Cell Styles
Cell styles are shared between objects and once they have been assigned they
cannot be changed. This stops unwanted side-effects such as changing the style
for lots of cells when only one changes.
Copying styles
Styles can also be copied
Colours
Colours for fonts, backgrounds, borders, etc. can be set in three ways: indexed,
aRGB or theme. Indexed colours are the legacy implementation and the colours
themselves depend upon the index provided with the workbook or with the
application default. Theme colours are useful for complementary shades of
colours but also depend upon the theme being present in the workbook. It is,
therefore, advisable to use aRGB colours.
aRGB colours
RGB colours are set using hexadecimal values for red, green and blue.
>>> from openpyxl.styles import Font
>>> font = Font(color="FF0000")
The alpha value refers in theory to the transparency of the colour but this is not
relevant for cell styles. The default of 00 will prepended to any simple RGB
value:
There is also support for legacy indexed colours as well as themes and tints.
Indexed Colours
Standard Colours
Index
0-4 00000000 00FFFFFF 00FF0000 0000FF00 000000FF
Applying Styles
Styles are applied directly to cells
Styles can also applied to columns and rows but note that this applies only to
cells created (in Excel) after the file is closed. If you want to apply styles to entire
rows and columns then you must apply the style to each cell yourself. This is a
restriction of the file format:
Named Styles
In contrast to Cell Styles, Named Styles are mutable. They make sense when you
want to apply formatting to lots of different cells at once. NB. once you have
assigned a named style to a cell, additional changes to the style will not affect
the cell.
Once a named style has been registered with a workbook, it can be referred to
simply by name.
Once a named style has been created, it can be registered with the workbook:
>>> wb.add_named_style(highlight)
But named styles will also be registered automatically the first time they are
assigned to a cell:
Number formats
• ‘Comma’
• ‘Comma [0]’
• ‘Currency’
• ‘Currency [0]’
• ‘Percent’
Informative
• ‘Calculation’
• ‘Total’
• ‘Note’
• ‘Warning Text’
• ‘Explanatory Text’
Text styles
• ‘Title’
• ‘Headline 1’
• ‘Headline 2’
• ‘Headline 3’
• ‘Headline 4’
• ‘Hyperlink’
• ‘Followed Hyperlink’
• ‘Linked Cell’
Comparisons
• ‘Input’
• ‘Output’
• ‘Check Cell’
• ‘Good’
• ‘Bad’
• ‘Neutral’
Highlights
• ‘Accent1’
• ‘20 % - Accent1’
• ‘40 % - Accent1’
• ‘60 % - Accent1’
• ‘Accent2’
• ‘20 % - Accent2’
• ‘40 % - Accent2’
• ‘60 % - Accent2’
• ‘Accent3’
• ‘20 % - Accent3’
• ‘40 % - Accent3’
• ‘60 % - Accent3’
• ‘Accent4’
• ‘20 % - Accent4’
• ‘40 % - Accent4’
• ‘60 % - Accent4’
• ‘Accent5’
• ‘20 % - Accent5’
• ‘40 % - Accent5’
• ‘60 % - Accent5’
• ‘Accent6’
• ‘20 % - Accent6’
• ‘40 % - Accent6’
• ‘60 % - Accent6’
• ‘Pandas’
Rich Text objects can contain a mix of unformatted text and TextBlock objects
that contains an InlineFont style and a the text which is to be formatted like this.
The result is a CellRichText object.
InlineFont objects are virtually identical to the Font objects, but use a different
attribute name, rFont, for the name of the font. Unfortunately, this is required by
OOXML and cannot be avoided.
Fortunately, if you already have a Font object, you can simply initialize
an InlineFont object with an existing Font object:
>>> from openpyxl.cell.text import Font
>>> font = Font(name='Calibri',
... size=11,
... bold=False,
... italic=False,
... vertAlign=None,
... underline='none',
... strike=False,
... color='00FF0000')
>>> inline_font = InlineFont(font)
You can create InlineFont objects on their own, and use them later. This makes
working with Rich Text cleaner and easier:
For example:
The CellRichText object is derived from list, and can be used as such.
Whitespace
>>> t = CellRichText()
>>> t.append('xx')
>>> t.append(TextBlock(red, "red"))
You can also cast it to a str to get only the text, without formatting.
>>> str(t)
'xxred'
Conditional Formatting
Excel supports three different types of conditional formatting: builtins, standard
and custom. Builtins combine specific rules with predefined styles. Standard
conditional formats combine specific rules with custom formatting. In additional
it is possible to define custom formulae for applying custom formats using
differential styles.
Note
The syntax for the different rules varies so much that it is not possible for
openpyxl to know whether a rule makes sense or not.
Because the signatures for some rules can be quite verbose there are also some
convenience factories for creating them.
Builtin formats
The builtins conditional formats are:
• ColorScale
• IconSet
• DataBar
ColorScale
You can have color scales with 2 or 3 colors. 2 color scales produce a gradient
from one color to another; 3 color scales use an additional color for 2 gradients.
IconSet
DataBar
• Average
• Percent
• Unique or duplicate
• Value
• Rank
Note
The formula uses an absolute reference to the column referred to, B in this
case; but a relative row number, in this case 1 to the range over which the
format is applied. It can be tricky to get this right but the rule can be adjusted
even after it has been added to the worksheet’s conditional format collection.
• openpyxl.worksheet.worksheet.Worksheet.insert_rows()
• openpyxl.worksheet.worksheet.Worksheet.insert_cols()
• openpyxl.worksheet.worksheet.Worksheet.delete_rows()
• openpyxl.worksheet.worksheet.Worksheet.delete_cols()
The default is one row or column. For example to insert a row at 7 (before the
existing row 7):
>>> ws.insert_rows(7)
>>> ws.delete_cols(6, 3)
Note
Openpyxl does not manage dependencies, such as formulae, tables, charts, etc.,
when rows or columns are inserted or deleted. This is considered to be out of
scope for a library that focuses on managing the file format. As a result, client
code must implement the functionality required in any particular use case.
This will move the cells in the range D4:F10 up one row, and right two columns.
The cells will overwrite any existing cells.
If cells contain formulae you can let openpyxl translate these for you, but as this
is not always what you want it is disabled by default. Also only the formulae in
the cells themselves will be translated. References to the cells from other cells or
defined names will not be updated; you can use the Parsing Formulas translator
to do this:
This will move the relative references in formulae in the range by one row and
one column.
Note
By default, outline properties are intitialized so you can directly modify each of
their 4 attributes, while page setup properties don’t. If you want modify the
latter, you should first initialize
a openpyxl.worksheet.properties.PageSetupProperties object with the required
parameters. Once done, they can be directly modified by the routine later if
needed.
>>> from openpyxl.workbook import Workbook
>>> from openpyxl.worksheet.properties import WorksheetProperties,
PageSetupProperties
>>>
>>> wb = Workbook()
>>> ws = wb.active
>>>
>>> wsprops = ws.sheet_properties
>>> wsprops.tabColor = "1072BA"
>>> wsprops.filterMode = False
>>> wsprops.pageSetUpPr = PageSetupProperties(fitToPage=True, autoPageBreaks=False)
>>> wsprops.outlinePr.summaryBelow = False
>>> wsprops.outlinePr.applyStyles = True
>>> wsprops.pageSetUpPr.autoPageBreaks = True
Worksheet Views
There are also several convenient properties defined as worksheet views. You
can use ws.sheet_view to set sheet attributes such as zoom, show formulas or if
the tab is selected.
Fold (outline)
>>> import openpyxl
>>> wb = openpyxl.Workbook()
>>> ws = wb.create_sheet()
>>> ws.column_dimensions.group('A','D', hidden=True)
>>> ws.row_dimensions.group(1,10, hidden=True)
>>> wb.save('group.xlsx')
Validating cells
Data validators can be applied to ranges of cells but are not enforced or
evaluated. Ranges do not have to be contiguous: eg. “A1 B2:B5” is contains A1
and the cells B2 to B5 but not A2 or B2.
Examples
>>> from openpyxl import Workbook
>>> from openpyxl.worksheet.datavalidation import DataValidation
>>>
>>> # Create the workbook and worksheet we'll be working with
>>> wb = Workbook()
>>> ws = wb.active
>>>
>>> # Create a data-validation object with list validation
>>> dv = DataValidation(type="list", formula1='"Dog,Cat,Bat"', allow_blank=True)
>>>
>>> # Optionally set a custom error message
>>> dv.error ='Your entry is not in the list'
>>> dv.errorTitle = 'Invalid Entry'
>>>
>>> # Optionally set a custom prompt message
>>> dv.prompt = 'Please select from the list'
>>> dv.promptTitle = 'List Selection'
>>>
>>> # Add the data-validation object to the worksheet
>>> ws.add_data_validation(dv)
>>> # Create some cells, and add them to the data-validation object
>>> c1 = ws["A1"]
>>> c1.value = "Dog"
>>> dv.add(c1)
>>> c2 = ws["A2"]
>>> c2.value = "An invalid value"
>>> dv.add(c2)
>>>
>>> # Or, apply the validation to a range of cells
>>> dv.add('B1:B1048576') # This is the same as for the whole of column B
>>>
>>> # Check with a cell is in the validator
>>> "B4" in dv
True
Note
Validations without any cell ranges will be ignored when saving a workbook.
Note
dv = DataValidation(type="whole")
Any whole number above 100:
dv = DataValidation(type="whole",
operator="greaterThan",
formula1=100)
Any decimal number:
dv = DataValidation(type="decimal")
Any decimal number between 0 and 1:
dv = DataValidation(type="decimal",
operator="between",
formula1=0,
formula2=1)
Any date:
dv = DataValidation(type="date")
or time:
dv = DataValidation(type="time")
Any string at most 15 characters:
dv = DataValidation(type="textLength",
operator="lessThanOrEqual"),
formula1=15)
Cell range validation:
dv = DataValidation(type="custom",
formula1"=SOMEFORMULA")
Note
Worksheet Tables
Worksheet tables are references to groups of cells. This makes certain
operations such as styling the cells in a table easier.
Creating a table
from openpyxl import Workbook
from openpyxl.worksheet.table import Table, TableStyleInfo
wb = Workbook()
ws = wb.active
data = [
['Apples', 10000, 5000, 8000, 6000],
['Pears', 2000, 3000, 4000, 5000],
['Bananas', 6000, 6000, 6500, 6000],
['Oranges', 500, 300, 200, 700],
]
'''
Table must be added using ws.add_table() method to avoid duplicate names.
Using this method ensures table name is unque through out defined names and all other
table name.
'''
ws.add_table(tab)
wb.save("table.xlsx")
Table names must be unique within a workbook. By default tables are created
with a header from the first row and filters for all the columns and table headers
and column headings must always contain strings.
Warning
In write-only mode you must add column headings to tables manually and the
values must always be the same as the values of the corresponding cells (ee
below for an example of how to do this), otherwise Excel may consider the file
invalid and remove the table.
Styles are managed using the the TableStyleInfo object. This allows you to stripe
rows or columns and apply the different colour schemes.
>>> ws.tables
{"Table1", <openpyxl.worksheet.table.Table object>}
>>> ws.tables.items()
>>> [("Table1", "A1:D10")]
Delete a table
>>> del ws.tables["Table1"]
The number of tables in a worksheet
>>> len(ws.tables)
>>> 1
>>> headings = ["Fruit", "2011", "2012", "2013", "2014"] # all values must be strings
>>> table._initialise_columns()
>>> for column, value in zip(table.tableColumns, headings):
column.name = value
Filters
If you need to handle this you can extract the range of the table and define the
print area as the appropriate cell range.
Note
Filters and sorts can only be configured by openpyxl but will need to be applied
in applications like Excel. This is because they actually rearrange, format and
hide rows in the range.
To add a filter you define a range and then add columns. You set the range over
which the filter by setting the ref attribute. Filters are then applied to columns in
the range using a zero-based index, eg. in a range from A1:H10, colId 1 refers to
column B. Openpyxl does not check the validity of such assignments.
wb = Workbook()
ws = wb.active
data = [
["Fruit", "Quantity"],
["Kiwi", 3],
["Grape", 15],
["Apple", 3],
["Peach", 3],
["Pomegranate", 3],
["Pear", 3],
["Tangerine", 3],
["Blueberry", 3],
["Mango", 3],
["Watermelon", 3],
["Blackberry", 3],
["Orange", 3],
["Raspberry", 3],
["Banana", 3]
]
for r in data:
ws.append(r)
filters = ws.auto_filter
filters.ref = "A1:B15"
col = FilterColumn(colId=0) # for column A
col.filters = Filters(filter=["Kiwi", "Apple", "Mango"]) # add selected values
filters.filterColumn.append(col) # add filter to the worksheet
ws.auto_filter.add_sort_condition("B2:B15")
wb.save("filtered.xlsx")
This will add the relevant instructions to the file but will neither actually filter
nor sort.
Advanced filters
The following predefined filters can be
used: CustomFilter, DateGroupItem, DynamicFilter, ColorFilter, IconFilter and Top10
The signature and structure of the different kinds of filter varies significantly. As
such it makes sense to familiarise yourself with either the openpyxl source code
or the OOXML specification.
CustomFilter
CustomFilters can have one or two conditions which will operate either
independently (the default), or combined by setting the and_ attribute. Filter can
use the following
operators: 'equal', 'lessThan', 'lessThanOrEqual', 'notEqual', 'greaterThanOrEqual',
'greaterThan' .
cfs.and_ = True
In addition, Excel has non-standardised functionality for pattern matching with
strings. The options in Excel: begins with, ends with, contains and their negatives
are all implemented using the equal (or for negatives notEqual ) operator and
wildcard in the value.
For example: for “begins with a”, use a* ; for “ends with a”, use *a ; and for
“contains a””, use *a* .
DateGroupItem
Date filters can be set to allow filtering by different datetime criteria such as
year, month or hour. As they are similar to lists of values you can have multiple
items.
Print Settings
openpyxl provides reasonably full support for print settings.
>>>
>>> wb = Workbook()
>>> ws = wb.active
>>>
>>> ws.page_setup.orientation = ws.ORIENTATION_LANDSCAPE
>>> ws.page_setup.paperSize = ws.PAPERSIZE_A5
The table size is stored internally as an integer, a number of alias variables are
also available for common sizes (refer to PAPERSIZE_*
in openpyxl.worksheet.worksheet ). If you need a non-standard size, a full list can be
found by searching ECMA-376 pageSetup and setting that value as the paperSize
Next Previous
Pivot Tables
openpyxl provides read-support for pivot tables so that they will be preserved in
existing files. The specification for pivot tables, while extensive, is not very clear
and it is not intended that client code should be able to create pivot tables.
However, it should be possible to edit and manipulate existing pivot tables, eg.
change their ranges or whether they should update automatically settings.
As is the case for charts, images and tables there is currently no management
API for pivot tables so that client code will have to loop over the _pivots list of a
worksheet.
Example
from openpyxl import load_workbook
wb = load_workbook("campaign.xlsx")
ws = wb["Results"]
pivot = ws._pivots[0] # any will do as they share the same cache
pivot.cache.refreshOnLoad = True
Next Previous