You are on page 1of 14

Python Style Guide | How to Write Neat and Impressive Python

Code
BE G I NNE R PRO G RA M M I NG PYT HO N T E C HNI Q UE

Overview

The Python Style Guide will enable you to write neat and beautiful Python code
Learn the different Python conventions and other nuances of Python programming in this Style Guide

Introduction

Have you ever come across a really poorly written piece of Python code? I’m talking about a tangled mess
where you had to spend hours just trying to understand what piece of code goes where. I know a lot of you
will nod your head at this.

Writing code is one part of a data scientist’s or analyst’s role. Writing beautiful and neat Python code, on
the other hand, is a different ball game altogether. This could well make or break your image as a proficient
programmer in the analytics or data science space (or even in software development).

Remember – our Python code is written once, but read a billion times over, potentially by viewers who are
not accustomed to our style of programming. This takes on even more importance in data science. So how
do we write this so-called beautiful Python code?

Welcome to the Python Style Guide!


A lot of folks in the data science and analytics domains come from a non-programming background. We
start off by learning the basics of programming, move on to comprehend the theory behind machine
learning, and then get cracking on the dataset. In this process, we often do not practice hardcore
programming or pay attention to programming conventions.

That’s the gap this Python Style Guide will aim to address. We will go over the programming conventions
for Python described by the PEP-8 document and you’ll emerge as a better programmer on the other side!

Are you completely new to Python programming? Then I’d suggest first taking the free Python course before
understanding this style guide.

Python Style Guide Contents

Why this Python Style Guide is Important for Data Science?


What is PEP8?
Understanding Python Naming Conventions
Python Style Guide’s Code Layout
Getting Familiar with using Comments
Whitespaces in your Python Code
General Programming Recommendations for Python
Autoformatting your Python code

Why this Python Style Guide is Important for Data Science?

There are a couple of reasons that make formatting such an important aspect of programming, especially
for data science projects:

Readability

Good code formatting will inevitably improve the readability of your code. This will not only present your
code as more organized but will also make it easier for the reader to easily understand what is going on in
the program. This will especially be helpful if your program runs into thousands of lines. With so many
dataframe, lists, functions, plots, etc. you can easily lose track of even your own code if you don’t follow
the correct formatting guidelines!

Collaboration

If you are collaborating on a team project, which most data scientists will be, good formatting becomes an
essential task. This makes sure that the code is understood correctly without any hassle. Also, following a
common formatting pattern maintains consistency in the program throughout the project lifecycle.

Bug fixes

Having a well-formatted code will also help you when you need to fix bugs in your program. Wrong
indentation, improper naming, etc. can easily make debugging a nightmare! Therefore, it is always better to
start off your program on the right note!

With that in mind, let’s have a quick overview of the PEP-8 style guide we will cover in this article!
 

What is PEP-8?

PEP-8, or Python Enhancement Proposal, is the style guide for Python programming. It was written by
Guido van Rossum, Barry Warsaw, and Nick Coghlan. It describes the rules for writing a beautiful and
readable Python code.

Following the PEP-8 style of coding will make sure there is consistency in your Python code, making it
easier for other readers, contributors, or yourself, to comprehend it.

This article covers the most important aspects of the PEP-8 guidelines, like how to name Python objects,
how to structure your code, when to include comments and whitespaces, and finally, some general
programming recommendations that are important but easily overlooked by most Python programmers.

Let’s learn to write better code!

The official PEP-8 documentation can be found here.

Understanding Python Naming Convention

Shakespeare famously said – “What’s in a name?”. If he had encountered a programmer back then, he
would have had a swift reply – “A lot!”.

Yes, when you write a piece of code, the name you choose for the variables, functions, and so on, has a
great impact on the comprehensibility of the code. Just have a look at the following piece of code:

# Function 1 def func(x): a = x.split()[0] b = x.split()[1] return a, b print(func('Analytics

Vidhya')) # Function 2 def name_split(full_name): first_name = full_name.split()[0] last_name =

full_name.split()[1] return first_name, last_name print(name_split('Analytics Vidhya'))

# Outputs ('Analytics', 'Vidhya') ('Analytics', 'Vidhya')

Both the functions do the same job, but the latter one gives a better intuition as to what it is happening
under the hood, even without any comments! That is why choosing the right names and following the right
naming convention can make a huge difference while writing your program. That being said, let’s look at
how you should name your objects in Python!

General Tips to Begin With

These tips can be applied to name any entity and should be followed religiously.

Try to follow the same pattern – consistency is the key!

thisVariable, ThatVariable, some_other_variable, BIG_NO

Avoid using long names while not being too frugal with the name either
this_could_be_a_bad_name = “Avoid this!” t = “This isn\’t good either”

Use sensible and descriptive names. This will help later on when you try to remember the purpose of
the code

X = “My Name” # Avoid this full_name = “My Name” # This is much better

Avoid using names that begin with numbers

1_name = “This is bad!”

Avoid using special characters like @, !, #, $, etc. in names

phone_ # Bad name

Naming Variables

Variable names should always be in lowercase

blog = "Analytics Vidhya"

For longer variable names, include underscores to separate_words. This improves readability

awesome_blog = "Analytics Vidhya"

Try not to use single-character variable names like ‘I’ (uppercase i letter), ‘O’ (uppercase o letter), ‘l’
(lowercase L letter). They can be indistinguishable from numerical 1 and 0. Have a look:

O = 0 + l + I + 1

Follow the same naming convention for Global variables

Naming Functions

Follow the lowercase with underscores naming convention


Use expressive names

# Avoid def con(): ... # This is better. def connect(): ...


If a function argument name clashes with a keyword, use a trailing underscore instead of using an
abbreviation. For example, turning break into break_ instead of brk

# Avoiding name clashes. def break_time(break_): print(“Your break time is”, break_,”long”)

Class names

Follow CapWord (or camelCase or StudlyCaps) naming convention. Just start each word with a capital
letter and do not include underscores between words

# Follow CapWord convention class MySampleClass: pass

If a class contains a subclass with the same attribute name, consider adding double underscores to
the class attribute

This will make sure the attribute __age in class Person is accessed as _Person__age. This is Python’s name
mangling and it makes sure there is no name collision

class Person: def __init__(self): self.__age = 18 obj = Person() obj.__age # Error obj._Person__age # Correct

Use the suffix “Error” for exception classes

class CustomError(Exception): “””Custom exception class“””

Naming Class Methods

The first argument of an instance method (the basic class method with no strings attached) should
always be self. This points to the calling object
The first argument of a class method should always be cls. This points to the class, not the object
instance

class SampleClass: def instance_method(self, del_): print(“Instance method”) @classmethod def

class_method(cls): print(“Class method”)

Package and Module names

Try to keep the name short and simple


The lowercase naming convention should be followed
Prefer underscores for long module names
Avoid underscores for package names

testpackage # package name sample_module.py # module name

Constant names

Constants are usually declared and assigned values within a module


The naming convention for constants is an aberration. Constant names should be all CAPITAL letters
Use underscores for longer names

# Following constant variables in global.py module PI = 3.14 GRAVITY = 9.8 SPEED_OF_Light = 3*10**8

Python Style Guide’s Code Layout

Now that you know how to name entities in Python, the next question that should pop up in your mind is
how to structure your code in Python!  Honestly, this is very important, because without proper structure,
your code could go haywire and can be the biggest turn off for any reviewer.

So, without further ado, let’s get to the basics of code layout in Python!

Indentation

It is the single most important aspect of code layout and plays a vital role in Python. Indentation tells
which lines of code are to be included in the block for execution. Missing an indentation could turn out to
be a critical mistake.

Indentations determine which code block a code statement belongs to. Imagine trying to write up a nested
for-loop code. Writing a single line of code outside its respective loop may not give you a syntax error, but
you will definitely end up with a logical error that can be potentially time-consuming in terms of debugging.

Follow the below mentioned key points on indentation for a consistent structure for your Python scripts.

Always follow the 4-space indentation rule

# Example if value<0: print(“negative value”) # Another example for i in range(5): print(“Follow this rule

religiously!”)

Prefer to use spaces over tabs

It is recommended to use Spaces over Tabs. But Tabs can be used when the code is already indented with
tabs.

if True: print('4 spaces of indentation used!')


Break large expressions into several lines

There are several ways of handling such a situation. One way is to align the succeeding statements with
the opening delimiter.

# Aligning with opening delimiter. def name_split(first_name, middle_name, last_name) # Another example. ans

= solution(value_one, value_two, value_three, value_four)

A second way is to make use of the 4-space indentation rule. This will require an extra level of indentation
to distinguish the arguments from the rest of the code inside the block.

# Making use of extra indentation. def name_split( first_name, middle_name, last_name): print(first_name,

middle_name, last_name)

Finally, you can even make use of “hanging indents”. Hanging indentation, in the context of Python, refers
to the text style where the line containing a parenthesis ends with an opening parenthesis. The
subsequent lines are indented until the closing parenthesis.

# Hanging indentation. ans = solution( value_one, value_two, value_three, value_four)

Indenting if-statements can be an issue

if-statements with multiple conditions naturally contain 4 spaces – if, space, and the opening parenthesis.
As you can see, this can be an issue. Subsequent lines will also be indented and there is no way of
differentiating the if-statement from the block of code it executes. Now, what do we do?

Well, we have a couple of ways to get our way around it:

# This is a problem. if (condition_one and condition_two): print(“Implement this”)

One way is to use an extra level of indentation of course!

# Use extra indentation. if (condition_one and condition_two): print(“Implement this”)

Another way is to add a comment between the if-statement conditions and the code block to distinguish
between the two:

# Add a comment. if (condition_one and condition_two): # this condition is valid print(“Implement this”)

Brackets need to be closed

Let’s say you have a long dictionary of values. You put all the key-value pairs in separate lines but where do
you put the closing bracket? Does it come in the last line? The line following it? And if so, do you just put it
at the beginning or after indentation?

There are a couple of ways around this problem as well.

One way is to align the closing bracket with the first non-whitespace character of the previous line.

# learning_path = { ‘Step 1’ : ’Learn programming’, ‘Step 2’ : ‘Learn machine learning’, ‘Step 3’ : ‘Crack on

the hackathons’ }
The second way is to just put it as the first character of the new line.

learning_path = { ‘Step 1’ : ’Learn programming’, ‘Step 2’ : ‘Learn machine learning’, ‘Step 3’ : ‘Crack on

the hackathons’ }

Break line before binary operators

If you are trying to fit too many operators and operands into a single line, it is bound to get cumbersome.
Instead, break it into several lines for better readability.

Now the obvious question – break before or after operators? The convention is to break before operators.
This helps to easily make out the operator and the operand it is acting upon.

# Break lines before operator. gdp = (consumption + government_spending + investment + net_exports )

Using Blank Lines

Bunching up lines of code will only make it harder for the reader to comprehend your code. One nice way to
make your code look neater and pleasing to the eyes is to introduce a relevant amount of blank lines in
your code.

Top-level functions and classes should be separated by two blank lines

# Separating classes and top level functions. class SampleClass(): pass def sample_function(): print("Top
level function")

Methods inside a class should be separated by a single blank line

# Separating methods within class. class MyClass(): def method_one(self): print("First method") def

method_two(self): print("Second method")

Try not to include blank lines between pieces of code that have related logic and function

def remove_stopwords(text): stop_words = stopwords.words("english") tokens = word_tokenize(text) clean_text =

[word for word in tokens if word not in stop_words] return clean_text

Blank lines can be used sparingly within functions to separate logical sections. This makes it easier to
comprehend the code

def remove_stopwords(text): stop_words = stopwords.words("english") tokens = word_tokenize(text) clean_text =


[word for word in tokens if word not in stop_words] clean_text = ' '.join(clean_text) clean_text =

clean_text.lower() return clean_text

Maximum line length


No more than 79 characters in a line

When you are writing code in Python, you cannot squeeze more than 79 characters into a single line. That’s
the limit and should be the guiding rule to keep the statement short.

You can break the statement into multiple lines and turn them into shorter lines of code

# Breaking into multiple lines. num_list = [y for y in range(100) if y % 2 == 0 if y % 5 == 0]


print(num_list)

Imports

Part of the reason why a lot of data scientists love to work with Python is because of the plethora of
libraries that make working with data a lot easier. Therefore, it is given that you will end up importing a
bunch of libraries and modules to accomplish any task in data science.

Should always come at the top of the Python script


Separate libraries should be imported on separate lines

import numpy as np import pandas as pd df = pd.read_csv(r'/sample.csv')

Imports should be grouped in the following order:


Standard library imports
Related third party imports
Local application/library specific imports

Include a blank line after each group of imports

import numpy as np import pandas as pd import matplotlib from glob import glob import spaCy import mypackage

Can import multiple classes from the same module in a single line

from math import ceil, floor

Getting Familiar with Proper Python Comments

Understanding an uncommented piece of code can be a strenuous activity. Even for the original writer of
the code, it can be difficult to remember what exactly is happening in a code line after a period of time.

Therefore, it is best to comment on your code then and there so that the reader can have a proper
understanding of what you tried to achieve with that particular piece of code.

 
General Tips for Including Comments

Always begin the comment with a capital letter


Comments should be complete sentences
Update the comment as and when you update your code
Avoid comments that state the obvious

Block Comments

Describe the piece of code that follows them


Have the same indentation as the piece of code
Start with a # and a single space

# Remove non-alphanumeric characters from user input string. import re raw_text = input(‘Enter string:‘) text
= re.sub(r'\W+', ' ', raw_text)

Inline comments

These are comments on the same line as the code statement


Should be separated by at least two spaces from the code statement
Starts with the usual # followed by a whitespace
Do not use them to state the obvious
Try to use them sparingly as they can be distracting

info_dict = {} # Dictionary to store the extracted information

Documentation String

Used to describe public modules, classes, functions, and methods


Also known as “docstrings”
What makes them stand out from the rest of the comments is that they are enclosed within triple
quotes ”””
If docstring ends in a single line, include the closing “”” on the same line
If docstring runs into multiple lines, put the closing “”” on a new line

def square_num(x): """Returns the square of a number.""" return x**2 def power(x, y): """Multiline comments.
Returns x raised to y. """ return x**y
 

Whitespaces in your Python Code

Whitespaces are often ignored as a trivial aspect when writing beautiful code. But using whitespaces
correctly can increase the readability of the code by leaps and bounds. They help prevent the code
statement and expressions from getting too crowded. This inevitably helps the readers to go over the code
with ease.

Key points

Avoid putting whitespaces immediately within brackets

# Correct way df[‘clean_text’] = df[‘text’].apply(preprocess)

Never put whitespace before a comma, semicolon, or colon

# Correct name_split = lambda x: x.split() # Correct

Don’t include whitespaces between a character and an opening bracket

# Correct print(‘This is the right way’) # Correct for i in range(5): name_dict[i] = input_list[i]

When using multiple operators, include whitespaces only around the lowest priority operator

# Correct ans = x**2 + b*x + c

In slices, colons act as binary operators

They should be treated as the lowest priority operators. Equal spaces must be included around each colon

# Correct df_valid = df_train[lower_bound+5 : upper_bound-5]

Trailing whitespaces should be avoided


Don’t surround = sign with whitespaces when indicating function parameters

def exp(base, power=2): return base**power

Always surround the following binary operators with single whitespace on either side:
Assignment operators (=, +=, -=, etc.)
Comparisons (==, <, >, !=, <>, <=, >=, in, not in, is, is not)
Booleans (and, or, not)
# Correct brooklyn = [‘Amy’, ‘Terry’, ‘Gina’, 'Jake'] count = 0 for name in brooklyn: if name == ‘Jake’:
print(‘Cool’) count += 1

General Programming Recommendations for Python

Often, there are a number of ways to write a piece of code. And while they achieve the same task, it is
better to use the recommended way of writing cleaner code and maintain consistency. I’ve covered some
of these in this section.

For comparison to singletons like None, always use is or is not. Do not use equality operators

# Wrong if name != None: print("Not null") # Correct if name is not None: print("Not null")

Don’t compare boolean values to TRUE or FALSE using the comparison operator. While it might be
intuitive to use the comparison operator, it is unnecessary to use it. Simply write the boolean
expression

# Correct if valid: print("Correct") # Wrong if valid == True: print("Wrong")

Instead of binding a lambda function to an identifier, use a generic function.  Because assigning
lambda function to an identifier defeats its purpose. And it will be easier for tracebacks

# Prefer this def func(x): return None # Over this func = lambda x: x**2

When you are catching exceptions, name the exception you are catching. Do not just use a bare except.
This will make sure that the exception block does not disguise other exceptions by KeyboardInterrupt
exception when you are trying to interrupt the execution

try: x = 1/0 except ZeroDivisionError: print('Cannot divide by zero')

Be consistent with your return statements. That is to say, either all return statements in a function
should return an expression or none of them should. Also, if a return statement is not returning any
value, return None instead of nothing at all

# Wrong def sample(x): if x > 0: return x+1 elif x == 0: return else: return x-1 # Correct def sample(x): if
x > 0: return x+1 elif x == 0: return None else: return x-1

If you are trying to check prefixes or suffixes in a string, use ”.startswith() and ”.endswith() instead of
string slicing. These are much cleaner and less prone to errors

# Correct if name.endswith('and'): print('Great!')


Autoformatting your Python code

Formatting won’t be a problem when you are working with small programs. But just imagine having to
follow the correct formatting rules for a complex program running into thousands of lines! This will
definitely be a difficult task to achieve. Also, most of the time, you won’t even remember all of the
formatting rules. So, how do we fix this problem? Well, we could use some autoformatters to do the job for
us!

Autoformatter is a program that identifies formatting errors and fixes them in place. Black is one such
autoformatter that takes the burden off your shoulders by automatically formatting your Python code to
one that conforms to the PEP8 style of coding.

You can easily install it using pip by typing the following command in the terminal:

pip install black

But let’s see how helpful black actually is in the real world scenario. Let’s use it to formats the following
poorly typed program:

Now, all we have to do is, head over to the terminal and type the following command:

black style_script.py

Once you have done that, if there are any formatting changes to be made, black would have already done
that in place and you will get the following message:

These changes will be reflected in your program once you try to open it again:
As you can see, it has correctly formatted the code and will definitely be helpful in case you miss out on
any of the formatting rules.

Black can also be integrated with editors like Atom, Sublime Text, Visual Studio Code, and even Jupyter
notebooks! This will surely be one extension you can never miss to add-on to your editors.

Besides black, there are other autoformatters like autopep8 and yapf which you can try out as well!

End Notes

We have covered quite a lot of key points under the Python Style Guide. If you follow these consistently
throughout your code, you are bound to end up with a cleaner and readable code.

Also, following a common standard is beneficial when you are working as a team on a project. It makes it
easier for other collaborators to understand. Go ahead and start incorporating these style tips in your
Python code!

Article Url - https://www.analyticsvidhya.com/blog/2020/07/python-style-guide/

Aniruddha Bhandari
I am on a journey to becoming a data scientist. I love to unravel trends in data, visualize it and predict
the future with ML algorithms! But the most satisfying part of this journey is sharing my learnings, from
the challenges that I face, with the community to make the world a better place!

You might also like