Professional Documents
Culture Documents
You have 1 free member-only story left this month. Sign up for Medium and get an extra one
When it comes to data visualization there are many possible tools Matplotlib, Plotly,
Bokeh… Which one is fitting my short term goals, within a notebook, and is a good
choice for longer-term, in production? What does production mean?
Now that you have a nice machine learning model, or you have completed some data
mining or analysis, you need to present and promote this amazing work. You may
initially reuse some notebooks to produce a few charts… but soon colleagues or clients
are requesting access to the data or are asking for other views or parameters. What
should you do? Which tools and libraries should you use? Is there a one fits all solution
for all stages of my work?
Data-visualization has a very wide scope, ranging from presenting data with simple
charts to be included in a report, to complex interactive dashboards. The first is
reachable to anybody that knows about Excel whereas the later is more a software
product that may require the full software development cycle and methodology.
https://towardsdatascience.com/which-library-should-i-use-for-my-dashboard-c432726a52bf 1/16
4/29/2021 Which library should I use for my Python dashboard? | by Antoine Hue | Towards Data Science
In between these two extreme cases, data scientists face many choices that are not
Get started Open in app
trivial. This post is providing some questions that will come along this process, and some
tips and answers to these. The chosen starting point is Python within a Jupiter notebook,
the target is a Web dashboard in production.
You have a great new idea for the data visualization, your boss is in love with Sunburst
graphs, but is this doable with the charting libraries you are using?
Here is a sample of drawing a word cloud from the Word Cloud Python library
documentation¹:
https://towardsdatascience.com/which-library-should-i-use-for-my-dashboard-c432726a52bf 2/16
4/29/2021 Which library should I use for my Python dashboard? | by Antoine Hue | Towards Data Science
import
Get started numpy as
Open np
in app
import matplotlib.pyplot as plt
from wordcloud import WordCloud
text = "square"
x, y = np.ogrid[:300, :300]
plt.axis("off")
plt.imshow(wc, interpolation="bilinear")
plt.show()
Network graphs
Network graphs are a specific category that is not natively handled by the above-listed
libraries. The main Python library for networks is NetworkX. By default, NetworkX is
using Matplotlib as a backend for drawing². Graphviz (Pygraphviz) is the de facto
standard graph drawing libraries and can be coupled with NetworkX³. With quite a few
lines of code, you may also use Plotly to draw the graph⁴. The integration between
NetworkX and Bokeh is also documented⁵.
Geographic plots
Geographically located information and maps is also a specific subfield of data-
visualization. Plotting maps bears many challenges, among them:
Showing more or less information depending on the zoom, this is also known as the
level of detail. The impact is at the same time on the readability of the plot, and on
the complexity that turns into latency to load and memory footprint
Depending on the chosen plotting library, you may have to do little or a lot of pre-
Get started Open in app
processing. The common open-source library to deal with geographical coordinate is
Proj, and GDAL to deal with file formats and translation from formats or coordinates to
other contexts.
Matplotlib has no direct support for plotting maps, it relies on pixel matrics (a raster
image) as explained in the gallery⁶. But this is a pis-aller, you do not want to do that, but
if your only target is a static image.
Plotly is demonstrating map plots⁷ based on Mapbox. Many features are available but at
least one is missing, the management of the level of detail in scatter plots.
Bokeh has some support for maps including the integration of Google maps⁸, but it
seems quite crude.
Map with areas (contours), clusters of markers, hover information created with Folium © the author
Folium is a Python library wrapping the Leaflet Javascript library. Leaflet is used in
many collaborative and commercial websites, for example Open Street Map. Folium is
very effective to draw maps on Open data.
https://towardsdatascience.com/which-library-should-i-use-for-my-dashboard-c432726a52bf 4/16
4/29/2021 Which library should I use for my Python dashboard? | by Antoine Hue | Towards Data Science
The open-source reference for geographic data manipulation is QGIS from OSGeo. It is a
Get started Open in app
desktop application but it includes a Python console and it is also possible to directly use
the Pyqgis API⁹.
Dataframes
Pandas and its Dataframe are must use for data science in Python. What is the impact on
chart creation and dashboard?
On the one hand, Pandas Dataframes and Series have a plot API. By default, it is relying
on Matplotlib as the backend. However, Plotly as the graphical backend for Pandas is
available¹⁰. Support for Bokeh is also available through a Github project¹¹.
On the other hand, the plots of Pandas might not fit with your requirements and you are
wondering how to use dataframes in plots besides using columns and series as vectors.
Plotly Express has this capability with support for column-oriented data¹².
https://towardsdatascience.com/which-library-should-i-use-for-my-dashboard-c432726a52bf 5/16
4/29/2021 Which library should I use for my Python dashboard? | by Antoine Hue | Towards Data Science
Enhancements to layouts
Matplotlib allows for uneven widths and heights using calls to Figure.add_subplots
method like “ fig.add_subplot(3, 1, (1, 2)) " making a subplot that spans the upper 2/3
of the figure¹⁴. Seaborn is introducing one enhancement which is the Scatter matrix¹⁵.
Plotly allows for similar capabilities including the uneven sub-plots. However, I find the
API rather limited. For example, it is not possible to set the font size of sub-plots title or
https://towardsdatascience.com/which-library-should-i-use-for-my-dashboard-c432726a52bf 6/16
4/29/2021 Which library should I use for my Python dashboard? | by Antoine Hue | Towards Data Science
Plotly, Express API, goes further with marginal probability plots as histograms or rug¹⁷,
and also a synchronized overview-detail chart that is called “range slider¹⁸”. This is
leading us to the interactivity of graphs that is detailed in the next section.
A way forward
But what if these layout helpers are not enough for my purpose? Possible answers are
many, ranging from the Plotly Dash solution to full HTML layout or SVG custom design
with d3.js.
Plotly Dash is proposing an intermediate solution in which you stick to Python but can
generate some more complex and more interactive dashboards than the plotting
libraries. Still, it requires that you have some basic HTML knowledge and sooner or later
will dive into the cascading stylesheets (CSS).
Dashboard layout with several plots and HTML elements using Dash, Plotly and Dash-Bootstrap-Components
© the author
https://towardsdatascience.com/which-library-should-i-use-for-my-dashboard-c432726a52bf 7/16
4/29/2021 Which library should I use for my Python dashboard? | by Antoine Hue | Towards Data Science
Interactivity is so many different things. It starts with common operations like zoom and
pan. The next step is synchronized graphs: zoom and pan are applied simultaneously on
several charts that share an axis. You might also be interested in synchronous selection
on two graphs, also called brushing (example in Bokeh¹⁹). Matplolib has such
interactivity for all render engines but within a notebook²⁰. There is however a solution
based on Matplotlib, mpld3 is handling this and might provide all you need. However,
the trend is to use newer libraries like Plotly or Bokeh that have zoom and pan in
notebooks “out of the box”.
Then come dynamic annotations. They span from hover information when the mouse is
located on a marker to line plot highlights. Regarding hover, whatever the used library
(Matplotlib with Mpld3 and plugins, Plotly, Bokeh) it means attaching an HTML
document div to each marker, and probably also some CSS.
https://towardsdatascience.com/which-library-should-i-use-for-my-dashboard-c432726a52bf 8/16
4/29/2021 Which library should I use for my Python dashboard? | by Antoine Hue | Towards Data Science
Bokeh plot of a UMAP projection of the MNIST digits (784 pixels) in the 2D feature plan with hover information
containing the original image © the author
More complex interactions are related to filtering or querying the data. It can be close to
the zoom function when the filter is modifying a range (e.g.: daily / weekly / monthly
selector for a time series), or a selector on the series of facets, or even more complex
associations. Selectors are available in Plotly as Custom Controls²¹ and in Bokeh as
widgets²².
The common plotting libraries provide basic capabilities for interactivity up to the
creation of some widget, but, as for advanced layouts, I would suggest to directly switch
to Plotly Dash which is more versatile.
If fluidity is gone, you have four main solutions, with increasing complexity:
Simplify the dashboard with fewer plots, fewer controls. That may be ok but then
you should think why such complexity was needed first?
Simplify the data, that shows less data or with less granularity. That may provide a
good tradeoff between features and accuracy. It is leading to the next solution…
issues linked to data management. In the end, you will do a lot more data
Get started Open in app
engineering and will probably reach a dead-end with two much data, too many
tables. The solution to the dead-end is even more data engineering with…
A server with an API is not the first thing you were thinking but you end up dealing with
this software project sooner than you think. It is also better to anticipate it than delay
until there is no other solution and project deadlines are coming fast.
Defining an API involves often several teams and skills: data scientists to define the why,
data engineers to define the what, and infrastructure engineers to define the how,
including performance, security, persistence, and integrity.
Plotly Dash allows for an initial API since it is based on the Flask framework. See the
Dash documentation on integration with a Flask app which is defining a custom Flask
route, that could serve data, i.e. an API²³. There is still no proper API access
management, including authentication. On that latter aspect, Dash is very limited²⁴.
Some tools are effective, they deliver the expected result, but are not efficient, getting to
the result takes a large amount of time. For example, d3.js is known as a very powerful
and complete data-visualization and charting library but at the same time, it requires
dealing with many things that are by default available in libraries with higher
abstraction.
Productivity is coming not only with using the right level of abstraction, but also an API
that is easy to use and well documented. I would say that none of the surveyed Python
charting libraries are easy to master.
Matplotlib’s API is quite complex when dealing with axes, formats (of labels), it is not
always consistent and quite redundant. As an example, see the above comment on
“`plt.subplot()`”. That’s not the only example, for example, there is a sub-routine
“`plt.xlabel()`” that is equivalent to the method “`ax.set_xlabel()`” on the Axes object.
https://towardsdatascience.com/which-library-should-i-use-for-my-dashboard-c432726a52bf 10/16
4/29/2021 Which library should I use for my Python dashboard? | by Antoine Hue | Towards Data Science
Plotly’s API is not better, first, you must choose between two API sets: the Express set
Get started Open in app
that is quite simple but limited and mostly targeted at dataframes, and the Graphical
Object set that is more complete and complementing the Express set, but does not have
some nice high-level features that are in Express. Then you will have to deal with the
Plotly documentation that is, to me, really difficult to get through. Searching with either
the Plotly website internal or a Web search engine seldom leads you to the API you are
looking for. And you may have to deal with the documentation of the underneath data
model in JSON.
Bokeh API is probably leaner and better documented but has some weird things like two
separate instructions to plot a line chart and associated markers.
I really need a nice and slick web app, should I be afraid of it?
Your dashboard is successful and will be deployed as a product internally to your
organization, available to clients, or even directly exposed on the Internet.
As a data scientist, you are missing skills to handle that and get the help of software
specialists. However, you are asked what is the effort or scope of this development. This
highly depends on the path to production, which framework is used there, and on the
framework that you have been using until now.
Plotly Dash may in some cases go up to production as the main Web app or an embedded
widget²³. As said earlier in such setup you will need to inspect the security of your
system before jumping to online production. Regarding security, as a data designer, you
mainly need to check that you are exposing only the wanted data and not more.
https://towardsdatascience.com/which-library-should-i-use-for-my-dashboard-c432726a52bf 11/16
4/29/2021 Which library should I use for my Python dashboard? | by Antoine Hue | Towards Data Science
Network and geographical interactive graph showing the incoming subway traffic in Paris © Data-Publica
SPA has two separate components: the browser side application in Javascript with a
framework like Angular or React.js, and the server-side application or service that may
get written on many frameworks and language: Java, C#, PHP, Javascript…, and even
Python.
Dash is already doing part of it. In fact, Dash is using one of the leading browser-side
framework, React.js, and on the server-side is based on Flask and Python. But as said
above you may reach some limits of Dash.
Besides the transition through Dash, Plotly and Bokeh have another advantage: they are
also available in Javascript as Plotly JS (and a React.js wrapper wrapper²⁶), Bokeh JS. In
fact, the Python version of Plotly is a wrapper around the Javascript. This implies that
given some plots or dashboards in Python based on Plotly or Dash or Bokeh, most of the
concepts and chart properties can be reused in the equivalent Javascript
implementation.
Conclusion
https://towardsdatascience.com/which-library-should-i-use-for-my-dashboard-c432726a52bf 12/16
4/29/2021 Which library should I use for my Python dashboard? | by Antoine Hue | Towards Data Science
In this post, we have brushed the path for a data-visualization dashboard from
Get started Open in app
experiments within notebooks, up to production. We have seen that the traditional
plotting library, Matplotlib, still has strong features and is usually the default backend
for specialized libraries like NetworkX and the Pandas Dataframe. But Matplolib is also
lagging on some aspects like integration and interactivity. Learning another framework
may be a good investment and will help you going forward up to production.
Two alternative frameworks are presented: Plotly and Bokeh. Both bring value as they
are more modern than Matplotlib. Both of them have a leading advantage when it comes
to bringing the dashboard to production: they are based on Javascript plotting
frameworks and most of the plots Python code can be translated directly in the
Javascript equivalent.
Plotly has another advantage on the go-to production path: it is integrated with Dash, a
framework to develop a simple dashboard as single-page applications while sticking to
Python. Required Javascript, including React components, and server API is generated
smoothly by Dash.
We have also seen that, as a data scientist or data-visualization designer, you should
anticipate requirements like interactivity, and their implications that may lead to the
development of an API to serve data.
References
https://towardsdatascience.com/which-library-should-i-use-for-my-dashboard-c432726a52bf 14/16
4/29/2021 Which library should I use for my Python dashboard? | by Antoine Hue | Towards Data Science
Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials
and cutting-edge research to original features you don't want to miss. Take a look.
https://towardsdatascience.com/which-library-should-i-use-for-my-dashboard-c432726a52bf Ab t H l L l
15/16
4/29/2021 Which library should I use for my Python dashboard? | by Antoine Hue | Towards Data Science
About Help Legal
Get started Open in app
Get the Medium app
https://towardsdatascience.com/which-library-should-i-use-for-my-dashboard-c432726a52bf 16/16