Professional Documents
Culture Documents
Table of Contents
Motivation................................................................................................................................................................4
Background.............................................................................................................................................................4
Executive Summary.........................................................................................................................................5
Tableau Server Powers Tableau Public............................................................................................6
Dogfooding at Cloud Scale..............................................................................................................................7
New Architecture Updates.......................................................................................................................8
New Minimum Hardware Requirements................................................................................................9
Performance Improvements............................................................................................................................9
Parallel Queries................................................................................................................................................9
Query Fusion.................................................................................................................................................. 10
Cache Server External Query Cache......................................................................................... 10
Horizontal Scale for Data Engine............................................................................................................... 12
Other Improvements........................................................................................................................................ 12
Scalability Testing Goals............................................................................................................................ 12
Testing Approach & Methodology.....................................................................................................13
Virtual Machines................................................................................................................................................... 14
Physical Machines................................................................................................................................................ 14
System Saturation and Think Time........................................................................................................... 15
Littles Law....................................................................................................................................................... 17
Think Time....................................................................................................................................................... 17
Workload Mix Changes................................................................................................................................... 18
New Methodology...................................................................................................................................... 19
Test Workbook Examples......................................................................................................................20
Extract Characteristics.....................................................................................................................................21
Standardized Isolated Environment......................................................................................................... 22
Deployment Topology.................................................................................................................................... 22
Measurement & Reporting..................................................................................................................... 23
Transaction..............................................................................................................................................................24
Throughput.............................................................................................................................................................24
Saturation Throughput.....................................................................................................................................24
Response Time.....................................................................................................................................................24
Concurrent Users...............................................................................................................................................24
Results..................................................................................................................................................................... 25
Comparing Scalability of Tableau Server 9.0 with 8.3.................................................................. 25
Linearly Scaling Throughput..........................................................................................................................26
Overall Hardware Observations...............................................................................................................26
Memory.............................................................................................................................................................27
Disk Throughput..........................................................................................................................................28
Network Usage.............................................................................................................................................28
8-Core Single Machine Comparison................................................................................................29
Increased Memory Requirements.....................................................................................................31
High Availability Impact.................................................................................................................................. 32
Applying Results...............................................................................................................................................33
Backgrounder Considerations.....................................................................................................................33
Best Practices DIY Scale Testing........................................................................................................... 34
TabJolt - Tooling for Scalability Testing........................................................................................... 34
Best Practices for Optimization In The Real World......................................................... 35
Summary............................................................................................................................................................... 36
Motivation
Many of our customers are making a strategic choice to deliver self-service
analytics at scale. Its natural for our customers (IT and business alike) to want
to understand how Tableau Server scales to support all their users globally.
In addition, customers want to plan ahead for capacity and hardware budget
allocations to accommodate increased adoption of Tableau.
As part of our Tableau 9.0 release process, we set a goal to understand how
Tableau Server 9.0 compares in scalability characteristics with Tableau Server
8.3. We also wanted to understand whether Tableau Server 9.0 scaled linearly
and how increased loads affected its availability.
Background
If you are used to traditional BI or are new to Tableau, it may help to understand
some core differences with how Tableau works.
Unlike traditional BI reports that are designed and developed for a limited set of
requirements, Tableau visualizations are built for interactivity. Users can ask any
number of questions about their data, without having to go through a traditional
software development life cycle to create new visualizations.
To provide self-service analytics at scaleand help keep users in the flow of
analysiswe have built on top of existing innovative technologies for Tableau
Server 9.0.
With Tableau, the age-old idea of query first, visualize next is completely
changed. Patented technologies, including VizQL, seamlessly combine query
and visualization into one process.
Users focus on their business problems and on asking questions of their data.
Instead of the old way, selecting data and picking from pre-built chart types.
They iteratively drag and drop dimensions, blend datasets, and create
calculations on various measures. During this process, Tableau creates clear
visualizations and seamlessly runs needed queries at the same time. This is a
different paradigm that you should factor in as you try to understand the scalability
of Tableau Server.
If you come from a traditional BI world, you are probably used to load-testing
static reports that meet a specific service level agreement (SLA). A static report
has a fixed scope, fixed set of queries and is often optimized by a developer, one
at a time, over many weeks.
Executive Summary
Tableau 9.0 is the biggest release in the history of our company. Since November
2014, very early in the 9.0 release cycle, we started performance and scalability
testing of new features as they were still being developed. We iteratively
incorporated design feedback for new features into the performance and load
testing for Tableau Server 9.0.
There are a number of factors that can impact performance and scalability,
including workbook design, server configuration, infrastructure tuning,
and networking.
Based on our goals and testing methodology we demonstrated that:
1. Tableau Server 9.0 is nearly linearly scalable across all scenarios tested.
2. Tableau Server 9.0 showed a 200+% improvement in throughput and significant
reduction in response times compared to 8.3
3. Tableau Server 9.0 showed increased memory and network usage compared
to 8.3
With many new architectural updates in Tableau Server 9.0, we chose cluster
topologies based on iterative testing for new server design and common customer
scenarios. In the table below (Figure 1), each row represents a Tableau Server
9.0 cluster configuration of 1 Node - 16 Cores, 2 Node - 32 Cores, and 3 Node 48 Cores.
We observed that in various configurations Tableau Server 9.0 could support
the following count of users when the system was at saturation. The table of
concurrent users included below represents the number of end users accessing
visualizations and interacting with them concurrently, at server saturation
using Littles Law.
In our test scenarios, we assume that roughly 10% of the total end users in
an organization or department are concurrently accessing and interacting
with visualizations.
Based on our testing and workloads, we observed that Tableau Server 9.0 can
support up to 927 total users on a 16-core single machine deployment, and scales
up to 2809 total users on a 48-core, 3-node cluster setup as shown in the table.
Deployment
Configuration
92.75
138.04
280.93
927
1380
2809
1 Node 16 Cores
2 Node 32 Cores
3 Node 48 Cores
With over 100,000 authors, over 450 million views, and 500,000 visualizations,
Tableau Public plays a key role in allowing us to use our own products.
Dogfooding at Cloud-Scale
Using our own products to do our work on a daily basis is a core Tableau
cultural value.
Tableau Public gives us a cloud-scale test environment to test new versions of
Tableau Server. As part of the product release process, we deploy Tableau Server
pre-release software to Tableau Public. This enables us not only to deploy our
products at large scale in a production, mission-critical environment, but also to
understand, find, and fix issues related to scalability.
We deployed Tableau Server 9.0 to Tableau Public in the 9.0 Beta cycle.
This gave us ample opportunity to not only learn about how the new architecture
is scaling in a real production situation but also helped up to find and fix issues
before we released the product to corporate customers.
Tableau Public has served more than 450 million impressions in its lifetime with
over 27 million in just the last month. It also supports more than 100,000 authors
who are creating and publishing over 500,000 visualizations to Tableau Public.
User
Tier
Storage
Tier
Management
Tier
Content
Management
Services*
Visualization
Services
Data
Provider
Services
Repository (Postgres)
File Store*
Cluster Controller*
Coordination Service*
Backgrounder
API
Services
10
This means that Tableau Server can have multiple connections open to your
back-end database and leverage more database resources where possible.
This allows compatible databases to work on queries in parallel instead of
sequentially, resulting in significantly faster query results. Whether this capability
benefits you specifically depends on the how your back-end databases handle
parallel work presented to them.
Query Fusion
As the name suggests, we take multiple separate queries from a dashboard and
fuse them together where possible, reducing the number of queries sent to the
back-end database. This is particularly beneficial for live connections.
However, if your dashboard is not generating any queries that are combinable,
this optimization will not help you.
Multiple Queries
Identify identical
queries excluding
columns returned
Fused Query
When output
columns differ by
aggregation/calcs
Fuse to single
query with all
columns
necessary
11
If this is characteristic of your data freshness and usage scenarios, then loading
these workbooks a second time will be significantly faster for your end users.
With the external query cache we save the results from previous queries for fast
access by future users.
The Cache Server process is powered by Redis, which is a highly scalable key
value cache used by many large internet-scale providers.
Application
Server
Data
Server
API Server
Backgrounder
Vizql
Server
Cache Server
DB
DB
12
13
14
Virtual Machines
Many customers deploy Tableau Server on virtual machines and run successful
scalable deployments. It is not the goal of the whitepaper to exhaustively
distinguish between physical and virtual infrastructure environments or across
the various virtualization platforms that are available. The level of performance
and scalability you can get on a virtualization platform also depends on the
configuration and tuning of the virtualization parameters for a given platform.
For example, using CPU overloading on VMware ESX is not recommended for
Tableau Server, because with heavier workloads, other applications may compete
with Tableau Server resource needs. Instead, you should consider running
Tableau Server on VMs with dedicated CPU affinity. There are virtualization
platform vendor-specific whitepapers you should review for best practices for your
chosen virtualization platform. A couple of examples for VMware are listed below
for your consideration.
Performance Best Practices for vSphere 5.5 guide
Deploying Extremely Latency-Sensitive Applications in vSphere 5.5
Tableau Server 9.0 runs as a server class application on top of any virtualization
platform. It requires sufficient compute resources and should be deployed with
that in mind. We recommend you seek guidance from your virtualization platform
vendor to perform tuning for your server deployment.
Physical Machines
Each physical machine deployment will vary depending on many factors.
For the purposes of these experiments, we wanted to minimize variability with
virtualization platforms and their specific tuning. So, we deployed Tableau Server
9.0 clusters on physical machines with homogenous hardware configuration in
a network-isolated lab.
For each test pass, we ran a predefined set of workloads and load mix against
16, 32, 48 cores across various cluster topologies. Through each iteration we
recorded not only the key performance indicators, but also system metrics
and application server metrics using JMX. For each of the runs, we correlated the
data and analyzed how the system behaved under increasing user loads.
At the end of each of the iterations, given the architectural changes, we reviewed
the results with our architecture team to inform future testing and methodology
updates. We also found and fixed scalability bugs as part of our agile
development process.
15
We ran many experiments that informed the deployment topologies for the final
tests. These experiments included studies of how server scalability is impacted
with various server component interactions. We will share these results as part
of this whitepaper.
In all, we ran over 1000 test iterations across one topology with each of the
iterations roughly taking two hours to complete. We measured and collected
a variety of system metrics and application metrics during the load tests to
understand how the system scaled with increasing loads while adding more
workers to the cluster.
System Saturation and Think Time
Often, infrastructure teams will want to measure and monitor CPU on the various
server processes and the machine. Typically, infrastructure teams want to allow
for sufficient CPU capacity headroom for burst load.
For example, 80% utilization on CPU could be a good indicator of saturation from
an infrastructure point of view. However, Tableau Server 9.0 is a workhorse and
requires sufficient compute capacity to do its work. It is not uncommon, at times,
to see some processes in a server cluster taking up 100% of a CPUs cycles.
This is by design and something the infrastructure teams should consider as
part of their monitoring strategy.
We measured the system saturation as a point during the load test where we
attain peak throughput saturation in combination with a ceiling on average
response times not exceeding three seconds. If the average latency exceeded
three seconds, we ignored any further increase in throughput of clients because
we wanted to take a conservative view on the reported numbers.
What this means in the context of our experiments is that Tableau Server could
allow more incremental user load on the system at the expense of increased
latencies for new users coming on to the system. In addition, we set a goal of
< 1% error rates (socket, HTTP, or other) for picking the point where we
measured saturation.
16
17
Littles Law
The goal of these tests was to determine the point at which the system reaches
capacity, including the total number of users at that point. Littles Law helps us
illustrate this point very well.
Imagine a small coffee shop that has one barista who can only help one customer
at a time. People enter, buy coffee, and leave. A basic cup of coffee is served
up quickly and more complex drinks take longer. In addition, if the barista were
to take additional time to review instructions on preparing a drink, then the total
time for servicing the user is the time taken to review instructions plus the time
required to make the drink.
The end-to-end service time drives the rate at which they serve people and send
them on their way.
However, if the number of customers arriving exceeds the number of customers
leaving, eventually the coffee shop will fill up and no one else can enter.
The coffee shop is maxed out. The variables that determine the maximum number
of customers in the shop at any one time are the length of time they spend there,
the complexity of their drink order, and the number of workers serving them.
To apply the coffee shop analogy to Tableau Server, lets imagine each barista
represented a VizQL server process. The coffee is analogous to the loading of
a visualization or an end user interacting with a dashboard. Then, the number of
end users concurrently loading and interacting with visualizations becomes the
product of the average response time and the saturated throughput.
Concurrent Users = Average Response Time x Saturated Throughput
You may be wondering, what could represent the CPU in this analogy?
We could imagine the CPU being the hardware the barista uses to actually do
the workthe espresso machine, the juicer, the mixer, the coffee dispenser, etc.
An espresso machine that can pour one shot at a time compared to four shots a
time can have a material difference on how efficiently the barista can
service customers.
Think Time
Often, load and performance testing teams include something called think time
to their response times or load testing scenarios. While this is a realistic concept,
in the context of analytics, think time can be difficult to predict.
18
For example, when looking at a visualization, I may quickly find what I want
a very short think time. However, this may lead to a lengthy, iterative exploration
of the data. This additional exploration could all be considered the end users
think time.
Traditional approaches have used this to mimic end user delay. In our approach,
we decided to test for real concurrency and ignored adding a specific think time
delay in our tests. In effect, our think time was zero.
On user ramp, there are many possible models. We ramped up one user per
second, with zero think time between their actions, until we reached saturation
as defined above.
Workload Mix Changes
We started doing performance and scalability testing very early in the release
cycle. Along the way, we made some key decisions that informed our approach.
We wanted to use a workload mix that would ensure we exercise the new
capabilities in the server and represent a realistic usage scenario including real
customer workbooks.
In our previous whitepaper, on the same topic, defining or classifying a workload
as simple/complex/moderate proved to be challenging. It often led to subjective
interpretations of what the terms meant.
For example, a workbook can look visually simple, but may have complexity
associated with the data required for it. This would make it a compute-intensive
workbook and one that benefits significantly from the product investments we
made in Tableau Server 9.0.
In order to simplify and exercise new features, we created a credible workload mix
across (a) a mix of realistic workbooks including customer workbooks (b) a mix of
users viewing and interacting with workbooks.
19
The figure below shows a visual representation of the load mix so you can see
how we have mixed the workloads for our tests.
VizQL Server
Load Ramp:
one user/second
65% View
60% Browser
Render
35% Interact
40% Server
Render
Load Mix
Multiple Workbooks
Figure 7: Showing the load mix used to represent multiple workload types for Tableau Server 9.0
New Methodology
The workload mix departs from the simple, moderate, complex workbook notion
that we used in the past whitepapers. Instead of running the server to saturation
with a workbook of one type (simple, complex, moderate) and using a user mix of
viewers and interactors, we wanted to make it more realistic.
We introduced a pool of workbooks that range in complexity. This pool included
workbooks that exercised the brand new features of Tableau Server and also
customers workbooks.
Depending on the workbooks design, it may use browser rendering or server
rendering. Browser rendering is a capability that existed in previous versions of
Tableau Server. It allows modern browsers to do some of the workbook rendering,
reducing the servers workload.
In cases where a workbook is very complex, for performance reasons, Tableau
clients push the heavier rendering work on to the server. In response, the server
does the heavy lifting and just sends back tiles that make up the visualization.
This is referred to as server rendering.
20
My Himalayan Story
8000 overview.
filter by index,
tab inswitching,
filterin actions
more.
Below
is anWhat
example
Everest is climbed
Cho Oyu is climbed
Expeditionsand
are on the
Cho Oyu
and Everest
are the main of aPoeple die on Everest
the Spring
the Fall
years
Story Point workbook used for testing.
Season Name
Autumn
Spring
Summer
causes?
because of 2 main
causes
Winter
0.0
3.8
600
An A
na n.
p.. .
Cho Oyu
Dhaulagiri I
Everest
Kangchenjunga
K
K a.
a.. .
Lhotse
Shar
Lhotse
Makalu
Yalun
g Kan
g
Manaslu
100
400
60
40
165
20
Everest
Dhaulagiri I
50
Cho Oyu
Lhotse
3.71 15
1.11
20
2000s
Peak Name
Annapurna I
Annapurna I - Mid..
Dhaulagiri I
Kangchenjunga
Kangchenjunga So..
Lhotse Middle
Makalu
Annapurna I - East..
Cho Oyu
Everest
Kangchenjunga C..
Lhotse
Lhotse Shar
Manaslu
Figure 8: A Story Point workbook that shows climbing and accident trends in the Himalayas.
Yalung Kang
8,000 to 8,850
2010s
1990s
1970s
1980s
3,000
1950s
1,000
1,500
2,000
2,500
# MEMBERS ON SUMMIT
1960s
500
1940s
0
0
1920s
Kangchenjunga
100
40
1930s
Annapurna I
2.89
1900s
150
36
1974
1985
1.17
1910s
8000ers map
3.49 11
1.00
1954
1973
1979
1985
1991
1997
2003
35
1.86 9
11
1.17
1.00
21
10.35
3.38
2001
1965
1983
1989
2007
1961
1974
1981
1987
1993
1999
2005
40
1925
1951
1980
1986
1992
1998
2004
1969
1979
1985
1991
1997
2004
2009
2004
1958
1980
1986
1992
1998
2004
2.25 5
1.137
1.17
1955
1970
1978
1984
1990
1996
2002
2008
1933
1950
1960
1967
1973
1979
1985
1991
1997
2003
13.01
32
1989
1989
1973
1979
1985
1991
1998
2004
156
200
# EXPEDITION
80
# MEMBERS
Welcome to the
fourteen
maps with varying layers and increasing number of marks, selection, categorical
A lot less
Cho Oyu
to Everes
21
Another test workbook shown above looks visually simple but took a long time to
load in previous releases. This was due to the same query being re-run separately
for each of the 4 views. This workbook was based on a taxi rides data set and the
specific interactions we exercised were the following: Select Categorical Filter By
Index, Tab Switching, Select November on the Calendar, Pick the Date 17th, Filter
to Cash, Switch Tab.
Other test workbooks were designed to test for performance under heavy loads
with 1,000,000 marks showing various trending analysis.
All of these workbooks were built using extracts.
Extract Characteristics
We chose to test with workbooks based on extracts. This eliminates any variability
that a live back-end data source can bring to the tests.
Realistic live connection scenarios vary significantly depending on how the
databases are used and what other loads are running on the databases
themselves. The extracts we used ranged in row count from 3000 rows to 93
million rows of data (~3.5GB in extract size).
22
In addition to workloads, there are many variables that can impact system
performance and scalability. In order to manage this variability and to drive
consistency among test runs, we standardized on several aspects of the test.
Standardized Isolated Environment
First, we standardized on the hardware. We ran these scalability tests in our
performance lab on physical machines with the following specifications.
Server Type
Operating System
CPU
Memory
Deployment Topology
Across each of the cluster nodes (workers), except the primary, we maintained the
following configuration of server processes:
Figure 11: Showing the server deployment topology for the scalability testing.
23
We scaled the workload using load generators driving the workload mix described
above. During test execution, we collected system metrics, performance metrics,
and application metrics using JMX. We saved the results in a relational data store.
We then analyzed the results using Tableau Desktop. The figure below shows
a logical but simplified view of the test execution. Its simplified only in that each
cluster node does not show all the server processes running on the machine.
Server Clusters
Tab Jolt 1
Gateway
TabJolt Load
Generators
VizQI x 2
App Server x 1
Data Engine x 1
Tableau
Desktop
Analytics
Gateway
Test
Results
Gateway
Tab Jolt 3
Gateway
Tab Jolt 2
VizQI x 2
App Server x 1
Data Engine x 0
VizQI x 2
App Server x 1
Data Engine x 0
VizQI x 2
App Server x 1
Data Engine x 0
Figure 12: The logical and simplified view of the test environment
Each of the test iterations collected a lot of data, but before we jump into the
results, lets understand some of the metrics and the definitions.
24
Transaction
A transaction is the end-user experience of loading a Tableau visualization and/or
interacting with a view. For example, if you are loading a visualization, the entire
set of requests (HTTP) that load the visualization represents a single transaction.
The response time is measured and reported for a transaction from the clients
perspective (that is from where the load is being generated).
Throughput
Throughput is the number of transactions per second (TPS). E.g. 5 TPS =
432,000 transactions in a 24-hour period. Tableau Public, for example,
has supported a peak of 1.3M page views in a day.
Saturation Throughput
Saturation throughput is the number of transactions per second across all clients
hitting the system when the system is in saturation. Our approach to determining
saturation point is described earlier in this paper.
Response Time
Response time is measured as the amount of time it takes the server to respond
to the end user request.
Concurrent Users
To understand concurrency in the context of Tableau Server, we will start by
defining what concurrency is not. Many times we speak to performance teams
that assume that user concurrency is defined as the number of users logged
into Tableau Server. While that is a logical metric, is it not representative of
concurrency in this whitepaper.
Number of logged in users only measures the scalability of the Application Server
process. A user login exercises a narrow path in the system and is not the same
critical path that loads and interacts with a visualizationwhich does a lot of the
compute-intensive work.
For Tableau Server, concurrency is defined as the number of end users that are
actively loading and interacting with visualizations at a specified response time
and throughput goal. This is a core metric that informs the number of users that
we can support on a given system under test, at saturation. We use Littles Law to
extrapolate the number of concurrent users number based on average response
times and saturated throughput across our experimentation and test execution.
25
Results
With all the new features and architecture updates in Tableau Server 9.0, we ran
several experiments to inform the following scenarios. We ran the same workload
using the same methodology, on both Tableau Server 9.0 and Tableau Server 8.3,
across the same topology and hardware in the same lab so we could compare
scalability across the releases.
Comparing Scalability of Tableau Server 9.0 with 8.3
With 16 cores or more, we see an increase in performance and scalability for
Tableau Server 9.0 compared to Tableau Server 8.3. Specifically, we saw Tableau
Server 9.0 scale from 927 total users on a single 16-core machine to 2809 total
users on a 3-node, 48-core server cluster with average response time well under
the goal of three seconds and an error rate below 1%.
For a typical workload, we demonstrated that Tableau Server 9.0 saturated
throughput increased from 209 TPS on a single node 16-core machine to 475
TPS on a 3-node 48-core machine. Reminding ourselves that TPS corresponds
to the number of visualizations loaded and interacted with in a second, we see
a nearly linearly scaling system where you can scale out by adding more worker
nodes to your cluster.
Saturated
Throughput TPS
Concurrent Users
Total Users
Error Rate
1 Node 16 Cores
209.7
0.44
92.75
927
0.78%
2 Node 32 Cores
303.4
0.46
138.04
1380
0.58%
3 Node 48 Cores
475.5
0.59
280.93
2809
0.16%
Topology
26
We re-ran the same tests using Tableau Server 8.3 on the same hardware,
with the same methodology. We captured the results in the table below.
Saturated
Throughput TPS
Concurrent Users
Total Users
Error Rate
1 Node 16 Cores
61.0
1.47
89.88
899
0.46%
2 Node 32 Cores
144.6
0.69
99.82
998
0.06%
3 Node 48 Cores
152.6
1.36
206.76
2068
0.12%
Topology
We observed that Tableau Server 8.3 scaled from 899 total users on a single 16core machine compared to 927 on the same configuration for Tableau Server 9.0.
In addition, Tableau Server 8.3 saturated at 2068 total users on a 3-node 48-core
cluster compared to 2809 total users on Tableau Server 9.0. However, Tableau
8.3 saturated relatively quicker, at 61 TPS on a 16-core machine compared to 209
TPS for Tableau Server 9.0. In the larger scale tests we found Tableau Server
8.3 saturated at 152 TPS on a 3-node 48-core cluster, compared to 475 TPS for
Tableau Server 9.0.
27
Memory
Compared to Tableau Server 8.3, we observed that Tableau Server 9.0 requires
between 40% more memory on a single 16-core machine to 70% more RAM
on a 3-node 48-core cluster.
28
Disk Throughput
For the disks we used in our experiments, Tableau Server 9.0 showed reduced
disk throughput over a distributed cluster. With a single machine 16-core scenario,
we see a 14% increase in disk throughput consumed between 8.3 and 9.0.
However, a 3-node 48-core cluster actually shows a 30% reduction in disk
throughput during the load tests between server versions. In Tableau Server 9.0,
we are persisting cluster state to disk, and each of the new components in
Tableau Server 9.0 logs to disk.
Network Usage
In Tableau Server 9.0, we now have several components that work together with
the new distributed query cache. In addition, we also have a coordination service
that maintains the state across the cluster. In comparison to 8.3, this shows
relatively large increases in network chatter. However, we did not observe a
significant impact from the network chatter on scalability or performance.
29
So on its own, Tableau Server 9.0 scales and performs well in spite of the
increased network traffic. The takeaway for a real deployment is to consider
deploying Tableau Server on 10GB networks when available.
8-Core Single Machine Comparison
For customers running Tableau Server on 8-core machines, we wanted to inform
how Tableau Server 9.0 would behave in comparison to 8.3 after an upgrade.
We ran a battery of tests with the same methodology to compare the results
across 9.0 and 8.3.
30
For the single 8-core machine scenario, we observed that Tableau Server 9.0 is
significantly better when you look at saturated throughput and response times,
when compared to 8.3, as shown in the table below.
Tableau Server 8.3
1 Node 8 Cores
34.2
129.9
1.80
0.46
Concurrent Users
61.56
59.45
Error Rate
1.71%
0.78%
Key Indicator
Figure 18: Comparing Server 8.3 and Server 9.0 on 8 core machine
31
Figure 19: RAM utilization comparison across 8.3 and 9.0 for single machine deployments
32
What this means is, based on the specific tests in this whitepaper, if you have a
single-machine 8-core Tableau Server 8.3 instance, doing an in-place upgrade
to 9.0 could help your end users experience a performance boost. However, you
may get slightly poorer scaling due to resource contention. Compared to 8.3, on a
single machine we saw fewer errors with 9.0. The specific performance gains you
see may vary depending on many factors. We hope that this helps inform your
planning needs for capacity as you consider upgrading from 8.x to 9.0.
We made a lot of improvements to high availability (HA) in Tableau Server 9.0.
In addition to introducing new server processes like the File Store, Cluster
Controller, Co-ordination Service, etc., we are doing new work to move extracts
onto all of the nodes in the cluster that have a Data Engine. We wanted to test
what impact, if any, the updates to HA would have on scalability. The following
section covers our observations.
High Availability Impact
In thinking about HA and non-HA, one key thing to remember is that we added
several new components to the server to support the new architecture for HA.
In order to test HA, we wanted to ensure we included the new application server
workload into the mix. We ran the new workload on the same hardware as before.
In Tableau Server 9.0, you must have at least three machines to run an HA
configuration. For more details, please read the server administration guide.
In one test, we enabled HA by adding a passive repository for failover and a
File Store and Data Engine processes to every node in the cluster.
When compared to the non-HA deployment, when HA is enabled, we noticed
a very small impact on throughput and response times. This is minor and
anticipated because we are doing more work to keep the Postgres repositories
and the extracts in sync. However, we see about a 10% percent increase in
memory usage across all workers when running an HA configuration.
The increase in memory usage did not impact the TPS significantly. We observed
a <1% reduction in TPS when HA was enabled. The error rate (mostly socket-read
timeouts) increased 0.1% to 0.3%, although still under our threshold of < 1%
error rate.
In addition, when running in HA mode, each publish action to the server (in case
of extract use) will require File Store processes to synchronize the new extracts
across all the nodes in the cluster. In previous versions, you could only run Data
Engine processes on up to two nodes in a cluster. In Tableau Server 9.0, you can
run Data Engine process on any number of nodes.
33
Each machine running the Data Engine process also requires a File Store
process. Given this configuration possibility, you should be aware that the
more nodes you set up for extract redundancy, the more costs you will incur in
synchronizing the extracts across the nodes. This cost is primarily reflected in
network usage and should inform your decision to deploy server workers across
slow links.
Applying Results
At this point, you are probably wondering how this applies to you and how you
can determine the capacity you need for your deployments. In this paper, we
demonstrated that Tableau Server scales nearly linearly for user concurrency.
One approach you could take is to leverage the guidance in this paper to find
the capacity you may need and use it as a baseline. Your actual results will
vary because you will not be using the same system we used for our tests in
this whitepaper.
Backgrounder Considerations
Much of what we discussed was in the context of user-facing workloads. The most
critical of those being view loading and interacting and portal interactions.
The Backgrounder server process does much of the work related to extract
refreshes, subscriptions and other scheduled background jobs. These jobs dont
compete with capacity if you schedule them to run at off-peak hours. When this is
not possible, you should plan for and add capacity needed for your backgrounders
and non-user-facing workloads to run concurrently with user facing processes.
Backgrounders are designed to consume an entire cores capacity because they
are designed to finish the work as quickly as possible. When you run multiple
backgrounders, you should consider the fact that a background server process
may impact other services running on the same machine. A good best practice
is to ensure that for N cores available to Tableau Server on your machine, you
run between N/4 and N/2 backgrounders on that machine.
While not required, you could separate out background server processes
to dedicated hardware, as necessary, to isolate its impact to the end
user workloads.
If you are looking to conduct your own load testing to find out how Tableau Server
scales in your environment with your workloads, here are some best practices.
34
35
36
Less is more - In prior releases, you may have had to run multiple VizQL server
processes on a single machine to handle load. However, the introduction of a
core technology called protocol groups allows for parallel queries, VizQL server
can now use multiple connections to back-end databases. You may find that in
Tableau Server 9.0 you get better scale by actually reducing the number of VizQL
processes to less than four. The new default configuration of running two VizQL
server processes per machine is now a best practice.
Summary
In this whitepaper, we went into detail and provided context on our approach,
appropriate changes in the methodology, and the final results of our Tableau
Server 9.0 server scalability results. We demonstrated that Tableau Server 9.0
scales linearly and performs better when compared to Tableau Server 8.3.
We observed that Tableau Server 9.0 could support up to 927 total users on
a single 16-core machine and scales up to 2809 total users on a 3-node 48core cluster setup, with 10% of users actively loading and interacting with
visualizations.
We hope that you use this whitepaper as a source of guidance for your own
Tableau Server 9.0 deployments. Given that every environment, workload and
deployment will look different, your results may vary.
37
About Tableau
Tableau helps people see and understand data. Tableau helps anyone quickly analyze, visualize and
share information. More than 29,000 customer accounts get rapid results with Tableau in the office
and on-the-go. And tens of thousands of people use Tableau Public to share data in their blogs and
websites. See how Tableau can help you by downloading the free trial at tableau.com/trial.
Additional Resources
Download Free Trial
Related Whitepapers
High Availability: Mission-Critical Rapid-Fire BI with Tableau Server
Tableau Secure Software Development
See All Whitepapers
Tableau and Tableau Software are trademarks of Tableau Software, Inc. All other company and
product names may be trademarks of the respective companies with which they are associated.