Session 6

Box Plots
Box and Whisker plots

 Open new sheet name it box plot.

 Drop category to the Columns shelf and profit to the Rows shelf. Also
drag Ship mode to the right of Category in Columns shelf.
 Choose Box-and-Whisker plot from Show Me.
 A chart appears which shows the box plots.
 Tableau automatically reassigns the ship mode to the Marks card.
Box plots with two dimensions

 Add the region dimension to the Column shelf.

Box plot – Understand it better
 Open new sheet name as box plot1.
 Drag Segment to Columns. Drag Discount to Rows. Take average for
 Drag Region to Columns, and drop it to the right of Segment.
 Select box plot from show me.
 Region is automatically assigned to detail marks card. And circle is
also assigned.
 Drag region from marks card and assign to column. - horizontal lines
are flattened box plots, which happens when box plots are based on a
single mark.
 Box plots are intended to show a distribution of data, and that can be
difficult when data is aggregated
Disaggregate data
 select Analysis > Aggregate Measures.(remove tick mark).

 Want to make a horizontal box plot. Swap axis. Find below map tab.

 Right-click (control-click on Mac) the bottom axis and select Edit

Reference Line. Click on edit reference line in the list that opens.
 Fill some bright colour.
So what is Clustering ?
 Open new sheet name as clustering.
 Attach titanic data set.
 Variable to be used- age in column and pclass in row. Go to analytics
tab and drag cluster on top of the visualization sheet.
 Survived and not survived is our target variable. Put the number of
cluster as 2 and not automatic. Repersented by two colours.
 Let us see how near we are the actual survival class. Put survived on
the details marks card. Not a good visual comparison.
 Let us make a copy of the sheet (use duplicate) and name it as C1.
 Remove cluster from marks card. Put survived on colour.
Let us make it better





Some points to remember
 Tableau uses the k-means algorithm for clustering.
 Tableau uses Lloyd’s algorithm with squared Euclidean
 Tableau uses the Calinski-Harabasz criterion to assess
cluster quality. Higher this value better the clusters are.

 where SSB is the overall between-cluster variance,

SSW the overall within-cluster variance, k the number of
clusters, and N the number of observations.

How to forecast accurately?

 Data – Global super store
 Add ship date in column and shipping cost in rows.
 Drag forecast in the visualization pane.
 It will automatically go to the colour marks card.
 Look at the forecast interval in the end.
 If you change circles from automatic the CI changes to box and
 Change the year in the shipping year.
Adjusting options
 Right click in the view -> forecast -> forecast options. Look at
different options.

 Open new sheet F1.

 Put order date in Columns, order priority and sales in rows. Change
date to green pill and at month level.
 Put order priority on colour marks card.
 Pull forecast and put in the visualization area.
 Right click in the visualization pane -> Forecast -> Describe forecast.
 Look at both tabs in the window Summary and models.
 Make a copy of the forecast sheet and name it F2.
 Click on the drill down on sum shipping cost ( which is in rows columns
and has arrow besides it to indicate it is a forecasted field) select
forecasted results.
 Add profits to the tool tip marks card. Drill down to forecast result
and click on precision. Add this precision in the tool tip using the
Insert option from dialogue box.
 Make a copy of the F2 as a cross tab this time. name it as F3.
 On the right hand corner we have an option of forecast indicator
shoeing two colours.
 Swap your axis (swap option below Map tab)
 Move forecast indicator from column to colour.
 Edit colour- estimate as orange and click ok.
 From the marks card remove un necessary variable- precicon % ..