Categories
How-To's Tableau Techniques

Robinson Projection in Tableau

When it comes to visualizing data, it’s no secret that the Mercator Projection has it’s issues. Certain countries, especially in the northern hemisphere, appear much larger than they are in reality, and it gets worse the farther you move from the equator. Just look at the difference in the size of Greenland between the Mercator and Robinson projections.

That’s because this particular projection was created for one specific purpose…to aid in navigation. And it is perfect for that purpose, but not so perfect for visualizing data. But when it comes to Tableau, it’s our only option…or is it?

I’d like to use this opportunity to point out that I did not come up with the methods discussed in this blog post, but they are techniques that I use pretty frequently, and something I get asked about a lot. So the goal of this post is to combine information from various sources, provide you with all of the files needed to create your own Robinson projections, and walk through, step by step, multiple approaches to building these maps in Tableau. For more information, I recommend checking out this blog post by Ben Jones, and this post by John Emery

I also want to use this opportunity to point out that these methods are more complicated, and less performant than using Tableau’s standard mapping, so I would recommend using them with caution.

Different Methods

This post is going to cover three different methods for creating Robinsons Projection Maps in Tableau.

  • Using a Shapefile
  • Using a CSV File (for countries)
  • Using a CSV File (for cities)

You can find all of the files needed for these methods here, as well as a sample workbook with all of the maps built.

Method 1: Using a Shapefile

This is by far the easiest method, but it’s a little less flexible than using a CSV File. To get started, download the following 4 files from the link above, and place them all in the same folder somewhere on your machine.

  • Country_ShapeFile_Robinson.DBF
  • Country_ShapeFile_Robinson.PRJ
  • Country_ShapeFile_Robinson.shp
  • Country_ShapeFile_Robinson.SHX

Now, let’s set up our data source

Building the Data Source

One quick note before we start. This particular Shapefile only has the 2-digit codes for countries. In order to relate this file to your data, you’ll need to make sure that your source also contains these codes (would not recommend trying to relate on country name). You can find a full list of codes here. Now let’s build our data source.

  • Connect to your Data
    • If you do not have data and just want to practice these techniques, you can use the Happiness_Scores.xlsx (from the World Happiness 2022 Report) file in the Google Drive
  • Connect to the Shapefile
    • From the Data Source Page, click “Add” at the top left to add a new connection
    • In the “To a File” options, select “Spatial File”
    • Navigate to the location where you stored the files above and select the Country_ShapeFile_Robinson.shp file
    • Drag the Shapefile into the data source pane
  • Update the relationship
    • Click on the noodle connecting the two sources
    • In the lower section, select the country code in each file (called ISO in Shapefile)

When complete, your data source should look like this

Building the View

  • Create a new worksheet
  • Double-click on the [Geometry] field
  • Add the 2-digit country code from your primary source onto Detail (in the sample data it’s called ISO A2)
  • Add whatever measure you would like to Color and assign your desired palette (in this example, I placed [Happiness Score] on color and assigned a Viridis palette)

When complete, your worksheet should look something like this.

Well, that’s a bit strange, isn’t it? Here’s an extremely quick explanation of what’s happening. Tableau only supports the Mercator projection. Also, in Tableau, you are limited to just the Map Mark Type for Shapefiles. When you add the [Geometry] field, Tableau generates the latitude and longitude for each polygon in your Shapefile and plots them on the map. But the coordinates that make up those polygons in our Shapefile are for a Robinson projection. So here you can see our Robinson Projection overlaid on Tableau’s Mercator Projection. There’s some magic happening behind the scenes in those Shapefiles that allows this to happen. You can read more about it in John’s post.

So now, let’s get rid of the Mercator Projection (or at least hide it).

First, you’ll want to remove the “Search” Map option. This search is based on the positions of countries in the Mercator Projection, not our new Robinson Projection, so if you try to search for a country, it’s going to bring you to the wrong place. I’d recommend disabling all of the Map options, but that’s up to you

  • In the upper toolbar, click on Map
  • Click on Map Options
  • Deselect the box labelled “Show Map Search”
  • Optional – Deselect all other options

Next we’ll want to remove all of the Map Layers

  • In the upper toolbar, click on Map
  • Click on Map Layers
  • Deselect all of the boxes in the Map Layers Window along the left side of the screen

The Map Layers Window should look like this

Finally, let’s remove our Worksheet Shading, just in case we want to put this on a colored dashboard, or lay it over a background image

  • In the upper toolbar, click on Format
  • Click on Shading
  • Change the first option (on the Sheet tab, under Default, labelled “Worksheet”) from White to None

And that should do it! At this point, your map should look something like this.

Wait a second…something’s still not right. Antarctica has no permanent population, so how did they respond to the World Happiness Survey? Well, they didn’t. In fact, there are several countries that are showing up on this map as yellow that aren’t in the World Happiness data. But because we’re using relationships, and because these countries exist in the Shapefile, they are being shown on the map and colored the same as the lowest value in the data source. This can be misleading, so let’s get rid of those.

  • Drag the measure (the same one currently on color) to the filter shelf
  • Click on “Special”
  • Select “Non-Null” Values

Ok, now we should be all set.

I got to be honest, something about this still bothers me. I would love to be able to see the countries that are missing data, but unfortunately, Tableau does not have the capability to ignore null values in the color application. But we have a few options.

Option 1: Create bins off of your measure and assign a color to each bin. Null will have its own bin

Option 2: The same as option 1 but use a calculated field to set thresholds instead of creating bins

Option 3: Duplicate our map worksheet, remove the measure from filter, remove the measure from color, set the color to how we want the “missing” values to appear, and then stack these two maps on top of each other on our dashboard (set to floating and set x, y, w, and h the same). The result would look something like this

Much better! Any of these methods will work, but I’m partial to the 3rd. This is mainly because you can use a diverging or continuous color palette instead of having to figure out which colors to assign to each “bin” with Option 1 and 2.

Now let’s move on to the CSV File method

Method 2: Using a CSV File (for countries)

This method has a lot of similarities to the first one, so I won’t go in to too much detail. The two main differences are; we are going to use a csv instead of a shapefile, and we are going to use the polygon mark type instead of a map.

To get started, download the CountryShapes.csv file from the Google Drive.

Building the Data Source

We’re going to set up our data source the same way as we did in Method 1. Connect to your data, add a connection for the csv file, and set up the relationship. This csv file, unlike the shapefile, has a ton of different country identifier fields. Do some prep work in advance to figure out which of these identifiers match what is in your data source, and set up the relationship using that field. When you’re finished, it should look something like this.

Building the View

Building the view is a little more complicated with this method. This csv file has the coordinates to “draw” all of the countries. But take a look at what happens if we try to use those coordinates as-is.

Now that looks an awful lot like a Mercator Projection. Well, that’s because it is. This file has the all of the coordinates to “draw” each country, but they are based on the Mercator Projection. So we need to re-calculate those latitude and longitude values. We’re going to build two new calculated fields, R_Lat (for our new latitude), and R_Lon (for our new longitude).

R_Lat

IF [Latitude]=0 THEN [Latitude]

ELSEIF ABS([Latitude])<5 THEN 0.5072 * (0+(0.0620-0) * ((ABS([Latitude])-0)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<10 THEN 0.5072 * (0.0620+(0.1240-0.0620) * ((ABS([Latitude])-5)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<15 THEN 0.5072 * (0.1240+(0.1860-0.1240) * ((ABS([Latitude])-10)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<20 THEN 0.5072 * (0.1860+(0.2480-0.1860) * ((ABS([Latitude])-15)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<25 THEN 0.5072 * (0.2480+(0.3100-0.2480) * ((ABS([Latitude])-20)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<30 THEN 0.5072 * (0.3100+(0.3720-0.3100) * ((ABS([Latitude])-25)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<35 THEN 0.5072 * (0.3720+(0.4340-0.3720) * ((ABS([Latitude])-30)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<40 THEN 0.5072 * (0.4340+(0.4958-0.4340) * ((ABS([Latitude])-35)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<45 THEN 0.5072 * (0.4958+(0.5571-0.4958) * ((ABS([Latitude])-40)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<50 THEN 0.5072 * (0.5571+(0.6176-0.5571) * ((ABS([Latitude])-45)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<55 THEN 0.5072 * (0.6176+(0.6769-0.6176) * ((ABS([Latitude])-50)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<60 THEN 0.5072 * (0.6769+(0.7346-0.6769) * ((ABS([Latitude])-55)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<65 THEN 0.5072 * (0.7346+(0.7903-0.7346) * ((ABS([Latitude])-60)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<70 THEN 0.5072 * (0.7903+(0.8435-0.7903) * ((ABS([Latitude])-65)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<75 THEN 0.5072 * (0.8435+(0.8936-0.8435) * ((ABS([Latitude])-70)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<80 THEN 0.5072 * (0.8936+(0.9394-0.8936) * ((ABS([Latitude])-75)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<85 THEN 0.5072 * (0.9394+(0.9761-0.9394) * ((ABS([Latitude])-80)/5)) * SIGN([Latitude])

ELSEIF ABS([Latitude])<90 THEN 0.5072 * (0.9761+(1-0.9761) * ((ABS([Latitude])-85)/5)) * SIGN([Latitude])

END

R_Lon

IF [Latitude]=0 THEN [Longitude]

ELSEIF ABS([Latitude])<5 THEN [Longitude] * (1-((ABS([Latitude])-0)/5) * (1-0.9986))

ELSEIF ABS([Latitude])<10 THEN [Longitude] * (0.9986-((ABS([Latitude])-5)/5) * (0.9986-0.9954))

ELSEIF ABS([Latitude])<15 THEN [Longitude] * (0.9954-((ABS([Latitude])-10)/5) * (0.9954-0.9900))

ELSEIF ABS([Latitude])<20 THEN [Longitude] * (0.9900-((ABS([Latitude])-15)/5) * (0.9900-0.9822))

ELSEIF ABS([Latitude])<25 THEN [Longitude] * (0.9822-((ABS([Latitude])-20)/5) * (0.9822-0.9730))

ELSEIF ABS([Latitude])<30 THEN [Longitude] * (0.9730-((ABS([Latitude])-25)/5) * (0.9730-0.9600))

ELSEIF ABS([Latitude])<35 THEN [Longitude] * (0.9600-((ABS([Latitude])-30)/5) * (0.9600-0.9427))

ELSEIF ABS([Latitude])<40 THEN [Longitude] * (0.9427-((ABS([Latitude])-35)/5) * (0.9427-0.9216))

ELSEIF ABS([Latitude])<45 THEN [Longitude] * (0.9216-((ABS([Latitude])-40)/5) * (0.9216-0.8962))

ELSEIF ABS([Latitude])<50 THEN [Longitude] * (0.8962-((ABS([Latitude])-45)/5) * (0.8962-0.8679))

ELSEIF ABS([Latitude])<55 THEN [Longitude] * (0.8679-((ABS([Latitude])-50)/5) * (0.8679-0.8350))

ELSEIF ABS([Latitude])<60 THEN [Longitude] * (0.8350-((ABS([Latitude])-55)/5) * (0.8350-0.7986))

ELSEIF ABS([Latitude])<65 THEN [Longitude] * (0.7986-((ABS([Latitude])-60)/5) * (0.7986-0.7597))

ELSEIF ABS([Latitude])<70 THEN [Longitude] * (0.7597-((ABS([Latitude])-65)/5) * (0.7597-0.7186))

ELSEIF ABS([Latitude])<75 THEN [Longitude] * (0.7186-((ABS([Latitude])-70)/5) * (0.7186-0.6732))

ELSEIF ABS([Latitude])<80 THEN [Longitude] * (0.6732-((ABS([Latitude])-75)/5) * (0.6732-0.6213))

ELSEIF ABS([Latitude])<85 THEN [Longitude] * (0.6213-((ABS([Latitude])-80)/5) * (0.6213-0.5722))

ELSEIF ABS([Latitude])<90 THEN [Longitude] * (0.5722-((ABS([Latitude])-85)/5) * (0.5722-0.5322))

END

Ok, so those calculations are a bit intense. But luckily, they are the only calculations that we need. Now let’s build our Robinson Projection.

  • Right click on [R_Lon] and drag it to Columns. When prompted, choose R_Lon without any aggregation
  • Right click on [R_Lat] and drag it to Rows. When prompted, choose R_Lat without any aggregation
  • Right click on your measure (Happiness score in the sample data) and drag it to Color. Choose MIN() when prompted
  • Drag [Polygon ID] to Detail
  • Drag [Sub Polygon ID] to Detail
  • Right click on [Point ID] and drag it to Path. When prompted, choose Point ID without any aggregation
  • Right click on the R_Lat axis, select “Edit Axis” and fix the axis to Start at -.6 and end at .6
  • Right click on the R_Lon axis, select “Edit Axis” and fix the axis to Start at -200 and end at 200

It should look something like this

You’ve probably noticed that we have the same problem as we did before. Countries that were not in our data are showing up in the map and colored the same as the lowest value. To remove these, you can follow the same process as we did with the first method (drag measure to filter, click on All Values, click on “Special”, select “Non-null values”). Another way to remove these is to just drag the country field from your data file to Detail.

Now, if we want to view those missing countries, but not apply the color, we could do exactly what we did with the first method (duplicate and stack). Or…we can do something a bit different. Let’s add a background image.

From the Google Drive, download the MapBackground.png file.

Now, let’s add that as a background image to our map

  • In the upper toolbar, click on Map
  • Click on Background Images
  • Select your data source (the same one used for the map)
  • Click “Add Image”
  • Give your background image a name like “Background Map”
  • Click “Browse” and select the MapBackground.png file
  • Under “X Field” make sure R_Lon is selected and set the Left = -180 and the Right = 180
  • Under the “Y Field” make sure R_Lat is selected and set the Left = -.51 and the Right = .52
  • Adjust the “Washout” for a lighter or darker map

The Background Image options should look like this when you’re done

You can also find or create your own background images, you’ll just have to play with them a bit to get them lined up correctly.

At this point, you’re pretty much done, but I would recommend a few finishing touches (similar to the first method)

  • Disable all Map Options
  • Hide the X and Y axes
  • Remove all Lines (Grid lines/Zero lines/etc.)
  • Remove Worksheet Shading (change White to Null)

And now, you’re map should look something like this

Alright, we’re going to cover one more method, but I promise, this one will be short. These two methods are great if you’re mapping countries, but what if you want to map cities?

Method 3: Using a CSV File (for cities)

In order to plot cities on a Robinson Projection, your data source will need to have the Latitude and Longitude values. Don’t have those? Well, let Tableau do the work for you. Here’s a link to a quick tutorial on how to export the generated Latitude and Longitude values from Tableau

How to get latitude and longitude values from Tableau

Once you have those in your data source, connect to your file, and let’s start building. If you don’t have a file, you can download the “OlympicGames_Cities.xlsx” file from our Google Drive.

Building the View

So we have the coordinates for each of our cities, but similar to the last example, if we try to plot these in Tableau, it’s going to plot them using the Mercator Projection. So we are going to use the same R_Lon and R_Lat calculations that we used in the previous method. They’re crazy long, so I’m not going to paste them here, but you can find them earlier in the post.

Once you have those calculations built, the process is nearly identical to Method 2.

  • Right click on [R_Lon] and drag it to Columns. When prompted, choose R_Lon without any aggregation
  • Right click on [R_Lat] and drag it to Rows. When prompted, choose R_Lat without any aggregation
  • Drag City to Detail
  • Right click on the R_Lat axis, select “Edit Axis” and fix the axis to Start at -.6 and end at .6
  • Right click on the R_Lon axis, select “Edit Axis” and fix the axis to Start at -200 and end at 200
  • Make any other design adjustments desired (change mark type to circle, add field to color, adjust size, adjust transparency, etc)

Now, just follow the same exact process from Method 2 to add your Background Map and to put on the finishing touches. And you should have something like this

Well that’s it for this post. I hope you enjoyed it, and as always, please reach out with any questions or to share your creations. We would love to see them. Thanks for reading, and see you next time!

Categories
How-To's Totally Useless Charts

Totally Useless Charts & How to Build Them – “Hand-drawn” Bar Charts

Welcome to our 2nd installment of Totally Useless Charts & How to Build Them, where we do…exactly what the name implies. Look at some totally useless charts and walk through, step by step, how to build them. If you missed the first installment, the goal of this series isn’t necessarily to teach you how to build these specific useless charts, but more to talk through the techniques, the approach, and the thought process behind each chart, so you can apply those concepts to your own custom charts.

In this installment we’re going to learn how to build “hand-drawn” bar charts in Tableau. These of course aren’t actually hand-drawn, but using some interesting techniques, and a lot of random numbers, we can kind of make them look that way. If you would like to follow along, you can download the workbook here, and the data here.

“Hand-Drawn” Bar Charts

First, let’s look at an example of what we’re talking about. Here is a viz that I published recently about relationships on the show “The Office”. You can check out the interactive viz here.

My goal was to make the entire viz look like an office desk belonging to everyone’s favorite receptionist/office manager, Pam Halpert. To do that, I had to make all of the visualizations appear to be “hand-drawn”, including the bar charts. Let’s zoom in one those.

Here we have two different bar charts, one for the Longest Relationship, and one for the Most Time in Relationships (by number of episodes). Today we’re going to be using different data, but the goal is still the same…build some bar charts that look “hand-drawn”.

Building Your Data Source

Let’s start with our data. For this example we’re going to look at the top 10 highest grossing films of all time. If you downloaded the sample data, you can find these in the “Data” tab.

Next, we need to do some densification. The first thing we need to do is to create a record for every line needed in each bar. If you look at one of these bars closely, you’ll see that it’s actually made up of a bunch of lines…one outer line (orange), and a number of cross lines (blue).

So we are going to densify our data with our first densification table, called “Lines” in the sample data. In this table, we have 1 record for our Outer Line, and 50 records for our Cross Lines

Then we’re going to join our “Data” table, and our “Lines” table using a join calculation with a value of 1 on each side.

But we’re not quite done yet. Now we have a record for each of our lines, but each of those lines is made up of multiple points. Our “Outer Line’ is made up of 4 points, and each of our “Cross Lines” is made up of two points

So we’re going to use one more densification table to create additional records for each of these points, for each of the lines. This table is called “Points” in the sample data

And we’re going to join this to our “Lines” table on the [LineType] Field.

Now for each of our 10 films, we have 4 records for our “Outer Line” and 100 records for our “Cross Lines”, 2 for each of the 50 lines in the “Lines” table.

Drawing the “Outer Lines”

Now we have our data, let’s start building our Totally Useless Chart. We’re going to start with the Outer Lines. To do this, and to make it a bit dynamic so you can play around with how the chart looks, we’re going to build 4 Parameters. Each of these is going to have a Data Type of “Float” and the Default values are below

  • Bar_Width = .6 (used for the height of each bar and the spacing between the bars)
  • Scale_Bar_Outer_Height = .03 (used along with a random number to jitter points vertically)
  • Scale_Bar_Outer_Length = .1 (used along with a random number to jitter points horizontally)
  • Cross Lines = 50 (used to limit the number of cross-lines in each bar. This is optional)

Next, we’ll start building our calculations. The first, and arguably most important of these calcs is going to be our [Jitter] calculation. We want a random number between -1 and 1. The Random() function will give us a random number between 0 and 1, so we can modify that by multiplying the random number by 2 and then subtracting 1 (so a random number of .6 would become .2, and a random number .4 would become -.2)

Jitter = Random()*2-1

Next, we need to calculate the length of each of our “bars”. We’re going to do this by comparing each value to the maximum value and then multiplying it by the Max Length, which in our case will be the number of “Cross” lines we have. So the highest grossing film, Avatar, will have a length of 50, since we have 50 “Cross” lines. ((2.847B/2.847B)*50). Number 10 on the list will have a length of around 26.6 ((1.515B/2.847B(*50). So first, let’s calculate our Max length.

Max Length = {MAX([Line ID]}

Next, we’ll want to divide the Box Office Gross for each movie by the value for the highest grossing movie. The result of this will be a percentage which we’ll then multiply by our [Max Length] field to get our [Outer Bar Length]

Outer Bar Length = ([Box Office Gross]/{MAX([Box Office Gross])})*[Max Length]

Now, we need to calculate the X and Y coordinates for the 4 points of each “Outer Line”. So under normal circumstances, point 1 and point 4 would start at 0, and point 2 and point 3 would just be the [Outer Bar Length]. So if you connected those points, it would start at 0 for point 1, go to the end of the line for point 2, stay at the end of the line for point 3, and then return to 0 for point 4. But we want this to look “hand-drawn”, and if I was drawing bar charts by hand, there is no way they would align that neatly. That’s where our [Jitter] and “Scale” parameters come in.

Outer_Bar_X

CASE [Points]

WHEN 1 then 0+([Jitter]*[Scale_Bar_Outer_Length])

WHEN 2 then [Outer Bar Length]+([Jitter]*[Scale_Bar_Outer_Length])

WHEN 3 then [Outer Bar Length]+([Jitter]*[Scale_Bar_Outer_Length])

WHEN 4 then 0+([Jitter]*[Scale_Bar_Outer_Length])

END

We just want to move these points slightly to get that “hand-drawn” effect, which is why we are using the “Scale” parameters. For that first point, if we just did 0+[Jitter], that value could fall anywhere between -1 and 1, which is a pretty significant shift. But using the [Scale_Bar_Outer_Length] parameter, we can increase that value to get more jitter, or decrease the value to get less jitter. Using a value of .1 in the parameter, means that the value for that first point would now fall somewhere between -.1 and .1.

Next, we need to calculate our Y coordinates for those same 4 points. Again, under normal circumstances, for point 1 we would add half of the bar width to our starting point (the middle of the bar), same for point 2, and then for points 3 and 4, we would subtract half of the bar width from the starting point. So, along with the X coordinates, it would look something like this.

This is where the [Bar_Width] parameter comes into play. We need to know how thick these bars should be. We’re using the [Rank] field as our starting point, so the first bar will start 0,1, the second bar will start at 0,2, and so on. But we don’t want the bars to overlap, or be right up against each other, so we can control that with the parameter. A larger value in this parameter will result in wider bars and less spacing, a smaller value will result in skinnier bars, and more spacing. A value of 1 will result in no spacing between the bars.

Also, similar to the calculation for the X coordinates, we are using that [Jitter] field along with a “Scale” parameter to control how much jitter there will be. So a larger number in the [Scale_Bar_Outer_Height] parameter will result in more vertical jitter, and a lower number will result in less. Here is the calculation for the Y coordinates.

Outer_Bar_Y

CASE [Points]

WHEN 1 then [Rank]+([Bar_Width]/2)+([Jitter]*[Scale_Bar_Outer_Height])

WHEN 2 then [Rank]+([Bar_Width]/2)+(([Jitter]*[Scale_Bar_Outer_Height])2)

WHEN 3 then [Rank]-([Bar_Width]/2)+(([Jitter]*[Scale_Bar_Outer_Height])2)

WHEN 4 then [Rank]-([Bar_Width]/2)+([Jitter]*[Scale_Bar_Outer_Height])

END

Now we have all of the calculations needed to draw our “Outer” lines. So let’s do that

  • Drag [Line Type] to the filter shelf and filter on “Outer Lines”
  • Right click on [Outer_Bar_X], drag it to Columns, and when prompted, choose [Outer_Bar_X] without aggregation
  • Right click on [Outer_Bar_Y], drag it to Rows, and when prompted, choose [Outer_Bar_Y] without aggregation
  • Change the Mark Type to “Line”
  • Right click on [Rank], select “Convert to Dimension” and then drag [Rank] to Detail
  • Right click on [Points], drag it to Path, and when prompted, choose [Points] without aggregation
  • Right click on the Y-axis, select “Edit Axis”, and check the box labelled “Reverse” under Scale

When that’s finished, you should have something that looks like this. There are 10 “bars”, with all 4 points in each bar slightly jittered to give it that “hand-drawn” look.

Next, we need to add the “Cross Lines”

Drawing the “Cross Lines”

To help understand the approach we’re going to take, think about taking each of these bars and breaking them into individual segments. So, for example, our first bar has a length of 50 (think back to the Max Length calculation). So we want to break that into 50 individual segments and draw a diagonal line from the top left of the segment to the bottom right of the segment.

The image above is roughly what it would look like if we draw perfect lines across those segments. But we don’t want perfect lines. We want “hand-drawn” lines. So we’re going to leverage our [Jitter] field and our “Scale” parameters once again.

So let’s build our X and Y calculations. Remember when we built our data source, for our “Cross Lines”, we needed two points, 1 and 2. So Point 1 is going to start the line at the top left of our segment, and Point 2 is going to end the line at the bottom right of our segment. Here is the calculation

Cross_X = if [Points]=1 then [Line ID]-1 + ([Jitter]*[Scale_Bar_Outer_Length]*2) else [Line ID]+([Jitter]*[Scale_Bar_Outer_Length]*2) END

Here we are calculating the position for both points on the X axis. For the first point, when [Points]=1, we want to use our [Line ID] value and subtract 1, so we’re starting at the beginning of our segment (ex. line 1 will start at 0, line 2 will start at 1, line 3 will start at 2, and so on). When [Points]=2, we are going to use just the [Line ID] value (ex. line 1 will end at 1, line 2 will end at 2, line 3 will end at 3). And then we’re just using our [Jitter] field and our “Scale” parameter to jitter these points a little bit, similar to what we did with the “Outer” lines. You may notice that there is a “*2” in these calculations. I added these so I could re-use my same parameters, but could add a little extra jitter to the Cross Lines. I figured if these were actually being done by hand there would be a lot more variation in these lines, compared to the “Outer” lines.

Now let’s calculate our Y coordinates. Similar to how we calculated the Y coordinates for the “Outer” lines, we want one of our points to be half the width of the bar above our starting point, and the other one, half the width of the bar below the starting point. And then we want to jitter them. Here’s the calculation for the Y coordinates.

Cross_Y = if [Points]=1 then [Rank]-([Bar_Width]/2)+([Jitter]*[Scale_Bar_Outer_Height]*2) else [Rank]+([Bar_Width]/2)+([Jitter]*[Scale_Bar_Outer_Height]*2) END

Now this is a little bit confusing because we reversed our axis in an earlier step. So, for Point 1, instead of adding half of the width of the bar to our starting point, the [Rank] field, we need to subtract it from the starting point, to get it to appear above the bar (because the axis is reversed). So when [Points]=1 we’ll subtract half of the width of the bar ([Bar_Width]/2) from the starting point, [Rank]. When [Points]=2, we’ll add half of the width of the bar to the starting point. And then once again we’re using the [Jitter] field, the “Scale” parameter, and then multiplying by 2 to get a little extra jitter. If you wanted to reverse the direction of these lines, so they go from top right to bottom left, just change the calc so when [Points]=1 you add, and when [Points]=2 you subtract.

Now let’s build it.

  • Drag [Line Type] to the filter shelf and filter on “Cross Lines”
  • Right click on [Cross_X], drag it to Columns, and when prompted, choose [Cross_X] without aggregation
  • Right click on [Cross_Y], drag it to Rows, and when prompted, choose [Cross_Y] without aggregation
  • Change the Mark Type to “Line”
  • Right click on [Line ID], select “Convert to Dimension” and then drag [Line ID] to Detail
  • Right click on [Rank], select “Convert to Dimension” and then drag [Rank] to Detail
  • Right click on [Points], drag it to Path, and when prompted, choose [Points] without aggregation
  • Right click on the Y-axis, select “Edit Axis”, and check the box labelled “Reverse” under Scale

Once complete, you should have 50 “Cross Lines” for each of your “bars”.

Now we just need to bring it all together

Combining the Lines

We have our “Outer” lines, and we have our “Cross” lines, and because we have separate data points for each of these (because of the way we structured our data) we can bring them together in the same view pretty easily. We just need to 2 more “Final” calculations for the X and Y coordinates.

Final_X = if [Line Type]=’Outer Lines’ then [Outer_Bar_X] else [Cross_X] END

Final_Y = if [Line Type]=’Outer Lines’ then [Outer_Bar_Y] else [Cross_Y] END

These are pretty straightforward, but basically, if the [Line Type]=”Outer Lines” use the X and Y values from the “Outer_Bar” fields. Otherwise, use the X and Y values from the “Cross” fields. Now let’s build our “bars” with these “Final” calcs.

  • Right click on [Final_X], drag it to Columns, and when prompted, choose [Final_X] without aggregation
  • Right click on [Final_Y], drag it to Rows, and when prompted, choose [Final_Y] without aggregation
  • Change the Mark Type to “Line”
  • Right click on [Line ID], select “Convert to Dimension” and then drag [Line ID] to Detail
  • Right click on [Rank], select “Convert to Dimension” and then drag [Rank] to Detail
  • Drag [Line Type] to Detail
  • Right click on [Points], drag it to Path, and when prompted, choose [Points] without aggregation
  • Right click on the Y-axis, select “Edit Axis”, and check the box labelled “Reverse” under Scale

And now you should have everything together on the same view!

Wait…that doesn’t look right. We don’t want all of those extra “Cross” lines on our shorter bars. Luckily, we can filter those out pretty easily with a calculated field. This is just a boolean calc that checks to see if the Line ID is less than the length of the bar. Remember from earlier that the Line ID corresponds to the right side, or the end of each of these “Cross” lines. So we only want to keep the lines where that value is less than the length of the bar.

Extra Lines Filter = [Line ID]<=[Outer Bar Length]

Now just drag the [Extra Lines Filter] field onto the Filter shelf, and filter on TRUE and voila!

There is one more step that’s completely optional. The way we set up this data source, we can have up to 50 “Cross” lines for the largest bar. But maybe you want less than that. I like to make my visualizations as dynamic as possible so I can play around with how it looks. Earlier we created a parameter called [Cross Lines]. We can use that parameter to determine how many lines we want to use. We’re just going to create one additional calculated field.

Max Cross Lines Filter = [Line ID]<=[Cross Lines]

Just drag that field onto the Filter shelf, filter on TRUE, and then right click on the pill and choose “Add to Context”. Now you can adjust the number of lines, and in doing so, the spacing between the lines. Here’s what it looks like with 30 lines instead of 50.

If you want to use more than 50 lines, just add some additional rows to the “Lines” densification table.

Final Touches

So now our view is built, but there is one critical piece of information missing from our chart…Row Labels. There are a few different ways you can add these, but I’m going to cover two quick options; Shapes and Labels.

For the viz that I shared earlier, I was publishing it to Tableau Public, which has pretty limited options when it comes to fonts, and I really wanted a “hand-drawn” font. What I ended up doing was creating custom shapes for each of my labels in PowerPoint. If you decide to go this route, one thing you want to make sure that you do is to set the size of each of the text boxes equal. If the text boxes are different sizes when you save them as images, it will look like the text is a different size for each value because Tableau will attempt to “normalize” them.

So first, create your shapes in PowerPoint. Here, I inserted 10 text boxes, typed my movie names, set the alignment to “Right”, and then set the Font to “Caveat”. Then I clicked and dragged to highlight all 10 text boxes, and in the top right corner of PowerPoint, in the Shape Format options, I set the “Width” so that they would all be the same size. regardless of how long the text in each box actually is.

Then I right-clicked on each image, saved them to a folder in my “Shapes” repository. Finally, I created a new sheet using the “Shapes” mark type, positioned them by Rank, and then fixed and reversed the axis so they would align with the bars. Then you can throw these two sheets in a container on your final dashboard and have something like this.

So that’s an option if your data is static and you don’t have a lot of values. This would be nearly impossible to maintain if new values were constantly being introduced, and would be way too much work if there were a lot of values in your data source. In those cases, you may need to go with more traditional labels and be limited to the available fonts. But even that can be a little bit tricky because these are lines, not bars.

For this we need one more calculated field. We only want to label 1 point for each of our bars but we can’t filter out any points. We also can’t use a calculation that results in some null values and some populated values for a given Line Type (because it can inadvertently remove sections of the lines we worked so hard to draw).

Label Name = if [Line Type]=’Outer Lines’ then [Name] END

Now just drag the [Label Name] field to Label and then set the Label options as follows.

You can choose whichever font type and size you prefer, but make sure to set the Alignment to “Top Left’, select “Line Ends” under Marks to Label, and under Options, de-select “Label end of line”. Between the calculated field and these options, only 1 Label will appear per bar, and it will appear to the top left of Point 1 of the “Outer” line (which is the bottom left point in each bar). It should end up looking something like this.

Those are a couple of ways to add Row Labels to your “hand-drawn” bar charts. If you want to add value labels as well you can follow a pretty similar process, but it’s a little trickier. This post is already long enough so I’m not going to go into that, but if you make it this far and want to add value labels, please reach out and I’d be happy to help you. Or you can take the easy way out and do what I did, and just create an image of a “hand-drawn” axis and add it to your dashboard.

Thank you so much for reading, and keep an eye on the blog for more ‘Totally Useless Charts & How to Build Them’!

Categories
How-To's It Depends Tableau Techniques

It Depends: Techniques for Filtering on Multiple Selections with Dashboard Actions in Tableau

Welcome to installment #3 of the “It Depends” blog series. If you’re not familiar with the series, each installment will cover a question where there is no clear definitive answer. We’ll talk through all of the different scenarios and approaches and give our recommendations on what approach to use and when. The question we are tackling this week is “How can dashboard actions be used to filter on multiple selections in a sheet?”. Pretty easy right. Just use a filter action…or a set action…or set controls…or a parameter action. There are clearly a lot of different ways to accomplish this, but which one should you use? And I think you know the answer…It depends!

But before we start…why does this even matter? Why not just throw a quick filter on the dashboard and call it a day? For me, it’s about user experience. When a user sees a mark on a dashboard and they want to dig into it, would they rather a) mouse over to a container filled with filters, find the right filter, click on a drop down, search for the value they want to filter, click that value, and then hit apply, or b) just click on the mark? Don’t get me wrong, quick filters are important, and often times essential, but they can also be a bit clunky, and can hinder performance. So whenever possible, I opt for dashboard actions.

Using dashboard actions to filter from a single mark is pretty easy, but the process gets a little more complicated when you want to be able to select multiple marks. And as developers, we have a choice on whether or not we want to push that burden onto our users. We have to decide what’s more important, ease of use for our users, or ease of setup for ourselves. We’ll also need to take performance into consideration. Those will be the two biggest decision points for which of these 4 methods to implement. And those methods are, as I mentioned above; Filter Actions, Set Actions, Set Controls, and Parameter Actions. There are no restrictions on these four approaches. Any one of them could be used in any scenario, but it’s up to you to weigh the importance of those two decision points and to determine which method gives you the right balance for your specific situation.

The Four Methods

Filter Action – This is the easiest of the four methods to implement and provides users with a better user experience than quick filters, both in terms of interacting with the dashboard and performance. But there are some downsides. First off, in order to select multiple marks, users would either need to hold down the CTRL key while making their selections, or those marks would need to be adjacent so that users could click and drag to select multiple marks together. Not ideal. Also, with Filter Actions, the selections cannot be used in other elements of your dashboard (calculated fields, text boxes, etc), like they can with other methods. And finally, and this one might be a bit personal, you can’t leverage any of the highlighting tips that I discussed in Installment 1 of this series, Techniques for Disabling the Default Highlighting in Tableau. Technically, you could use the “Highlight Technique” but because you can’t use the selected values in a calculated field, there would be no clear way to identify which mark(s) have been selected. Full disclosure, I never use this technique, but I’m going to include it because technically it will accomplish what we’re trying to do.

Set Action – This method provides a somewhat similar user experience to Filter Actions. Better performance than quick filters, but users still need to hold CTRL, or click and drag to select multiple marks. The main benefit of this approach over Filter Actions is it’s flexibility. You can use it to filter some views, and in other views you can compare the selected values to the rest of the population. This type of analysis isn’t possible with Filter Actions. With Filter Actions, your only option is to filter out all of the data that is not selected. With Set Actions, you can segment your data into selected and not selected. You can also use those segments in calculated fields, which is another huge benefit over Filter Actions.

Set Controls – This method provides all of the same benefits of Set Actions, but with one major improvement. Users do not need to hold CTRL or click and drag. They can click on individual marks and add them one by one to their set. This method is a little more difficult to set up than the previous two, but in my opinion, it is 100% worth it. It’s possible that it may be marginally worse for performance, but I have never had performance issues using this method. The only things that I don’t like about this approach are that you can’t easily toggle marks in and out of the set (you can do this with a second set action in the ‘menu’, but it’s a bit clunky), and you can’t leverage the Filter Technique for disabling the default highlighting (we’ll talk more about this later).

Parameter Action – In my opinion this approach provides the best user experience. Users can select marks one by one to add to their “filter”, and can also, depending on how you set it up, toggle marks in and out of that filter. The main downside here is performance. This technique relies on some somewhat complicated string calculations that can really hurt performance when you’re dealing with large or complex data sets. It’s also the most difficult to implement. But when performance isn’t a concern, I love this approach.

Method Comparison

So which method should you use? Let’s take a look at those two decision points that I mentioned earlier, User Experience and Performance.

If neither of these decision points are important then you can ignore this entire post and just use a quick filter. But that’s usually not the case. If Performance is important, but you’re less concerned with User Experience, you can use either a Set Action or a Filter Action (but I would recommend Set Actions over Filter Actions). If User Experience is important and Performance is not a concern, you can use a Parameter Action. And if both User Experience and Performance are important, then Set Controls are the way to go. But as I mentioned earlier, you are not limited to any of these methods in any scenario, and there could be other elements that influence your decision. So let’s do a closer comparison of these methods.

*Flexibility refers to how the selected values can be used in other elements of the dashboard
**Default Highlight Techniques refer to which technique (from Installment 1) can be used with each method

So now we have a pretty good idea which method we should use in any given scenario. Now let’s look at how to implement each of these.

Methods in Practice

For each of these methods we are going to use this ridiculously simple Sales dashboard. We’re going to use the bar chart on the left (Sales by Subcat) to filter the line chart on the right (Sales by Quarter). If you’d like to follow along, you can download the sample workbook here.

Filter Action

As I mentioned earlier, this is the easiest method to set up, but will require your users to either hold CTRL or click and drag to select multiple marks.

  • Go to Dashboard > Actions and click “Add Action”
  • Select “Filter” from the list of options
  • Give your Action a descriptive Name (ex. SubCat Filter)
  • Under “Source Sheets” select the worksheet that will drive the action. In this example it is our bar chart “Sales by SubCat”.
  • Under “Target Sheets” select the worksheet that will be affected by the action. In this example it is our line chart “Sales by Quarter”
  • Under “Run action on” choose “Select”
  • Under “Clearing the selection will” choose “Show All Values”
  • Click “OK”

When you’re finished, your “Add Filter Action” box should look like this

And now when we click on a mark, hold CTRL and click multiple marks, or click and drag to select multiple marks our line chart will be filtered to just the selected value(s). And when we click in the white space on that sheet, or the selected value (when only one mark is selected) the filter will clear and the line chart will revert to show all values

Set Action

Filtering with Set Actions is slightly more involved but still pretty straightforward. For this method we need to create our set, we need to add that set to the Filter Shelf on our line chart, and then we need to add the dashboard action to update that set.

  • Go to the sheet that will be affected by the action. In this case it is our line chart (Sales by Quarter).
  • Create a set for the field that will be used for the filter. In this case it is our [Sub-Category] field.
    • Right click on [Sub-Category]
    • Click on “Create”
    • Click on “Set”
    • Click the “All” option to select all values
    • Click “OK”
  • Add the filter to the sheet that will be affected by the action (Sales by Quarter).
    • Drag the [Sub-Category Set] to the filter shelf on the “Sales by Quarter” worksheet
    • It should default to “Show Members in Set”, but just in case, click on the drop-down on the [Sub-Category Set] pill on the filter shelf and make sure that is the option selected
  • Add the dashboard action to update the [Sub-Category Set]
    • Navigate back to the dashboard
    • Go to Dashboard > Actions and click “Add Action”
    • Select “Change Set Values” from the list of options
    • Give your Action a descriptive Name (ex. SubCat Set Action Filter)
    • Under “Source Sheets” select the worksheet that will drive the action. In this example it is our bar chart “Sales by SubCat”.
    • Under “Target Set” select the set that was used as the filter in the previous steps. In this case it is our [Sub-Category Set]
    • Under “Run action on” choose “Select”
    • Under “Running the action will” choose “Assign values to set”
    • Under “Clearing the selection will” choose “Add all values to set”
    • Click “OK”

When you’re finished your “Add Set Action” box should look like this.

The way this Set Action configuration works is that each time you make a new selection, the contents of the set are being completely overwritten with the newly selected values. That’s what the “Assign values to set” option does. And when you clear the selection, by clicking on white space in the sheet, or the last selected value, the contents of the set are replaced again with all of the values. That’s what the “Add all values to set” option does.

I would recommend one additional step if you’re using this method to override the default highlighting. When using Set Actions you are somewhat limited on what techniques you can use for this, but the “Highlight Technique” works great. You can read about how to use that technique here. Once you’ve added the Highlight Action, just put the [Sub-Category Set] field on color on your “Sales by Subcat” sheet and select the colors you want to display for marks that are selected (in the set) and marks that are not selected (out of the set). When you’re done, your dashboard should look and function like this. Keep in mind that similar to Filter Actions, users will need to hold CTRL and click, or click and drag to select multiple marks.

Set Controls

Setting up our filter with Set Controls is going to be very similar to Set Actions, but with one major difference. The way Set Controls work is that they allow you to select marks one by one and either add them to your set or remove them from your set. This is a great feature, but it makes filtering with them a little tricky.

If we were to start with all of our values in the set, we couldn’t just click on a value in our bar chart to add it to the set, since it’s already there (as well as all of the other values). So we need to start with the set empty and then start adding values when we click on them. But if our set is empty, and we use that set as a filter, as we did in the previous example, then our line chart will be blank until we start adding values. And we don’t want that. We want the line chart to show all of the values, until we start selecting values, and then we want it to just show those values. And we can accomplish this with a calculated field. So first, we’re going to create our set, then we’ll create our calculated field, then we’ll add that field as a filter to our line chart, and then we’ll add the action to update the set.

  • Go to the sheet that will be affected by the action. In this case it is our line chart (Sales by Quarter).
  • Create a set for the field that will be used for the filter . In this case it is our [Sub-Category] field. (skip this step if you followed along with the Set Action example)
    • Right click on [Sub-Category]
    • Click on “Create”
    • Click on “Set”
    • Click the “All” option to select all values
    • Click “OK”
  • Create a calculated field called [SubCat Filter]
    • { FIXED : COUNTD([Sub-Category Set])}=1 OR [Sub-Category Set]
  • Add the filter to the sheet that will be affected by the action (Sales by Quarter).
    • Drag the [SubCat Filter] field to the filter shelf on the “Sales by Quarter” worksheet
    • Filter on “True”
  • Add the dashboard action to update the [Sub-Category Set]
    • Navigate back to the dashboard
    • Go to Dashboard > Actions and click “Add Action”
    • Select “Change Set Values” from the list of options
    • Give your Action a descriptive Name (ex. SubCat Set Control Filter)
    • Under “Source Sheets” select the worksheet that will drive the action. In this example it is our bar chart “Sales by SubCat”.
    • Under “Target Set” select the set that was used as the filter in the previous steps. In this case it is our [Sub-Category Set]
    • Under “Run action on” choose “Select”
    • Under “Running the action will” choose “Add values to set”
    • Under “Clearing the selection will” choose “Remove all values from set”
    • Click “OK”

When you’re finished your “Add Set Action” box should look like this.

So there are a couple of things we should cover here, starting with the calculated field. Here it is again.

{ FIXED : COUNTD([Sub-Category Set])}=1 OR [Sub-Category Set]

Sets only have two possible values; IN or OUT. The set may contain hundreds or even thousands of values from the source field, but the sets themselves can only have these two values. So if we use a FIXED Level of Detail expression and count the distinct values, the result will be either 1 or 2. If the set is empty, the value for every record will be OUT, so the result of the LOD will be 1. Similarly, if the set contains all values, the value for every record will be IN, so the result of the LOD will still be 1. But if some values are in the set (IN) and other values are not in the set (OUT), then the result of the LOD will be 2 (IN and OUT).

So the first part of this calculated field ( FIXED : COUNTD([Sub-Category Set])}=1) will be true for all records when the set is empty, or if it contains all values. The second part of this calculated field (OR [Sub-Category Set]) will only be true for records in the set. So when we start with an empty set, the overall result of this calculated field will be True for every record, so everything will be included in our line chart. As soon as we add a value to our set, the first part becomes false for every record, but the second part becomes true for the values in our set. Because we are using an OR operator, the overall result, once we click on some values, will be true for the selected values and false for the rest of the values.

Next, let’s look at the Set Control options. We are starting with an empty set. Each time we click on a new mark, that value will be added to our set. That’s what the “Add values to set” option does. Unlike the “Assign values to set”, it does not override the set, it just adds new values to the existing ones. And then when we click on some white space in the sheet, or the last selected value, the set will go back to being empty. That’s what the “Remove all values from set” option does.

And just like in the previous example, I would recommend using the “Highlight Technique” covered here, and then adding the [SubCat Filter] field to color on the bar chart (Sales by SubCat).

And now your dashboard should look and function like this. Notice that you no longer need to CTRL click, or click and drag, to select multiple values. Nice!

Parameter Action

This method is by far the most complex, but if done correctly, it provides a really smooth user experience. The thing that I like most about this approach is that you can set it up so that you can “toggle” values in and out of the filter. There are a few extra steps in this approach, and some somewhat complex calculations. We need to create our parameter, create a calculated field that will be passed to that parameter, add that field to Detail on our bar chart, create a calculated field for our filter, add that filter to our line chart, and add the dashboard action that will update the parameter.

  • Create a string parameter called [SubCat Select] and set the “Current value” to the pipe character “|”
  • Create a calculated field called [SubCat String]
    • IF CONTAINS([SubCat Select], ‘|’ + [Sub-Category] + ‘|’) THEN REPLACE([SubCat Select],’|’ + [Sub-Category] + ‘|’,’|’)
    • ELSE [SubCat Select] + [Sub-Category] + ‘|’
    • END
  • Go to the sheet that will drive the parameter action and drag the [SubCat String] field to Detail. In this example, that is the “Sales by SubCat” sheet
  • Create a calculated field called [SubCat String Filter]
    • [SubCat Select]=’|’ OR CONTAINS([SubCat Select],’|’+[Sub-Category]+’|’)
  • Add the filter to the sheet that will be affected by the action (Sales by Quarter).
    • Drag the [SubCat String Filter] field to the filter shelf on the “Sales by Quarter” worksheet
    • Filter on “True”
  • Add the dashboard action to update the [Sub-Category Set]
    • Navigate back to the dashboard
    • Go to Dashboard > Actions and click “Add Action”
    • Select “Change Parameter” from the list of options
    • Give your Action a descriptive Name (ex. SubCat Parameter Filter)
    • Under “Source Sheets” select the worksheet that will drive the action. In this example it is our bar chart “Sales by SubCat”.
    • Under “Target Parameter” select the parameter that was set up in the previous steps. In this example it is [SubCat Select]
    • Under “Source Field” select the [SubCat String] Field
    • Under “Aggregation”, leave set to “None”
    • Under “Run action on” choose “Select”
    • Under “Clearing the selection will” choose “Keep Current Value”
    • Click “OK”

When you’re finished, your “Add Parameter Action” box should look like this

Alright, let’s talk through some of those calculations, starting with the [SubCat String] Field. Basically, what this calculation is doing is building and modifying a concatenated string of values. Here’s that calculation again.

IF CONTAINS([SubCat Select],’|’ + [Sub-Category] + ‘|’) THEN REPLACE([SubCat Select],’|’ + [Sub-Category] + ‘|’,’|’)
ELSE [SubCat Select] + [Sub-Category] + ‘|’
END

We set up our parameter to have a default value of just the pipe character (“|”). I’ll explain the use of the pipes a little later on. The first line of the calculation looks to see if the selected value is already contained in our string. So it’s searching our entire concatenated string for the pipe + selected value + pipe (ex. |phones|). If it finds that value in our string, it will replace it with just a pipe. So for our example, if our parameter value was |phones|binders|storage| and we click on phones, it will replace “|phones|” with a pipe, leaving “|binders|storage|”

The second line of this calculation will add the selected value. The calc has already tested to see if it’s there in the previous step, and if it’s not, this line will add it along with a pipe. Now let’s look at our parameter action…in action.

Take a look at the parameter at the top. As we click on each Sub-Category, its added to our string. Now look what happens when we click on those selected marks again.

As we click on each previously selected mark, that mark is removed from our string, until we’re left with just our starting value, the pipe.

The reason I use pipes on both sides, instead of just a comma separated list is to avoid situations where one potential value could be equal to a part of another value (ex. Springfield and West Springfield). The pipes ensure we are matching exact values in the next step, our filter calculation. Here’s that filter calculation again.

[SubCat Select]=’|’ OR CONTAINS([SubCat Select],’|’+[Sub-Category]+’|’)

This is pretty similar to the filter that we created in our Set Controls example. The first part is checking to see if our concatenated string is “empty”, meaning nothing has been selected yet and it’s just a pipe. If that’s the case, the result of this calculated field will be true for every record. The second part of the calculated field is checking the [Sub-Category] field in each record and seeing if that value is contained within our parameter string (along with the starting and ending pipe). If it is, the result of this calculation will be True for those records and False for all of the other records.

When using this method, I would highly, highly, highly recommend using the “Filter Action” technique for disabling the default highlighting. You can find it here. This technique not only masks the selection, but it also automatically de-selects a mark after you click on it. That is really important when using this method. Once you add that Filter Action, just add the [SubCat String Filter] field to color, and you should be good to go. Here’s what it looks like in action.

And there you have it. Four different methods for filtering your dashboards on multiple selections. As I mentioned before, some of these are much more complex than others, but they provide a much better user experience. In my opinion, it’s worth it to put in the extra time and effort to save your users frustration later on. I hope you enjoyed this post, and keep an eye out for more installments of “It Depends”.

Categories
How-To's Tableau Techniques Totally Useless Charts

Totally Useless Charts & How to Build Them – Lotus Flowers

Welcome to our new series, Totally Useless Charts & How to Build Them. In each installment of this series we’ll look at one very custom chart, something with almost no real use cases, and we’ll walk through, step by step, how to build it. The purpose of this series isn’t necessarily to teach you how to build these specific useless charts, it’s more about talking through the techniques, the approach, and the thought process behind each chart. Our hope is that seeing how we went about building these will help you with your own custom charts. But if you do somehow find a great use case for one of these charts, by all means, please download the workbook and use it as your own.

In this first installment we’re going to learn how to build Lotus Flowers in Tableau. It’s not a requirement, but it may be a good idea to review Part 1 and Part 2 of the Fun With Curves Series before proceeding. To follow along, you can download the workbook here, and the data here.

Lotus Flower

First, let’s take a look at what we’re trying to build. Below is a lotus flower with 10 petals, which means we have a total of 11 polygons; 1 circle and 10 petals. The circle is fairly easy to build using the techniques in Part 1 mentioned above. The petals are a little more complicated. But first thing’s first…we need some data.

Building Your Data Source

Let’s start with our data. For this example we’re going to build 12 lotus flowers and we’re going to use the value from our data source to size the flowers appropriately. We’ll start with the tab titled ‘Source Data’.

Next, we’re going to do some densification to get the number of polygons needed for each of the flowers. Below I have 1 record for the Circle and 24 records for the petals. We’re going to build this in a way that will let you choose how many petals you want to display (up to 24). This data can be found in the ‘Polygons’ tab in the sample data.

Now we’re going to join these two sources together using a join calculation (value of 1 on each side). The result will be 25 records for each of our 12 ‘Base’ records.

Next, we need to do a little more densification, but this time it’s a little trickier. For our circle, we want at least 50 points for a relatively smooth curve. For our petals, we actually need to draw 2 lines for each petal, one for the left side of the petal (Min) and one for the right side of the petal (Max), and then join those together. Pretty confusing right? We’ll talk about this in a lot more detail. This table is a little too large to include a screenshot, but take a look at the ‘Densification’ tab in the sample data.

For our circles, we have 50 records. We have two numerical fields, [Points] and [Order], that both run from 1 to 50 and a [Type] field to identify that these points are for our circles. For our petals, we have 100 records. We still have the same two numerical fields, but the values are a little different. We have an [Order] field that runs from 1 to 100, and a [Points] field that runs from 1 to 50 and then back down from 50 to 1. We also have a [Side] field with values of Min or Max. The Min records will be used to draw the left side of our petals. The Max records will be used to draw the right side of our petals. And then we have a [Type] field to identify that these records are for our petals. Now we just need to join this table to our data source on the [Type] Field.

Building Your Circles

If you have read through Part 1 of the Fun With Curves series, then you may remember that in order to draw a circle in Tableau, we only need 2 inputs; the distance of each point from the center of the circle (the radius), and the position of each point around the circles (represented as a percentage).

Let’s start with the first input, the radius. We are going to size our circles based on the Value field in the Source Data. We want the area of our circles to represent the value in the data. So we have the area of each circle, we just need to use those values to calculate the radius of each circle. We can do this with the simple calculation below.

Radius = SQRT([Value]/PI())

Next, we need to calculate the position of each point around the circle. I’m not going to go into too much detail on this, but you can read more about it in the post mentioned above. To calculate this, we need the maximum number of points for our Circles (50), and we need the [Points] field (values 1 thru 50). For the max point calculation I am going to use an LOD because the max number of points for our circles, may not always align with the max number of points in our data source (but in this case it does).

Max_Point = {FIXED [PolygonType] : MAX([Points])}

Circle_Position = ([Points]-1)/([Max_Point]-1)

Next, we just need to plug the [Radius] and the [Circle_Position] values into our X and Y formulas for plotting points around a circle.

Circle_X = [Radius]* SIN(2*PI() * [Circle_Position])

Circle_Y = [Radius]* COS(2*PI() * [Circle_Position])

Now, let’s draw our circles

  • Right click on [Circle_X] and drag it to columns. When prompted, choose [Circle_X] without any aggregation
  • Right click on [Circle_Y] and drag it to rows. When prompted, choose [Circle_Y] without any aggregation
  • Right click on [Base_ID], change it to a Dimension, and drag it to Detail
  • Right click on [Order], change it to a Dimension, and drag it to Path
  • Drag [Type] to Filter Shelf and filter to ‘Circle’
  • Change the Mark Type to Polygon

Now you should have something that looks like this.

Although it looks like one big circle, we actually have all 12 circles in this view. They’re just stacked on top of each other. So next we need to space these out a little bit. There are a lot of different techniques to do this, but here’s one I like to use to create Trellis Charts. This technique works great when you have a sequential ID field, which we do (Base_ID).

First, we’re going to create a numeric parameter that will allow us to choose the number of columns we want to create. We’ll call the parameter [Column Count] and set the value to 3. Next, we’re goin to use the [Base_ID] field to break our circles into columns and rows, starting with row.

Row = CEILING([Base ID]/[Column Count])

Column = [Base ID]-(([Row]-1)*[Column Count])

Now right click on both of these fields, change them to Dimensions, and then drag them to the appropriate shelf (Row to Rows, Column to Columns). The result should look something like this.

Building Your Petals

Alright, so this part is a little more complicated. I’m going to start by reviewing the basics of how you build these shapes, but I’m going to skim over the calculations since those will change significantly once we try to build these petals around our circle. No need to follow along with the workbook during this section.

So here are the basics. Let’s start by drawing a Bezier Curve with 4 control points. Our line is going to start at 0,0 and end at 5,10. Wow, this is easy, we already have the coordinates for 2 of the points!

Let’s take a look at our inputs. Our line will have a height of 10 and a width of 5. I’ve also built 2 parameters that we’ll use to calculate the 2nd and 3rd set of points. You can experiment with different values here, but these seem to work pretty well. We need a total of 8 values (4 sets of X and Y).

  • Point 1 will be the start of the line. In this case, it’s 0,0
  • Point 2 will appear on the same X axis as Point 1, but will be somewhere between the start and end of the line on the Y axis. I like to place it two thirds of the way, or .67 (the value in the P2 Parameter Input above). So the coordinates for Point 2 will be 0 and 6.7 (P2 Parameter x Height of the line)
  • Point 3 will appear on the same X axis as Point 4, and will appear somewhere between the start and end of the line on the Y axis (similar to P2). I like to place it halfway, or .5 (the value in the P3 Parameter Input above). So the coordinates for Point 3 will be 5 and 5 (P3 Parameter x Height of the line).
  • Point 4 will be the end of the line. In this case, it’s 5,10

If you were to plot these 4 points, you would have a jagged line like you see in the image above. But look what happens when we plug those values into our Bezier calculations

X = (1-[T])^3*[P1_X] + 3*(1-[T])^2*[T]*[P2_X] + 3*(1-[T])*[T]^2*[P3_X] + [T]^3*[P4_X]

Y = (1-[T])^3*[P1_Y] + 3*(1-[T])^2*[T]*[P2_Y] + 3*(1-[T])*[T]^2*[P3_Y] + [T]^3*[P4_Y]

Alright, we are halfway there! Kind of. So now we have a line that will create 1/2 of one of our petals. But in order to turn this into a petal shaped polygon, we need another line that’s a mirror image of this one.

This is where the Min and Max records come in. We need to calculate our 4 sets of coordinates for both sides. Luckily, most of the values are actually the same. P3 and P4 are going to be identical for both lines. And the Y values for P1 and P2 are the same. The only differences are the X values for P1 and P2. And to calculate those we just add the width of the whole petal (width x2) to our starting point. And if we were to plug these coordinates into the same calculations, we have this.

Now this is where the [Order] field comes into play. To make this one single polygon instead of two separate lines, we can use the [Order] field on Path and change the Mark Type to polygon

On the Left (Min) side, we have points 1 thru 50, running from the bottom left up to the top middle. On the right (Max) side, we have points 1 thru 50 running from the bottom right up to the top middle. But then we have the [Order] field (on label in the image above). This field runs from the bottom left to the top middle to the bottom right, in one continuous flow, from value 1 to 100. This is what makes it a single continuous polygon.

Ok, so that’s how we would build 1 single petal shaped polygon, perfectly positioned facing directly upward. But that’s not what we’re trying to do. We’re trying to build a dynamic number of petals, evenly spread around a circle and facing outward in the appropriate directions. So let’s do that.

Building Your Petals (for real this time)

We’re going to use a similar approach to what was described above, but everything needs to be plotted around a circle. So any calculations we use to determine our 4 points are going to have to run through those X and Y calculations for plotting points around a circle. That means, for all of our points, P1 thru P4 for both sides, we need to calculate two things; the distance from the center of the circle, and the position around the circle. But before we calculate those specific points there are a few other calculations that we’ll need. We also need a parameter, called [Petal Count] that will allow us to select how many petals we want. This should be a numeric parameter, and let’s set the value to 10 (I recommend using a range, allowing values from 4 to 24)

T = ([Points]-1)/([Max_Point]-1) – this is the same as the [Position] calc used earlier. It’s used to evenly space points between 0% and 100%

Petal_Count_Filter = [Polygon ID]<=[Petal Count] – this will be used as a filter to limit the number of petals displayed to what is selected in the [Petal Count ] parameter

Petal_Width = 1/[Petal Count] – this calculates the total position around the circle that will be occupied by each petal. For example, if there were 10 petals, each one would occupy 1/10 of the space around the circle, or .10

Petal_Side_Position = IF [Side]=’MIN’ THEN ([Polygon ID]-1)*[Petal_Width] ELSE [Polygon ID]*[Petal_Width]
END
– this calculates the position of the start of each Min line and Max line. If there were 10 petals, the Min side of the 2nd petal would be at position .1 or (2-1)*.1, and the Max side of the petal would be at position .2, or 2*.1. The Min value will share the same position as the Max value of the previous petal. The max value will share the same position as the Min value of the next petal

Petal_Middle_Position = ([Polygon ID]/[Petal Count]) – ([Petal_Width]/2) – this calculates the position of the center of each petal. If there were 10 petals, the center of petal 3 would be at .25, or (3/10) – (.1/2). This is also halfway between the position of the Min line and the Max line.

Alright, now we can calculate all of our coordinates. Let’s start with P1. For the first input, we want this point to start right at the edge of our circle. So the distance from the center is going to be equal to the radius of the inner circle. So the first input is just [Radius]. For the second input, we’ll use the the [Petal_Side_Position] we calculated above.

P1_X = [Radius]* SIN(2*PI() * [Petal_Side_Position])

P1_Y = [Radius]* COS(2*PI() * [Petal_Side_Position])

If we were to plot these points for 12 petals, we would end up with 24 points, but it would appear that we only 12 because each is overlapping with another point. But this gives us the outside edges of each of our petals

Now onto P2. This one is a little more complicated. We’re going to use P1 as a starting point for this calculation, instead of the center of the inner circle. Now we need to calculate the distance from P1 where we want our next point to appear. First, we need to determine the length of the entire petal. I like to use a parameter for this so I can dynamically adjust the look of the flowers. So let’s create a parameter called [Petal_Length_Ratio]. This is going to be a number relative to the radius, so a ratio of 1 would set the length of the petal equal to the radius of the circle. A value of .8 would set the length of the petal equal to 80% of the radius, and so on. I usually go with a value somewhere between .5 and 1. We’ll use this along with the radius, so that the petals of each flower are sized appropriately based on the size of their inner circle. Next, we need to position this point somewhere between the start of the line and the end of the line. As I mentioned earlier, I like to place it two thirds of the way (P2_Parameter from the previous section). So the first input, the distance from P1, is going to be the radius x the length ratio x the P2 parameter. For the second input, we’re going to use the [Petal_Middle_Position] because we want this side of the line to follow the same path as the line with P3 and P4. If we were to use the [Petal_Side_Position] field, we would end up with really wide, strange looking petals. This will probably make more sense a little further along. For now, let’s plug those values into our X and Y calcs.

P2_X = [P1_X] + (([Radius] * [Petal_Length_Ratio] * [P2_Parameter]))* SIN(2*PI() * [Petal_Middle_Position])

P2_Y = [P1_Y] + (([Radius] * [Petal_Length_Ratio] * [P2_Parameter]))* COS(2*PI() * [Petal_Middle_Position])

P3 is a little more straight forward. For the first input, we’re going to calculate the distance from the center of the inner circle. And then we’ll use a similar approach to what we did for P2. The first input will be the radius + (the radius x the length ratio x the P3 parameter). As I mentioned in the earlier section, I like to set this parameter to .5. And once again, we’re going to use the [Petal_Middle_Position] field for the second input.

P3_X = ([Radius]+([Radius] *[Petal_Length_Ratio] * [P3_Parameter]))* SIN(2*PI() * [Petal_Middle_Position])

P3_Y = ([Radius]+([Radius] *[Petal_Length_Ratio] * [P3_Parameter]))* COS(2*PI() * [Petal_Middle_Position])

P4 is almost identical to P3, except we don’t need the length ratio. We want this point to appear at the end of the line. So we can just remove that from the calc.

P4_X = ([Radius]+([Radius] * [Petal_Length_Ratio])) SIN(2*PI() * [Petal_Middle_Position])

P4_Y = ([Radius]+([Radius] * [Petal_Length_Ratio])) COS(2*PI() * [Petal_Middle_Position])

We’re almost there! If we were to plot these points for the first petal in our lotus flower, it would look like this. It looks very similar to what we reviewed in the previous section, but with one very important difference…everything is at an angle…which is what we wanted.

All that’s left to do is to plug all of these points in our Bezier calcs and then build our polygons!

Petal_X = (1-[T])^3*[P1_X] + 3*(1-[T])^2*[T]*[P2_X] + 3*(1-[T])*[T]^2*[P3_X] + [T]^3*[P4_X]

Petal_Y = (1-[T])^3*[P1_Y] + 3*(1-[T])^2*[T]*[P2_Y] + 3*(1-[T])*[T]^2*[P3_Y] + [T]^3*[P4_Y]

Now the polygons. Let’s build this as a Trellis chart, just like we did with the Circles. So drag [Row] on to Rows and [Column] onto Columns. And then;

  • Right click on [Petal_X] and drag to Columns. When prompted, select [Petal_X] without aggregation
  • Right click on [Petal_Y] and drag to Rows. When prompted, select [Petal_Y] without aggregation
  • Drag [Type] to Filter Shelf and filter to ‘Petal’
  • Drag [Petal_Count_Filter] to Filter Shelf and filter to TRUE. Right click and ‘Add to Context’
  • Drag [Polygon_ID] to Detail
  • Drag [Order] to Path
  • Change Mark Type to Polygon

We’re so close! Your sheet should look like this

The only thing left to do is to combine the Circle polygons with the Petal Polygons. We have separate data for them, all we need to do is get them on the same sheet. So we’ll create two more simple calcs to bring it all together.

Final_X = IF [Type]=’Circle’ THEN [Circle_X] ELSE [Petal_X] END

Final_Y = IF [Type]=’Circle’ THEN [Circle_Y] ELSE [Petal_Y] END

Now just replace [Petal_X] and [Petal_Y] with [Final_X] and [Final_Y] and drag [Type] from the filter shelf on to Color and you should have your lotus flowers!

The Final Touches

The hard part is done, now to make it look pretty. Play around with some of the parameters until you get the look that you like. Adjust the [Petal Count], the [Column_Count], the [Petal_Length_Ratio], and even the [P2_Parameter] and [P3_Parameter] if you wanna get crazy.

Next, throw some color on there. You could make the color meaningful to encode some data, or you could do what I just did and color it randomly. I used the calc below and then just assigned one of the color palettes I have saved.

Color = [Type] + STR([Base ID])

And that’s it! If you made it this far, please reach out and let me know what you thought, and what you came up with. Thank you so much for reading, and keep an eye on the blog for more ‘Totally Useless Charts & How to Build Them’

Categories
How-To's Tableau Techniques

It Depends: Techniques for Disabling the Default Highlighting in Tableau

The thing that I love most about Tableau is the incredible flexibility. No matter what you are trying to do, there is a way to do it. And more often than not, there are actually several ways to do it. That’s where this series comes in. There are so many incredible hacks and techniques floating out there in the Tableau Universe, it can be difficult to figure out which ones to use and when. In each installment of this series we’ll be focusing on one specific ‘question’ and discuss the pros, cons, and use cases of various techniques. And our first question of the series is… “How do I turn off the default highlighting in Tableau when I click on a mark?”. And the answer is, of course, “It Depends”.

First off, what are we talking about when we say ‘default highlighting’? As I’m sure you have noticed, when you click on a mark in Tableau, something happens to the mark you selected, and to all of the other marks in your view. When you click on a text mark, you get a blue box on the selected mark and all of the other marks fade. When you click on any other type of mark, that mark retains it’s formatting (sometimes with an extra black box around it) and the rest of the marks fade. And then everything in that view goes back to normal when you click on something else.

This behavior makes sense. When a mark is selected, you should know which mark that is. But the result, in my opinion, does not look great. It would be really nice if we could control what the selected and non-selected marks look like.

This post is going to focus on three techniques that will allow you to do just that. We’ll call them the ‘Highlight Technique’, the ‘Filter Technique’, and the ‘Transparent Technique’. First, let’s talk a little bit about each of these techniques, and then we’ll walk through how to apply them. If you’re familiar with the techniques and are just looking for a reminder on how to do one of them, feel free to skip ahead.

The Highlight Technique – This technique leverages a highlight action and essentially highlights every mark when any mark is selected. What I love about this technique is that it’s very simple to set up, and it can be applied to multiple worksheets. With a single action, you can ‘turn off’ highlighting for your entire dashboard…as long as your dashboard doesn’t contain a specific mark type. One of the major drawbacks of this approach is that it does not work with text marks. Instead of getting rid of the blue boxes on your BANs, this technique will turn them yellow. Another drawback of this approach is that it doesn’t actually de-select the mark, it just masks the selection. You can still see a black border around the selected mark, and if you have something that could be clicked on multiple times in a row (like a scroll button), it makes for a clunky user experience. Users would have to click the mark three times to run the action twice (once to run the action, a second time to de-select the mark, and a third time to run the action again). And one last drawback is if you have the opacity turned down on a mark, when you click on it, the mark will show at full opacity.

The Filter Technique – This technique leverages a filter action, and to be completely honest, I’m not entirely sure how it works. But it works great! I was first introduced to this technique by Yuri Fal, during a Twitter discussion on this exact topic, and several others have written about it since. What I love about this technique is that it actually de-selects the mark after you click on it, unlike the Highlight Technique. It also works on any chart type. The only downsides I have found with this approach are that it’s a little tricky to set up, you have to create a separate action for every worksheet on your dashboard, and it does not work well in conjunction with some other actions, mainly other filter actions and set controls. The issue with using this technique with other actions is that you cannot leverage the ‘Clearing the selection will…’ options. So basically there is no option to undo your action. This is a major drawback if you’re trying to use set controls, or another filter action, but not so much with parameter actions since you can replicate that ‘Clearing’ function with a calculated field.

The Transparent Technique – I haven’t used this technique much in the past (mainly because I learned the other techniques first), but it’s definitely something I will use more in the future. I first came across this approach in Kevin Flerlage’s blog post, 14 Use Cases for Transparent Shapes. It leverages a transparent shape that can be built in PowerPoint or other design tools and, unlike the other techniques, it does not rely on dashboard actions. Everything can be done in your worksheet. Because the transparent shape doesn’t have a border, or any fill, there is nothing for Tableau to highlight when it’s selected. Another way that it differs from the other approaches is that the Highlight and Filter techniques can be used almost universally (with the few exceptions we discussed earlier), but this approach has pretty limited applications when it comes to avoiding highlighting (but a wide variety of other awesome applications that you can read more about in Kevin’s post). The two use cases that we’ll focus on are text marks and buttons. Personally, I wouldn’t recommend trying this approach for any chart type where the marks aren’t uniform in size and spacing (but that doesn’t mean you can’t try it). Another downside of this approach is that it does not actually de-select a mark, it just masks it, similar to the Highlight Technique. But on the upside, it’s probably the easiest of the three methods to implement, at least for text marks.

Alright, so let’s bring that all together

Now it should be clear which technique you should use right? Of course not. It depends. So let’s look at a few specific common use cases.

Which Technique to Use When

Text Marks

As I mentioned earlier, the Highlight Technique does not work with text marks, so that leaves two options. Up until recently I would have said with 100% confidence that you should use the Filter Technique. The main difference here is that if you have more than one mark in your worksheet, the Transparent Technique will fade the other marks, and the Filter Technique will retain all formatting (so you’ll want to use other indicators, like color, to show which mark is selected). With most mark types you’re trying to avoid that fading, but it actually works pretty well with text marks. What we really want to do here is get rid of that blue box, which both options will do. Here are the two methods, the Transparent Technique as is, and the Filter Technique using color to indicate the selected mark

Transparent Technique

Filter Technique

Both look great, but I would give the edge to the Transparent Technique because of how easy it is to set up. But I encourage you to try both methods and see which one you prefer

Buttons

There aren’t really any limitations on this use case. All three methods will work, but what works best might depend on how your buttons are built and how they will be used. If you built your button in Tableau using the Circle or Square mark type, or if you built them in another tool and brought them in as custom shapes, I wouldn’t recommend using the Transparent Technique. For this technique to work you would have to create two worksheets, one with buttons and one with transparent shapes, and then layer the sheets so that the transparent shapes in your top sheet are aligned perfectly with the buttons on your bottom sheet. Not terribly difficult, but the other methods are easier in this case. The only time I would recommend using the Transparent Technique for buttons over the other techniques, is if your buttons are not actually in Tableau at all, and are instead, part of a background image. You could design and incorporate your buttons directly into your background with other design tools, and then use this technique to make them act and appear as buttons in Tableau.

The next consideration would be whether or not a button might be clicked multiple times in a row, like it would with a scroll button. In that case I would definitely go with the Filter Technique as that’s the only one of these techniques that actually de-selects the mark. As I mentioned earlier, without that de-selection, users will have to click the button three times to run the action twice in a row. If multiple clicks aren’t a concern, you can also use the Highlight Technique, but my vote goes to the Filter Technique. Overall it makes for the best user experience and can make running your dashboard actions smooth and app-like. It’s worth the little bit of extra effort to set it up.

Any Other Chart Type

If you’re running an action from any other type of chart, like a bar chart or a scatterplot, I would not recommend using the Transparent Technique. You may be able to get it to work (on a scatterplot at least), but similar to the Buttons use case, there is an easier way. I will always default to the Filter Technique, because I think it provides the best user experience. Again, it may take longer to set up, but in my experience, it’s usually worth it. However, there are times when that is not an option. As I mentioned before, with that technique, you lose the ability to ‘Clear the Selection’. That means no set controls, no filter actions, and extra work for parameter actions. If I’m running a parameter action, I will usually still put in the effort to use the Filter Technique (and use a calculated field in the action to clear the selection). If I’m running a filter action, or using set controls, I will use the Highlight Technique. So my vote here again goes to the Filter Technique (when possible).

Turn Off All Highlighting

Sometimes, I have a dashboard with no interactivity, but I still don’t want users to click on something and trigger that highlighting. If you aren’t using your tooltips you could just put a floating blank over your whole dashboard and call it a day. But in most cases I still want the users to see the tooltips. In that case, the Highlight Trick is fantastic. It’s quick to set up and it can be applied to every chart in your dashboard with a single action (except for text marks). You could set up the Filter Technique on each of your worksheets individually, or build duplicates of each and try to layer them with the Transparent Technique, but in this case, the Highlight Technique definitely gets the vote.

The Verdict

So to summarize; for text marks, like BANs, I recommend the Transparent Technique, for any other types of charts that are running a parameter action, I recommend the Filter Technique, and for charts running filter actions or using set controls, or for mass highlight removal, I recommend the Highlight Technique.

Perfect, now what exactly are these techniques?

The Three Techniques

Here’s a quick walk-through of each of the three techniques. One thing to keep in mind is that we are only going to cover how to remove the highlighting. In a lot of cases, once you remove the highlighting, it’s still important to indicate which mark is selected. This is especially true if you are using actions to filter your dashboard. I review a few ways to do this in a post I wrote a while back on the Highlight Technique, but this post is already long enough so I’m not going to repeat them here.

The Highlight Technique

  • Create a calculated field. This field can be called whatever you want and can contain any non-aggregate value. The key is that this field will be the same on every single row in your data (and every mark in your worksheet). I typically name my field ‘HL’ and use a blank value (in calc body, just enter ”). For this example, I’m going to use ‘I<3Tableau’
  • Drag your ‘HL’ field to Detail on the marks card on ALL worksheets where you want to disable highlighting
  • Go back to your dashboard and add a Highlight Action by clicking on ‘Dashboard’ in the upper section above the toolbar, and then selecting ‘Actions’. Then click ‘Add Action’ and select ‘Highlight’
  • Update your Highlight Action. In this example I want to turn off the highlighting on my bar chart and my scatterplot, so I’ve added the ‘HL’ field to detail on both of those worksheets
    • Give your action a descriptive name so it’s easy to find and edit later on
    • Under Source Sheets, select ALL of the worksheets where you want to disable highlighting
    • Under Target Sheets, also select ALL of the worksheets where you want to disable highlighting
    • Run the Action on Select
    • Under Target Highlighting choose ‘Selected Fields’
    • Choose the ‘HL’ field from the list of fields
  • Your updated action should look like this

Once that’s done, click ‘OK’ and test your action. Usually if the action is not working as expected it’s because the ‘HL’ field is not on detail, or because it’s a mark type that’s not supported by this technique (Text marks).

The Filter Technique

  • Create a calculated field called 0 and enter the number 0 in the body of the calculation
  • Repeat the step above but use 1 for the calculation name and the value in the body of the calculation
  • Right click on both new fields, 0 and 1, and change them to a dimension
  • Drag both new fields to Detail on the marks card on the worksheet where you want to disable highlighting (this technique can only support one sheet at a time)
  • Go back to your dashboard and add a Filter Action by clicking on ‘Dashboard’ in the upper section above the toolbar, and then selecting ‘Actions’. Then click ‘Add Action’ and select ‘Filter’
  • Update your Filter Action. In this example I want to turn off the highlighting on the BANs, so I’ve added the 0 and 1 fields to detail on that worksheet.
    • Give your action a descriptive name so it’s easy to find and edit later on
    • Under Source Sheets, keep the active dashboard selected in the drop-down and then select the worksheet where you want to disable highlighting from the options below (the list of sheets in your dashboard)
    • Under Target Sheets, select your worksheet from the drop-down. Make sure that you select it from the drop-down and not from the options below the drop-down
    • Run the Action on Select
    • Set the ‘Clearing the selection will’ option to ‘Show all values’
    • Under Filter, choose ‘Selected Fields’
    • In the table below, on the left side in the Source Field column, click ‘Click to add’ and choose 0 from the list of fields
    • On the right side, in the Target Field column, choose 1 from the list of fields
  • Your updated action should look like this

Once all of the options are updated, click OK and test your action. If your action is not working as expected, the most common issues are;

  • Under Target Sheets, you may have the dashboard selected in the drop-down and your worksheet selected in the options below that. You can fix this by selecting your worksheet in the drop-down instead of the dashboard
  • 0 and 1 were added to detail as measures with aggregation. If this is the cause, you’ll see a warning at the bottom of the screenshot above that says ‘Missing fields from…’. The 0 and 1 need to be converted to dimensions, or added to detail without aggregation.

The Transparent Technique

So there are actually two different techniques here, one for text marks, and one for buttons. I am going to cover the technique for text marks because I think it’s the best technique for this mark type. But definitely check out Kevin’s post here to learn more about the technique with buttons and 12 other awesome uses cases for transparent shapes.

The first step is to create your transparent shape. I usually do this in PowerPoint but you can use any design program.

  • In PowerPoint, click on ‘Insert’ in the upper toolbar
  • Click on ‘Shapes’ and select a circle
  • Click anywhere on your slide to insert the circle
  • In the ‘Shape Format’ tab, click on ‘Shape Fill’ and select ‘No Fill’
  • In the ‘Shape Format’ tab, click on ‘Shape Outline’ and select ‘No Outline’
  • Right click on your transparent shape and select ‘Save Picture As’
  • Save the transparent shape to a sub-folder in the Shapes folder in your Tableau Repository
    • The path for this is usually C:\Users\Username\Documents\My Tableau Repository\Shapes unless you changed the location
    • Within the Shapes folder you can create sub-folders. I have a folder called ‘Transparent’ with just my transparent shape, so it’s easy to find

Once your transparent shape is created and saved you can apply it to your Text marks.

  • Go to the worksheet with the text marks
  • On the marks card, change the mark type to ‘Shape’
  • Click on ‘Shape’ then ‘More Shapes…’
  • At the bottom right of the window, click ‘Reload Shapes’
  • In the ‘Select Shape Palette’ drop-down, select the sub-folder where you saved your transparent shape
  • Click ‘Assign Palette’

When you do this, Tableau automatically moves all of your text to the Labels for the transparent shape, but if you have a field on color, you may notice that the color isn’t applied to those labels. To fix this, there’s one additional step.

  • Click on ‘Label’ on the marks card
  • Click on ‘Font’
  • Click the ‘Match Mark Color’ box

Now the color should be applied to each mark and if you click on any of the text marks, the blue box should be gone. You should see the selected mark retain it’s formatting, and the non-selected marks fade slightly.

The End

So that’s it for this installment of ‘It Depends’. Please keep in mind that these recommendations are all personal opinions based on my experiences. I encourage you to learn and try all of these different techniques on your own and figure out what works best for you.

And I hope you’ll join us for the next installment of ‘It Depends’ where we’ll discuss the different methods for filtering a dashboard with an action that supports multiple selections (Filter Action vs Set Action vs Set Control vs Parameter Action). Thanks everybody!

Categories
How-To's

Fun With Curves in Tableau Part 3: Sigmoid Curves

This is the third and final part of our series on creating curved elements in Tableau. Although this post covers new and different techniques, I would recommend checking out Part 1 of the series here as some of the concepts overlap. This post will focus on sigmoid curves and a few different techniques to build them, depending on what you’re trying to do and what your data looks like. To follow along, you can download the sample data and workbook here.

Sigmoid Curves

This is a sigmoid curve. Whether you realize it or not, these curves are everywhere on Tableau Public. Sankeys? Sigmoid Curves. Curvy lines on a map? Sigmoid Curves. Curvy bump charts, or curvy slope charts, or curvy dendograms, or curvy area charts? All Sigmoid Curves. The main difference between this type of curve and the Bezier curves we discussed in Part 2 of this series, is that only 2 points are needed to draw a Sigmoid Curve. If you have the start point, the end point, and the right formulas, the math will take over and create that nice symmetric s-shaped curve for you. So what formulas should you use? The answer is, as with most things in Tableau, it depends. We’re going to cover two different models for drawing Sigmoid curves, we’ll call them the ‘Standard’ model, and the ‘Dynamic’ model.

Think about most of the chart types that use sigmoid Curves. Sankeys, Curvy slope or bump charts, dendograms…they all have something in common. The start of the lines and the end of the lines are uniform. If they are running from left to right, the start of all of the lines share an X value, and the end of all of the lines share an X value (columns). If they are running from top to bottom, the start of all of the lines share a Y value, and the end of all of the lines share a Y value (rows). When these conditions are true, which they will be in 99% of the applications in Tableau, you can use the ‘Standard’ model. If either of those conditions are not true, you can use the ‘Dynamic’ model.

Since the majority of the time you will be using the ‘Standard’ model, let’s start with that one.

Building Your Data Source

To build our data source we are going to follow the same process that we did in Part 1 and Part 2 of this series. We are going to create additional points by joining our sample data to a densification table using join calculations (value of 1 on each side of the join). For this first example, we are going to use the sheet titled ‘Slope Chart’ for our sample data, and the sheet titled ‘SigmoidModel’ for our densification table. You can download the sample files here.

Here is our sample file that we will use to draw 12 curved lines, from left to right.

You may notice that the densification table looks a little different. In previous parts of this series, the densification table was a single column with numbers 1 thru however many points you wanted to create. For this model, we have 2 columns, ‘t’ and ‘Path’. The Path value will be used to tell Tableau how to connect the points (think connect the dots). The ‘t’ value is going to be used to draw the curve and will be a value evenly spaced between -6 and 6. For our example, we are using 24 points, so we will have a value at every .5 increment (-6, -5.5, -5, …). If you want your curve to be a little smoother, you could use 48 points and have a value at every .25 increment (-6, -5.75, -5.5, …). Really, you can use any number of points, you’ll just need to do the math to get t values that are evenly spaced between -6 and 6. Why -6 and 6? I have absolutely no idea and I’m not going to pretend to. I’m just here to show you how to make cool curvy things, and to do this cool curvy thing, you need values evenly spaced between -6 and 6. Here is our densification table

Building Your Calculations

Once we join those files together, there are only two calculations needed to turn the points from our sample data into curved lines.

Sigmoid

1/(1+EXP(1)^-[T])

Curve

[Start Position] + (([End Position] – [Start Position]) * [Sigmoid])

Essentially what these calculations are doing are determining the total vertical distance that needs to be travelled (End Position – Start Position) and then the Sigmoid function spaces these points appropriately on the Y axis to create the S-shaped curve. Let’s build our view and then come back to this.

Build Your Curves

Follow the steps below to build your curves

  • Right click on the [T] field, drag it to columns, and when prompted, choose the top option ‘T’ without aggregation
  • Right click on the [Curve] field, drag it to rows, and when prompted, choose the top option ‘Curve’ without aggregation
  • Change [LineID] to a Dimension, and drag it to color
  • Change the Mark Type to ‘Line’
  • For this example, you may not need to address the ‘Path’, but I find it’s good practice when dealing with lines and polygons.
    • Right click on the [Path] field, drag it to Path, and when prompted, choose the top option ‘Path’ without aggregation

When you finish, your worksheet should look like this

With just two calculations, we were able to draw those nice curved lines. Let’s quickly re-visit those calcs. Our Sigmoid calculation is a mathematical function with values ranging between 0 and 1 depending on the ‘T’ value. The table below shows the Sigmoid Value for each T value and the second row shows the difference from the previous value. Notice that at the beginning and the end, the values don’t change much from one step to the next, but towards the middle the values change much more rapidly. And where T=0, the value is .5.

Now let’s look at that in context with our lines and with our Curve calculation. Notice that the vertical position for each of the points on the line doesn’t change much at the beginning or the end, but does change rapidly as it approaches the center. And the center of each line is exactly half way between the starting position and the ending position.

For example, Line 4 starts at position 4 and ends at position 10. At T=0, the vertical position is at 7, halfway between 4 and 10. Let’s plug this line into our calculation. Remember, from the table above, the value of Sigmoid at T=0 is .5

Curve = [Start Position] + (([End Position] – [Start Position]) * [Sigmoid])

or

Curve = 4 + ((10 – 4) * .5) = 7

And let’s do this with one other point, just for good measure. If we look at the table above, we see that the Sigmoid value at T=-3 is .047

Curve = 4 + ((10 – 4) * .047) = 4.28

If you look at the image above, you’ll notice that at T=-3, the vertical position is just above the 4, or 4.28 to be exact. That’s enough math for now, let’s look at some other fun examples.

More Examples – Dendogram

Now let’s follow the same exact process for the curvy slope chart above, but change up our data a little bit. For this example we are going to use the sheet titled ‘Dendogram’ for our data, and we’ll use the same ‘SigmoidModel’ sheet for our densification. Here is our sample data.

In this example, our starting position is the same for all of our lines and is halfway between the minimum end position (1) and the maximum end position (12). If you follow the same process your sheet should look something like this.

Now just for fun, let’s make that starting point dynamic. Create a numeric parameter called [Dynamic_Start] and then create a new field called [Dynamic_Curve]. This will be exactly the same as our [Curve] calculation but we’ll replace the [Start_Position] field with [Dynamic_Start].

Dynamic Curve

[Dynamic Start] + (([End Position] – [Dynamic Start]) * [Sigmoid])

Now in your view, just replace the [Curve] field on Rows with the [Dynamic Curve] field and you should have something like this.

More Examples – Sankey

This is not going to be an in-depth tutorial on how to create Sankey diagrams. People much smarter than me (check out flerlagetwins.com) have written many posts and shared many templates for anyone interested in building one. This section will be dedicated more to the mechanics of a Sankey diagram and how we can take what we’ve already learned in the previous examples and apply them towards building a Sankey diagram. The one main difference between what we’ve done so far and what we’re going to do now lies in the Mark Type. It’s time for some polygon fun.

Essentially what we need to do is draw 2 lines for every 1 ‘Sankey Line’ and then connect them together. Typically, you would use table calculations in Tableau to figure out the positions of these lines, but we are going to keep it simple and use some data from our sample file. What we’re actually building is probably more of a Slope Chart, since each segment will be of equal width, but the fundamentals are the same as with a true Sankey Diagram. For this example we are going to use the sheet titled ‘Sankey’ and for our densification we’re going to use the sheet titled ‘SankeyModel’. Here is our sample data

You may notice that this densification table has some additional data. It has twice as many records and a new column called ‘Side’ with values ‘Min’ and ‘Max’. This is because for each segment, or ‘Sankey Line’ we will actually be drawing two lines, a Min Line and a Max Line. Here’s out densification data.

Building Your Calculations

We are still going to use the same 2 fields, Sigmoid and Curve, but in this case, the calculation for Curve will be a little different.

Sigmoid

1/(1+EXP(1)^-[T])

We have the two points needed for one of our lines (Start_Position and End_Position), we’ll call this the Min Line. But we need the two points for our Max Line as well. As I mentioned earlier, in this example all of our segments are going to be equal width, so we’ll just add 1 to our position fields.

Start_Position_Max

[Start_Position]+1

End_Position_Max

[End_Position]+1

And now we have the points needed to draw all of our curves.

Curve

IF [Side]=’Min’ then [Start_Position] + (([End_Position] – [Start_Position]) * [Sigmoid])
ELSE [Start_Position_Max] + (([End_Position_Max] – [Start_Position_Max]) * [Sigmoid]) END

This calculation may look complicated, but it’s almost identical to the Curve calculation we were using earlier. When we joined to the SankeyModel, it essentially created two sets of points for us. We have all of the same T values as we did earlier (24 points ranging from -6 thru 6), but now we have 2 of each, one where [Side] = ‘Min’ and one where [Side] = ‘Max’. So we can use this to draw two lines. The first part of the IF statement is for the ‘Min’ line. It is exactly the same as our previous Curve calculation. The second part of the IF statement (ELSE) is for the ‘Max’ line and it is the same calculation but using the [Start_Position_Max] and [End_Position_Max] fields.

Build Your Curves

To start, let’s follow the same process we did earlier, but with a couple of minor changes

  • Right click on the [T] field, drag it to columns, and when prompted, choose the top option ‘T’ without aggregation
  • Right click on the [Curve] field, drag it to rows, and when prompted, choose the top option ‘Curve’ without aggregation
  • Change [LineID] to a Dimension, and drag it to color
  • Change the Mark Type to ‘Line’
  • Drag [Side] to Detail

Your sheet should look something like this

This is starting to look like a Sankey diagram. For each LineID we have two lines that are moving together, we just need to ‘color them in’, and we’ll do that by converting them to polygons. So change the Mark Type to polygon.

Uh-oh. What happened?

So because we have [Side] on detail, it’s treating our Min and Max lines as separate polygons. We want them treated as 1 polygon, so surely removing the [Side] field will fix it.

$@!%!!!!

Alright, we can figure this out. Remember how earlier I said it was always good practice to address the Path on the Marks Card when using lines and polygons? Let’s try right clicking on our [Path] field, dragging that to Path and adding it without any aggregation.

Ahhh, that’s better.

Now let’s figure out what happened. We know why we had to remove the [Side] field, but how did the [Path] field fix our worksheet. If you look at the densification table, you’ll see that the T values go from -6 up to 6 and then repeat in the reverse order. The Path field on the other hand is sequential from 0 thru 49. Let’s take a look at the [Path] values in relation to the [T] values

So our [T] and [Curve] fields are telling Tableau where to plot all of the points, and then our [Path] field is telling Tableau the order in which to connect those points to form our polygon. And that is basically all you need to know about using Polygons in Tableau. Just Points and Path.

Let’s take a look at one more example of something you can do with the ‘Standard’ Model.

More Examples – Proportion Plot

The process for building a Proportion Plot is going to be very similar to what we did in the previous example. For this, we are going to use the sheet titled ‘Proportion’ for our sample data, and we’ll use the ‘SankeyModel’ sheet again for our densification. Here is our sample data.

You’ll notice that this table has a couple of extra columns. The ‘Run’ columns are just running totals of the Start_Value and End_Value columns. Again, this would typically be done with table calculations in Tableau, but to keep things simple we are doing those calculations in the data source. Now let’s build our calculations.

Building Your Calculations

Once again we are going to use those same two calculations, but with some minor modifications. First, our Sigmoid calculation, which is always the same.

Sigmoid

1/(1+EXP(1)^-[T])

Now, similar to the Sankey example, we need to draw two sets of line for each section. So we’ll need the start and end positions for our Min Line and the start and end positions for our Max Line. Because this is a Proportion Plot (part of a whole), we want all of our values represented as percentages. So we’ll calculate what our position should be and then divide that number by the total of the column (or the maximum value in the running total column) to translate that start position into a percentage of the whole.

Start Position Min

([Start_Value_Run]-[Start_Value])/{MAX([Start_Value_Run])}

End Position Min

([End_Value_Run]-[End_Value])/{MAX([End_Value_Run])}

Let’s take a closer look at this before we move on. We’ll use Section #2 from the table above. For the Starting Position (Start Position Min) we are going to subtract the value of that section from the running total to get the starting position.

So section #2 would start at 35 (90-55=35). Then we’ll divide that by 115, which is the total of the column, and the maximum value of the running total column, which gives us 30.4%. So this section of our Proportion Plot is going to start 30.4% of the way up our Y Axis. Now we’ll repeat for the Max Lines. These calculations will be the same except we do not need to subtract the value of the section. We just want the running total divided by the grand total.

Start Position Max

[Start_Value_Run]/{MAX([Start_Value_Run])}

End Position Max

[End_Value_Run]/{MAX([End_Value_Run])}

Using section #2 as an example again, we would divide 90 (the running total value) by 115 (the total of the column or the max value of the running total column) and get 78.3%. So we want this section of our Proportion Plot to start 30.4% of the way up the Y axis and end 78.3% of the way up the Y axis. And now for the Curve calculation, we’ll use the same calculation that we did in the Sankey Example

Curve

IF [Side]=’Min’ then [Start_Position_Min] + (([End_Position_Min] – [Start_Position_Min]) * [Sigmoid])
ELSE [Start_Position_Max] + (([End_Position_Max] – [Start_Position_Max]) * [Sigmoid]) END

Build Your Curves

Now we’ll follow the same process to build our view

  • Right click on the [T] field, drag it to columns, and when prompted, choose the top option ‘T’ without aggregation
  • Right click on the [Curve] field, drag it to rows, and when prompted, choose the top option ‘Curve’ without aggregation
  • Change [Section] to a Dimension, and drag it to color
  • Change the Mark Type to ‘Polygon’
  • Right click on the [Path] field, drag it to Path, and when prompted, choose the top option ‘Path’ without aggregation

When you’re finished, your sheet should look something like this.

If you look at section #2, the orange section, you’ll see that on the left side (the starting positions), it ranges from 30.4% to 78.3%, which is what we had calculated in our example. So it accounts for about 48% of the total (55 / 115=.48) and it starts at about 30% (35 / 115)

That’s it for the ‘Standard Model’. Now we are going to jump into one quick example with the ‘Dynamic Model’.

Dynamic Model

As I mentioned earlier, 99% of the time you are going to want to use the ‘Standard’ model, but there are instances where that may not work. One example use case, which has been popular in recent years, is drawing curved lines on a map that connect to another chart. So let’s use that.

Building Your Data Source

For this example we are going to draw curved lines from 9 U.S. states and have them end in a uniform column on the right side of our sheet, so we can connect them to another chart. For this we’ll use the sheet titled ‘Dynamic’ for our sample data and we’ll use the ‘SigmoidModel’ sheet for our densification. Here is our sample data.

So for each of our states we have the starting points (the lat and lon for the state), but we need to figure out our end points. This can be done in a lot of different ways. You can do it manually and plug those values into your data source. Personally, I like to calculate them in Tableau. But those calcs will be dramatically different depending on your use case. For this example, we are going to draw our lines from left to right and they will all end in a uniform column to the right of the U.S. and will be spaced evenly between the lowest state and highest state in our sample data. If your use case is different, you can most likely modify these calculations to get what you need, and as always, feel free to reach out to me if you have any questions. So let’s calculate our end points. Keep in mind that this step is completely optional. If you prefer to have static points in your data source for the end points, just name them End_Lat and End_Lon and skip ahead to the next calculation.

Build Your Calculations

This first calculation will be used to find the latitude for our end points. The calculation looks complicated but we will break it down.

End_Lat

{MAX([Lat])} – (({MAX([Lat])}-{MIN([Lat])})/{COUNTD([State])-1}) * ([Rank]-1)

The first portion of this calculation, {MAX([Lat])}, is used to find our ‘reference point’, which will be the maximum latitude in our sample data. The next portion of the calculation, ({MAX([Lat])}-{MIN([Lat])}) gets our ‘range’, or the maximum latitude minus the minimum latitude. Then we divide that range by the number of states -1 to get the spacing between points, or {COUNTD([State])-1}. And then we multiply that spacing by the Rank -1 (so that the first point starts at our max Lat).

So an easier version of this formula is Ref Point – ((Range/Number of States -1) *( Rank -1)

Now we need to calculate the longitude for our end points. This is much easier. We want all of our lines to end in the same column, so they will all have the same longitude. I like to use parameters rather than hardcoding values, so you can easily adjust the position of things. So let’s create a parameter called [Line End Offset] and set that value to 10. Then our calculation is just the maximum longitude value in our data + our offset parameter

End_Lon

{MAX([Lon])} + [Line End Offset]

So now we have the two sets of points that we’ll need to draw our lines. If we were to stop here and just map those two sets of points, we would have a point on the center of each of our states (blue) and then a column of points to the right of the U.S. (orange).

Now for the rest of our calculations. Once again we’ll need the Sigmoid calculation.

Sigmoid

1/(1+EXP(1)^-[T])

The Curve calculation is exactly the same as all of the previous examples just with different field names.

Curve

[Lat] + (([End Lat] – [Lat]) * [Sigmoid])

The main difference between the ‘Dynamic’ model and the ‘Standard’ model lies in the spacing of the points. In the Dynamic model, points are spaced evenly between -6 and 6 and we can just use the [T] field on columns. That obviously won’t work on a map. We have different longitudes for all of our starting points, so we’ll need to adjust the approach a little bit. Instead of evenly spacing points between -6 and 6, we need to evenly space them between our starting and ending longitudes. So we’ll take the total horizontal distance (ending longitude – starting longitude) and divide it by the number of points in our densification table -1.

Point Spacing

([End Lon]-[Lon])/({COUNTD([Path])-1})

Now we’ll use that calculation, along with the [Path] field and our starting longitude to create points that are equally distant between our starting and ending longitude.

Lon_Adjusted

[Lon] + [Path]*[Point Spacing]

Build Your Curves

Now that we have all of our calculations, let’s build our curve.

  • Right Click on [Lon_Adjusted], go to ‘Geographic Role’ and select ‘Longitude’
  • Right Click on [Curve], go to ‘Geographic Role’ and select ‘Latitude’
  • Right click on the [Lon_Adjusted] field, drag it to columns, and when prompted, choose the top option ‘Lon_Adjusted’ without aggregation
  • Right click on the [Curve] field, drag it to rows, and when prompted, choose the top option ‘Curve’ without aggregation
  • Drag [State] to Color
  • Change the Mark Type to ‘Line’
  • Right click on the [Path] field, drag it to Path, and when prompted, choose the top option ‘Path’ without aggregation

When you’re done, your sheet should look something like this.

Now with just a few modifications to the ‘Standard’ model, we were able to build our Sigmoid curves with different starting points. This example was for building lines that run left to right, but you can just as easily build them from top to bottom. The calculations and example for that can be found in the sample workbook here. Just use the fields with the ‘Ttb’ suffix.

If you’re still reading this, I apologize for the incredibly long post. There are just so many things you can do with Sigmoid Curves and I wanted to demonstrate how, with basically the same technique, you can build a huge variety of cool curvy things in Tableau. If you’re interested in learning more about Sigmoid Curves, definitely check out flerlagetwins.com. Ken and Kevin are incredible and everything I know about Sigmoid Curves (and a lot of other things that will eventually be covered in this blog), I learned from them.

Here are a few examples of where I’ve used Sigmoid Curves in my Tableau Public Work (click thumbnail to view on Tableau Public).

Standard Model

Dynamic Model

As always, thank you so much for reading!

Categories
How-To's

Fun With Curves in Tableau Part 2: Bezier Curves

This is part two of a three-part series on creating curved elements in Tableau. Although this post covers new and different techniques, I would recommend checking out part one of the series here as some of the concepts overlap. This post will focus on one type of curved line that is used frequently in Tableau Public visualizations; Bezier Curves. To follow along, you can download the sample data and workbook here.

Bezier Curves

There are many types of Bezier curves varying in complexity from very simple to ridiculously complicated. One commonality with these types of curves is that they rely on ‘control points’. This post is going to focus on quadratic Bezier curves, which have 3 control points. An easy way to think about these points is that there is a starting point, a mid point, and an end point, creating a triangle. The starting point and end point are simply the start and end of the line. The other point, the mid point, will determine the shape of the triangle, and in turn, what that curve is going to look like. Now let’s see how the position of that mid point (creating different types of triangles) will affect the curve.

Each of the triangles above have the same starting point (1,0) and the same end point (10,0), but have significantly different curves because of the varying mid point. For most applications in Tableau we’re going to be dealing with examples like Example 1 and Example 4 in the image above, where the mid point is halfway between the other points, creating an isosceles triangle. To make things even easier for this example, we’re going to deal with just Example 1, which is an equilateral triangle, meaning all 3 sides are the same length.

Building Your Data Source

To build our data source we are going to follow the same process that we did in Part 1 of this series. We are going to create additional points by joining our sample data to a densification table using join calculations (value of 1 on each side of the join). In this case, our sample data is called Bezier_SampleData and our densification data is called BezierModel. You can download the sample data here. In our sample data we have 10 records that we’ll use to draw 10 unique curves. For simplicity sake, all of these curves will start and end at 0 on the Y axis.

Building Your Calculations

To draw our curves there are a few things we’ll need to calculate. Our sample data has 2 of the 3 points we need (starting point and end point), so we’ll need to calculate the X and Y values for the 3rd point (mid point). Let’s start there.

We discussed above that we are going to use equilateral triangles to draw our curves. In that case, the X value for the mid-point will be halfway between [X_Start] and [X_End] and the Y value will be the height of the triangle plus the starting point. Here is an example

To find the X value that falls in the middle of [X_Start] and [X_End], just add them together and divide by 2

X_Mid

([X_Start]+[X_End)/2

To find the Y Value add the Y starting point to the height of the triangle. To find the height of the triangle, use the formula (h=a*3/2) where a is the length of one of the sides of your equilateral triangle. To find that length subtract [X_Start] from [X_End]

Y_Mid

[Y_Start] + (([X End]-[X Start])*SQRT(3)/2)

Now, we have 3 points for each record in our data set. We have our starting point ([X_Start],[Y_Start]), our end point ([X_End],[Y_End]), and our mid point ([X_Mid],[Y_Mid]). If we were to stop here and plot those points, it would look something like this, 10 equilateral triangles of different sizes, all on the Y axis (because we had used 0 for all of the [Y_Start] and [Y_End] values)

Now let’s convert those points into Bezier curves. The first calculation we’ll need is [T]. T is going to be a percentage value that is equally spaced between 0 and 1 for the number of points in our densification table. Think of this as similar to the [Position] calc in Part 1 of this series, but slightly different because we need our first point to start at 0.

T

([Points]-1)/{MAX([Points])-1}

No matter how many points you add to your densification table, this calculation will spread them evenly between 0 and 1. This field is used to evenly space our points along our curved line. When complete, your T values should look like this. The value for point 1 should be 0%, the value for the last point should 100% and all of the points in the middle should be equally spaced between those

Now all that is left is to calculate the X and Y coordinates for each point along our curved lines. The calculations for X and Y are exactly the same, but in the [Bezier_X] calc you are using the 3 X values, and in the [Bezier_Y] calc you are using the 3 Y values

Bezier_X

((1-[T])^2*[X_Start] + 2*(1-[T])*[T]*[X_Mid]+[T]^2*[X_End])

Bezier_Y

((1-[T])^2*[Y_Start] + 2*(1-[T])*[T]*[Y_Mid]+[T]^2*[Y_End])

Now let’s build the curves in Tableau

Build Your Curves

Follow the steps below to build your curves

  • Right click on the [Bezier_X] field, drag it to columns, and when prompted, choose the top option ‘Bezier_X’ without aggregation
  • Right click on the [Bezier_Y] field, drag it to rows, and when prompted, choose the top option ‘Bezier_Y’ without aggregation
  • Change the Mark Type to ‘Line’
  • Right click on the [Points] field, drag it to Path, and when prompted, choose the top option ‘Points’ without aggregation
    • This tells Tableau what order to ‘connect the dots’ in.
  • Drag [Line Name] to color

When you finish, your worksheet should look something like this

Or you can change the Mark Type to ‘Polygon’ and reduce the Opacity and it will look like this

Now this is a relatively simple example. You can get really creative with how you calculate those 3 points, especially the ‘mid’ point. Let’s do one more example, combining the work we’ve done in this exercise with what we had done in Part 1 of the series. The result should look pretty familiar.

First, let’s take our X values and plot them in a circle, instead of on a straight line. From Part 1 of this series, we know that in order to plot points around a circle, we need 2 inputs for each point; the Radius (the distance from the center of the circle), and the Position (the position around the circle expressed as a percentage). Since we want all of our points to be equally distant from the center, we can use a single value for the radius of all points. Let’s create a parameter called [Radius] and set the value to 10. For Position, we’ll need to calculate the position for each start and end point. For the position calculations we’ll need to first find the maximum number of points around our circle, or in this case, the max value of the [X_Start] and [X_End] fields together. Looking at our data, we can see the maximum value of the [X_Start] field is 12 and the maximum value of the [X_End] field is 15. So our Max Point will be 15

Max_Point

{MAX(MAX([X_Start],[X_End]))}

Next, we’ll use that value to calculate the position around the circle for each start and end point

Position_Start

[X_Start]/[Max_Point]

Position_End

[X_End]/[Max_Point]

Now, we can calculate the X and Y coordinates for the starting point and end point of every line, using the same calculations we used in Part 1 of the series. Let’s begin with the starting points

Circle_X_Start

[Radius]* SIN(2*PI() * [Position_Start])

Circle_Y_Start

[Radius]* COS(2*PI() * [Position_Start])

And now we’ll calculate the X and Y coordinates for the end points using the same formulas but swapping out the [Position_Start] field, with the [Position_End] field.

Circle_X_End

[Radius]* SIN(2*PI() * [Position_End])

Circle_Y_End

[Radius]* COS(2*PI() * [Position_End])

Now for each of our lines, we have two sets of coordinates. We have the coordinates for the start of our line ([Circle_X_Start],[Circle_Y_Start]) and the coordinates for the end of our line ([Circle_X_End],[Circle_Y_End]). All we need now is that 3rd set of coordinates, the mid-point. The good news is, when plotting these around a circle, we have a very convenient mid-point…the middle of the circle. In this case, because our circle is starting at (0,0), we can use those values as our 3rd set of points. Let’s take our 3 sets of coordinates and plug them into the Bezier calculations we used earlier

Circle_Bezier_X

((1-[T])^2*[Circle_X_Start] + 2*(1-[T])*[T]*0+[T]^2*[Circle_X_End])

Circle_Bezier_Y

((1-[T])^2*[Circle_Y_Start] + 2*(1-[T])*[T]*0+[T]^2*[Circle_Y_End])

Now in our view if we replace the [Bezier_X] and [Bezier_Y] fields with the [Circle_Bezier_X] and [Circle_Bezier_Y] fields, we get something like this…the foundation of a chord chart

This chart uses the same exact calculations for the curved lines, we just used some additional logic to calculate the coordinates for the start and end points. In our first example, for Line 2, we drew a curved line between 3 and 15 on the X axis. The coordinates for our start and end point were (3,0) and (15,0) respectively. Then we calculated the coordinates for our mid point, which ended up being (9,10.4)

In this last example, we also drew a curved line between 3 and 15, but instead of those points being on the same Y axis, they were positioned around a circle. We used what we learned in Part 1 of this series to translate those X values (3 and 15) into coordinates around a circle. So the coordinates for our start and end positions ended up being (9.5,3.1) and (0,10) respectively. And instead of calculating our mid-point, we used the center of the circle (0,0).

These are just a few basic examples of what you can do with Bezier curves, but there are so many possibilities. Here are a few examples of where I’ve used Bezier Curves in my Tableau Public Profile.

Thank you so much for reading, and keep an eye out for the third and final part of this Series, focusing on Sigmoid curves.

Categories
How-To's

Fun With Curves in Tableau Part 1: Circles

In recent years there have been multiple scientific studies1 designed to confirm what many of us in the Data Visualization Community have already suspected; when it comes to art, people are drawn to curves. Think about some of your favorite pieces of Data Art. I am willing to bet that the majority of them contain some type of curved element. Not only are curves more aesthetically pleasing than straight lines and sharp corners, but they have that ‘WOW’ factor, because as we all know, curved lines do not exist in Tableau. They take effort, and when it comes to drawing curves, most people don’t know where to start. But you don’t need to be an expert in Tableau to create beautiful radial charts, or to add some impressive curves to your dashboards. You just need to know the math, you need to know how to structure your data, and you need to know how to bring those elements together in Tableau. That’s the goal of this series. To hopefully demystify some of this work and make it more approachable, and to provide some examples. This series will focus on three types of curved elements; Circles, Bezier Curves, and Sigmoid Curves.

Drawing Circles in Tableau

Tableau does have a ‘Circle’ mark type that can be used, but being able to draw your own circles opens up a world of possibilities. The calculations that are used for drawing a circle, are the same calculations that can be used to create any type of radial chart you can imagine. For this post, I am going to keep the math as simple as possible, but if you’re interested in diving deeper I would recommend checking out this post by Ken Flerlage.

Once you have your data structured there are really only 2 inputs needed to create your radial; the distance of each point from the center of the circle (radius), and the position of each point around the circle. We’ll discuss these inputs a lot more in this post, but let’s start with our data structure. To follow along with this post, you can download the sample data and workbook here.

Building Your Data Source

To create any type of curved element in Tableau, you’ll need to start by densifying your data. To create your densification table, create a table in Excel with 1 column, in the first cell name that column ‘Points’, and then add rows with numbers 1 through however many points you wish to create. There are a few things to consider when choosing that number. Choosing a number that is too high may affect performance as that many rows will be added to your data source for each record in your ‘Core’ data set (a data source with 1000 rows will turn into 100,000 if you choose 100 points). Choosing a number that is too low will result in visible straight lines and corners instead of a smooth curve. Here is an example to help visualize how the number of points can affect the shape

I will usually choose somewhere between 50 and 100 points, closer to 50 when the circles will be small, and closer to 100 when the circles will be large. For this post, we’ll go with 50 points, so your densification table should look something like this.

And here is some sample data we’ll use as our ‘Core’ data source. We have 10 records that we’ll use to create 10 distinct circles, and we’ll use the ‘Value’ column to size the circles appropriately. I use this technique frequently, instead of the ‘Circle’ mark type in Tableau, to ensure that the size (area) of each circle accurately represents the underlying value. When designing my ‘Core’ data source, I like to use a sequential ‘ID’ field, starting from 1. This can help make some of the calculations easier, but if it’s not an option, you can typically replace that ‘ID’ field with the INDEX function in Tableau.

Now that we have our ‘Core’ data set, and our densification table, let’s bring these together in Tableau. To densify our data we’ll need to join these two tables in the physical layer in Tableau.

  • First, connect to your ‘Core’ data. In this example, that table is called ‘CircleData’
  • Drag your ‘CircleData’ table onto the Data Source pane
  • Double-click on the ‘CircleData’ Logical Table to view the Physical Tables
  • Drag your ‘Densification’ data onto the Data Source pane. In this example, this table is called ‘CircleDensification’
  • Join the tables with a Join Calculations
    • On the left side of the join, click on the drop-down and select ‘Create Join Calculation’
    • In the calculation box enter the number 1
    • Repeat the steps above on the right side of the join
  • When complete, your data source should look like this

Building Your Calculations

Now that we have our data source, the next step will be to build our calculations. As I mentioned earlier, there are 2 inputs that we’ll need to draw our circles; the distance of each point from the center of the circle, and the position of each point around the circle. Let’s start with the distance calculation.

In this example, we want the area of the circle to represent the value in our data. The distance from the center of a circle to the outside of the circle is known as the radius, and we can calculate that with the formula below. This will serve as our first input.

r = √A/π or in plain English Radius = Square Root of Area/Pi

In our data we have the Area (Value) so we can calculate the radius for each of circles using the calculation below in Tableau. Create a new calculated field called ‘Radius‘ and copy the formula below

Radius

SQRT([Value]/PI())

For the next input, we’ll need to calculate the position of each point around the circle. For each circle, we have 50 points (the number of rows in our densification table) and we’ll want to evenly distribute those points around the circle. The resulting number will be a percentage and will represent how far around the circle that point appears. For example, imagine you were looking at a clock. 3 o’clock would be 25%, 6 o’clock would be 50%, 9 o’clock would be 75% and 12 o’clock would be 100%

We can calculate that percentage for each point by taking the value of the Points field divided by the Max Value of that field (which would represent the number of points or number of rows in our densification table). So the Position calculation would be as follows in Tableau

Position

[Points]/{MAX([Points])}

Note* if you plan on using the ‘Line’ mark type to draw your circles instead of the ‘Polygon’ mark type, you should modify this calculation by subtracting 1 from the divisor, [Points]/{MAX([Points])-1}. This is because polygons will automatically connect your first and last point. For lines, we need to force that connection.

Now your data should look something like this. For each record in your ‘Core’ data set, you have 50 points, with position values equally spread between 0-100%

Now we have our 2 inputs and all that is left to do is to translate these inputs into X and Y coordinates, which can be done with two simple calculations

X

[Radius]* SIN(2*PI() * [Position])

Y

[Radius]* COS(2*PI() * [Position])

Build your Circles

Follow the steps below to build your circles

  • Right click on the [X] field, drag it to columns, and when prompted, choose the top option ‘X’ without aggregation
  • Right click on the [Y] field, drag it to rows, and when prompted, choose the top option ‘Y’ without aggregation
  • Change the Mark Type to ‘Polygon’
  • Right click on the [Points] field, drag it to Path, and when prompted, choose the top option ‘Points’ without aggregation
    • This tells Tableau what order to ‘connect the dots’ in.
  • Drag [Circle Name] to color

When you finish, your worksheet should look something like this

Right away, you’ll probably notice a few things about this. First, these look like ovals, not circles. And second, there are only 2 circles when there should be 10. The reason they look like ovals is because the worksheet is wider than it is tall. It’s important when you place a radial chart on a dashboard that you set the width and height equal and that you Fix both the X and Y axis to the same range. The reason there are only two circles visible is because all 10 circles have the same starting position (0,0), so they are currently stacked on top of each other.

Arrange Your Circles

There are a number of techniques you can use to arrange these circles. You could place the [Circle ID] field on Rows to create a column of circles, or on Columns to create a row of Circles. Or, with a little more math, you could do a combination of these and create a trellis chart, or ‘small multiple’. But personally, I like to use ‘offset values’ which place everything on the same pane and give you total control of the placement of each object.

First, let’s place all of these circles in a single row. To do this, we’ll create a numeric parameter and set the value to 10. This is going to be used to set the spacing between each circle. To increase the spacing, set the parameter higher. To decrease the spacing, set the parameter lower.

Now we’ll create our ‘offset value’.

Circle Offset

[Circle ID]*[Circle Offset Parameter]

And next, we’ll add that value to our [X] calculation

X

[Radius]* SIN(2*PI() * [Position]) + [Circle Offset]

And here is the result

This works because it moves every point in each circle an equal amount. For Circle 1, the [Circle Offset] value will be 10 (Circle ID is 1 x Circle Offset Parameter is 10 = 10). Adding that to the [X] value that was previously calculated will move every point in that circle 10 to the right. For Circle 2, the [Circle Offset] value will be 20 (Circle ID is 2 x Circle Offset Parameter is 10 = 20), which will move every point in that circle 20 to the right.

If we add the [Circle Offset] value to the [Y] calculation instead of [X], the result will be a single column of circles. And if we add the [Circle Offset] field to both the [X] and the [Y] values, the result will be a diagonal line.

So those are some easy ways to plot your circles in a line…but this is a post about circles. So…let’s plot them in a circle. And to do this we’re going to use the same techniques we used to draw the circles.

First, let’s create one more parameter that will serve as the radius of our new circle. We’ll call it ‘Base Circle Radius’ and set the value to 30

That parameter will be one of our two inputs for plotting points around a circle. The second input is going to be the position. Previously, we had 50 points that we wanted to plot evenly around a circle. Now, we have 10 points (10 circles). Previously, we had values 1 thru 50 (the Points field). Now, we have values 1 thru 10 (the Circle ID field). So using the same technique we used earlier, we’ll calculate the position of each circle around our ‘Base’ circle.

Base Circle Position

[Circle ID]/{MAX([Circle ID])}

Now, we have our two inputs and we can use the same calculations to translate those into X and Y values

Base Circle X

[Base Circle Radius]* SIN(2*PI() * [Base Circle Position])

Base Circle Y

[Base Circle Radius]* COS(2*PI() * [Base Circle Position])

Now all you need to do is add these values to your [X] and [Y] calculations respectively, similar to what we did in the previous section

X

[Radius]* SIN(2*PI() * [Position]) + [Base Circle X]

Y

[Radius]* COS(2*PI() * [Position]) + [Base Circle Y]

And here is the result, a circle of circles

Now this is just one simple example of what’s possible once you know how to plot points around a circle. I use these same few calculations, with some modifications, in a ton of my Tableau Public visualizations. Here are a few examples that use this technique.

Thank you so much for reading the first of many posts on Do Mo(o)re With Data and keep an eye out for the second part of this series.

1 Here’s an article about a few studies demonstrating human’s innate affinity for curves. Link