Categories
How-To's

Fun With Curves in Tableau Part 3: Sigmoid Curves

This is the third and final part of our series on creating curved elements in Tableau. Although this post covers new and different techniques, I would recommend checking out Part 1 of the series here as some of the concepts overlap. This post will focus on sigmoid curves and a few different techniques to build them, depending on what you’re trying to do and what your data looks like. To follow along, you can download the sample data and workbook here.

Sigmoid Curves

This is a sigmoid curve. Whether you realize it or not, these curves are everywhere on Tableau Public. Sankeys? Sigmoid Curves. Curvy lines on a map? Sigmoid Curves. Curvy bump charts, or curvy slope charts, or curvy dendograms, or curvy area charts? All Sigmoid Curves. The main difference between this type of curve and the Bezier curves we discussed in Part 2 of this series, is that only 2 points are needed to draw a Sigmoid Curve. If you have the start point, the end point, and the right formulas, the math will take over and create that nice symmetric s-shaped curve for you. So what formulas should you use? The answer is, as with most things in Tableau, it depends. We’re going to cover two different models for drawing Sigmoid curves, we’ll call them the ‘Standard’ model, and the ‘Dynamic’ model.

Think about most of the chart types that use sigmoid Curves. Sankeys, Curvy slope or bump charts, dendograms…they all have something in common. The start of the lines and the end of the lines are uniform. If they are running from left to right, the start of all of the lines share an X value, and the end of all of the lines share an X value (columns). If they are running from top to bottom, the start of all of the lines share a Y value, and the end of all of the lines share a Y value (rows). When these conditions are true, which they will be in 99% of the applications in Tableau, you can use the ‘Standard’ model. If either of those conditions are not true, you can use the ‘Dynamic’ model.

Since the majority of the time you will be using the ‘Standard’ model, let’s start with that one.

Building Your Data Source

To build our data source we are going to follow the same process that we did in Part 1 and Part 2 of this series. We are going to create additional points by joining our sample data to a densification table using join calculations (value of 1 on each side of the join). For this first example, we are going to use the sheet titled ‘Slope Chart’ for our sample data, and the sheet titled ‘SigmoidModel’ for our densification table. You can download the sample files here.

Here is our sample file that we will use to draw 12 curved lines, from left to right.

You may notice that the densification table looks a little different. In previous parts of this series, the densification table was a single column with numbers 1 thru however many points you wanted to create. For this model, we have 2 columns, ‘t’ and ‘Path’. The Path value will be used to tell Tableau how to connect the points (think connect the dots). The ‘t’ value is going to be used to draw the curve and will be a value evenly spaced between -6 and 6. For our example, we are using 24 points, so we will have a value at every .5 increment (-6, -5.5, -5, …). If you want your curve to be a little smoother, you could use 48 points and have a value at every .25 increment (-6, -5.75, -5.5, …). Really, you can use any number of points, you’ll just need to do the math to get t values that are evenly spaced between -6 and 6. Why -6 and 6? I have absolutely no idea and I’m not going to pretend to. I’m just here to show you how to make cool curvy things, and to do this cool curvy thing, you need values evenly spaced between -6 and 6. Here is our densification table

Building Your Calculations

Once we join those files together, there are only two calculations needed to turn the points from our sample data into curved lines.

Sigmoid

1/(1+EXP(1)^-[T])

Curve

[Start Position] + (([End Position] – [Start Position]) * [Sigmoid])

Essentially what these calculations are doing are determining the total vertical distance that needs to be travelled (End Position – Start Position) and then the Sigmoid function spaces these points appropriately on the Y axis to create the S-shaped curve. Let’s build our view and then come back to this.

Build Your Curves

Follow the steps below to build your curves

  • Right click on the [T] field, drag it to columns, and when prompted, choose the top option ‘T’ without aggregation
  • Right click on the [Curve] field, drag it to rows, and when prompted, choose the top option ‘Curve’ without aggregation
  • Change [LineID] to a Dimension, and drag it to color
  • Change the Mark Type to ‘Line’
  • For this example, you may not need to address the ‘Path’, but I find it’s good practice when dealing with lines and polygons.
    • Right click on the [Path] field, drag it to Path, and when prompted, choose the top option ‘Path’ without aggregation

When you finish, your worksheet should look like this

With just two calculations, we were able to draw those nice curved lines. Let’s quickly re-visit those calcs. Our Sigmoid calculation is a mathematical function with values ranging between 0 and 1 depending on the ‘T’ value. The table below shows the Sigmoid Value for each T value and the second row shows the difference from the previous value. Notice that at the beginning and the end, the values don’t change much from one step to the next, but towards the middle the values change much more rapidly. And where T=0, the value is .5.

Now let’s look at that in context with our lines and with our Curve calculation. Notice that the vertical position for each of the points on the line doesn’t change much at the beginning or the end, but does change rapidly as it approaches the center. And the center of each line is exactly half way between the starting position and the ending position.

For example, Line 4 starts at position 4 and ends at position 10. At T=0, the vertical position is at 7, halfway between 4 and 10. Let’s plug this line into our calculation. Remember, from the table above, the value of Sigmoid at T=0 is .5

Curve = [Start Position] + (([End Position] – [Start Position]) * [Sigmoid])

or

Curve = 4 + ((10 – 4) * .5) = 7

And let’s do this with one other point, just for good measure. If we look at the table above, we see that the Sigmoid value at T=-3 is .047

Curve = 4 + ((10 – 4) * .047) = 4.28

If you look at the image above, you’ll notice that at T=-3, the vertical position is just above the 4, or 4.28 to be exact. That’s enough math for now, let’s look at some other fun examples.

More Examples – Dendogram

Now let’s follow the same exact process for the curvy slope chart above, but change up our data a little bit. For this example we are going to use the sheet titled ‘Dendogram’ for our data, and we’ll use the same ‘SigmoidModel’ sheet for our densification. Here is our sample data.

In this example, our starting position is the same for all of our lines and is halfway between the minimum end position (1) and the maximum end position (12). If you follow the same process your sheet should look something like this.

Now just for fun, let’s make that starting point dynamic. Create a numeric parameter called [Dynamic_Start] and then create a new field called [Dynamic_Curve]. This will be exactly the same as our [Curve] calculation but we’ll replace the [Start_Position] field with [Dynamic_Start].

Dynamic Curve

[Dynamic Start] + (([End Position] – [Dynamic Start]) * [Sigmoid])

Now in your view, just replace the [Curve] field on Rows with the [Dynamic Curve] field and you should have something like this.

More Examples – Sankey

This is not going to be an in-depth tutorial on how to create Sankey diagrams. People much smarter than me (check out flerlagetwins.com) have written many posts and shared many templates for anyone interested in building one. This section will be dedicated more to the mechanics of a Sankey diagram and how we can take what we’ve already learned in the previous examples and apply them towards building a Sankey diagram. The one main difference between what we’ve done so far and what we’re going to do now lies in the Mark Type. It’s time for some polygon fun.

Essentially what we need to do is draw 2 lines for every 1 ‘Sankey Line’ and then connect them together. Typically, you would use table calculations in Tableau to figure out the positions of these lines, but we are going to keep it simple and use some data from our sample file. What we’re actually building is probably more of a Slope Chart, since each segment will be of equal width, but the fundamentals are the same as with a true Sankey Diagram. For this example we are going to use the sheet titled ‘Sankey’ and for our densification we’re going to use the sheet titled ‘SankeyModel’. Here is our sample data

You may notice that this densification table has some additional data. It has twice as many records and a new column called ‘Side’ with values ‘Min’ and ‘Max’. This is because for each segment, or ‘Sankey Line’ we will actually be drawing two lines, a Min Line and a Max Line. Here’s out densification data.

Building Your Calculations

We are still going to use the same 2 fields, Sigmoid and Curve, but in this case, the calculation for Curve will be a little different.

Sigmoid

1/(1+EXP(1)^-[T])

We have the two points needed for one of our lines (Start_Position and End_Position), we’ll call this the Min Line. But we need the two points for our Max Line as well. As I mentioned earlier, in this example all of our segments are going to be equal width, so we’ll just add 1 to our position fields.

Start_Position_Max

[Start_Position]+1

End_Position_Max

[End_Position]+1

And now we have the points needed to draw all of our curves.

Curve

IF [Side]=’Min’ then [Start_Position] + (([End_Position] – [Start_Position]) * [Sigmoid])
ELSE [Start_Position_Max] + (([End_Position_Max] – [Start_Position_Max]) * [Sigmoid]) END

This calculation may look complicated, but it’s almost identical to the Curve calculation we were using earlier. When we joined to the SankeyModel, it essentially created two sets of points for us. We have all of the same T values as we did earlier (24 points ranging from -6 thru 6), but now we have 2 of each, one where [Side] = ‘Min’ and one where [Side] = ‘Max’. So we can use this to draw two lines. The first part of the IF statement is for the ‘Min’ line. It is exactly the same as our previous Curve calculation. The second part of the IF statement (ELSE) is for the ‘Max’ line and it is the same calculation but using the [Start_Position_Max] and [End_Position_Max] fields.

Build Your Curves

To start, let’s follow the same process we did earlier, but with a couple of minor changes

  • Right click on the [T] field, drag it to columns, and when prompted, choose the top option ‘T’ without aggregation
  • Right click on the [Curve] field, drag it to rows, and when prompted, choose the top option ‘Curve’ without aggregation
  • Change [LineID] to a Dimension, and drag it to color
  • Change the Mark Type to ‘Line’
  • Drag [Side] to Detail

Your sheet should look something like this

This is starting to look like a Sankey diagram. For each LineID we have two lines that are moving together, we just need to ‘color them in’, and we’ll do that by converting them to polygons. So change the Mark Type to polygon.

Uh-oh. What happened?

So because we have [Side] on detail, it’s treating our Min and Max lines as separate polygons. We want them treated as 1 polygon, so surely removing the [Side] field will fix it.

$@!%!!!!

Alright, we can figure this out. Remember how earlier I said it was always good practice to address the Path on the Marks Card when using lines and polygons? Let’s try right clicking on our [Path] field, dragging that to Path and adding it without any aggregation.

Ahhh, that’s better.

Now let’s figure out what happened. We know why we had to remove the [Side] field, but how did the [Path] field fix our worksheet. If you look at the densification table, you’ll see that the T values go from -6 up to 6 and then repeat in the reverse order. The Path field on the other hand is sequential from 0 thru 49. Let’s take a look at the [Path] values in relation to the [T] values

So our [T] and [Curve] fields are telling Tableau where to plot all of the points, and then our [Path] field is telling Tableau the order in which to connect those points to form our polygon. And that is basically all you need to know about using Polygons in Tableau. Just Points and Path.

Let’s take a look at one more example of something you can do with the ‘Standard’ Model.

More Examples – Proportion Plot

The process for building a Proportion Plot is going to be very similar to what we did in the previous example. For this, we are going to use the sheet titled ‘Proportion’ for our sample data, and we’ll use the ‘SankeyModel’ sheet again for our densification. Here is our sample data.

You’ll notice that this table has a couple of extra columns. The ‘Run’ columns are just running totals of the Start_Value and End_Value columns. Again, this would typically be done with table calculations in Tableau, but to keep things simple we are doing those calculations in the data source. Now let’s build our calculations.

Building Your Calculations

Once again we are going to use those same two calculations, but with some minor modifications. First, our Sigmoid calculation, which is always the same.

Sigmoid

1/(1+EXP(1)^-[T])

Now, similar to the Sankey example, we need to draw two sets of line for each section. So we’ll need the start and end positions for our Min Line and the start and end positions for our Max Line. Because this is a Proportion Plot (part of a whole), we want all of our values represented as percentages. So we’ll calculate what our position should be and then divide that number by the total of the column (or the maximum value in the running total column) to translate that start position into a percentage of the whole.

Start Position Min

([Start_Value_Run]-[Start_Value])/{MAX([Start_Value_Run])}

End Position Min

([End_Value_Run]-[End_Value])/{MAX([End_Value_Run])}

Let’s take a closer look at this before we move on. We’ll use Section #2 from the table above. For the Starting Position (Start Position Min) we are going to subtract the value of that section from the running total to get the starting position.

So section #2 would start at 35 (90-55=35). Then we’ll divide that by 115, which is the total of the column, and the maximum value of the running total column, which gives us 30.4%. So this section of our Proportion Plot is going to start 30.4% of the way up our Y Axis. Now we’ll repeat for the Max Lines. These calculations will be the same except we do not need to subtract the value of the section. We just want the running total divided by the grand total.

Start Position Max

[Start_Value_Run]/{MAX([Start_Value_Run])}

End Position Max

[End_Value_Run]/{MAX([End_Value_Run])}

Using section #2 as an example again, we would divide 90 (the running total value) by 115 (the total of the column or the max value of the running total column) and get 78.3%. So we want this section of our Proportion Plot to start 30.4% of the way up the Y axis and end 78.3% of the way up the Y axis. And now for the Curve calculation, we’ll use the same calculation that we did in the Sankey Example

Curve

IF [Side]=’Min’ then [Start_Position_Min] + (([End_Position_Min] – [Start_Position_Min]) * [Sigmoid])
ELSE [Start_Position_Max] + (([End_Position_Max] – [Start_Position_Max]) * [Sigmoid]) END

Build Your Curves

Now we’ll follow the same process to build our view

  • Right click on the [T] field, drag it to columns, and when prompted, choose the top option ‘T’ without aggregation
  • Right click on the [Curve] field, drag it to rows, and when prompted, choose the top option ‘Curve’ without aggregation
  • Change [Section] to a Dimension, and drag it to color
  • Change the Mark Type to ‘Polygon’
  • Right click on the [Path] field, drag it to Path, and when prompted, choose the top option ‘Path’ without aggregation

When you’re finished, your sheet should look something like this.

If you look at section #2, the orange section, you’ll see that on the left side (the starting positions), it ranges from 30.4% to 78.3%, which is what we had calculated in our example. So it accounts for about 48% of the total (55 / 115=.48) and it starts at about 30% (35 / 115)

That’s it for the ‘Standard Model’. Now we are going to jump into one quick example with the ‘Dynamic Model’.

Dynamic Model

As I mentioned earlier, 99% of the time you are going to want to use the ‘Standard’ model, but there are instances where that may not work. One example use case, which has been popular in recent years, is drawing curved lines on a map that connect to another chart. So let’s use that.

Building Your Data Source

For this example we are going to draw curved lines from 9 U.S. states and have them end in a uniform column on the right side of our sheet, so we can connect them to another chart. For this we’ll use the sheet titled ‘Dynamic’ for our sample data and we’ll use the ‘SigmoidModel’ sheet for our densification. Here is our sample data.

So for each of our states we have the starting points (the lat and lon for the state), but we need to figure out our end points. This can be done in a lot of different ways. You can do it manually and plug those values into your data source. Personally, I like to calculate them in Tableau. But those calcs will be dramatically different depending on your use case. For this example, we are going to draw our lines from left to right and they will all end in a uniform column to the right of the U.S. and will be spaced evenly between the lowest state and highest state in our sample data. If your use case is different, you can most likely modify these calculations to get what you need, and as always, feel free to reach out to me if you have any questions. So let’s calculate our end points. Keep in mind that this step is completely optional. If you prefer to have static points in your data source for the end points, just name them End_Lat and End_Lon and skip ahead to the next calculation.

Build Your Calculations

This first calculation will be used to find the latitude for our end points. The calculation looks complicated but we will break it down.

End_Lat

{MAX([Lat])} – (({MAX([Lat])}-{MIN([Lat])})/{COUNTD([State])-1}) * ([Rank]-1)

The first portion of this calculation, {MAX([Lat])}, is used to find our ‘reference point’, which will be the maximum latitude in our sample data. The next portion of the calculation, ({MAX([Lat])}-{MIN([Lat])}) gets our ‘range’, or the maximum latitude minus the minimum latitude. Then we divide that range by the number of states -1 to get the spacing between points, or {COUNTD([State])-1}. And then we multiply that spacing by the Rank -1 (so that the first point starts at our max Lat).

So an easier version of this formula is Ref Point – ((Range/Number of States -1) *( Rank -1)

Now we need to calculate the longitude for our end points. This is much easier. We want all of our lines to end in the same column, so they will all have the same longitude. I like to use parameters rather than hardcoding values, so you can easily adjust the position of things. So let’s create a parameter called [Line End Offset] and set that value to 10. Then our calculation is just the maximum longitude value in our data + our offset parameter

End_Lon

{MAX([Lon])} + [Line End Offset]

So now we have the two sets of points that we’ll need to draw our lines. If we were to stop here and just map those two sets of points, we would have a point on the center of each of our states (blue) and then a column of points to the right of the U.S. (orange).

Now for the rest of our calculations. Once again we’ll need the Sigmoid calculation.

Sigmoid

1/(1+EXP(1)^-[T])

The Curve calculation is exactly the same as all of the previous examples just with different field names.

Curve

[Lat] + (([End Lat] – [Lat]) * [Sigmoid])

The main difference between the ‘Dynamic’ model and the ‘Standard’ model lies in the spacing of the points. In the Dynamic model, points are spaced evenly between -6 and 6 and we can just use the [T] field on columns. That obviously won’t work on a map. We have different longitudes for all of our starting points, so we’ll need to adjust the approach a little bit. Instead of evenly spacing points between -6 and 6, we need to evenly space them between our starting and ending longitudes. So we’ll take the total horizontal distance (ending longitude – starting longitude) and divide it by the number of points in our densification table -1.

Point Spacing

([End Lon]-[Lon])/({COUNTD([Path])-1})

Now we’ll use that calculation, along with the [Path] field and our starting longitude to create points that are equally distant between our starting and ending longitude.

Lon_Adjusted

[Lon] + [Path]*[Point Spacing]

Build Your Curves

Now that we have all of our calculations, let’s build our curve.

  • Right Click on [Lon_Adjusted], go to ‘Geographic Role’ and select ‘Longitude’
  • Right Click on [Curve], go to ‘Geographic Role’ and select ‘Latitude’
  • Right click on the [Lon_Adjusted] field, drag it to columns, and when prompted, choose the top option ‘Lon_Adjusted’ without aggregation
  • Right click on the [Curve] field, drag it to rows, and when prompted, choose the top option ‘Curve’ without aggregation
  • Drag [State] to Color
  • Change the Mark Type to ‘Line’
  • Right click on the [Path] field, drag it to Path, and when prompted, choose the top option ‘Path’ without aggregation

When you’re done, your sheet should look something like this.

Now with just a few modifications to the ‘Standard’ model, we were able to build our Sigmoid curves with different starting points. This example was for building lines that run left to right, but you can just as easily build them from top to bottom. The calculations and example for that can be found in the sample workbook here. Just use the fields with the ‘Ttb’ suffix.

If you’re still reading this, I apologize for the incredibly long post. There are just so many things you can do with Sigmoid Curves and I wanted to demonstrate how, with basically the same technique, you can build a huge variety of cool curvy things in Tableau. If you’re interested in learning more about Sigmoid Curves, definitely check out flerlagetwins.com. Ken and Kevin are incredible and everything I know about Sigmoid Curves (and a lot of other things that will eventually be covered in this blog), I learned from them.

Here are a few examples of where I’ve used Sigmoid Curves in my Tableau Public Work (click thumbnail to view on Tableau Public).

Standard Model

Dynamic Model

As always, thank you so much for reading!

Categories
How-To's

Fun With Curves in Tableau Part 2: Bezier Curves

This is part two of a three-part series on creating curved elements in Tableau. Although this post covers new and different techniques, I would recommend checking out part one of the series here as some of the concepts overlap. This post will focus on one type of curved line that is used frequently in Tableau Public visualizations; Bezier Curves. To follow along, you can download the sample data and workbook here.

Bezier Curves

There are many types of Bezier curves varying in complexity from very simple to ridiculously complicated. One commonality with these types of curves is that they rely on ‘control points’. This post is going to focus on quadratic Bezier curves, which have 3 control points. An easy way to think about these points is that there is a starting point, a mid point, and an end point, creating a triangle. The starting point and end point are simply the start and end of the line. The other point, the mid point, will determine the shape of the triangle, and in turn, what that curve is going to look like. Now let’s see how the position of that mid point (creating different types of triangles) will affect the curve.

Each of the triangles above have the same starting point (1,0) and the same end point (10,0), but have significantly different curves because of the varying mid point. For most applications in Tableau we’re going to be dealing with examples like Example 1 and Example 4 in the image above, where the mid point is halfway between the other points, creating an isosceles triangle. To make things even easier for this example, we’re going to deal with just Example 1, which is an equilateral triangle, meaning all 3 sides are the same length.

Building Your Data Source

To build our data source we are going to follow the same process that we did in Part 1 of this series. We are going to create additional points by joining our sample data to a densification table using join calculations (value of 1 on each side of the join). In this case, our sample data is called Bezier_SampleData and our densification data is called BezierModel. You can download the sample data here. In our sample data we have 10 records that we’ll use to draw 10 unique curves. For simplicity sake, all of these curves will start and end at 0 on the Y axis.

Building Your Calculations

To draw our curves there are a few things we’ll need to calculate. Our sample data has 2 of the 3 points we need (starting point and end point), so we’ll need to calculate the X and Y values for the 3rd point (mid point). Let’s start there.

We discussed above that we are going to use equilateral triangles to draw our curves. In that case, the X value for the mid-point will be halfway between [X_Start] and [X_End] and the Y value will be the height of the triangle plus the starting point. Here is an example

To find the X value that falls in the middle of [X_Start] and [X_End], just add them together and divide by 2

X_Mid

([X_Start]+[X_End)/2

To find the Y Value add the Y starting point to the height of the triangle. To find the height of the triangle, use the formula (h=a*3/2) where a is the length of one of the sides of your equilateral triangle. To find that length subtract [X_Start] from [X_End]

Y_Mid

[Y_Start] + (([X End]-[X Start])*SQRT(3)/2)

Now, we have 3 points for each record in our data set. We have our starting point ([X_Start],[Y_Start]), our end point ([X_End],[Y_End]), and our mid point ([X_Mid],[Y_Mid]). If we were to stop here and plot those points, it would look something like this, 10 equilateral triangles of different sizes, all on the Y axis (because we had used 0 for all of the [Y_Start] and [Y_End] values)

Now let’s convert those points into Bezier curves. The first calculation we’ll need is [T]. T is going to be a percentage value that is equally spaced between 0 and 1 for the number of points in our densification table. Think of this as similar to the [Position] calc in Part 1 of this series, but slightly different because we need our first point to start at 0.

T

([Points]-1)/{MAX([Points])-1}

No matter how many points you add to your densification table, this calculation will spread them evenly between 0 and 1. This field is used to evenly space our points along our curved line. When complete, your T values should look like this. The value for point 1 should be 0%, the value for the last point should 100% and all of the points in the middle should be equally spaced between those

Now all that is left is to calculate the X and Y coordinates for each point along our curved lines. The calculations for X and Y are exactly the same, but in the [Bezier_X] calc you are using the 3 X values, and in the [Bezier_Y] calc you are using the 3 Y values

Bezier_X

((1-[T])^2*[X_Start] + 2*(1-[T])*[T]*[X_Mid]+[T]^2*[X_End])

Bezier_Y

((1-[T])^2*[Y_Start] + 2*(1-[T])*[T]*[Y_Mid]+[T]^2*[Y_End])

Now let’s build the curves in Tableau

Build Your Curves

Follow the steps below to build your curves

  • Right click on the [Bezier_X] field, drag it to columns, and when prompted, choose the top option ‘Bezier_X’ without aggregation
  • Right click on the [Bezier_Y] field, drag it to rows, and when prompted, choose the top option ‘Bezier_Y’ without aggregation
  • Change the Mark Type to ‘Line’
  • Right click on the [Points] field, drag it to Path, and when prompted, choose the top option ‘Points’ without aggregation
    • This tells Tableau what order to ‘connect the dots’ in.
  • Drag [Line Name] to color

When you finish, your worksheet should look something like this

Or you can change the Mark Type to ‘Polygon’ and reduce the Opacity and it will look like this

Now this is a relatively simple example. You can get really creative with how you calculate those 3 points, especially the ‘mid’ point. Let’s do one more example, combining the work we’ve done in this exercise with what we had done in Part 1 of the series. The result should look pretty familiar.

First, let’s take our X values and plot them in a circle, instead of on a straight line. From Part 1 of this series, we know that in order to plot points around a circle, we need 2 inputs for each point; the Radius (the distance from the center of the circle), and the Position (the position around the circle expressed as a percentage). Since we want all of our points to be equally distant from the center, we can use a single value for the radius of all points. Let’s create a parameter called [Radius] and set the value to 10. For Position, we’ll need to calculate the position for each start and end point. For the position calculations we’ll need to first find the maximum number of points around our circle, or in this case, the max value of the [X_Start] and [X_End] fields together. Looking at our data, we can see the maximum value of the [X_Start] field is 12 and the maximum value of the [X_End] field is 15. So our Max Point will be 15

Max_Point

{MAX(MAX([X_Start],[X_End]))}

Next, we’ll use that value to calculate the position around the circle for each start and end point

Position_Start

[X_Start]/[Max_Point]

Position_End

[X_End]/[Max_Point]

Now, we can calculate the X and Y coordinates for the starting point and end point of every line, using the same calculations we used in Part 1 of the series. Let’s begin with the starting points

Circle_X_Start

[Radius]* SIN(2*PI() * [Position_Start])

Circle_Y_Start

[Radius]* COS(2*PI() * [Position_Start])

And now we’ll calculate the X and Y coordinates for the end points using the same formulas but swapping out the [Position_Start] field, with the [Position_End] field.

Circle_X_End

[Radius]* SIN(2*PI() * [Position_End])

Circle_Y_End

[Radius]* COS(2*PI() * [Position_End])

Now for each of our lines, we have two sets of coordinates. We have the coordinates for the start of our line ([Circle_X_Start],[Circle_Y_Start]) and the coordinates for the end of our line ([Circle_X_End],[Circle_Y_End]). All we need now is that 3rd set of coordinates, the mid-point. The good news is, when plotting these around a circle, we have a very convenient mid-point…the middle of the circle. In this case, because our circle is starting at (0,0), we can use those values as our 3rd set of points. Let’s take our 3 sets of coordinates and plug them into the Bezier calculations we used earlier

Circle_Bezier_X

((1-[T])^2*[Circle_X_Start] + 2*(1-[T])*[T]*0+[T]^2*[Circle_X_End])

Circle_Bezier_Y

((1-[T])^2*[Circle_Y_Start] + 2*(1-[T])*[T]*0+[T]^2*[Circle_Y_End])

Now in our view if we replace the [Bezier_X] and [Bezier_Y] fields with the [Circle_Bezier_X] and [Circle_Bezier_Y] fields, we get something like this…the foundation of a chord chart

This chart uses the same exact calculations for the curved lines, we just used some additional logic to calculate the coordinates for the start and end points. In our first example, for Line 2, we drew a curved line between 3 and 15 on the X axis. The coordinates for our start and end point were (3,0) and (15,0) respectively. Then we calculated the coordinates for our mid point, which ended up being (9,10.4)

In this last example, we also drew a curved line between 3 and 15, but instead of those points being on the same Y axis, they were positioned around a circle. We used what we learned in Part 1 of this series to translate those X values (3 and 15) into coordinates around a circle. So the coordinates for our start and end positions ended up being (9.5,3.1) and (0,10) respectively. And instead of calculating our mid-point, we used the center of the circle (0,0).

These are just a few basic examples of what you can do with Bezier curves, but there are so many possibilities. Here are a few examples of where I’ve used Bezier Curves in my Tableau Public Profile.

Thank you so much for reading, and keep an eye out for the third and final part of this Series, focusing on Sigmoid curves.

Categories
How-To's

Fun With Curves in Tableau Part 1: Circles

In recent years there have been multiple scientific studies1 designed to confirm what many of us in the Data Visualization Community have already suspected; when it comes to art, people are drawn to curves. Think about some of your favorite pieces of Data Art. I am willing to bet that the majority of them contain some type of curved element. Not only are curves more aesthetically pleasing than straight lines and sharp corners, but they have that ‘WOW’ factor, because as we all know, curved lines do not exist in Tableau. They take effort, and when it comes to drawing curves, most people don’t know where to start. But you don’t need to be an expert in Tableau to create beautiful radial charts, or to add some impressive curves to your dashboards. You just need to know the math, you need to know how to structure your data, and you need to know how to bring those elements together in Tableau. That’s the goal of this series. To hopefully demystify some of this work and make it more approachable, and to provide some examples. This series will focus on three types of curved elements; Circles, Bezier Curves, and Sigmoid Curves.

Drawing Circles in Tableau

Tableau does have a ‘Circle’ mark type that can be used, but being able to draw your own circles opens up a world of possibilities. The calculations that are used for drawing a circle, are the same calculations that can be used to create any type of radial chart you can imagine. For this post, I am going to keep the math as simple as possible, but if you’re interested in diving deeper I would recommend checking out this post by Ken Flerlage.

Once you have your data structured there are really only 2 inputs needed to create your radial; the distance of each point from the center of the circle (radius), and the position of each point around the circle. We’ll discuss these inputs a lot more in this post, but let’s start with our data structure. To follow along with this post, you can download the sample data and workbook here.

Building Your Data Source

To create any type of curved element in Tableau, you’ll need to start by densifying your data. To create your densification table, create a table in Excel with 1 column, in the first cell name that column ‘Points’, and then add rows with numbers 1 through however many points you wish to create. There are a few things to consider when choosing that number. Choosing a number that is too high may affect performance as that many rows will be added to your data source for each record in your ‘Core’ data set (a data source with 1000 rows will turn into 100,000 if you choose 100 points). Choosing a number that is too low will result in visible straight lines and corners instead of a smooth curve. Here is an example to help visualize how the number of points can affect the shape

I will usually choose somewhere between 50 and 100 points, closer to 50 when the circles will be small, and closer to 100 when the circles will be large. For this post, we’ll go with 50 points, so your densification table should look something like this.

And here is some sample data we’ll use as our ‘Core’ data source. We have 10 records that we’ll use to create 10 distinct circles, and we’ll use the ‘Value’ column to size the circles appropriately. I use this technique frequently, instead of the ‘Circle’ mark type in Tableau, to ensure that the size (area) of each circle accurately represents the underlying value. When designing my ‘Core’ data source, I like to use a sequential ‘ID’ field, starting from 1. This can help make some of the calculations easier, but if it’s not an option, you can typically replace that ‘ID’ field with the INDEX function in Tableau.

Now that we have our ‘Core’ data set, and our densification table, let’s bring these together in Tableau. To densify our data we’ll need to join these two tables in the physical layer in Tableau.

  • First, connect to your ‘Core’ data. In this example, that table is called ‘CircleData’
  • Drag your ‘CircleData’ table onto the Data Source pane
  • Double-click on the ‘CircleData’ Logical Table to view the Physical Tables
  • Drag your ‘Densification’ data onto the Data Source pane. In this example, this table is called ‘CircleDensification’
  • Join the tables with a Join Calculations
    • On the left side of the join, click on the drop-down and select ‘Create Join Calculation’
    • In the calculation box enter the number 1
    • Repeat the steps above on the right side of the join
  • When complete, your data source should look like this

Building Your Calculations

Now that we have our data source, the next step will be to build our calculations. As I mentioned earlier, there are 2 inputs that we’ll need to draw our circles; the distance of each point from the center of the circle, and the position of each point around the circle. Let’s start with the distance calculation.

In this example, we want the area of the circle to represent the value in our data. The distance from the center of a circle to the outside of the circle is known as the radius, and we can calculate that with the formula below. This will serve as our first input.

r = √A/π or in plain English Radius = Square Root of Area/Pi

In our data we have the Area (Value) so we can calculate the radius for each of circles using the calculation below in Tableau. Create a new calculated field called ‘Radius‘ and copy the formula below

Radius

SQRT([Value]/PI())

For the next input, we’ll need to calculate the position of each point around the circle. For each circle, we have 50 points (the number of rows in our densification table) and we’ll want to evenly distribute those points around the circle. The resulting number will be a percentage and will represent how far around the circle that point appears. For example, imagine you were looking at a clock. 3 o’clock would be 25%, 6 o’clock would be 50%, 9 o’clock would be 75% and 12 o’clock would be 100%

We can calculate that percentage for each point by taking the value of the Points field divided by the Max Value of that field (which would represent the number of points or number of rows in our densification table). So the Position calculation would be as follows in Tableau

Position

[Points]/{MAX([Points])}

Note* if you plan on using the ‘Line’ mark type to draw your circles instead of the ‘Polygon’ mark type, you should modify this calculation by subtracting 1 from the divisor, [Points]/{MAX([Points])-1}. This is because polygons will automatically connect your first and last point. For lines, we need to force that connection.

Now your data should look something like this. For each record in your ‘Core’ data set, you have 50 points, with position values equally spread between 0-100%

Now we have our 2 inputs and all that is left to do is to translate these inputs into X and Y coordinates, which can be done with two simple calculations

X

[Radius]* SIN(2*PI() * [Position])

Y

[Radius]* COS(2*PI() * [Position])

Build your Circles

Follow the steps below to build your circles

  • Right click on the [X] field, drag it to columns, and when prompted, choose the top option ‘X’ without aggregation
  • Right click on the [Y] field, drag it to rows, and when prompted, choose the top option ‘Y’ without aggregation
  • Change the Mark Type to ‘Polygon’
  • Right click on the [Points] field, drag it to Path, and when prompted, choose the top option ‘Points’ without aggregation
    • This tells Tableau what order to ‘connect the dots’ in.
  • Drag [Circle Name] to color

When you finish, your worksheet should look something like this

Right away, you’ll probably notice a few things about this. First, these look like ovals, not circles. And second, there are only 2 circles when there should be 10. The reason they look like ovals is because the worksheet is wider than it is tall. It’s important when you place a radial chart on a dashboard that you set the width and height equal and that you Fix both the X and Y axis to the same range. The reason there are only two circles visible is because all 10 circles have the same starting position (0,0), so they are currently stacked on top of each other.

Arrange Your Circles

There are a number of techniques you can use to arrange these circles. You could place the [Circle ID] field on Rows to create a column of circles, or on Columns to create a row of Circles. Or, with a little more math, you could do a combination of these and create a trellis chart, or ‘small multiple’. But personally, I like to use ‘offset values’ which place everything on the same pane and give you total control of the placement of each object.

First, let’s place all of these circles in a single row. To do this, we’ll create a numeric parameter and set the value to 10. This is going to be used to set the spacing between each circle. To increase the spacing, set the parameter higher. To decrease the spacing, set the parameter lower.

Now we’ll create our ‘offset value’.

Circle Offset

[Circle ID]*[Circle Offset Parameter]

And next, we’ll add that value to our [X] calculation

X

[Radius]* SIN(2*PI() * [Position]) + [Circle Offset]

And here is the result

This works because it moves every point in each circle an equal amount. For Circle 1, the [Circle Offset] value will be 10 (Circle ID is 1 x Circle Offset Parameter is 10 = 10). Adding that to the [X] value that was previously calculated will move every point in that circle 10 to the right. For Circle 2, the [Circle Offset] value will be 20 (Circle ID is 2 x Circle Offset Parameter is 10 = 20), which will move every point in that circle 20 to the right.

If we add the [Circle Offset] value to the [Y] calculation instead of [X], the result will be a single column of circles. And if we add the [Circle Offset] field to both the [X] and the [Y] values, the result will be a diagonal line.

So those are some easy ways to plot your circles in a line…but this is a post about circles. So…let’s plot them in a circle. And to do this we’re going to use the same techniques we used to draw the circles.

First, let’s create one more parameter that will serve as the radius of our new circle. We’ll call it ‘Base Circle Radius’ and set the value to 30

That parameter will be one of our two inputs for plotting points around a circle. The second input is going to be the position. Previously, we had 50 points that we wanted to plot evenly around a circle. Now, we have 10 points (10 circles). Previously, we had values 1 thru 50 (the Points field). Now, we have values 1 thru 10 (the Circle ID field). So using the same technique we used earlier, we’ll calculate the position of each circle around our ‘Base’ circle.

Base Circle Position

[Circle ID]/{MAX([Circle ID])}

Now, we have our two inputs and we can use the same calculations to translate those into X and Y values

Base Circle X

[Base Circle Radius]* SIN(2*PI() * [Base Circle Position])

Base Circle Y

[Base Circle Radius]* COS(2*PI() * [Base Circle Position])

Now all you need to do is add these values to your [X] and [Y] calculations respectively, similar to what we did in the previous section

X

[Radius]* SIN(2*PI() * [Position]) + [Base Circle X]

Y

[Radius]* COS(2*PI() * [Position]) + [Base Circle Y]

And here is the result, a circle of circles

Now this is just one simple example of what’s possible once you know how to plot points around a circle. I use these same few calculations, with some modifications, in a ton of my Tableau Public visualizations. Here are a few examples that use this technique.

Thank you so much for reading the first of many posts on Do Mo(o)re With Data and keep an eye out for the second part of this series.

1 Here’s an article about a few studies demonstrating human’s innate affinity for curves. Link