Tableau calculations — An intro

towards-data-science

This post was originally published by Kevin Macias-Matsuura at Towards Data Science

Image for post
Photo by Oleg Magni from Pexels

This blog marks the third entry in my ongoing “Teaching Tableau” blog. In our previous installments I showed how to create a basic dashboard from start to finish and how to work with filters. This week’s tutorial will go over “Calculations”. What is a calculation? Seems simple but Tableau has three designations for what a calculation is. Calculations are written either as part of the query made on the data source or after the query. This will make more sense in a second. There are Basic Calculations, Level Of Detail Expressions (LOD), and Table Calculations. Basic calculations can be aggregate calculations like a sum or average that work on the whole dataset or can also work on the data source by calculating a single row. Basic calculations are made as part of the query. LOD Expressions are also made as part of the Tableau query. Tableau states that they allow us to “compute aggregations that are not at the level of detail of the view”. LOD expressions can work at a more detailed level through the use of the EXCLUDE, INCLUDE, and FIXED functions. Table calculations are written after the query meaning they operate on the values calculated as the results of the query.

Although we definitely have the ability to write long and complex calculations sometimes simple is sufficient. If we click on a pill and select “Quick Table Calculation” there are ready made table calculations for us to choose from.

Image for post

This batman tool belt of data analysis should always be your first choice. If these do not quite suffice then we need to break out the brain power and make a more detailed calculation. This field may not show up because the pill may only be able to use a basic calculation. Remember that basic calculations compute on the data source while table calculations are computed locally from the results of a query. When we have a visualization with various fields we refer to this as a “Field Set”. For example you may have various categories that can be broken down. We can take pianos, guitars, and basses as our “categories”. These categories can then be broken down into “segments” like acoustic and electric which contain values like total sales which we refer to as “quantities”. Table calculations can be applied but we need to keep in mind that as we add and remove fields the table calculations will also adjust accordingly.

How do we make a calculation? First off click on the dropdown menu shown below and select “Create Calculated Field”.

Image for post

This will bring you to the calculation screen. Give your calculation a name in the top field. For this example we want to see all comments with more than 100 words. For a calculation like this, the basic calculation function IF will work just fine. Its syntax goes as follows: IF <expr> THEN <then> [ELSEIF <expr2> THEN <then2> . . .] [ELSE <else>] END. Using this syntax, in the calculation field we will type: IF [Num Words] > 100 THEN “> 100” ELSE “< 100” END as shown in the screenshot below.

Image for post

Tableau will automatically fill in suggestions as soon as you type so if you forget your options that can be a great reminder. If you really have no idea what options are available you can always hit the triangle at the right of the calculations field where another window with all the aggregate functions will appear for you to browse. Taking some time to browse the options is a great way to familiarize yourself with the tools available to you as there are wonderful summaries that go along with each function. Now that we have our calculation we can hit OK which will add a pill to the data pane on the left with our title. Now all we have to do is add the appropriate pills to the row and column sections and build our calculated graph.

Image for post

Adding our calculated pill split our graph into one graph with all comments with less than 100 words. Another quick way to make calculations is to double click the field for either the columns or rows section which will allow you to write in your calculation directly. From here you can even drag pills from the data pane into the field so that you don’t have to type them out.

Let’s take a look at an example of a LOD expression. For this example I will be using an e-commerce dataset from Kaggle.com. You can find it here: link Let’s say we want the average sales for each day of the month. Our graph would look something like this:

Image for post

If we take a closer look at the data we can see that for each session a user might buy multiple instances of the same item or a few different items altogether. What if we wanted to then find the average sales per session to see how much an average user spends on a given day. For this we can perform an LOD expression. Firstly we pull up the table calculations menu. In the input field we can write in the following expression {INCLUDE [User Session]: SUM([Price])}

Image for post

This will give us a new pill which we can move to the rows section and build our graph. Now all we need to do is change the aggregation function to AVG from the pill’s dropdown menu under Measures. We can see the tables compared below.

Image for post

The syntax begins and ends with curly braces. LOD expressions can use INCLUDE, EXCLUDE, and FIXED. We used include here because we want the calculation to take into account the day dimensions in the view. The difference between FIXED and EXCLUDE is that FIXED is calculated before the dimension filters while EXCLUDE is calculated after the dimension filters. They do not take into account the dimensions in the view in the same way that INCLUDE does.

I hope this basic overview of table calculations whet your appetite for some number crunching. These calculations can get pretty complex so taking time to practice is definitely warranted. I also want to give some credit to Pris Lam’s blog for some of the information on LOD expressions. Check out some of his work here: link.

Spread the word

This post was originally published by Kevin Macias-Matsuura at Towards Data Science

Related posts