Gold Questions Flashcards

1
Q

When I add many fields to my rows shelf, some rows start going vertical. As I add more, the labels start getting concatenated. Can I fix this?

A

You can adjust Tableau from taking these actions by setting default table options. (Analysis  Table Layout  Advanced, and increase the number of row and horizontal row labels.)
ALSO: You can perform some of the following actions:
• Resize the label rows (click-drag) to provide more room
• Choose to “Rotate” the labels (right-click a label)
• Edit the aliases of the label values to make them shorter.
• Format the font for the labels to make them smaller
• Un-Choose “Show Header” for some field labels
• Choose “Hide Field Labels for Rows”
• Don’t place so many rows onto your sheet – it’s not best-practice for visual analysis!!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does unchecking ‘Analysis Aggregate Measures’ differ from converting a measure to a dimension?

A

A dimension returns all distinct values of the field. “Dis-aggregate” means that you want to return each row from the underlying database.
Example: 1,1,1,1,2,3. When set to a dimension there would be 3 marks, as a disaggregate measure, 6 marks.
This becomes very important when using reference lines, trend lines or other secondary calcs. An ‘average’ ref line on the dimension would yield a value of 2, but on the disaggregate data the value would be 1.5.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an ‘Attribute’ (ATTR)? How does it differ from dimensions and measures?

A

An attribute is an aggregation that can be applied to Dimensions. This enables the dimensions to be removed from the level of detail, but still be displayed. The functionality is useful for excluding a dimension from a table calc or allowing dimensions from secondary data sources.

Returns the value of the dimension if it only has a single value for all rows in the group, otherwise it displays an asterisk (*) character. Null values are ignored.

An attribute is equivalent to the formula:
	IF MAX([Field])=MIN([Field]) THEN  MAX([Field]) ELSE“*” END
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Can I connect to a stored procedure?

A

Sometimes, but this is not often needed. Due to Tableau’s advanced calculation, filtering and security capabilities, most customers find that they can accomplish more with tableau connecting to raw data than by executing a stored procedure.
The stored procedure or User defined function (UDF) must return a table. The reason for this is that all subsequent actions in Tableau will be sent to the database as queries in a ‘sub select’.
For Example:
Select [Region] From ([my stored procedure result]).
Most databases don’t support this. When they do, this effectively means that the stored procedure is executed every time an action is taken in Tableau. This typically defeats the benefits of a stored procedure. A much better approach is to first execute the stored procedure into a table or file and then connect Tableau to the results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Does Tableau have any API’s or scripting?

A

“What is it that you are trying to accomplish?” We have embedding and URL actions that solve a lot of these needs. Other items are often solved with existing functionality such as table calcs or actions that will help satisfy their requirements. We also have a Javascript API.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Your website mentions the VizQL language. How do I program in this or modify the VizQL?

A

Currently the VizQL is generated only by using the Tableau Desktop interface. It can be manipulated through the Tableau Server web interface by Interactors. It is not possible to see the VizQL or create it directly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do I group thousands of items together?

A

Best chance is to use a calculated field with a programmatic condition in these “thousands of items” such that the cardinality is low (less than 100). For measures, rounding using a calc field is useful. For dimensions, rounding using or LEFT or RIGHT functions, etc…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Can I save a filter for reuse? For example I manage 12 of the 5000 products we have.

A

Yes. Create the filter, and then choose “Create Set” from the filter’s context menu

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

If I have a calculated field that includes members from two blended data sources, will the results be included when I create an extract?

A

Since extracts are specific to each data source, the results from the calculated field will not be included in the data extract (i.e. it will not be optimized). However, the calculation will still work fine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do I move a sheet from one Tableau workbook to another?

A

Bookmarks. Save one from the source workbook. Open the destination book, choose that bookmark, save the file. (Note: you cannot bookmark a dashboard)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Can I move dashboards from one workbook to another?

A

No.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the polygon mark type and how do I use it?

A

Polygon mark type is advanced and used to describe geographic areas and other areas of measurement. They typically require coordinate data in the underlying database. Armed with the right set of coordinate data, polygon marks – in conjunction with the PATH shelf – can describe two-dimensional areas. Any time you have all of the coordinate data to describe ANY area (state, election regions, etc), you can accurately use the polygon mark type to show these boundaries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the two main types of Joins?

A

Joins require a condition to evaluate how a row in one table matches a row or set of rows in another table. The fields used in this condition are called ‘key’ fields.
Inner join: a join of two or more tables that keeps only that rows where the key is common to both tables
Outer join: is further defined as left outer join or right outer join. These includes all rows from the table on the left (or right) side of the join condition regardless of whether a match was found on the right (or left) side.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

I created a join and most of my data is missing. How do I fix this?

A

An outer join will always include the records of the primary table (i.e. left outer join will always include data from the ‘left’ table, even if no match is found in the right table.)
An example is a transaction header table which is related to a transaction line-item table. For dimensional analysis purposes, you would want all header rows to be returned, even if there was not always a specific line-item related to it.

Select * from TransactionHeader
LEFT OUTER JOIN TransactionLineItems
On TransactionHeader.LineItemID = TransactionLineItems.LineItemID

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Join Culling?

A

Because joins cost time and resources to process on the database server, we don’t really want to use every join that we declared in our data source at all times. Join Culling allows you to skip the join when the query does not require it and provides increased performance. To do this, you must have relationships setup in your database between the tables you’ll be joining in Tableau. Inner Joins will work best for Join Culling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do I do Sparklines?

A

Create a line chart.
Edit the axis to not include zero.
Hide the axis
Make the rows really tiny

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do I create a chart the shows the budget compared to actual?

A

Try a bullet graph first. IF that doesn’t satisfy, then:
Color Answer: Steps: add “budgeted” to ROW, add a date measure to column. Then, add second measure (e.g. “actual”) to Y-axis (i.e. the row axis where “budgeted” was already placed). Tableau intelligently determined that COLOR is the best-practice in this case.
Side by Side Answer: Follow the steps for Color Answer, and drag Measure Names to the Column Shelf next to your Date. This allows different measures to appear side by side.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How do I show the rank of my products?

A
  1. Create a calculated field: index()

2. Add this to your sheet and change it to discrete.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Can I create a chart that shows the trailing 12 months revenue?

A

Yes, this is easy to do in Tableau. You can use a table calculation for this, as well, you can use the built-in totals feature. If you have “month” as a dimension you can turn on grand totals against this dimension.
Answer one: use relative date filters to filter a single sales bar down to the last 12 months (don’t add date to the columns).
Answer two: Use a Table calc to use a Moving Calculation for the sum of the last 12 Months. This one is even better as this is simple for Tableau, and hard for other products.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Can I create Control Charts?

A

Yes. Control charts are often used in SPC (Statistical Process Control) and six sigma (6σ) efforts or other quality control. It is basically a time series analysis with trends and reference lines. Tableau makes them very robust with one exception: We can exclude the outliers completely, but cannot exclude them from the reference line calculations while still displaying them on the chart.
See the knowledge base for more details.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Can I do a Pareto Analysis?

A

Yes (this is the quick way, a more complete answer is in the knowledge base):
Place a dimension on the column shelf. Sort it by descending.
Place a measure on the rows shelf. Duplicate it so there are two measures.
Set the 2nd one to be a Running Total with a secondary calculation of Percent of Total.
Dual Axis these two measures.
Set the first measure to be a Bar and the second measure to be a Line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

I have a view with MY(Order Date) on Columns, SUM(Sales) on Rows, with three different colored lines – one for each Product Category. I want to animate the lines over time, and show the history so I can get a view like the following where the lines will appear for each month:

A

Drag Date to the Pages shelf and to columns. In order to get the lines to produce over time, the Mark type needs to be changed from Line to some other option, such as Shape, Circle, even text. Then check Show History on the Pages card, access the drop-down menu, Show History for All and select Trails.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Can I use dimensions from the secondary datasource to roll-up my primary data? How can I create a primary group from a secondary data source?

A

Yes. Kind of.
An example is if I have sales and people in my primary and people and their territory assignments in the secondary. How do I roll-up my sales (primary) by territory (secondary)?
Once your data blended view is complete you can right click on your field you wish to group from your secondary source and select “Create Primary Group.” This will populate your primary data source with a grouped field based on values shared in both sources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Setup a view that displays only a list customers whom purchased from BOTH Furniture and Office Supplies in the past.

A

Since no row of record contains information on about a customer whom purchased on multiple product categories, this is a filter based on aggregated results – this suggests using the “conditions” tab in filter on [Customer Name]

Use the formula:
(MAX(IIF([Product Category]=”Furniture”,1,0))+MAX(IIF([Product Category]=”Office Supplies”,1,0)))=2

How this works:
• MAX(IIF([Product Category]=”Furniture”,1,0))
Show me the value “1” if Customer ever had bought something from Furniture
• MAX(IIF([Product Category]=”Office Supplies”,1,0))
Show me the value “1” if Customer ever had bought something from Office Supplies
• (MAX(IIF([Product Category]=”Furniture”,1,0))+MAX(IIF([Product Category]=”Office Supplies”,1,0)))=2
Only show me if both of the above formulas returned “1” or another words 1+1=2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Can I use Tableau to discover which products were frequently purchased together?

A

Yes. This is commonly referred to as a Market Basket Analysis. First, identify the unique identifier (Customer name, ID number, Business, etc.) and the category in question(Product category, Business segment, Academic Course, etc.).
Self join tables approach: Go to Edit Tables on the data connection and choose to Add a Table. Select the table with the data we will be needing and choose to Join the Table. Add two join clauses: Join the identifier to itself, and join the category in question to itself where the relationship is <>(not equal).

26
Q

How about the top 5 products that I sold this year?

A

Yes. Filter on current year, then add this filter to the “context”. Reason: context is created prior to filtering to the top 5.

27
Q

How about the top 5 in each region?

A

Yes. Create an index() calculation and add it to the sheet. Edit it to restart for each region.

28
Q

What about how the top 10 products and regions are performing?

A

Yes. Make a set of product and region, place this new set on row, followed by product and region. Sort on the set. Then hide headers:

29
Q

How do I make a box plot?

A

Many customers prefer our box plots to the standard definition because we will show you all outliers and plot density throughout the range, not just the whiskers (though you can get that as well).
See the knowledge base for examples.

30
Q

Can I find the purchase history of customers that purchased this month?

A
  1. Manually filtering
  2. Dynamic sets from dimensions
  3. Manual sets from marks/members
  4. See also: cohort analysis.
31
Q

Every day, I want to open Tableau and see how yesterday’s top 3 sales people have performed over the last 6 months.

A

Similar to above, use dynamic sets. Once created, this set can be a filter and be used against any date range. You need to peg it using “add to context”.

32
Q

How do I apply weights when working with my survey data?

A

If you have a weight field in your data set you can apply this weight with a simple calculated field. [Weight]*[Measure].

33
Q

How do I find the average sale per order?

A

Calculated field. Given a dimension “Order” and a measure “line item sale amount”
Sum([line item sale amount])/Countd(Order)

34
Q

Tableau only created equal sized bins, but I want mine to be 0-10, 10-50, 50-100, 100-1000, and greater than 1000.

A

Instead of using built in binning, use a calculated field with an IF statement. Something like:
IF [Sales] > 10000 THEN “LARGE”
ELSEIF [Sales] > 5000 THEN “MEDIUM”
ELSE “SMALL” END

35
Q

Your bins are great! But I have just a couple of things that are bigger than 1000, how do just create a bin that is >1000 and not have to define all the intermediary bins?

A
  1. IF [Sales] > 5000 THEN 5000 ELSE [Sales] END
  2. Then, bin this calculated field
  3. Place on shelf
  4. Edit alias for that last group (e.g. ‘>= 5000’)
36
Q

Is there any way that we can dynamically change the size of our bins?

A

Yes, Tableau makes this very easy to do by allowing you to create parameters directly in the bin menu.

37
Q

I have a date field that looks like 20070326. Tableau doesn’t recognize it as a number. How do I make it into a date field? Bonus, I have 200703?

A

Calculated field. Examples assume the source is of type “string” (otherwise, wrap str() around these):
DATE(MID([DividendDate],5,2) + “/” + RIGHT([DividendDate], 2) + “/” + LEFT([DividendDate],4))
Bonus question:
DATE(MID([DividendDate],5,2) + “/01/” + LEFT([DividendDate],4))

38
Q

Do you have a Modulus function in excel it is Mod()?

A

Yes we do have a modulus function. Instead of MOD(), Tableau has a modulo operator, in calculated fields this is the “%” character, e.g.:
[Dividend] % INT[Divisor]
The modulo operator finds the remainder when dividing by integers: 5%2 = 1

39
Q

Can I use calculations from Excel?

A

Yes. Once connected to the excel sheet, calculated fields are represented inside of Tableau. A change to the calculation in excel will result in updated information in the Tableau worksheet.
Note: you cannot use VB or other excel functions inside of Tableau, only the output of those functions/calculations.

40
Q

Can I use functions in my database that Tableau doesn’t have?

A

Yes. You can use the various Tableau “RAWSQL” functions to return results from functions in the underlying database. Care must be taken to present the results of these functions in meaningful ways (i.e. a db function that returns a list might need to be ‘counted’ in tableau) (stored procedures may also work, as long as they return a single value for each row – aka single-valued functions)

41
Q

I have financial data. How do I calculate the profit ratio?

A

Tricky question. Their table will look like this:

Account Amount

Gross profit	100
Expenses		1049
Revenue		2302
COGS		8239
Etc.

Best way is a couple of calculated fields, e.g
Profit: IF [Account] = “Gross Profit” THEN [Amount] ELSE NULL END

Revenue: IF [Account] = “Revenue” THEN [Amount] ELSE NULL END

Etc, then, perform your math on these calc fields:
sum(Profit)/sum(Revenue)

42
Q

How do I calculate year-over-year growth without using table calcs?

A

Hint: similar to the ‘financial data’ question above.

43
Q

How do I sort on a particular dimensions’ member (i.e. sort on 2008 of “order date”)

A

I.E Do this, but without single-click sort:

44
Q

Can Tableau calculate the result of compounding interest?

A

Yes. Tableau can certainly do this with some table calculation functions:
PREVIOUS_VALUE(1)*(1+AVG([Discount]))

45
Q

Can Tableau do z-tests?

A

Yes, the calculation is:
(Sum(x) – WINDOW_AVG(Sum(x))) / STDEVP(x)
where x is a field.
Z represents the number of standard deviations between x and its population average. As a general benchmark, any Z scores above 2 are considered statistically significant.

46
Q

The Tableau Server maintenance views do not fit my needs. Can I customize them?

A

Yes. See the admin guide or knowledge base for details.

47
Q

How can I embed a view without the toolbar?

A

Add to the end of the url: ?:embed=yes&?:toolbar=no

48
Q

Can I do a multiple regression?

A

No. But Tableau does iterative regressions with ANOVA very well.

49
Q

What does the Anova table tell me?

A

analysis of variance: 1. Sources of variation (AKA “Field”) 2. Degrees of freedom 3. Sums of squares 4. Mean squares 5. F-statistic 6. P-Value
for further information, see also: the internet.

50
Q

What is a p-Value?

A

A p-value shows the probability that sample data will adequately represent the population from which it was drawn. The larger the value, the less likely that the sample data accurately represents the population. The max. value for a P-Value is 1.0, min is zero, and a value of .05 means that there’s a 5% chance that the model does not represent the data.

51
Q

Can I use a cumulative distribution function?

A

Most likely. It depends on exactly what they are asking, but start with a histogram and set the measure to be ‘running total’. If they like that, then set it to percent of total as well (secondary table calc).

52
Q

Can I find the correlation?

A

Yes – compares two sets of data against each other and contains two separate sets of measures.

53
Q

Can I do significance testing?

A

Significance is built-in to the ‘describe trend model’ command. A possible advanced approach is to calculate your significance manually, however this requires a detailed understanding of your dataset and what significance you are testing for.

54
Q

How does Tableau’s Data Engine work (at a high level)?

A

Tableau’s Fast Data Engine achieves speed through emulating some of the high-performance capabilities of column-store databases, such as Sybase IQ and Vertica, making effective use of data compression and storing in columns.

55
Q

Column-store data bases work across large machines and many servers though, how can Tableau’s Data Engine be similar?

A

The approach is tailored for use in the single PC environment by the inclusion of a virtual memory paging capability. Essentially Tableau works similar to these databases in a miniature way. The databases spread data over many disks and many servers, working in parallel to deliver the answer to a query rapidly. For a single PC user of Tableau, the Data Engine spreads the workload across processing resources, multiple cores on each CPU and even in some cases multiple CPUs, to utilize as much power as possible.

56
Q

Does it use memory or disk space?

A

Databases make extensive use of memory because it is faster (up to 100 times), with column store, because the data is in compressed form, much more can reside in memory at any given time. Tableau’s Data engine takes advantage of available memory in the same way.

57
Q

What if there is not enough memory to extract a large data set into Tableau’s data engine?

A

When it is processing more data than can be fit into memory, it writes it out to disk using a virtual memory capability. It divides the data into pages of equal size, writing out the pages to the disk when memory is full and then reading them back in when needed. This activity is algorithmically controlled, so that the pages of data least likely to be required are pushed out to disk. When they are eventually needed, they are read in and other pages are pushed out.

58
Q

Is there a memory requirement for Tableau to use extracts?

A

The less you have available will limit the data volumes you can work on. No different that if you want to download movies or music, you need to allot space on that machine. To determine the minimum amount of space necessary, we generally recommend sizing your hard drive to be at least three to five times the total data size. This is because when Tableau creates its data extracts, it uses a lot of temporary space on the hard drive to create them, so a 10 GB extract may use 30-50 GB of temp hard drive space while creating the extract (the temp file will subsequently be deleted upon successful completion of the extract).

59
Q

What if I want to create my own maintenance views for Tableau Server?

A

In addition to the pre-built administrative views available on the Maintenance page on the Server, you can use Tableau Desktop to query and build your own analyses of server activity. The Tableau Server repository has several database views set up that you can connect to and query.

To access these views you must first use the command line tool to enable external access to the Tableau Server database. Next, if you are accessing the database from a computer that is not running Tableau Server, you will need to add your computer’s IP address to Tableau Server’s pg_hba.conf.templ file. Finally, you need to connect to and query the Tableau Server database.

60
Q

What is the difference between Multiple Regression and ANOVA?

A

ANOVA allows us to see if differences in the variable of interest between groups are due to chance, or if there is a significant group effect. It tells us very little about the nature of that relationship. All we know is whether a significant difference exists between groups.

Regression gives much more information. It tells us if there is a difference, where the differences lie, and we can predict to what degree the group causes changes in the dependent variable.

61
Q

Explain how VizQL works?

A

Unlike current charting packages and like query languages, VizQL permits an unlimited number of picture expressions. Visualizations can thus be easily customized and controlled. VizQL is a declarative language. The desired picture is described; the low-level operations needed to retrieve the results, to perform analytical calculations, to map the results to a visual representation, and to render the image are generated automatically by the query analyzer. The query analyzer compiles VizQL expressions to SQL and MDX and thus VizQL can be used with relational databases and datacubes.