2A. Get data in Power BI Flashcards

1
Q

What is a “flat” file?

A

A type of file that has only one data table and every row of data is in the same structure. The file doesn’t contain hierarchies. For example csv, xlsx

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are some examples of locations from which to import flat files into Power BI? And what are some differences between them?

A

Local - only use for data that doesn’t change, since changes to the original file will not be reflected in Power BI
OneDrive for Business - better for keeping Power BI synchronized with changes in the data
OneDrive - Personal
SharePoint - Team Sites - better for keeping Power BI synchronized with changes in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are three places where you can change the location of a source file?

A

Data source settings
Query settings
Advanced Editor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the two main data connectivity modes when connecting to SQL databases?

A
  • Import (selected by default, recommended) and
  • DirectQuery.

Mostly, you select Import.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the three login options when connecting to an SQL database?

A
  • Windows - Use your Windows account (Azure Active Directory credentials).
  • Database - Use your database credentials. For instance, SQL Server has its own sign-in and authentication system that is sometimes used. If the database administrator gave you a unique sign-in to the database, you might need to enter those credentials on the Database tab.
  • Microsoft account - Use your Microsoft account credentials. This option is often used for Azure services.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are two ways of selecting data to import from an SQL database?

A
  • Using the navigator to click and select tables.
  • Writing an SQL query (under advanced options on the SQL Server database window)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Where do you change data source settings, and why might you have to do so from time to time?

A

On the Home tab, select Transform data, and then select the Data source settings option. You can also change data source settings from inside power query and in advanced settings

This action is often required due to a security policy within the organization, for example, when the password needs to be updated every 90 days. You can change the data source, edit permissions or clear permissions.

You can change the data source, edit permissions or clear permissions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does SQL stand for?

A

SQL stands for Structured Query Language and is a standardized programming language that is used to manage relational databases and perform various data management operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are some operations you can perform on the data before loading it into power query, when you load data using an SQL statement?

A

SQL is beneficial because it allows you to load only the required set of data by specifying exact columns and rows in your SQL statement and then importing them into your semantic model. You can also join different tables, run specific calculations, create logical statements, and filter data in your SQL query.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the structure of of an SQL statement?

A

The SQL query starts with a Select statement, which allows you to choose the specific fields (think of these as columns) that you want to pull from your database.

FROM specifies the name of the table that you want to pull the data from.

All queries should also have a WHERE clause. This clause will filter the rows to pick only filtered records that you want.

For example:
SELECT
ID
, NAME
, SALESAMOUNT
FROM
SALES
WHERE
OrderDate >= ‘1/1/2020’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why is it best to avoid filtering data imports with SQL statements directly in Power BI, what is the preferred alternative, and why is it preferred?

A

It is preferable to write SQL statement queries as a “view,” an object in a relational database. Power BI can then retrieve data using the view object, which allows Power Query to participate in Query Folding, which optimizes data retrieval according to how the data is being used later.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does the wildcard character (*) do in an SQL statement, and why is it not recommended?

A

The wildcard character (*) will import all columns within a table. This method isn’t recommended because it will lead to redundant data in your semantic model, which will cause performance issues and require extra steps to normalize your data for reporting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are Dynamic reports, and what are their benefits?

A

They are reports in which the data can be changed by a developer according to user specifications. They are valuable because a single report can be used for multiple purposes. If you use thems, you’ll have fewer individual reports to create, which will save organizational time and resources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are two ways of creating dynamic reports?

A
  • Use a parameter (good if filtering tables by one value at a time)
  • Use a function (for filtering on multiple values at a time)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you use a parameter to create a dynamic report?

A

On the Home tab in Power Query, select Manage parameters > New parameter.
Then right click the table query, select advanced editor, and replace the existing value in the WHERE execute statement with the parameter name (i.e. the parameter instead of the value/string)
Now you can edit the value in the parameter, and the query will show only the data matching that parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you use a function to create a dynamic report?

A
  • create a spreadsheet with one column. The values in this column will be the ones used as a filter
  • load that s/s into power query
  • right click the main table query (the one you want to filter) and click Create Function…
  • select the s/s you just loaded with the “filter values,” click on the Add Column tab, and then Invoke Custom Function
  • then load the individual columns you want to load (it will look like you have merged two queries)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a NoSQL database (also referred to as non-SQL, not only SQL or non-relational)?

A

A flexible type of database that doesn’t use tables to store data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What do you have to do when accessing an Azure end-point for the first time from Power BI?

A

Make sure that you enter your account key. You can find this key in the Primary Key box in the Read-only Keys blade of your Azure portal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What do you need to do with data from no-SQL databases before being able to use it in Power BI?

A

You need to normalize it. After connecting to a database in the Navigator pane of the Get Data process, selected Edit instead of loading, and you’ll be able to turn the unstructured data into data in columns. It will likely first look like a table with only one “merged” column that you can then expand and then transform as usual.

20
Q

How do you connect to data in an application, for example Share Point?

A

By referencing the SharePoint site URL. You only need the top-level url, and can select the specific data to load in the navigator pane in the next step.

21
Q

What are the three standard types of storage modes you can choose from when loading data into Power BI? For which connection type are the options different?

A

For most connections:
- Import
- DirectQuery
- Dual (Composite)

For (Azure) Analysis Services
- Import
- Connect live

22
Q

Which is the most common method of storing/importing data, and why is it sometimes not used?

A

Most often Import is used.
But DirectQuery is sometimes used, for example if security concerns means data is not allowed to be stored other than in the original data-file. Some semantic models may also be too large to load into a Power BI file. DirectQuery also ensures that you will always view up-to-date data.

23
Q

What is Import storage mode, and what is its features?

A

This mode allows you to create a local Power BI copy of your semantic models from your data source. You can use all Power BI service features with this storage mode, including Q&A and Quick Insights. Data refreshes can be scheduled or on-demand. It is the default for creating new Power BI reports.

24
Q

What is DirectQuery storage mode, and what is its features?

A

This option is useful when you don’t want to save local copies of your data because your data won’t be cached. Instead, you can query the specific tables that you’ll need by using native Power BI queries, and the required data will be retrieved from the underlying data source. Essentially, you’re creating a direct connection to the data source.

Using this model ensures that you’re always viewing the most up-to-date data, and that all security requirements are satisfied.

Additionally, this mode is suited for when you have large semantic models to pull data from. Instead of slowing down performance by having to load large amounts of data into Power BI, you can use this mode to create a connection to the source, solving data latency issues as well.

25
Q

What is Dual (Composite) storage mode, and what is its features?

A

In this mode, you can identify some data to be directly imported and other data that must be queried. Any table that is brought in to your report is a product of both Import and DirectQuery modes. This mode allows Power BI to choose the most efficient form of data retrieval.

26
Q

What storage mode do you have to use if you want to use all features of Power BI services, including Q&A and Quick Insights?

A

Import
DirectQuery is not compatible with Q&A and Quick Insights

27
Q

What storage mode is best if you have strict data security requirements; need to ensure data is always up to date; or if you have a very large language model?

A

DirectQuery

28
Q

What is Azure Analysis Services?

A

A fully managed platform as a service (PaaS) that provides enterprise-grade semantic models in the cloud. You can use advanced mashup and modeling features to combine data from multiple data sources, define metrics, and secure your data in a single, trusted tabular semantic model. The semantic model provides an easier and faster way for users to perform ad hoc data analysis using tools like Power BI.

29
Q

Getting data from Azure Analysis Services is similar to getting data from another SQL Server (connecting, picking a data model, selecting tables), but with some notable differences, which are these?

A
  • Analysis Services models have calculations already created.
  • If you don’t need an entire table, you can query the data directly. Instead of using Transact-SQL (T-SQL) to query the data, like you would in SQL Server, you can use multi-dimensional expressions (MDX) or data analysis expressions (DAX).
30
Q

What are the two connectivity options for (Azure) Analysis Services?

A
  • Import
  • Connect live
31
Q

What is the Connect live storage/connection option, which which data connection can it be used with, and what are some benefits of using it?

A

An option for Azure Analysis Services.

Azure Analysis Services uses the tabular model and DAX to build calculations, similar to Power BI. These models are compatible with one another.

Using this storage option helps you keep the data and DAX calculations in their original location, without having to import them all into Power BI.

Azure Analysis Services can have a fast refresh schedule, which means that when data is refreshed in the service, Power BI reports will immediately be updated, without the need to initiate a Power BI refresh schedule.

This process can improve the timeliness of the data in your report.

32
Q

What languages do you use to directly query data in Azure Analysis Services?

A

Select Connect live when connecting to the data base, and then you can use multi-dimensional expressions (MDX) or data analysis expressions (DAX).

33
Q

What is some good first steps for diagnosing the performance of your report?

A
  • Run the Performance Analyzer tool
  • Run Query Diagnostics in Power Query (under the Tools tab)
34
Q

What are some ways to optimize query performance

A
  • Take advantage of Query Folding
  • Process as much data as possible in the original data source. Power Query and Power Query Editor allow you to process the data; however, the processing power that is required to complete this task might lower performance in other areas of your reports. Generally, a good practice is to process, as much as possible, in the native data source.
  • Use native SQL queries. When using DirectQuery for SQL databases, such as the case for our scenario, make sure that you aren’t pulling data from stored procedures or common table expressions (CTEs).
  • Separate date and time, if bound together. If any of your tables have columns that combine date and time, make sure that you separate them into distinct columns before importing them into Power BI. This approach will increase compression abilities.
35
Q

What is Query Folding?

A

It helps you increase the performance of your Power BI reports. Query folding is the process by which the transformations and edits that you make in Power Query Editor are simultaneously tracked as native queries, or simple Select SQL statements, while you’re actively making transformations. The reason for implementing this process is to ensure that these transformations can take place in the original data source server and don’t overwhelm Power BI computing resources.

36
Q

What are some benefits of Query Folding?

A
  • More efficiency in data refreshes and incremental refreshes. When you import data tables by using query folding, Power BI is better able to allocate resources and refresh the data faster because Power BI doesn’t have to run through each transformation locally.
  • Automatic compatibility with DirectQuery and Dual storage modes. All DirectQuery and Dual storage mode data sources must have the back-end server processing abilities to create a direct connection, which means that query folding is an automatic capability that you can use. If all transformations can be reduced to a single Select statement, then query folding can occur.
37
Q

Query Folding track transformations as native queries, but native queries are not possible for some transformations–which?

A
  • Adding an index column
  • Merging and appending columns of different tables with two different sources
  • Changing the data type of a column

A good guideline to remember is that if you can translate a transformation into a Select SQL statement, which includes operators and clauses such as GROUP BY, SORT BY, WHERE, UNION ALL, and JOIN, you can use query folding.

38
Q

What are some factors that can cause errors when importing data into Power BI?

A
  • Power BI imports from numerous data sources.
  • Each data source might have dozens (and sometimes hundreds) of different error messages.
  • Other components can cause errors, such as hard drives, networks, software services, and operating systems.
  • Data often can’t comply with any specific schema.
39
Q

When importing data, what might cause the error message “Query timeout expired”?

A

Relational source systems often have many people who are concurrently using the same data in the same database. Some relational systems and their administrators seek to limit a user from monopolizing all hardware resources by setting a query timeout. These timeouts can be configured for any timespan, from as little as five seconds to as much as 30 minutes or more.

40
Q

When importing data, what might cause the error message “Power BI Query Error: Timeout expired”? What are some ways of addressing it?

A

This error indicates that you’ve pulled too much data according to your organization’s policies. Administrators incorporate this policy to avoid slowing down a different application or suite of applications that might also be using that database.

You can resolve this error by pulling fewer columns or rows from a single table. While you’re writing SQL statements, it might be a common practice to include groupings and aggregations. You can also join multiple tables in a single SQL statement. Additionally, you can perform complicated subqueries and nested queries in a single statement. These complexities add to the query processing requirements of the relational system and can greatly elongate the time of implementation.

If you need the rows, columns, and complexity, consider taking small chunks of data and then bringing them back together by using Power Query. For instance, you can combine half the columns in one query and the other half in a different query. Power Query can merge those two queries back together after you’re finished.

41
Q

When importing data, what might cause the error message “We couldn’t find any data formatted as a table”?

A

Occasionally, you may encounter the “We couldn’t find any data formatted as a table” error while importing data from Microsoft Excel. Fortunately, this error is self-explanatory. Power BI expects to find data formatted as a table from Excel. The error even tells you the resolution. Perform the following steps to resolve the issue:

  • Open your Excel workbook, and highlight the data that you want to import.
  • Press the Ctrl-T keyboard shortcut. The first row will likely be your column headers.
  • Verify that the column headers reflect how you want to name your columns. Then, try to import data from Excel again. This time, it should work.
42
Q

When importing data, what might cause the error message “Couldn’t find file”?

A

Usually, this error is caused by the file moving locations or the permissions to the file changing. If the cause is the former, you need to find the file and change the source settings.

  • Open Power Query by selecting the Transform Data button in Power BI.
  • Highlight the query that is creating the error.
  • On the left, under Query Settings, select the gear icon next to Source.
  • Change the file location to the new location.
43
Q

Why might columns appear blank after importing data?

A

This situation happens because of an error in interpreting the data type in Power BI. The resolution to this error is unique to the data source. For instance, if you’re importing data from SQL Server and see blank columns, you could try to convert to the correct data type in the query. By specifying the correct type at the data source, you eliminate many of these common data source errors.

44
Q

How can you specify data type of a column in a query?

A

By using CAST

Instead of using this query:

SELECT CustomerPostalCode FROM Sales.Customers

Use this query:

SELECT CAST(CustomerPostalCode as varchar(10)) FROM Sales.Customers

45
Q

What is the query language for SQL servers?

A

T-SQL (Transactional-SQL)