Implement and manage an analytics solution Flashcards

(64 cards)

1
Q

What is the benefit of storing different layers of your lakehouse in separate workspaces?

A

It can enhance security, manage capacity use, and optimize cost-effectiveness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

You want to use Apache Spark to explore data interactively in Microsoft Fabric. What should you create?

A

A notebook.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

You need to use Spark to analyze data in a CSV file. What’s the simplest way to accomplish this goal?

A

Load the file into a dataframe.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which method is used to split the data across folders when saving a dataframe?

A

partitionBy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which type of table should an insurance company use to store supplier attribute details for aggregating claims?

A

Dimension table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a semantic model in the data warehouse experience?

A

A semantic model is a business-oriented data model that provides a consistent and reusable representation of data across the organization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the purpose of item permissions in a workspace?

A

To grant access to individual warehouses for downstream consumption.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which language is optimized for querying real-time data in an eventhouse?

A

KQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which Microsoft Fabric Real-Time Intelligence component is used to visualize and explore real-time data in tiles?

A

Real-Time Dashboards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the primary function of Microsoft Fabric Eventstreams?

A

Ingesting and transforming real-time data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the purpose of the Fabric Activator destination in an eventstream?

A

Data sent to an Activator destination can be used to trigger an automated action based on data values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the primary language used for querying a data warehouse?

A

SQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is indexing important in a data warehouse?

A

It speeds up data retrieval times.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the purpose of a fact table in a data warehouse?

A

To store the results of calculations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the primary advantage of Dynamic Data Masking (DDM)?

A

It limits data exposure by obscuring sensitive information in real time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the purpose of a security predicate function in Row-Level Security (RLS)?

A

It determines whether a row is accessible to a user based on certain conditions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What happens when a user is granted a permission and then denied the same permission in a warehouse?

A

The DENY always supersedes the GRANT, and the user is denied access to the specific object.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the role of Git in the CI/CD process in Fabric?

A

Git lets your team collaborate using branches, and provides version control. It helps manage incremental code changes, and see code history.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the purpose of connecting a Fabric workspace to a Git repository?

A

To sync content between the workspace and Git, ensuring they have the same content.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the primary function of deployment pipelines in Fabric?

A

Deployment pipelines automate the movement of content through the development, test, and production stages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the 3 security levels in Fabric’s security model and their order of evaluation? They are evaluated sequentially to determine whether a user has data access.

A
  1. Microsoft Entra ID authentication: checks if the user can authenticate to the Azure identity and access management service, Microsoft Entra ID.
  2. Fabric access: checks if the user can access Fabric.
  3. Data security: checks if the user can perform the action they’ve requested on a table or file.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is order of evaluation of access in Fabric?

A

Microsoft Entra ID authentication, Fabric access, Data security

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What workspace role should be assigned to a data engineer who needs to create Fabric items and read all data in an existing lakehouse?

A

Contributor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Which of the following tools can be used to apply granular data access permissions in Fabric?

A

OneLake data access roles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the default storage format for Fabric's OneLake?
Delta-Parquet
26
What is a Microsoft Fabric lakehouse?
An analytical store that combines the file storage flexibility of a data lake with the SQL-based query capabilities of a data warehouse.
27
You have a managed table based on a folder that contains data files in delta format. If you drop the table, what happens?
The table metadata and data files are deleted.
28
Your organization uses Microsoft Fabric to manage data ingestion from various sources, including Microsoft Azure Event Hubs and local files. The data is used for real-time analytics and reporting. You need to implement a solution that transforms real-time event data before loading it into the Microsoft Fabric lakehouse. What should you do to achieve this?
Define real-time processing logic with the event processor.
29
Your organization uses Microsoft Fabric to manage real-time data from IoT devices sent to Azure Event Hub. Design a solution to process events for storage in a Fabric lakehouse. Which two actions should you perform as part of the solution?
Set up an eventstream with Azure Event Hub as a source. Use the event processor to filter and transform data.
30
Your company uses Dynamic Data Masking (DDM) in its Microsoft Fabric SQL database to protect sensitive customer information, including Email, PhoneNumber, and CreditCardNumber fields. You need to configure DDM so nonprivileged users see masked versions of these fields. Which masking function should you apply to the CreditCardNumber field?
partial(0,"XXXX-XXXX-XXXX-",4)
31
You are implementing a new data analytics solution using Microsoft Fabric. Your team consists of data engineers skilled in PySpark and SQL, and data analysts who primarily use Microsoft Power BI. The data to be ingested includes structured, semi-structured, and unstructured formats from various sources. Select a data store that accommodates diverse data formats and supports both PySpark and SQL operations for data transformation and analysis. Which two data stores provide a complete solution?
Azure Synapse Analytics Microsoft Lakehouse
32
Your organization uses Microsoft Fabric to manage data across departments. You need to ensure each department can manage its data governance settings independently. What should you do?
Delegate tenant-level settings to domains.
33
You are implementing an analytics solution using Microsoft Fabric with a deployment pipeline: Development, Test, and Production. You want to deploy only specific content from Test to Production to maintain data integrity. You need to configure the deployment pipeline to deploy only specific content from Test to Production. What should you do?
Use selective deployment.
34
Your company uses Microsoft Fabric deployment pipelines with multiple stages. You need to determine the last deployment time for each stage. What should you do?
Review the deployment history for each stage.
35
Your organization uses Microsoft Fabric deployment pipelines with stages: Development, Test, and Production. You need to configure the Production stage to connect to the production database.. What should you do?
Create a deployment rule for Production.
36
Your company implements a new analytics solution using Microsoft Fabric that requires frequent updates and testing before deployment to production. You need to set up the deployment pipeline to enable backward deployment, deploying content from a later stage to a earlier stage in the pipeline.. Each correct answer presents part of the solution. Which three actions should you take?
Configure rules to maintain settings across stages. Ensure the target stage is empty and has no workspace assigned to it. Review deployment history.
37
An organization uses Microsoft Fabric for data analytics workflows. A deployment pipeline with different configurations at each stage, such as database connections and query parameters, is needed. You need to ensure consistent settings across stages in a deployment pipeline. Each correct answer presents part of the solution. Which three actions should be taken?
Establish deployment rules for semantic models. Redeploy content after modifying rules. Set query parameters for each stage.
38
Your organization uses Microsoft Fabric for analytics solutions. The team must set up deployment pipelines for transitions between development, testing, and production environments. You need to maintain distinct configurations for each stage and ensure only verified changes are promoted to production. Each correct answer presents part of the solution. Which two actions should you take?
Create unique deployment rules for each stage. Use selective deployment for content control.
39
Your organization uses Microsoft Fabric for data warehousing and has implemented dynamic data masking to protect sensitive information in the EmployeeData table. The table includes columns such as EmployeeID, FirstName, LastName, SSN, and email. You need to ensure that only authorized users can view unmasked data while others see masked data according to the defined rules. What action should you take?
Grant UNMASK permission to authorized users.
40
Your company is implementing a new data governance strategy using Microsoft Fabric. The data warehouse contains various tables with sensitive information that must be protected from unauthorized access. You need to ensure only authorized medical personnel can access the 'MedicalHistory' column in the 'Patients' table. What should you use?
Column-Level Security
41
Your company uses a Microsoft Fabric data warehouse to store employee salary information. You need to configure permissions so that only specific users can view unmasked salary data while others see masked data. Which permission should you grant to the authorized users?
Grant UNMASK permission.
42
Your company is using Microsoft Fabric to manage a data warehouse that includes a table with sensitive employee information. The table contains columns such as EmployeeID, Name, Salary, and SocialSecurityNumber. You need to restrict access to the Salary and SocialSecurityNumber columns to HR personnel only. Each correct answer presents part of the solution. Which three actions should you take?
Create an HR role and assign it to HR personnel. Deny SELECT permission on the sensitive columns to other roles. Grant SELECT permission on the sensitive columns to the HR role.
43
Your organization uses Microsoft Fabric for data warehouse management. The security team is concerned about unauthorized access to sensitive customer information in a database. You need to ensure sensitive data is protected and accessible only to authorized users. Which three actions should you perform? Each correct answer presents part of the solution.
Define dynamic data masking for sensitive columns. Grant UNMASK permission to authorized users. Test masking with a non-privileged user.
44
Your organization uses Microsoft Fabric for data warehousing, containing sensitive customer information like email addresses and credit card numbers. You need to implement a security solution to mask sensitive data in real-time without altering the data structure. Each correct answer presents part of the solution. Which action should you take?
Implement Dynamic Data Masking on sensitive columns.
45
Your company has a data warehouse in Microsoft Fabric to store sales data. Different departments need access to this data, but each department should only see data relevant to their operations. You need to configure the data warehouse so each department can only view its own sales data. Each correct answer presents part of the solution. Which two actions should you take?
Create a security policy with a filter predicate for departments. Implement row-level security using department identifiers.
46
Your organization uses Microsoft Fabric to manage a data warehouse containing sensitive customer information, including email addresses and credit card numbers. You need to prevent nonprivileged users from viewing full email addresses and credit card numbers. Each correct answer presents part of the solution. Which two actions should you perform?
Use dynamic data masking on CreditCardNumber with partial() function. Use dynamic data masking on Email with email() function.
47
A company uses Microsoft Fabric to orchestrate data processing workflows with tasks dependent on previous task completion. You need to implement an orchestration pattern ensuring tasks execute in sequence based on dependencies. What should you do?
Use a pipeline with dependencies.
48
What are the 2 ways to query tables in a KQL database?
Kusto Query Language (KQL) code or use a restricted subset of Structured Query Language (SQL) statements.
49
What does this KQL query do? Automotive | project trip_id, pickup_datetime, fare_amount | sort by pickup_datetime desc | where fare_amount > 20
Selects 3 columns, trip_id, pickup_datetime, fare_amount, from the Automotive table where fare amount is > 20. Then sorts by pickup date in desc order.
50
Besides KQL databases in Eventhouses, what other products support KQL?
Azure Monitor and Azure Sentinel
51
In KQL databases, what is a disadvantage of using SQL for querying?
One major disadvantage of using SQL over KQL is that it's not the native language of the engine and has to go through a transformer. This language difference prevents it from being published to Power BI directly from the Queryset.
52
True or False: To query on ingestion_time, a datetime value needs to be included with the data.
False. When the data is being perpetually loaded from a streaming source (for example, by an eventstream), you can use the ingestion_time function to filter based on when the data was loaded into the table, even if the data doesn't include a time-based value on which to filter. For example, the following query retrieves the events that have been loaded into the table in the past hour. Automotive | where ingestion_time() > ago(1h) | project trip_id, vendor_id, pickup_datetime, fare_amount | where ingestion_time() > ago(1h)
53
True or False: KQL supports Materialized Views and Stored Functions.
True
54
What is an eventhouse?
A data store for real-time data in KQL databases.
55
What is the purpose of using the project operator in KQL (Kusto Query Language) queries?
To specify which columns to include in your query output.
56
How can you create a reusable parameterized query for a KQL database?
Create a stored function.
57
You have access to a historical dataset that contains the monthly expenses of the marketing department. You want to generate predictions of the expenses for the coming month. What task do you need to perform to predict the expenses for the coming month?
Forecasting
58
Which feature in Microsoft Fabric should you use to review the results of MLflow's tracking through a user interface?
Experiments
59
Which feature in Microsoft Fabric should you use to accelerate data exploration and cleansing? (for data science)
Data Wrangler
60
Where is all data in Fabric stored?
OneLake which is built on Azure Data Lake Storage gen2
61
What is a real-time dashboard?
A Real-Time Intelligence item that can display data from a real-time streaming source in Microsoft Fabric.
62
How can you enable users to filter a real-time dashboard interactively?
Add parameters to the dashboard.
63
How can you ensure users always see fresh data in a real-time dashboard?
Configure auto refresh for the dashboard.
64