1A. Discover data analysis Flashcards

1
Q

What are descriptive analytics?

A

Help answer questions about what has happened based on historical data. These techniques summarize large semantic models to describe outcomes to stakeholders.

For example KPIs, ROI, and reports to provide a view of an organization’s sales and financial data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are diagnostic analytics?

A

Help answer questions about why events happened. Generally, this process occurs in three steps:

  1. Identify anomalies in the data. These anomalies might be unexpected changes in a metric or a particular market.
  2. Collect data that’s related to these anomalies.
  3. Use statistical techniques to discover relationships and trends that explain these anomalies.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are predictive analytics?

A

Help answer questions about what will happen in the future. These techniques use historical data to identify trends and determine if they’re likely to recur. Techniques include a variety of statistical and machine learning techniques such as neural networks, decision trees, and regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Prescriptive analytics?

A

Help answer questions about which actions should be taken to achieve a goal or target. Techniques rely on machine learning as one of the strategies to find patterns in large semantic models. By analyzing past decisions and events, organizations can estimate the likelihood of different outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are cognitive analytics?

A

Attempt to draw inferences from existing data and patterns, derive conclusions based on existing knowledge bases, and then add these findings back into the knowledge base for future inferences, a self-learning feedback loop. Help you learn what might happen if circumstances change and determine how you might handle these situations.

Inferences aren’t structured queries based on a rules database; rather, they’re unstructured hypotheses that are gathered from several sources and expressed with varying degrees of confidence. Effective analytics of this kind depend on machine learning algorithms, and will use several natural language processing concepts to make sense of previously untapped data sources, such as call center conversation logs and product reviews.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

List different kinds of analytics

A

Descriptive
Diagnostic
Predictive
Prescriptive
Cognitive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

List data roles

A

Business analyst
Data analyst
Data engineer
Data scientist
Database administrator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a business analyst?

A

Closer to the business than a data analyst and is a specialist in interpreting the data that comes from the visualization. Often, this role and a data analyst could be the responsibility of a single person.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a data analyst?

A

Enables businesses to maximize the value of their data assets through visualization and reporting tools such as Microsoft Power BI. Responsible for profiling, cleaning, and transforming data. Their responsibilities also include designing and building scalable and effective semantic models, and enabling and implementing the advanced analytics capabilities into reports for analysis. Works with the pertinent stakeholders to identify appropriate and necessary data and reporting requirements, and then they are tasked with turning raw data into relevant and meaningful insights.

Also responsible for the management of Power BI assets, including reports, dashboards, workspaces, and the underlying semantic models that are used in the reports. They are tasked with implementing and configuring proper security procedures, in conjunction with stakeholder requirements, to ensure the safekeeping of all Power BI assets and their data.

Work with data engineers to determine and locate appropriate data sources that meet stakeholder requirements. Work with the data engineer and database administrator to ensure that the analyst has proper access to the needed data sources. Also works with the data engineer to identify new processes or improve existing processes for collecting data for analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a data engineer?

A

Provision and set up data platform technologies that are on-premises and in the cloud. They manage and secure the flow of structured and unstructured data from multiple sources. The data platforms that they use can include relational databases, nonrelational databases, data streams, and file stores. Data engineers also ensure that data services securely and seamlessly integrate across data platforms.

Primary responsibilities include the use of on-premises and cloud data services and tools to ingest, egress, and transform data from multiple sources. They collaborate with business stakeholders to identify and meet data requirements. They design and implement solutions.

While some alignment might exist in their tasks with those of a database administrator, their scope of work goes well beyond looking after a database and the server where it’s hosted and likely doesn’t include the overall operational data management.

As a data analyst, you would work closely with them in making sure that you can access the variety of structured and unstructured data sources because they will support you in optimizing semantic models, which are typically served from a modern data warehouse or data lake.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a data scientist?

A

Perform advanced analytics to extract value from data. Their work can vary from descriptive analytics to predictive analytics. Descriptive analytics evaluate data through a process known as exploratory data analysis (EDA). Predictive analytics are used in machine learning to apply modeling techniques that can detect anomalies or patterns. These analytics are important parts of forecast models. Some might work in the realm of deep learning, performing iterative experiments to solve a complex data problem by using customized algorithms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a database administrator?

A

Implements and manages the operational aspects of cloud-native and hybrid data platform solutions that are built on Microsoft Azure data services and Microsoft SQL Server. They’re responsible for the overall availability and consistent performance and optimizations of the database solutions. They work with stakeholders to identify and implement the policies, tools, and processes for data backup and recovery plans.

This role is different from the role of a data engineer. This role monitors and manages the overall health of a database and the hardware that it resides on, whereas a data engineer is involved in the process of data wrangling, in other words, ingesting, transforming, validating, and cleaning data to meet business needs and requirements.

Also responsible for managing the overall security of the data, granting and restricting user access and privileges to the data as determined by business needs and requirements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the five key areas of work for a data analyst?

A

Prepare
Model
Visualize
Analyze
Manage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe data preparation

A

The process of profiling, cleaning, and transforming your data to get it ready to model and visualize. It involves, among other things, ensuring the integrity of the data, correcting wrong or inaccurate data, identifying missing data, converting data from one structure to another or from one type to another, or even a task as simple as making data more readable.

It also involves understanding how you’re going to get and connect to the data and the performance implications of the decisions. When connecting to data, you need to make decisions to ensure that models and reports meet, and perform to, acknowledged requirements and expectations.

Privacy and security assurances are also important. These assurances can include anonymizing data to avoid oversharing or preventing people from seeing personally identifiable information when it isn’t needed. Alternatively, helping to ensure privacy and security can involve removing that data completely if it doesn’t fit in with the story that you’re trying to shape.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe data modelling

A

The process of determining how your tables are related to each other. This process is done by defining and creating relationships between the tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe data visualizing

A

The goal is to solve business problems. A well-designed report should tell a compelling story about that data, which will enable business decision makers to quickly gain needed insights. By using appropriate visualizations and interactions, you can provide an effective report that guides the reader through the content quickly and efficiently, therefore allowing the reader to follow a narrative into the data.

An important aspect is accessibility.

17
Q

Describe the analyze task

A

The important step of understanding and interpreting the information that is displayed on the report. In your role as a data analyst, you should understand the analytical capabilities of Power BI and use those capabilities to find insights, identify patterns and trends, predict outcomes, and then communicate those insights in a way that everyone can understand.

18
Q

Describe the data management task

A

Overseeing the sharing and distribution of items, such as reports and dashboards, and ensuring the security of Power BI assets.

Sharing and discovery of your content is important for the right people to get the answers that they need. It is also important to help ensure that items are secure. You want to make sure that the right people have access and that you are not leaking data past the correct stakeholders.

It can also help reduce data silos within your organization. Data duplication can make managing and introducing data latency difficult when resources are overused. Power BI helps reduce data silos with the use of shared semantic models, and it allows you to reuse data that you have prepared and modeled. For key business data, endorsing a semantic model as certified can help to ensure trust in that data.