The Data Analyst's Toolbox Flashcards

1
Q

What are some common tools used for data analysis

A
  • spreadsheets
  • query Languages
  • visualization tools
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a spreadsheet?

Remember working in Excel

A

A digital worksheet that
* stores
* organizes
* sorts
data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are some ways a spreadhseet helps a data analyst?

A

It helps them:
* see patterns
* group information
* create data visualizations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are formulas in spreadsheets?

A

Mathematical equations/instructions used to perform specific calculations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are functions in spreadsheets?

A

Preset/Predefined formulas that automatically perform a calculation/task

they cannot be internally edited / are fixed
eg SUM(argument1, argument2,…)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a query language?

A

It is a **computer prpgramming language ** that allows one to retrieve and manipulate data from a database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is Structured Query Language a widely used language

A

It is easy to understand and works well with many kinds of databases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are some uses of query languages?

A

They allow analysts to select, retrieve, request for, create, add, delete and update data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are data visualizatioon tools used for?

Think of the Tableu Viz of the day

A
  • turning complex numbers into a story
  • help stakeholders come to conclusions leading to informed data driven decisions and strategies.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Give two example of data visualization tools.

A
  1. Tableu - has drag and drop features to create visuals
  2. Looker- communicates directly with database and connects data to visual tool.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Overall, what determined which tool to use in the data analysis process between databases and spreadsheets?

A

criteria based on numerous factors such as
* number of users,
* type of data
* availability
* analytical steps required

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When is it ideal to use a database?

A

When dealing with large amounts of complex data with multiple dimensions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are some advantages of databases?

A
  1. Unlimited amount of data
  2. Ability to build relations and link data across tables
  3. Efficiently hosting multiple users
  4. Suitable when data is over 100 rows
  5. Information can automatically update itself across all tables
  6. Summary reports - since all tables are linked
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are some limitations of databases?

A
  1. Only give output in a standard table form (thus limited in visuals)
  2. Have limited calculation abilities.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are some advantages of spreadhseets?

A
  1. Can perform complex calculations and analyses.
  2. Have in-built visualization tools
  3. Suitable for finance (straightforward number crunching) and data analysis

number crunching- like profit, total sales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are some limitations of spreadsheets?

A
  1. Limited in relating and connecting data especially across tables and different files.
  2. One has to manually enter data
  3. Becomes challenging when data exceeds 100 rows.
17
Q

Every programming language follows a predetermined set of guidelines and structure known as,

A

Syntax

18
Q

What is the basic structure of an SQL Query?

A

SELECT (columns)
FROM (dataset.table)
WHERE (conditions);

it is more efficient to select only what you need.
semicolon (;) **statement terminator **

19
Q

Write an SQL query selecting the columns that contain a customer’s first name, last name and ID from the customer name table in the customer data dataset. Furthermore, write one to retrieve customer data from a customers whose first name is Tony.

A

SELECT
first_name,
last_name,
customer_ID

FROM
customer_data.customer_name

WHERE
first_name= ‘Tony’;

20
Q

Write an SQL query selecting the columns that contain a customer’s first name, last name and ID from the customer name table in the customer data dataset. Furthermore, write one to retrieve customer data from a customers whose first name is Tony, last name Magnolia and customer id is greater than 0.

A

SELECT
first_name,
last_name,
customer_ID

FROM
customer_data.customer_name

WHERE
first_name= ‘Tony’
AND last_name= ‘Magnolia’
AND customer_ID > 0;

customer_ID > 0 is a logical condition

21
Q

True or False

Using indentation and capitalization in SQL helps one to review and troubleshoot their code more easily.

A

True

22
Q

Write a WHERE command that looks for all names that start with ch

A

SELECT *
FROM customers.customers_name
WHERE first_name LIKE ‘ch%’;

% is a **wildcard **

23
Q

To save on time when reviewing long, complex queries after a long time one should write what alongside their code?

Write an example comment on SELECT colomns

A

Comments

SELECT columns – this is customer name
SELECT columns /* */

24
Q

True or False

Comments can be written outside a query statement and within

A

True

25
Q

Write an SQL statement to assign a new alias to a column known as field 1 to first name.

(let it be known as)

A

field_1 AS first_name – Alias to assign new name

AS clause

26
Q

True or False

Aliases do not inherently change the name and they only last within the query.

A

True

27
Q

What does this command mean.
WHERE first_name <> ‘John’

A

where first name is not equal / not John.