IDSA PRELIMS Flashcards

1
Q

What are new techniques to solve problems?

A

Data Science & Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the different roles in analytics?

A

Collector/ Data Steward
Business Analyst
Modeler/Data Scientist
Data Engineer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the APEC Analytics Competencies?

A
  • Domain Knowledge & Application
  • Data Management & Governance
  • Operational Analytics
  • Data Visualization & Presentation
  • Research Methods
  • Data Engineering Principles
  • Statistical Techniques
  • Data Analytics Methods & Algorithms
  • Computing
  • 21st Century Skills
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Who has the best domain knowledge?

A

Steward
Analyst
Manager

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Who has the best data governance?

A

Steward
Manager

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Who has the best operational analytics?

A

ALL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Who has the best data visualization?

A

Analyst
Manager

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Who has the best research methods?

A

Scientist

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Who has the best data engineering?

A

Engineer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Who has the best statistical techniques?

A

Scientist

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Who has the best methods and algorithms?

A

Scientist

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Who has the best computing?

A

Scientist

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Who has the best 21st century skills?

A

ALL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the components of the Data Science Skillset?

A

Substantive Expertise
Math and Sciences Knowledge
Hacking Skills
Substantive Expertise
Traditional Research
Machine Learning
Danger Zone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data science requires the intersection of what abilities?

A

Hacking skills
Math and Science Statistics
Substantive Expertise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Necessary for working with massive amounts of electrical data

A

Hacking skills

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Crucial for generating motivating questions and hypotheses and interpreting results

A

Substantive expertise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Allows a data scientist to choose appropriate methods and tools in order to extract insight from data

A

Math & Statistics knowledge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Stems from combining hacking skills with math and statistics knowledge, but does not require scientific motivation

A

Machine learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Lies at the intersection of knowledge of math and statistics with substantive expertise in a scientific field

A

Traditional Research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Combined with substantive scientific expertise without rigorous methods can beget incorrect analyses

A

Danger Zone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Data Science or Data Analytics: Uses big data

A

Both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Data Science or Data Analytics: Healthcare, gaming, travel, industries with immediate data needs

A

Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Data Science or Data Analytics: Macro

A

Science

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Data Science or Data Analytics: To ask the right questions
Science
26
Data Science or Data Analytics: Machine learning, AI, Search engine, engineering, corporate analytics
Science
27
Data Science or Data Analytics: To find actionable data
Analytics
28
Data Science or Data Analytics: Micro
Analytics
29
What is the mother of innovation?
Necessity
30
What is the goal of report writing?
Automation
31
What are the goals of a centralized system?
ERP - Enterprise Resource Planning MIS - Management Info System
32
Goals: Apps for everyone
Business Intelligence
33
Where is data science and analytics seen?
Education Environment Healthcare
34
process of knowledge discovery, machine learning and predictive analytics.
Data Mining
35
Data mining is NOT about?
* Descriptive statistics * Exploratory visualization * Dimensional slicing * Hypothesis testing * Queries
36
Data Mining involved extracting ____, building _____ and is a combination of ____, _____, ____ .
* Extracting Meaningful Patterns. * Building Representative Models. * Combination of Statistics, Machine Learning, and Computing Algorithms
37
Types of Learning Models in Data Mining?
Supervised/ Directed Unsupervised/ Undirected
38
What model of data mining: generalizes the relationship between the input and output variables.
Supervised
39
What model of data mining: to find patterns in data based on the relationship between data points themselves
Unsupervised
40
DATA MINING: Groups of Learning Models?
* Classification Models (S) * Regression Models (S) * Clustering Models (S/US) * Anomaly Detection (US) * Time Series Forecasting (US) * Association (US) * Text and Sentiment Analysis (US)
41
DATA MINING: Steps?
 Business Understanding  Data Understanding  Data Preparation  Modeling  Testing and Evaluation  Deployment
42
Data cleaning is the process of preparing data for analysis by removing or modifying?`
incorrect, incomplete, irrelevant, duplicated, or improperly formatted data.
43
Parts of Rapidminer interface
Repository Canvas Operators/Analysis tabs Parameter tabs Description tabs
44
How to import data on Rapidminer?
File --> import data or click the repository tab
45
Types of data when importing?
polynomial binomial real integer date_time date time
46
What type of data: many different string values (for example: red, green, blue, yellow)
polynomial
47
What type of data: (for example: 23.12.2014 17:59).
date_time
48
What type of data: a fractional number (for example: 11.23 or -0.0001).
real
49
What type of data: (for example 23.12.2014).
Data
50
What type of data: (for example 17:59).
Time
51
What type of data: a whole number (for example: 23, -5, or 11,024,768).
Integer
52
What type of data: exactly two values (for example: true/false, yes/no)
Binomial
53
After importing data, the data will appear in the ______ tab.
Results
54
To find the basic statistics of each attributes, click _____.
Statistics
55
In filtering cases, You may add more criteria by clicking ____.
Add Entry.
56
In missing value imputation data preparation, Instead of filtering, you may?
remove all cases with missing values, using the condition class, instead of Add Filters.
57
To impute missing data, in the operator tab, search for ____, then drag and drop on the line connecting the Filtering Examples and the res knob.
Replace Missing Values
58
In dealing with miscoded data, To remove “white spaces” in the encoding, use the ____ operator.
TRIM
59
In Dealing with miscoded data, Connect the _____ and the ______.
Out node of the Retrieve Customer operator and second res of the result knob
60
To remove “duplicates” in the encoding, use the _____ operator.
Remove Duplicates
61
To recode miscoded values, use the _____operator.
REPLACE
62
You may impute missing values using _______ operator in other attributes.
REPLACE MISSING VALUES
63
Use the ______ operator to select the attributes that you need for analysis.
Select Attributes
64
Set role operator is used when?
to tag the attribute that will be use as the label (Target Variable) or any other role it will act in the analysis.
65
Join operator is needed when?
If two data sets are needed to be merged in order to make an analysis
66
Connect the first data set or its result in the (right/left) node of the Join operator and the other data set at the (right/left) node.
Left; right
67
What are the steps of data preparation in RapidMiner?
1. Importing Data 2. Data Preparation 3. Data Filtering 4. Missing Value Imputation 5. Dealing with Miscoded Entries 6. Selecting and Setting Roles of Attributes 7. Combining Data Sets 8. Data Cleaning
68
What is data visualization?
graphical representation of data techniques used to communicate insights from data through visual representation.
69
What are the objectives of data visualization?
* to distill large datasets into visual graphics to allow for easy understanding of complex relationships within the data * to analyze massive amounts of information and make data-driven decisions.
70
What are the common visualization techniques?
* Bar Graph * Line Graph * Pie Graph * Histogram * Scatterplot * Boxplot * Heatmap
71
What Common Visualization Technique: to compare counts, percentage, or other measures (average) for different discrete categories of data
Bar Graph
72
T or F: Bar Graphs in RapidMiner are aggregated data
T
73
In creating bar graphs in RapidMiner, Set the ________ and use the _____ function.
Group by Stage; Average aggregate
74
What Common Visualization Technique: to observe trend
Line Graph
75
What Common Visualization Technique: shows the relative contribution that different categories contribute to an overall total
Pie Graph
76
What Common Visualization Technique: the frequency distribution of continuous attribute
Histogram
77
Bar graph presents ____ attribute while histogram represents ____ attribute .
categorical numerical
78
T or F: Histograms have spaces in between
F
79
T or F: In creating a histogram, CHECK the reverse axis to keep the order of the values.
F; do not check
80
T or F: There can be a histogram for two or more variables
T
81
What Common Visualization Technique: plots two numerical attributes
Scatterplot
82
What Common Visualization Technique: graphical representation of the quartiles
Boxplots
83
What Common Visualization Technique: graphical representation of data where the individual values contained in a matrix (map) are represented as colors.
Heat maps