Data Analysis Flashcards

(56 cards)

1
Q

must be based on a solid understanding of statistical analysis and epidemiological concepts.

A

Definitions used in data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The data include all positive cases, taking into account variables and decreasing the number of false-negatives.

A

Sensitivity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The data include only those cases specific to the needs of the measurement, excluding those from a different population thereby decreasing the number of false-positives.

A

Specificity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data are classified according to subsets, taking variables into consideration.

A

Stratification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The tool/indicator collects and measures the necessary data

A

Recordability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Results should be reproducible.

A

Reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The tool or indicator should be easy to use and understand.

A

Usability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Collection measures the target adequately, so that the results have predictive value.

A

Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

a method by which to identify patterns and relationships in large amounts of data, such as the identification of risk factors or the effectiveness of interventions.

A

Knowledge discovery in database (KDD)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

he steps to KDD include

A

selecting data, preprocessing (e.g., assembling target data set, cleaning data of noise), transforming data, data mining, and interpreting results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

the analysis (often automatic) of large amounts of data to identify underlying or hidden patterns.

A

Data Mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

may be applied to multiple patients’ electronic health records to generate information about the need for further examination or interventions.

A

Data Mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The steps to data mining include

A

detecting anomalies, identifying relationships, clustering, classifying, regressing, and summarizing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

involves electronically searching through large amounts of information to find relevant items.

A

Data Mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data mining uses several tools to look for patterns:

A

Association rule mining
Classification
Clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

This tool looks for patterns in which a certain data object shows up repeatedly (more than randomly) and is associated with an unrelated data object.

A

Association Rule mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

This tool looks for data group membership. An example would be the number of sunny days in a year.

A

Classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

This tool organizes data objects according to their similar characteristics. This results in a natural pattern or clustering of similar data.

A

Clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Data mining can also be called

A

Knowledge discovery

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

refers to the collection and summation of data for further use, such as for statistical analysis.

A

Data aggregation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

may be used to collect information about an individual from multiple sources, often for targeted marketing purposes.

A

Data aggregation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

show the spread or dispersion of data.

A

Measures of distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

is the distance from the highest to the lowest number

24
Q

measures the distribution spread around an average value.

25
is the square root of the variance and shows the dispersion of data above and below the mean in equally measured distances
standard deviation
26
a method of comparing rates or ratios.
Chi-square (X2)
27
a means by which to establish if a variance in categorical data (as opposed to numerical data) is of statistical significance.
Chi-square test
28
generally used to show whether there is a significant difference between groups or conditions being analyzed.
Chi- square testing
29
used to analyze data to determine if there is a statistically significant difference in the means of both groups. examines two sets of data that are similar,
The "t" test
30
used to evaluate the data sets found in scattergrams; it compares the relationship between the dependent variable and the independent variable to determine if the relationship correlates.
Regression analysis
31
attempting performance improvement and developing practice guidelines without data can be problematic.
Integrating the results of data analysis
32
should assist with case management, decision-making about individual care, improvement of critical pathways related to clinical performance, staff performance evaluations, credentialing, and privileging.
Integration of information
33
the process of changing information from a given source (such as a data entry terminal) into information that can be understood by a destination point (such as a large database)
Data transformation
34
Data transformation is performed in two steps:
Data mapping code generation
35
This process develops a map of how information flows from one place to another and figures out which parts of the information needs to be transformed.
Data Mapping
36
This is when the actual transformation occurs and the data is converted into a form compatible with its destination.
Code generation
37
can be verbal (e.g., spoken/written representations), analog (e.g., television, radio, telephone, recorded), or digital (e.g., coded).
Data Representation
38
uses continuous waveform signals varying in intensity.
Analog representation
39
uses codes (usually numeric), such as the binary code (base 2) to represent values.
Computerized representation of data
40
comprised of strings of 1s and 0s with 1s stored in magnetized areas of disks and 0s stored in non-magnetized areas; thus, 1 represents “on,” and 0 represents “off.”
binary code
41
Each representation (0 or 1) is referred to as a
Bit - binary digit
42
8 bits =
1 byte
43
1 byte can represent
256 characters
44
1,000 bytes =
1 kilobyte
45
1 million bytes =
1 megabyte
46
1 billion bytes =
1 gigabyte
47
1 trillion bytes =
1 terabyte
48
the pattern of 0s and 1s used to represent characters.
The coding scheme
49
the most common binary coding scheme is
American Standard Code for Information Interchange
50
characters represent 4 binary bits; thus, 1 byte can be represented by 2 hexadecimal characters.
Hexadecimal
51
Uses a base of 16 and 16 symbols (usually the numeral 1–9, representing values 0 to 9 and Arabic letters A through F, representing values 10–15).
Hexadecimal coding
52
One digit (4 bits) is referred to as a
nibble
53
8 bits/ 1 byte are referred to as
octet
54
used with the Universal Character Set, is a standardized coding system that has a large capacity and can be used to represent text for most languages, including Asian languages.
The unicode standard coding scheme
55
provides a specific numeric value for each character and can be used across multiple platforms.
Unicode
56
representing all alphabets of the world languages, ideographic sets, symbols, and 100 scripts, and is particularly valuable for making coding accessible internationally.
Unicode