visualising correlation Flashcards

1
Q

what is optimizing Aspect Ratio

A

Adjusting the x and y-axis scales to proportionally represent the data, ensuring accurate visualization of relationships between variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

why must we remove the Fill Color

A

Using only outlines for data points to reduce over-plotting and make it easier to see overlapping points and identify patterns or trends.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are the reference regions

A

Visual areas on a scatterplot used to compare data to a reference set of values, making it easier to interpret the relationship between variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how can we Visually Distinguishing Data Sets When Divided into Groups

A

Using different colors, symbols, or trend lines to differentiate between subsets of data within a scatterplot.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the Trend Lines

A

Lines that trace the basic shape of data from left to right, indicating the overall direction and relationship between variables in a scatterplot.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the Line of Best Fit

A

A line with the least possible amount of residuals, which can be used to predict values not in the dataset and represents the overall trend in the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is Multiple Trend Lines

A

using different trend lines within a scatterplot to show different trends within subsets of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is a Crosstab Display

A

A method to separate data into individual scatterplots by categories or groups, reducing complexity and over-plotting and making it easier to compare correlation patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are the Grid Lines

A

Lines that help enhance comparisons between scatterplots by providing a visual reference for values on both the x and y axes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the Coefficient of Determination (r2)

A

A statistical measure that describes the strength of correlation but not direction. It can be expressed as a percentage, indicating how much of the variation in one variable is determined by the other variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Difference between Crosstab Display and Scatterplot Matrix

A

Crosstab Display separates data into individual scatterplots based on categories or groups, allowing for easier comparison of correlation patterns within different categories.

Scatterplot Matrix arranges scatterplots in a matrix format to compare multiple pairs of quantitative variables simultaneously, identifying relationships and correlations between these variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are the Correlation Analysis Techniques and Best Practices

A

Optimizing aspect ratio and quantitative scales

▰ Removing fill color to reduce over-plotting

▰ Comparing data to reference regions

▰ Visually Distinguishing Data Sets When Divided into Groups

▰ Using trend lines to enhance perception of the correlation’s shape, strength, and outliers

▰ Using multiple trend lines to see categorical differences

▰ Using trellis and crosstab displays to reduce complexity and over-plotting

▰ Using grid lines to enhance comparisons between scatterplots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what does it mean when a correlation is curvilinear

A

When a correlation is curvilinear, the relationship between values is not fixed to a
consistent amount

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

one example of curved upward correlation pattern

A

The growth of Netflix can be represented by an S-curve, starting with a slow adoption rate, followed by a period of rapid acceleration as more users embraced the streaming service, and finally, reaching a saturation point as the market became more saturated and competition increased.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

exponential growth example

A

Compound interest that banks pay grows exponentially
through time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

logarithmic growth example

A

In the early stages of an epidemic, the number of infected individuals tends to increase rapidly. However, as more people become infected and measures like vaccinations or social distancing are implemented, the rate of new infections starts to decline. Eventually, the growth curve levels off as the infection reaches a saturation point or is effectively controlled.

17
Q

why is apparent correlation erroneous

A

erroneous because of an insufficient or
biased sample

18
Q

what can correlation indicate

A

▻ One variable causes another’s behavior
▻ Neither causes the other’s behavior
▻ Apparent correlation is erroneous because of an insufficient or
biased sample
4

19
Q

what is correlation analysis

A

Introduction to Correlation
Correlation analysis involves comparing two quantitative variables to see if values
in one vary systematically with the other, and if so,
▰in what manner,
▰to what degree and
▰why

20
Q

what are the visual characteristic of correlation

A

▻ Direction = Positive, Negative
▻ Strength = Strong, Weak
▻ Shape = Straight, Curved