Data Analysis Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Distribution =

A

How frequently different values are observed in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Frequency =

A

Number of times value appears in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Frequency distribution =

A

Table or graph that shows values and their corresponding frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Relative Frequency =

A

Frequency of a value/Total Number of Data Entries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Relative frequency distribution

A

Table or graph showing relative frequencies of each value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Make predictions with the slope of trend line of a scatter plot

A
  1. Take or estimate 2 points on the trend line
  2. Work out the slope
  3. Slope = The change in y axis per every value on the x axis
  4. Multiply slope if needed to change x axis unit for example (for every hour, for every week etc.)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Arithmetic Mean =

A

Sum of all the values/ No. Of Values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Weighted Mean =

A

Sum of All UNIQUE Values/ no. Of unique values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Weight of a value =

A

Frequency it appears

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Median =

A

‘Middle Number’

  1. Order values from smallest to biggest
  2. If no. Of Values is Odd, Median = number in the middle of this list

If No. of Values is even, there are 2 numbers in the middle. Median = Mean of these 2 values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Mode

A

‘Most frequent’

Value that occurs most frequently in list

There can be more than one in a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Positions of data

A

(Order data from least to greatest)

L = Least

M = Median

G = Greatest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Quartiles

A

Q1, Q2(M), Q3 Split data in to 4 groups:

L - Q1

Q1 - Q2(M)

Q2(M) - Q3

Q2 - G

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Percentiles

A

99 percentiles split data up in 100 groups

Group 1. L - 1 percentile

Group 100. 99 percentile - G

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How to find Q1

A

Find median of 1st half of data (the data before median)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How to find Q3

A

Find median of Second half of data (data after the median)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Dispersion

A

Degree of spread of the data

Most common = range, interquartile range, standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Range =

A

G - L

Greatest - Least

(Show maximal spread of data but can be effected by outliers)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Interquartile Range =

A

Q3 - Q1

Shows spread of middle data. Is not effected by outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Standard Deviation - measure of

A

Measure of spread that depends on every number in the data set (unlike ranges).

The more data is spread away from the mean - the greater the standard deviation

Sometimes called Population Standard Deviation (differentiate it from sample standard Deviation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How to calculate standard deviation =

A
  1. Find the mean
  2. Find the difference between each value and the mean and square it
  3. Find the mean of these squared differences
  4. Square root this number (take only the positive answer)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How to find the SAMPLE Standard Deviation

A
  1. Find the mean
  2. Find the difference between each value and the mean and square it
  3. Sum of these squared differences/ (no. of Values - 1)
  4. Square root this number (take only the positive answer)

(Sometimes preferred for a sample of data taken from a larger ‘population’ (set) of data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

1 , 2 , 3 Standard deviations above the mean =

A

Mean + 1d
Mean + 2d
Mean + 3d

d = standarde Deviation

24
Q

1 , 2 , 3 Standard deviations below the mean =

A

Mean - 1d
Mean - 2d
Mean - 3d

d = standarde Deviation

25
Q

How many standard deviations from the mean is X?

A

If X > Mean
Mean + Rd = X

If X < Mean
Mean - Rd = X

Where R = no. Of standard deviations

So re written:

R = (X - Mean) / d

OR

R = (X+ Mean)/ d

26
Q

In any group of data all values are within ____ standard deviations of the mean

A

In any group of data all values are within 3 standard deviations of the mean

27
Q

Set =

A

Collection of objects (aka members or elements)

Repetitions do not count as additional elements

Order does not matter

28
Q

Finite set

A

All elements can be completely counted

29
Q

Infinite Set

A

Can’t counts all elements

E.g.: set of all integers

30
Q

Empty set

A

Has no elements/members

Denoted by ∅

31
Q

Non Empty Set =

A

A set with 1 or more members/elements

32
Q

Subset

A

Set of numbers that are also all featured in a larger set.

Example: A and B are Sets. All the elements in Set A are also in Set B. Therefore A is a SUBSET of B.

Set A - {2,8} Set B - {0,2,4,6,8}

33
Q

∅ is a subset of ______

A

∅ is a subset of every set

34
Q

List =

A

A set that is in order
Can have repeating elements

(Unlike a set)

35
Q

Intersections

A

A set formed from the parts that appear in both of 2 other sets.

Example: intersection of X and Y (written as X
∩ Y) =
all the elements that appear in both Set X and Set Y

36
Q

Union

A

A set that is made up of all of the elements in 2 other sets (don’t include elements twice)

Example: The union of X and Y (written X ∪ Y) = all of the elements of Set X and Set Y

If sets are mutually exclusive
X U Y = |X| + |Y|

If sets can intersect - inclusion-exclusion principle

37
Q

If set have not elements in common they are said to be ____

A

If set have not elements in common they are said to be mutually exclusive (or disjointed).

Written as X ∩ Y = ∅

38
Q

Inclusion-exclusion principle

A

IF THE SETS CAN INTERSECT

A U B | = |A| + |B| - | A
∩ B |

39
Q

Multiplication principle

A

If K = different possibilities for first choice

M = different possibilities for second choice (that is independent of first choice)

KM = different possibilities for the pair of choices

Example - 5 meals 3 deserts = 15 combos

(Note can be more than 2 choices)

40
Q

Permutation

A

An order of elements

Example : how many permutations of the letters A B and C are there?

41
Q

Factorial

A

n! = n(n-1)(n-2)(n-3)….. 1

Example

3! = (3)(3-1)(3-2) = (3)(2)(1) = 6

42
Q

Solving Permutation problems

A
  1. Find number of elements (n)

2. Calculate: n!

43
Q

No. of Permutations ( objects are placed in rising order) of k Objects taken from Set n

also written as: permutations of n objects taken k at a time

A

nPk = n!/(n-k)!

Example: how many 5 digit positive integers can be using 1,2,3,4,5,6,7 if none can occur more than once?

  1. n = 7 k = 5
  2. 7!/(7-5)! = 2,520
44
Q

No. of combinations (objects not placed in order) of k Objects taken from Set n

also written as: permutations of n objects taken k at a time

A

nCk = n!/k!(n-k)!

Example: How many ways to select a 3 person committee from group of 9?

  1. n= 9 k = 3
  2. 9!/3!(9-3)! = 84
45
Q

Permutations nPk =

Combinations nCk =

A

Permutations = The number of ways to select AND ORDER k Objects from a set of n Objects

Combinations = The number of subsets of n that contain k objects

46
Q

Sample Space

A

Set of all possible outcomes

47
Q

Event

A

particular set of outcomes

48
Q

Probability that event (E) occurs =

A

P(E) = no. of outcomes that satisfy E / Number of total possible outcomes

49
Q

If event E is certain to occur P(E) =

A

P(E) = 1

50
Q

If event E is certain not to occur P(E) =

A

P(E) = 0

51
Q

IF event E is possible but not certain

A

0<p></p>

52
Q

Probability Event E wont happen =

A

1 - P(E)

53
Q

Sum of probabilities of all possible outcomes =

A

1

54
Q

The probability that both event E and F occur =

A

IF events E and F are independent: P (E and F) = P(E)P(F)

IF events E and F are mutually exclusive (cannot occur at same time) : P (E and F) = 0 - impossible

55
Q

The probability that event E or F or Both occur

A

IF events E and F are independent: P (E or F) = P(E) + P(F) - P (E and F)

IF events E and F are mutually exclusive (cannot occur at same time) : P (E and F) = P(E) + P(F)