Data Analytics Flashcards

1
Q

iloc example

A

df.iloc[1,2] - single cell (200)
df.iloc[2] - Entire row (1000, 2000, 3000, 4000)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

loc example

A

Same as iloc but with string headings
e.g.
df.loc[2,’a’]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

describe

A

Summary of a single column

df.[‘a’].describe()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Mean

A

The total of the figures, divided by the number of individual figures

1,2,2,3,2,4
Mean: 13/6 = 2.16666

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Median

A

The middle point

1,2,2,3,2,4 -> 1,2,2,2,3,4
Median: 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Mode

A

The most common Figure

1,2,2,3,2,4
Mode : 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Inter Qaurtile range

A

The Difference between the First and Third Qaurtile Values
Q1: 10
Q3: 50

IQR: 40

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Nominal

A

Categorisation without order e.g. the books are in: English, French, German etc.

Distinctiveness ( = and != )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Ordinal

A

Categorisation with order e.g. the coffee was: Good, Medium, Bad

Distinctiveness ( = and != )
Order ( <,<=,>,>= )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Interval

A

Scale with an arbitrary zero value e.g. temperature, shoe size, dates

Distinctiveness ( = and != )
Order ( <,<=,>,>= )
Addition ( + and - )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Ratio

A

Scale with a non-arbitrary zero value e.g. distance, age, speed etc.

Distinctiveness ( = and != )
Order ( <,<=,>,>= )
Addition ( + and - )
Multiplication ( * and / )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

NOIR

A

Qualitative:
Nominal
Ordinal

Quantatitive:
Interval
Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

DOAM

A

Distinctiveness (=, !=)
Ordering (<, <=, >, >=)
Addition (+, -)
Multiplication (*, /)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Nominal : Binary

A

1/0, On/Off, Yes/No, True/False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Normal Distribution

A

Standard Bell Curve

Mode, mean and Median are in the centre

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Left skewed

A

Tail is on the left, Hump on the right

Left: Mean
Middle: Median
Right: Mode

“You’re mean when you walk away”

17
Q

Right skewed

A

Tail on the right, hump on the left

Left: Mode
Middle: Median
Right: Mean

“You’re mean when you walk away”

18
Q

Tuple

A

stores data but cant be changed

myTuple = (1,2,3)

19
Q

List in relation to tuple

A

Like a tuple but can be changed

myList = [1,2,3]

20
Q

List

A

ordered collection of elements supporting mixed data types

21
Q

Array

A

similar to a list but all must be of the same type

22
Q

2D array or matrix

A

a grid of elements with uniform data types

23
Q

DataFrame

A

two dimensional, potentially tabular data structure with labelled axes, allowing different data types for each column

e.g. SQL, or CSV

24
Q

Measures of Dispersion

A

Standard Deviation, and Variance

25
Q

Variance

A

The averages of the squared differences form the mean

26
Q

Standard Deviation in relation to variance

A

The square root of the variance

27
Q

Standard Deviation (small and large)

A

Smaller: data points tend closer to the mean
Larger: data points have greater variability

28
Q

Correlation

A

Measures the strength and direction of the linear relationship between two variables

29
Q

Correlation: -1, 1

A

1 = Perfect positive correlation
-1 = Perfect negative correlation

30
Q

Covariance

A

The degree to which two variables change together in a dataset

31
Q

Strong and Weak Correlation

A

Strong Correlation: High degree of association between the two variables.
Weak Correlation: Low degree of association between the two variables.