Data Analytics Flashcards

1
Q

iloc example

A

df.iloc[1,2] - single cell (200)
df.iloc[2] - Entire row (1000, 2000, 3000, 4000)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

loc example

A

Same as iloc but with string headings
e.g.
df.loc[2,’a’]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

describe

A

Summary of a single column

df.[‘a’].describe()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Mean

A

The total of the figures, divided by the number of individual figures

1,2,2,3,2,4
Mean: 13/6 = 2.16666

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Median

A

The middle point

1,2,2,3,2,4 -> 1,2,2,2,3,4
Median: 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Mode

A

The most common Figure

1,2,2,3,2,4
Mode : 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Inter Qaurtile range

A

The Difference between the First and Third Qaurtile Values
Q1: 10
Q3: 50

IQR: 40

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Nominal

A

Categorisation without order e.g. the books are in: English, French, German etc.

Distinctiveness ( = and != )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Ordinal

A

Categorisation with order e.g. the coffee was: Good, Medium, Bad

Distinctiveness ( = and != )
Order ( <,<=,>,>= )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Interval

A

Scale with an arbitrary zero value e.g. temperature, shoe size, dates

Distinctiveness ( = and != )
Order ( <,<=,>,>= )
Addition ( + and - )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Ratio

A

Scale with a non-arbitrary zero value e.g. distance, age, speed etc.

Distinctiveness ( = and != )
Order ( <,<=,>,>= )
Addition ( + and - )
Multiplication ( * and / )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

NOIR

A

Qualitative:
Nominal
Ordinal

Quantatitive:
Interval
Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

DOAM

A

Distinctiveness (=, !=)
Ordering (<, <=, >, >=)
Addition (+, -)
Multiplication (*, /)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Nominal : Binary

A

1/0, On/Off, Yes/No, True/False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Normal Distribution

A

Standard Bell Curve

Mode, mean and Median are in the centre

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Left skewed

A

Tail is on the left, Hump on the right

Left: Mean
Middle: Median
Right: Mode

“You’re mean when you walk away”

17
Q

Right skewed

A

Tail on the right, hump on the left

Left: Mode
Middle: Median
Right: Mean

“You’re mean when you walk away”

18
Q

Tuple

A

stores data but cant be changed

myTuple = (1,2,3)

19
Q

List in relation to tuple

A

Like a tuple but can be changed

myList = [1,2,3]

20
Q

List

A

ordered collection of elements supporting mixed data types

21
Q

Array

A

similar to a list but all must be of the same type

22
Q

2D array or matrix

A

a grid of elements with uniform data types

23
Q

DataFrame

A

two dimensional, potentially tabular data structure with labelled axes, allowing different data types for each column

e.g. SQL, or CSV

24
Q

Measures of Dispersion

A

Standard Deviation, and Variance

25
Variance
The averages of the squared differences form the mean
26
Standard Deviation in relation to variance
The square root of the variance
27
Standard Deviation (small and large)
Smaller: data points tend closer to the mean Larger: data points have greater variability
28
Correlation
Measures the strength and direction of the linear relationship between two variables
29
Correlation: -1, 1
1 = Perfect positive correlation -1 = Perfect negative correlation
30
Covariance
The degree to which two variables change together in a dataset
31
Strong and Weak Correlation
Strong Correlation: High degree of association between the two variables. Weak Correlation: Low degree of association between the two variables.