Week 9: Time Series, Imbalanced Data & Fairness Flashcards

1
Q

REVERSED

F1 = (2*R*P)/(R+P)

A

What is the F1 measure?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

REVERSED

  • Sampling data from a stream
  • Queries over sliding windows
  • Counting distinct elements
A

What are 3 problems with a data stream?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

REVERSED

AEO(diff) = [(P1-P2) + (P3-P4)] /2

A

What is AEO(diff)?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

REVERSED

Maintain a count of the number of distinct elements seen so far

A

What is counting distinct elements?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

REVERSED

  • A single sensitive (protected) attribute defining demographic groups
  • Find privileged and unprivileged groups based on the sensitive attributes and the decision label
  • Checking parity between demographic groups
  • Cannot always identify hidden unfairness
A

What is statistical fairness?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

REVERSED

  • Store all the first s elements of the stream to S
  • We have seen n-1 elements, now the nth element arrives
  • With probability s/n, keep the nth element, otherwise discard it
  • If we picked the nth element, then it replaces one of the element s in sample S, picked uniformly at random
A

What is reservoir sampling?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

REVERSED

might introduce artificial minority class examples too deeply in the majority class space

A

What is a problem with SMOTE?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

REVERSED

  • Cost is the penalty associated with an incorrect prediction, goal is to minimise the cost
  • Based on the classifier predicted probabilities
  • Binary traditional case: predict positive if probability is > 0.5
  • Probability threshold can be changed using a cost matrix
  • Classify as positive if: probability of positive > FP/FP+FN
A

What is cost sensitive classification?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

REVERSED

  • Define multiple subgroups in a dataset, check parity between these subgroups
  • A statistical constraint is needed
A

What is group fairness?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

REVERSED

D = {X,S,Y} is a dataset
* X: the set of attributes that do not contain sensitive information regarding individuals
* S: the set of sensitive attributes containing sensitive information
* Y/Y*: either 0 or 1 is the original/predicted class label of individuals, which indicates the decision outcome
* G/G’: the values of the unprivileged/privileged group

A

What are the symbols used for defining fairness metrics?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

REVERSED

  • Divide the data into two equal time ranges
  • Calculate the average of the observations in each of the two time ranges. plot the average at the mid-point of each time range.
  • Draw a straight line between the two points
A

How does the semi average method work for finding the trend?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

REVERSED

PP and EO need original and model

DP, DI and consistency can be computed from either the original or the model

A

Which fairness metrics need the original dataset and the model?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

REVERSED

  • Naive forecasting
  • Simple mean
  • Moving average
  • Weighted moving average
  • Exponential smoothing
A

What are 5 methods for forecasting the trend?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

REVERSED

  • A smaller a makes the forecast more stable
  • A larger a makes the forecast more responsive
A

What do different values of a do for an exponential smoothing forecast?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

REVERSED

bias in the training datasets

A

Where does bias in algorithms come from?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

REVERSED

  • Forecasts are more accurate for aggregated data than for individual items
  • Forecast are more accurate for shorter than longer time periods
A

What makes demand forecasts more accurate?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

REVERSED

series which are measures of activities to specific dates e.g. retail, balance of payments

A

What is a flow series?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

REVERSED

  • Sensitive attributes should not affect the outcome labels
  • Identify “proxy” attributes that are related to the protected attributes
A

What is causal fairness?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

REVERSED

  • Collect more data - difficult in many domains
  • Delete data from the majority class
  • Create synthetic data
  • Adapt your learning algorithm (cost sensitive classification)
  • Random over/under sampling
A

What are 5 options for handling imbalanced data?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

REVERSED

  1. Take the difference between a sample point and one of its nearest neighbours
  2. Multiply the difference by a random number between 0 and 1 and add it to the feature vector
A

What are the steps of creating data with SMOTE?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

REVERSED

Balanced accuracy = (sensitivity + specificity)/2

A

What is the balanced accuracy measure?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

REVERSED

  • Pick a hash function h that maps each of the N elements to at least log2(N) bits
  • For each stream element a, let r(a) be the number of trailing 0s in h(a)
  • r(a) = position of first 1 counting from the right (including 0)
  • Record R = the maximum r(a) seen
  • Estimated number of distinct elements = 2^R
A

What is the Flajolet-Martin approach?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

REVERSED

  • a is small -> more weight for the past parameters
  • a is large -> more weight for the present trend
A

What do a high and low alpha represent in exponential smoothing?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

REVERSED

  • Synthetic minority over-sampling techniques (SMOTE)
  • Creates new data points from the minority class
A

What is SMOTE?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

REVERSED

EO states that instances from protected and unprotected groups should have equal true positive rate (TPR) and false positive rate (FPR)

  • P1 = P[Y*(x) = 1 | S(x) = G’, Y(x) = 1]
  • P2 = P[Y*(x) = 1 | S(x) = G, Y(x) = 1]
  • P3 = P[Y*(x) = 1 | S(x) = G’, Y(x) = 0]
  • P4 = P[Y*(x) = 1 | S(x) = G, Y(x) = 0]
  • For a classifier to be fair: P1=P2 and P3=P4
A

What is equalised odds difference?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

REVERSED

measures of activity at a point in time e.g. employment

A

What is a stock series?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

REVERSED

  • Eliminate the discrimination from the final predictions
  • Change the predicted outcomes of classifiers by accessing a hold out set that was not involved in the training of the model
A

What is post-processing for mitigation?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

REVERSED

-The instances in both protected(unprivileged) and unprotected(privileged) groups should have equal probability of being predicted as positive outcome
DP(diff) = P[Y(x) = 1 | S(x) = G’] - P[Y(x) = 1 | S(x) = G] = approx 0
-This metric takes values between 0 and 1 where 0 is the optimal

A

What is Demographic Parity (DP) Difference?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

REVERSED

The long term growth or decline of the series

A

What is trend?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

REVERSED

Inductive hypothesis: after n elements, the sample S contains each element seen so far with probability s/n
Inductive step: for elements already in S, the probability that the algorithm keeps it in S is:…n/n+1

So, at time n the tuples in S were there with probability s/n, then at time n+1 the tuple stayed in s with probability n/n+1, so the probability that a tuple is in S at time n+1 is (s/n)*(n/n+1) = s/n+1

A

How do you prove that each element is picked with equal probability in reservoir sampling using mathematical induction?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

REVERSED

* A classifier is fair in terms of predictive parity if the probability that an example is positive in the original dataset given that it is predicted positive from both protected and unprotected groups is the same
* P[Y(x) = 1 | Y*(x) = 1, S(x) = G] = P[Y(x) = 1 | Y*(x) = 1, S(x) = G’]

A

What is predictive parity?

32
Q

REVERSED

  • Networks are difficult to converge
  • The goal is for generator and discriminator to reach some desired equilibrium but this is rare
  • GANs are yet to converge on large problems
A

What are 3 problems with GANs?

33
Q

REVERSED

MSE = sum(yt = y*t)^2 / T - T1 + 1

T=total number of samples in time series
T1 = index of first value to be forecasted
yt = actual value
y*t = predicted value

A

What is the formula for the MSE for testing forecast accuracy?

34
Q

REVERSED

Random under-sampling: randomly delete data points from the majority class - problem with loss of information

A

What is random undersampling and its problem?

35
Q

REVERSED

Classifiers try to reduce the overall error (increase the accuracy) so they can be biased towards the majority class

A

What is the imbalanced data problem?

36
Q

REVERSED

  • Semi-average
  • Moving average
  • Least-square
  • Exponential smoothing
A

What are 4 common methods for measuring the trend?

37
Q

REVERSED

  • Data enters at a high speed rate
  • The system cannot store the entire steam, but only a small fraction
A

What is a data streams model?

38
Q

REVERSED

Pre-process the dataset only - try to transform the data so the underlying discrimination is removed

A

What is pre-processing for mitigation?

39
Q

REVERSED

An individual fairness metric measures how similar the labels are for the similar instances in a dataset based on the k-neighbours of the instance
-Takes values between 0 and 1 where 1 is the optimal

A

What is consistency?

40
Q

REVERSED

historical bias in the decision variable
less informative features
biased data collection
imbalanced representation of different demographic groups

A

What are reasons for biased data? (4)

41
Q

REVERSED

a pattern of change that recurs regularly over time

A

What is seasonal variation?

42
Q

REVERSED

  • Trend
  • Seasonal variation
  • Cyclical variation
  • Irregular variation
A

What are the 4 components of time series?

43
Q

REVERSED

Next period’s forecast = average of previously observed data
Yt+1 = (Y1 + Y2 + … = Yt)/t

A

What is simple mean forecasting?

44
Q

REVERSED

Next periods forecast = previous period’s actual
Yt+1 = Yt

A

What is naive forecasting?

45
Q

REVERSED

Given a set of points (xi, yi), find the best fitting line f(xi) = a + bxi such that SSE = sum (yi - f(xi)^2 is minimised

A

How does least squared liner regression work for finding the trend?

46
Q

REVERSED

Cyclical variations have recurring patterns but with a longer and more erratic time scale compared to seasonal variations

A

What is cyclical variation?

47
Q

REVERSED

  • Huge columns of continuous data, possible infinite
  • Fast changing and required fast, real-time response
  • Random access is expensive - single scan algorithms
A

What are characteristics of a data streams model?

48
Q

REVERSED

  • An irregular (or random) variation in a time series occurs over varying (usually short) periods
  • It follows no pattern and is by nature unpredictable
  • Irregular variation cannot be explained mathematically
A

What is irregular variation?

49
Q

REVERSED

Random oversampling: randomly duplicate data points from the minority class - problem with overfitting and fixed boundaries

A

What is random oversampling and its problem?

50
Q

REVERSED

Must split the data into train/test sets and perform preprocessing on just the training data

A

What do you have to do when performing SMOTE?

51
Q

REVERSED

  • Keep the most recent k items
  • Upon the arrival of a new item from the stream, discard the oldest item
A

What is the sliding window model for data streams?

52
Q

REVERSED

  • Past history is used to flatten out short term fluctuations Sx = ay + (1-a)Sx-1
  • Sx = the smoothed value for observation x
  • y = the actual observation at time x
  • Sx-1 = the smoothed value previously calculated for observation at time x-1
  • a = the smoothing constant where 0 <= a <= 1
A

What is the forumla for exponential smoothing for finding the trend?

53
Q

REVERSED

  • Adjust the time series
  • Seasonally adjusted data = actual values / seasonal index *100
A

How do you remove the seasonal effect?

54
Q

REVERSED

  • root mean squared error
  • mean absolute error (MAE)
  • tracking signal = sum(yt - y*t)/MAE
A

What are 3 other measures for testing forecast accuracy?

55
Q

REVERSED

  1. Determine the number of samples n
  2. Allocate mid point in time and replace the time points by their corresponding x values by increasing and decreasing one unit from the mid point accordingly
  3. The dependent variable is “y”
  4. Compute sum(xi^2) and sum(xi*yi), where sum(xi) is 0
  5. Find y = a+bx where b = sum(xi*yi)/sum(xi^2) and a = sum(yi)/n
A

What are the steps for finding the values of a and b for least squares linear regression?

56
Q

REVERSED

regularly spaced peaks and troughs

A

How can you identify seasonality in a time series?

57
Q

REVERSED

Fb = (1+B^2)(R*P)/(B^2*P + R)

A

What is the Fb measure?

58
Q

REVERSED

estimate the counts in an unbiased way. Accept that the count may have a little error, but limit the probability that the error is large

A

What if you do not have the space to maintain the set of elements?

59
Q

REVERSED

  • The generator tries to mimic examples from a training dataset, which is sampled from the true data distribution. Does this by transforming a random source of noise received as input into a synthetic sample. The objective of the generative network is to increase the error rate of the discriminative network
  • The discriminator receives a sample, but it is not told where the sample comes from. It’s job is to predict whether it is a data sample or a synthetic sample. The objective of the discriminate network is to decrease the binary classification loss
A

What are the roles of the generator and discriminator in GANs?

60
Q

REVERSED

  • Individuals with similar features except the sensitive (protected) attributes must have the same/similar outcomes
  • A similarity/distance measure is needed
  • Requires strong assumptions regarding the relationship between features and the decision label
A

What is individual fairness?

61
Q

REVERSED

Suffers from propagation error

A

What is a problem with exponential smoothing?

62
Q

REVERSED

  • Adjust/tune the classification algorithm
  • Applied during the model training
A

What is in-processing for mitigation?

63
Q

REVERSED

  • Maintain a sample size S of exactly s samples
  • Suppose at time n we have n items
  • Each sample is in the sample S with equal probability s/n
A

What is sampling a fixed sample size?

64
Q

REVERSED

a set of observations measured at specified, usually equal time intervals

A

What is a time series?

65
Q

REVERSED

-Also called rolling window
Next periods forecast = simple average of the last k periods
Yt+1 = (Yt-k+1 + Yt-k+2 + … + Yt) / k

A

What is moving average forecasting? What is another name for it?

66
Q

REVERSED

Generative adversarial networks (GANs)
-System of two neural networks (generator and discriminator) competing against each other in a zero-sum framework: improvement in one model come at cost to performance of other model
Can learn to draw samples from a model that is similar to the original data

A

What is GANs?

67
Q

REVERSED

  • A smaller k makes the forecast more responsive
  • A larger k makes the forecast more stable
A

What do different values of k do for a moving average forecast?

68
Q

REVERSED

Naive solution: generate a random integer in [0..9] for each query. store query if the integer is 0, otherwise discard
Problem: as the stream grows. the sample size will also grow

A

What is sampling a fixed proportion? What is the problem with it?

69
Q

REVERSED

  • Simple average method
  • Take the average for each period (period mean) over at least 3 years
  • Express each value as an index by comparing it to the average of all periods over the same period of time (divide actual value by period mean to get index)
A

How do you calculate the seasonal index?

70
Q

REVERSED

high degree of irregularity in original or seasonal-adjusted series or, abrupt change in the time series characteristics of the original data

A

What can cause the usefulness of trend estimates to decline?

71
Q

REVERSED

  • The discriminator becomes too strong too quickly and the generator ends up not learning anything
  • The generator only learns very specific weaknesses of the discriminator
  • The generator learns only a very small subset of the true data distribution
A

What are 3 of the ways that GANs can fail?

72
Q

REVERSED

  • Fairness through unawareness: deletes the sensitive attributes in a dataset
  • Preferential sampling (re-sampling): data objects are sampled with replacement
  • Massaging (relabeling): changes the actual class labels of some of the instances in the training set
  • Reweighing: assigns weights to each instance in the training set
A

What are 4 examples of pre-processing for mitigation?

73
Q

REVERSED

Next periods forecast = weighted average of the last k periods with
Yt+1 = c1Yt-k+1 + … + ckYt
with c1+c2 … + ck = 1

A

What is weighted moving average forecasting?

74
Q

REVERSED

  • Based on the premise that if values in a time series are averaged over a sufficient period, the effect of short term variations will be reduced
  • The degree of smoothing can be controlled by selecting the number of cases to be included in the average
  • a 5-year moving average: for one year, get the average of the 2 previous years, current year and two ahead years. this is the average for that year. compute for each year and plot.
A

How does the moving average method work for finding the trend?

75
Q

REVERSED

majority of the data coming from one class

A

What is imbalanced data?

76
Q

REVERSED

-The ratio between the probability of protected and unprotected groups getting positive or desired outcomes
DI(D) = P[Y(x) = 1 | S(x) = G] / P[Y(x) = 1 | S(x) = G’]
-A dataset or a classifier is considered fair (by law) if its DI-ratio is between 0.8 and 1.25 (1 is the optimal)

A

What is Disparate Impact (DI) ratio

77
Q

REVERSED

Next periods forecast = weighted average of the previous reading and the history
Yt+1 = aYt + (1-a)Y*t
y*t is the prediction for y*t from exponential smoothing

A

What is exponential smoothing for trend forecasting?