Retail Credit Risk Flashcards

(56 cards)

1
Q

define retail lending

A

exposure to an individual/small business, and guaranteed by such person

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are 4 examples of retail lending.

A

credit cards
residential mortgages
small business facilities
installment loans

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are two characteristics of retail lending

A

low individual exposure
managed collectively rather than individually

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is a credit risk score

A

a total number of points that predicts a borrower’s future repayment performance based on historical information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is a scorecard

A

a mathematical algorithm used to generate a score for rank-order risk analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are scorecards used for

A
  1. lending decisions
  2. mitigation of portfolio credit risk
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are two benefits of using a scorecard

A

easy to interpret
easy to monitor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are the 6 stages in model development

A
  1. business objectives
  2. data preparation
  3. model development
  4. model approval
  5. model deployment
  6. monitoring
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are 3 aspects of business objectives

A
  1. key issues
  2. expectations for the model
  3. structure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

define key issues

A

trends, challenges and concerns outlined by the business

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

define structure

A

project team members, data and timeline

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are the 5 C’s of data preparation

A
  1. Comprehensiveness
  2. Clean
  3. Consistent
  4. Current
  5. Caretaking
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

define comprehensiveness

A

ensuring the data captures the full scope and complexity of the underlying information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

define clean

A

ensuring the accuracy of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

define consistent

A

ensuring the uniformity of the data across different sources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

define current

A

ensuring the data is up to date

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

define caretaking

A

the ongoing management of the data to preserve its quality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what are 6 aspects of the data preparation in the model development lifecycle

A

the 5 C’s
exclusion criteria
timeframe
defining the target and explanatory variables
segmentation (# of models)
sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what are three sources of exclusion criteria

A

scope
data errors
operational

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what two periods are involved in the timeframe of model creation

A
  1. observation period
  2. performance period
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what are two aspects of the observation periods

A
  1. for explanatory variables
  2. should be representative of the current/future environment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what are two aspects of performance periods

A
  1. for the target variable
  2. should be long enough to have a sufficient number of defaults.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what are the two modeling techniques

A
  1. industry standard
  2. other methodologies
24
Q

compare the advantages of the two modeling techniques

A

industry-standard:
1. few variables
2. expert judgment

other methods:
1. many variables
2. one step for variable reduction and model fitting
3. adaptive learning

25
compare the disadvantages of the two modeling techniques
industry-standard: 1. few variables 2. distributional assumptions 3. separate steps for variable reduction and model fitting other methods: 1. many variables 2. risk of overfitting
26
what are the 5 steps of the industry standard model development technique?
1. variable transformation 2. variable reduction 3. model fitting 4. scorecard scaling 5. scorecard assessment
27
what technique can be used in variable transofrmation
weight of evidence
28
define variable reduction
removing any variable that cannot be used or doesnt make sense
29
what are two techniques for variable reduction
1. grouping 2. variable clustering
30
what is grouping
creating bins within a variable
31
what are three benefits of grouping?
i. Accounts for non-linear relationship between the target and explanatory variables. ii. Accounts for outliers iii. Allows for the treatment of missing values as a separate category.
32
how should grouping be performed?
1. Start by creating 20 equal bins. 2. Calculate the WOE of each bin. 3. Collapse bins with similar WOE. 4. Remove variables with weak IV.
33
what is variable clustering
grouping correlated variables together such that variables within a cluster are highly correlated and variables between of clusters are uncorrelated two reduce the multicollinearity of the model.
34
which two variables should represent the cluster then using variable clustering?
1. the variable with the highest IV 2. the variable with the lowest 1-R^2
35
what are two aspects of model fitting in the industry standard technique?
variable selection: forward, backwards, ridge lasso assumptions that historical experiences predict future behaviour and that consumer behaviour will not change significantly
36
define scorecard scaling when using the industry standard technique
raw scores are scaled to a three digit number
37
what is the formula in score in scorecard scaling
score=offset+(factor⋅ln⁡(2⋅odds) )-PDO
38
what are the 3 types of scorecard assessment
1. rank ordering 2. population stability 3. benchmarking
39
what are the 5 evaluation metrics used in rank ordering scorecard assessment
1. KS statistic 2. misclassification 3. ROC curve 4. accuracy ratio 5. lift chart
40
what does population stability do
quantify population differences by measuring the shift between two sample distributions
41
what is the formula for the population shift index (PSI)
PSI=∑[(N_bin-B_bin )⋅ln⁡(N_bin/B_bin ) ]
42
what values of PSI indicate: no significant shift, a minor shift, a significant shift
<0.1: no significant shift 0.1-0.25: minor shift >0.25: significant shift
43
what is benchmarking
comparing a scorecard to an existing scorecard
44
what is the KS statistic
the maximum difference between the CDFs of the distributions of defaults and non-defaults
45
what is misclassification
the confusion matrix
46
what is the ROC curve
the probability a randomly chosen non-default will be ranked righter than a randomly chosen default; plots the true positive rate against the false positive rate
47
what is the formula of the accuracy ratio
AR=GINI/(Perfect GINI)
48
what is the GINI index
the area between the Lorenz and random curve
49
what is a lift chart
the cumulative % of defaults per decile divided by the total population % of defaults.
50
what does weight of evidence do
transforms explanatory variables into a set of groups based on the similarity of the target variable distributions.
51
what does WOE measure
how strong a group is at separating defaults from non-defaults
52
what does a negative WOE signify?
more defaults than non-defaults
53
what is the formula for WOE
WOE=ln⁡[((# non-defaults)/(total non-defaults))/((# defaults)/(total defaults))]
54
what is a variable's information value
the predictive power of a single variable (its ability to separate defaults from non-defaults)
55
what is the formula for information value
IV=∑[[(# non-defaults)/(total non-defaults)-(# defaults)/(total defaults)]⋅WOE_i
56
what IV value ranges indicate: very weak weak moderate strong
<0.02: very weak 0.02-0.1: weak 0.1-0.3: moderate 0.3+: strong