Adversarial Forensics/ DS Theory/ Deep Web Forensics Flashcards

1
Q

Considering an adversarial set-up between a forgerer (F) and a forensic analyst (FA), we can state that …

Select one or more:

a. FA never cares about false negatives
b. FA wants to detect fake data
c. F aims at maximizing the probability that the data are classified as valid by FA with minimum distortion
d. F aims at altering the data so that FA detect them as valid.
e. F aims at maximizing the probability that the data are classified as valid by FA regardless of distortion

A

b. FA wants to detect fake data
c. F aims at maximizing the probability that the data are classified as valid by FA with minimum distortion
d. F aims at altering the data so that FA detect them as valid.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

According to Dempster-Schafer theory, the belief associated to a given evidences or hypothesis …

Select one or more:

a. … is always lower than the plausibility.
b. … can be higher than probability.
c. … it correspond to 1 minus the plausibility of that evidence/hypothesis
d. … characterizes how much the evidence/hypothesis is provable.

A

a. … is always lower than the plausibility.

d. … characterizes how much the evidence/hypothesis is provable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Given a set of data valid sources that can be modeled as Bernoulli processes with probability 𝑝=0.8,0.85,0.75 and a fake data source characterized by probability 𝑞=0.9, which target probability should a forgerer obtain by altering the input data (assuming that the forensic analyst wants to zero the probability of rejecting valid data)?

A

0,85

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which among the following adversarial ML attacks is the LEAST computationally expensive?

Select one:

a. Jacobian-based Saliency Map Attack (JSMA)
b. Fast Gradient Sign method (FGSM)
c. Deepfool
d. Broyden-Fletcher-Goldfarb-Shanno method (BFGS)
e. Carlini & Wagner

A

b. Fast Gradient Sign method (FGSM)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Given a set of data valid sources that can be modeled as Bernoulli processes with probability p= 0.57 , 0.6, 0.55 and a fake data source characterized by probability q=0.74, which reference probability should a forensic analyst choose in order to avoid discarding valid data and minimize the probability of accepting fake data?

Assume that the forensic analyst compute the difference between the estimated probability and a reference plausible values with an acceptance threshold equal to 0,2; assume also that he wants to zero the probability of rejecting valid data.

A

0,40

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

According to the DS theory, the value of the mass function for the empty set ∅
is 0.

Select one:
True
False

A

true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Among the following, select the wrong statement about the Frechet Inception Distance or FID.

Select one:

a. It is robust to noise
b. It models real and fake data as a multivariate distributions.
c. It is not affected by visual artifacts
d. It can be computed using mean and covariance matrix for fake and real data only.

A

c. It is not affected by visual artifacts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Given three IDS whose evidence results can be combined using DS theory, we can state that …

Select one or more:

a. … you must have a mass entry also for null intersections
b. … order of binary combination does not matter.
c. … you must have a mass entry for each non-null intersection

A

b. … order of binary combination does not matter.

c. … you must have a mass entry for each non-null intersection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Given a set of data valid sources that can be modeled as Bernoulli processes with probability p= 0.59 , 0.71, 0.61 and a fake data source characterized by probability q=0.91, which target probability should a forgerer obtain by altering the input data? Assume that the forensic analyst compute the difference between the estimated probability and a reference plausible values; assume also that he wants to zero the probability of rejecting valid data.

A

0.71

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In adversarial machie learning, which of the following attack combination scenarios prove to be the most challenging …

Select one:

a. Blackbox individual attack
b. Blackbox targeted attack
c. Whitebox universal attack.
d. Whitebox targeted attack

A

Blackbox targeted attack

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

According to Dempster-Schafer theory, the domain of the mass function is the universe set 𝑋={𝑎,𝑏,𝑐,…} corresponding to the different possible events.

Select one:
True
False

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

According to the DS theory, plausibility is the probability that A is provable (supported) by the evidence.

Select one:
True
False

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Which of the following attack types are most likely adressing an IDS?

Select one:

a. Poisoning attack
b. Evasion attack
c. Model extraction

A

b. Evasion attack

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In clean data learning for anomaly detection, we can NOT use the training set to create …

Select one:

a. A dynamical system from Partial Differential Equations
b. ARMA models
c. Recurrent Neural Networks
d. A multiclass SVM classifier

A

d. A multiclass SVM classifier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Which of the following loss function can lead to a vanishing gradient problem in a GAN?

Select one:

a. 𝐿(𝐷,𝐺)=−0.5𝐸𝑧[log𝐷(𝐺(𝑧))]
b. 𝐿(𝐷,𝐺)=𝐷(𝑥)−𝐷(𝐺(𝑧))
c. 𝐿(𝐷,𝐺)=𝐸𝑥[log𝐷(𝑥)]+𝐸𝑧[log(1−𝐷(𝐺(𝑧)))]

A

c. 𝐿(𝐷,𝐺)=𝐸𝑥[log𝐷(𝑥)]+𝐸𝑧[log(1−𝐷(𝐺(𝑧)))]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Given a GAN where the generator function can be denoted as 𝐺(𝑧) and the discriminator function as 𝐷(𝑥) (𝐷(𝑥)=0 implies x is fake, 𝐷(𝑥)=1 implies x is real), the function to be minimized (maximized) by the generator (discriminator) is

𝐿(𝐷,𝐺)=𝐸[𝑙𝑜𝑔𝐷(𝑥)]+𝐸[𝑙𝑜𝑔(1−𝐷(𝐺(𝑧)))]
The distribution of real data can be modeled by the function 𝑝𝑟(𝑥)
while the fake data follows the distribution 𝑝𝑔(𝑥)
It is therefore possible to state …

Select one or more:

a. if 𝑝𝑔→𝑝𝑟 then 𝐷(𝐺(𝑧))→1/2
b. if 𝑝𝑔→𝑝𝑟 then 𝐷(𝐺(𝑧))→0
c. The generator aims at having 𝐷(𝐺(𝑧))=0
d. the learning process can be assimilated to a MAP estimation problem.
e. the cost function takes into account the vanishing gradient problem at the generator.
f. The discriminator aims at having 𝐷(𝐺(𝑧))=0

A

f. The discriminator aims at having 𝐷(𝐺(𝑧))=0
a. if 𝑝𝑔→𝑝𝑟 then 𝐷(𝐺(𝑧))→1/2
d. the learning process can be assimilated to a MAP estimation problem.

17
Q
Among the weaknesses of Tor networks we can mention
Select one or more:
a. timing analysis 
b. packet stream profiling
c. attacks on the exit node
A

a. timing analysis

c. attacks on the exit node

18
Q

The metric Inception Score (IS) will provide a high values for GANs that …. (select the correct statement)

Select one:

a. generate realistic data 𝐱
b. aim at making 𝑝(𝑦|𝐱) as uniform as possible.
c. aim at making 𝑝(𝑦) as uniform as possible.
d. generate 𝑝(𝑦) with a low entropy.

A

c. aim at making 𝑝(𝑦) as uniform as possible.

19
Q
Assign each [belief,plausibility] couple to the correct statement.
[1,1]
[0,1]
[0.3,0.7]
[0.75,0.9]

the hypothesis is not totally possible,
I can not conclude anything on the hypothesis
I am sure that the hypothesis is true
Although not completely plausible, there is a strong belief that the hypothesis is true.

A

[1,1] → I am sure that the hypothesis is true.,
[0,1] → I can not conclude anything on the hypothesis,
[0.3,0.7] → the hypothesis is not totally possible,
[0.75,0.9] → Although not completely plausible, there is a strong belief that the hypothesis is true.

20
Q

In TOR network, …

Select one or more:

a. The Onion Proxy knows all the keys
b. data are encrypted at the Onion Proxy only
c. data are encrypted at each link
d. the whole routing path can be hidden

A

c. data are encrypted at each link

d. the whole routing path can be hidden

21
Q

In the Tor network, traffic profiling can be avoided …

Select one or more:

a. including padding packets in the stream
b. rerouting packets along different paths.
c. splitting streams into chunks of 512 bytes

A

a. including padding packets in the stream

c. splitting streams into chunks of 512 bytes

22
Q

In adversarial ML, which among the following definitions proves to be more suitable for an attack algorithm that aims at forcing the classifier into misclassifying a single sample into a predefined class and knows the detection algorithm, as well as its parameters

a. Individual untargeted whitebox attack
b. Individual targeted blackbox attack
c. Universal targeted whitebox attack
d. Individual targeted whitebox attack

A

d. Individual targeted whitebox attack

23
Q

Which among the following adversarial strategies perturbs one feature at a time?

Select one

a. BFGS
b. JSMA
c. C&W

A

b. JSMA

24
Q

Provide a description of Generative Adversarial Networks (GANs): define principles, structures, cost functions and related problems.

A
25
Q

Suppose that you want to evaluate a set of different GANs using the Frechet Inception Distance (FID) – Inception Score (IS). How does it work and what are the related issues?

A
26
Q

Considering the Frechet Inception Distance FID select the wrong among the following ones

a. It can detect mode dropping
b. It is sensitive to some artifacts
c. It favores detectable objects rather than realistic ones
d. it assumes a Gaussian distribution

A

c. It favores detectable objects rather than realistic ones

27
Q

Which among the following technologies is not offered as illegal services on-demand (Crime-as-a-Service or CaaS)

a. Adversarial Machine Learning
b. Deepfake creation
c. Ransomware
d. Phishing

A

a. Adversarial Machine Learning

28
Q

Considering the TOR protocol, select the wrong statement among the following ones

Select one:

a. Message m through relay nodes R1,R2,R3 is encrypted as K1(K2(K3(m)))
b. only the first router knows the destination node
c. padding packets can be used to keep the connection alive
d. using cells of 512 byte permits mitigating the problem of profiling

A

b. only the first router knows the destination node

29
Q

Describe Tor protocols and how they affect the network forensics investigations.

A
30
Q

Present the Dempster-Schafer theory and explain how it can be employed in sensor fusion.

A
31
Q

Given a system failure, two experts (John and Paul) are evaluating the different HIDS and NIDS of different machines to identify the source of the attack. System failure could be caused be Terminal A, Terminal B or Terminal C. John believes that the failure is due to terminal
1
A with a mass m1(A) = 0.8 or to terminal B with mass m1(B) = 0.1; uncertainty is 0.01. Paul believes that the failure is due to computer C with m2(C) = 0.9 or to the others with mass 0.05 (m2(A,B)); uncertainty is 0.05. Compute a Dempster-Schafer combination of John and Paul opinions. Describe each step and make a table of values.

A
32
Q

Present and discuss two adversarial scenarios in Digital Forensics.

A
33
Q

Generative Adversarial Network (GAN): definition, cost function to be minized and applications.

A