Chapter 3 - Forensic Flashcards

Question 1

Q

What is the Napier approach and how does it relate to your framework?

Answer

A

Napier approach - Bayesian hierarchical model for compositional data with structural zeros
accounts for zeros by splitting the data manually based on presence and absence of elements prior to fitting the model
my work builds upon this by automating the split of the compositional elements to make it more accessible for real-world situations
done through conducting clustering to the data prior to fitting the model
proposing an integrated clustering approach which clusters the items in the model

Question 2

Q

How does your integrated clustering model work and why is it beneficial?

Answer

A

combines class membership inference and parameter estimation within the MCMC
Unlike pre-clustering approaches, which fix cluster labels in advance, this method treats class labels as latent variables, updating them during sampling
accounts for classification uncertainty
minimising user decisions and manual input

Question 3

Q

What were the main features of the forensic glass data?

Answer

A

elemental proportions of glass fragments which contain a large proportion of zero values
observation belongs to a known type of glass (e.g., window, container), with repeated measurements per item and fragment
goal is to classifying fragments to their original glass use type

Question 4

Q

How did you handle high zero counts in elements like Fe?

Answer

A

split the data into subsets / groups based on the presence and absence of compositional elements
results in subsets which contain solely absent components
reduce the impact that the large proportion of zeros may have on the data

Question 5

Q

Why was oxygen chosen as the denominator for the compositional ratios?

Answer

A

require a component that is always non-zero
oxygen is always present this was chosen to be the denominator

Question 6

Q

What is the rationale for the pre-clustering vs. integrated clustering approaches?

Answer

A

pre-clustering: first step at automating the Napier approach
automates through clustering glass items prior to fitting the model
still requires subjective decisions
integrated clustering: treats cluster labels as latent variables and models these within the model
using the probability of an element being present or absent as a prior
potentially leads to more optimal clustering as not solely based on an indicator matrix

Question 7

Q

What are the advantages of your integrated clustering method?

Answer

A

flexible and widely applicable
removes manual intervention
no need for any pre-treatment to the data
avoids bias from fixed labels

Question 8

Q

How did you measure classification performance?

Answer

A

correct classification rates
Brier Score - assess accuracy of prediction model
Expected Calibration Error (ECE) - assess how well predicted probabilities aligned with observed

Question 9

Q

What role did cross-validation play in your analysis?

Answer

A

Five-fold cross-validation was used to provide robust evaluation across all models by allowing each item to be a ‘test’ item once (unknown glas use type)

Note that ‘test’ data usually refers to data for which the value of the response is treated as unknown, in this case this would be the compositions. Here, however, the compositions can be seen by the model and the item type is the unknown quantity to be predicted.

Question 10

Q

How did you deal with unknown class labels in MCMC chains?

Answer

A

unknown class labels were treated as latent variables within the model
values were sampled during MCMC, informed by the data ihood under each class and prior class probabilities

Question 11

Q

What are the implications of your findings for forensic science?

Answer

A

offer a robust interpretable tool for forensic glass analysis
tool allows for forensic glass fragments to be classified to help assess if they are the same glass type as a crime scene

Question 12

Q

Can you explain how you handled the hierarchical structure in the forensic glass application?

Answer

A

hierarchical structure was handled through fitting a hierarchical model with both fixed and random effects
observations from the same glass use type shared parameters, accounting for within-type correlation

Question 13

Q

Why did you choose a square-root transformation in some models?

Answer

A

applied to stabilise variance and approximate normality for transformed compositional ratios
avoids any imputation to be conducted in order to take the transformation (e.g. vs the log)

Question 14

Q

What were the key challenges in modelling the forensic glass data?

Answer

A

High number of zero values - Fe
Small sample sizes for some glass use types - headlamp
Compositional nature of the data, complicating traditional modelling techniques
Hierarchical nature of the data - multiple measurements per fragment per item

Question 15

Q

How did your models perform compared to standard classification approaches on the glass data?

Answer

A

achieved higher classification accuracy
lower Brier Score
better calibration (ECE)

Question 16

Q

What were the key evaluation metrics used to assess model performance?

Answer

Study These Flashcards

A

Classification accuracy: to quantify the proportion of correctly classified items
Brier Score: assess the accuracy of prediction model through the mean squared error between predicted probabilities and true outcomes
Expected Calibration Error (ECE): assess how well the predicted probabilities align with observed

Question 17

Q

How did you ensure fair comparisons between your models?

Answer

Study These Flashcards

A

all approaches were evaluated using identical cross-validation data
transformation of compositional ratio was identical across approaches

Question 18

Q

What are Brier scores and Expected Calibration Error, and what do they indicate?

Answer

Study These Flashcards

A

Brier Score: assess accuracy of prediction models, quantifies the mean squared error of predicted probabilities
lower scores indicate better predictive accuracy and confidence calibration
Expected Calibration Error (ECE): assesses how well the predicted probabilities align, measures the difference between predicted probabilities and the actual observed
lower ECE indicates that predicted probabilities closely match real-world outcomes

Question 19

Q

What does a low Brier Score indicate in your models?

Answer

Study These Flashcards

A

indicate better predicitve accuracy
confidence in the predicted values

Question 20

Q

How did you use the Expected Calibration Error (ECE)?

Answer

Study These Flashcards

A

ECE was used to assess the reliability of probabilistic predictions
computed by binning predicted probabilities and comparing average prediction confidence to the observed frequency in each bin

Question 21

Q

In what scenarios did your methods outperform traditional methods the most?

Answer

Study These Flashcards

A

overall correct classification of all glass use types
correct classification of car and building windows
lowest Brier score for car and building windows
lowest ECE

Question 22

Q

How might your forensic glass model be used in real forensic investigations?

Answer

Study These Flashcards

A

classification of evidence: assigning recovered glass fragments to potential glass use types with quantified uncertainty
evaluating evidential strength: offering posterior probabilities
enhancing computational efficiency: providing easier computation of probabilistic classification glass use type as done in model

Question 23

Q

How generalisable are your methods to other types of compositional data?

Answer

Study These Flashcards

A

applicable to other compositional data applications which contain a large proportion of zeros and a hierarchical structure

Example: soil compositions with multiple measurements

Question 24

Q

What are the main limitations of your approaches?

Answer

Study These Flashcards

A

computational cost running integrated clustering approach
poor classification of headlamps - could be improved

Chapter 3 - Forensic Flashcards

(24 cards)