General Flashcards

Question 1

Q

OpenSearch Service

Answer

A

vector database.
store and retrieve vectors as high-dimensional points.
include capabilities for efficient and fast lookup of nearest neighbors in the N-dimensional space.
suitable to store information for RAG use

Question 2

Q

K-means clustering

Answer

A

is a popular unsupervised machine learning algorithm used for partitioning a dataset into a pre-defined number of clusters

Question 3

Q

Pre-training bias metrics

Answer

A

Class Imbalance (CI)
Label Imbalance (DPL)
Kullback-Leibler Divergence (KL)
Jensen-Shannon Divergence (JS)
Lp-norm (LP)
Total Variation Distance (TVD)
Kolmogorov-Smirnov (KS)
Conditional Demographic Disparity (CDD)

https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-measure-data-bia

Question 4

Q

Post-training bias metrics

Answer

A

Difference in Positive Proportions in Predicted Labels (DPPL)
Disparate Impact (DI)
Difference in Conditional Acceptance (DCAcc)
Difference in Conditional Rejection (DCR)
Specificity difference (SD)
Recall Difference (RD)
Difference in Acceptance Rates (DAR)
Difference in Rejection Rates (DRR)
Accuracy Difference (AD)
Treatment Equality (TE)
Conditional Demographic Disparity in Predicted Labels (CDDPL)
Counterfactual Fliptest (FT)
Generalized entropy (GE)

https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-measure-post-tra

Question 5

Q

Partial dependence plots (PDP)

Answer

A

show the dependence of the predicted target response on a set of input features of interest.

Question 6

Q

Shapley values

Answer

A

determine the contribution that each feature made to model predictions.
method (solution concept) for fairly distributing the total gains or costs among a group of players who have collaborated.

Question 7

Q

The difference in proportions of labels (DPL)

Answer

A

compares the proportion of observed outcomes with positive labels for facet d with the proportion of observed outcomes with positive labels of facet a in a training dataset

Question 8

Q

Weight

Answer

A

Multiplies the input value, controlling its influence on the output.

Question 9

Q

Bias

Answer

A

Adds a constant term, allowing the model to fit the data better by shifting the activation function.

Question 10

Q

Text embeddings

Answer

A

represent meaningful vector representations of unstructured text such as documents, paragraphs, and sentences. You input a body of text and the output is a (1 x n) vector. You can use embedding vectors for a wide variety of applications.

Question 11

Q

Amazon Fraud Detector

Answer

A

is a fully managed service that you can use to detect fraudulent activities. Examples of fraudulent activities include fraudulent transactions or the creation of fake accounts.

General Flashcards

(11 cards)