Machine Learning Quizzes - Nikolai Flashcards
What is the primary benefit of using Amazon SageMaker notebooks for machine learning tasks?
They provide an interactive environment with full managed infrastructure for machine learning tasks.
Amazon SageMaker notebooks are fully managed Jupyter notebooks that provide an interactive environment for managing machine learning tasks. They handle the setup and configuration of the underlying infrastructure, including servers, environments, and software.
Which feature of SageMaker notebooks ensures that your work and data are not lost between sessions?
Persistent Storage
SageMaker notebooks come with persistent storage, ensuring that data and work are saved and can be accessed later, even between sessions.
What is a key difference between a SageMaker notebook instance and a SageMaker Studio notebook?
Studio notebooks are integrated within SageMaker Studio and allow managing multiple notebooks.
SageMaker Studio notebooks are part of SageMaker Studio, offering an integrated environment where users can manage multiple notebooks, access persistent storage, and perform tasks within a unified interface.
Which of the following is a key feature of SageMaker Data Wrangler?
It provides over 300 pre-configured data transformations out of the box.
SageMaker Data Wrangler provides more than 300 pre-configured transformations to help with tasks like handling missing values and encoding categorical data.
What is one of the key features of SageMaker Data Wrangler that helps in transforming categorical features into numerical ones?
One-hot encoding
One-hot encoding is a common technique used to convert categorical data into numerical data, which is essential for certain machine learning algorithms that require numerical input. SageMaker Data Wrangler offers this transformation out-of-the-box.
Which of the following is NOT a capability of SageMaker Data Wrangler?
Creating SQL queries to fetch data.
SageMaker Data Wrangler does not create SQL queries to fetch data. It focuses on importing data, transforming it, and visualizing it within the SageMaker environment.
What is the benefit of using the “human in the loop” feature in SageMaker Ground Truth?
It allows a combination of human input and machine learning models to improve labeling accuracy.
“Human in the loop” allows SageMaker Ground Truth to combine human feedback with machine learning models, improving labeling efficiency and accuracy over time.
What is a primary function of SageMaker Ground Truth?
Labeling datasets for machine learning model training.
SageMaker Ground Truth is primarily used for labeling datasets, which is crucial for preparing data for machine learning model training.
What is one of the main purposes of SageMaker Feature Store?
To store, manage, and share features for machine learning models.
SageMaker Feature Store is used to store, manage, and share features across different machine learning models and teams, helping streamline feature reuse and collaboration.
What is a key advantage of using built-in algorithms in Amazon SageMaker?
They are optimized for specific use cases and abstract away infrastructure management.
Built-in algorithms in SageMaker are optimized for specific tasks and come pre-packaged, making it easier to use without needing to manage infrastructure or perform complex setups.
Which type of learning is not one of the four main types of SageMaker’s built-in algorithms?
Reinforcement Learning
Reinforcement learning is not included in the four main categories of built-in algorithms in SageMaker.
What is the primary benefit of using SageMaker Jumpstart for machine learning development?
It provides access to pre-trained models and prebuilt solutions, reducing development time.
SageMaker Jumpstart allows users to quickly develop and deploy machine learning models by providing pre-trained models and prebuilt solutions, minimizing the time spent on development.
What is a hyperparameter in the context of machine learning?
A configuration that controls how a model learns and operates, set before training begins.
Hyperparameters are configurations set before the training process begins that control how a model learns and operates, such as the depth of a decision tree.
Which of the following is true about Amazon SageMaker’s automatic model tuning feature?
It runs multiple training jobs with different hyperparameter combinations to find the best model.
SageMaker’s automatic model tuning runs multiple training jobs with different hyperparameter combinations to find the best-performing model based on the chosen performance metric.
Which hyperparameter tuning technique ensures that every possible combination of hyperparameters is tested?
Grid search
Grid search tests every possible combination of specified hyperparameters by creating a grid and testing each combination in sequence.
What is the primary advantage of using custom Docker containers in Amazon SageMaker?
Custom containers allow full control over the training and inference environment, including specific libraries and operating system choices.
Custom Docker containers provide full control over the environment, allowing users to include their own code, dependencies, libraries, and operating system.
When using a custom Docker container for training in SageMaker, which service is used to store and retrieve the Docker image?
Amazon Elastic Container Registry (ECR)
Amazon ECR is used to store Docker images, which are then pulled by SageMaker to run training jobs or inference.
In SageMaker, what must a custom Docker container expose during inference to function correctly?
A REST API endpoint
During inference, a custom Docker container must expose a REST API endpoint for SageMaker to make predictions.
What is the main purpose of using SageMaker Experiments?
To track, organize, and compare multiple model training runs.
SageMaker Experiments helps track, organize, and compare different model training runs (called trials) to identify the best configuration. It captures all the details of hyperparameters, datasets, and performance metrics to assist in the comparison process.
In SageMaker Experiments, what is a trial component?
A step or stage within a training run, such as data preprocessing or model evaluation.
A trial component in SageMaker Experiments represents the various stages of a machine learning workflow, such as data preprocessing, model training, or evaluation. It helps capture the metadata for each stage.
What is the primary function of SageMaker Clarify?
To detect and mitigate biases in machine learning models
SageMaker Clarify is designed to detect and mitigate biases in machine learning models, ensuring fairness and transparency.
Which of the following is a method used by SageMaker Data Wrangler to address class imbalance in datasets?
Random Undersampling
Random undersampling is one of the techniques used in SageMaker Data Wrangler to handle class imbalance by reducing the number of samples in the majority class.
What is the primary purpose of Amazon SageMaker Debugger?
To monitor and debug training jobs in real-time
SageMaker Debugger enables real-time monitoring and debugging of model training jobs, allowing users to detect issues such as vanishing gradients or overfitting during the training process.
Which type of issue can SageMaker Debugger detect during model training?
Overfitting
SageMaker Debugger can detect common training issues like overfitting, vanishing gradients, and underfitting during model training.