Snowflake - Udemy - exam 1 Flashcards

Question

Which of these types of VIEW does Snowflake support? (Select 3)

Answer 1

Standard View Secure View Materialized View Snowflake supports three types of views. Standard View, Secure View, and Materialized View. Standard View: It is a default view type. Its underlying DDL is available to any role with access to the view. When you create a standard view, Snowflake saves a definition of the view. Snowflake does not run the query. When someone accesses the view, that is when the query is run. The standard view will always execute as the owning role. Secure View: The secure view is exactly like a standard view, except users cannot see how that view was defined. Sometimes a secure view will run a little slower than a standard view to protect the information in a secure view. Snowflake may bypass some of the optimizations. Materialized View: A materialized view is more like a table. Unlike a standard or secure view, Snowflake runs the query right away when you create a materialized view. It takes the results set and stores that result set as a table in Snowflake. Because Snowflake is storing that materialized view as a table, creating micro partitions. Snowflake is creating metadata about those micro partitions. So when you query a materialized view, if you put a filter on the view, you get the same benefit of micro partition pruning that you would get from a table. With Snowflake, the materialized view is automatically refreshed every time there is a transaction against the base table. So it is always going to be in sync. If you want, you can also create a secure materialized view, which again will hide the logic from the user. A note about materialized views, because Snowflake is auto-refreshing them in the background, they use some credits, so there is a little bit of a cost there. Moreover, there is some storage, and Snowflake stores the result set as a table in Snowflake. So materialized views use more storage and compute than standard or secure views.

Answer 2

Only if the system estimate there's enough query load to keep the cluster busy for at least 6 minutes. In the Economy Scaling policy, Snowflake spins up an additional cluster only if the system estimates there’s enough query load to keep the cluster busy for a least 6 minutes.

Answer 3

SYSTEM$CLUSTERING_INFORMATION SYSTEM$CLUSTERING_DEPTH For example, if you have an EMPLOYEE table - you can run any of these queries to find the depth - SELECT SYSTEM$CLUSTERING_INFORMATION('EMPLOYEE'); SELECT SYSTEM$CLUSTERING_DEPTH('EMPLOYEE');

Answer 4

BUILD_STAGE_FILE_URL BUILD_STAGE_FILE_URL generates a Snowflake-hosted file URL to a staged file using the stage name and relative file path as inputs. A file URL permits prolonged access to a specified file. That is, the file URL does not expire. File URL: URL that identifies the database, schema, stage, and file path to a set of files. A role that has sufficient privileges on the stage can access the files.

Answer 5

Window Function A window function is any function that operates over a window of rows.

Answer 6

Brotli raw_deflate deflate gzip Zstandard bzip2 All of these are supported by Snowflake. Snowflake can automatically detect any of these compression methods except Brotli and Zstandard.

Answer 7

Truncate TRUNCATE will delete all of the data from a single table. So, once Monica truncates table t1, table t1's structure remains, but the data will be deleted. DELETE is usually used for deleting single rows of data.

Answer 8

Using Create Stage command A Directory table is not a separate database object; it stores a catalog of staged files in cloud storage. Roles with sufficient privileges can query a directory table to retrieve file URLs to access the staged files and other metadata. A directory table can be added explicitly to a stage when the stage is created (using CREATE STAGE) or later (using ALTER STAGE) with supplying directoryTableParams. directoryTableParams (for internal stages) ::= [ DIRECTORY = ( ENABLE = { TRUE | FALSE } [ REFRESH_ON_CREATE = { TRUE | FALSE } ] ) ] ENABLE = TRUE | FALSE Specifies whether to add a directory table to the stage. When the value is TRUE, a directory table is created with the stage.

Answer 9

True True, only the user who generated the scoped URL can use the URL to access the referenced file. I case of File URL, any role that has sufficient privileges on the stage can access the file.

Answer 10

False An account-level resource monitor does not override resource monitor assignments for individual warehouses. If either the account resource monitor or the warehouse resource monitor reaches its defined threshold and a suspend action has been defined, the warehouse is suspended.

Answer 11

ORGADMIN ORGADMIN role manages operations at the organizational level. More specifically, this role: Can create accounts in the organization. Can view all accounts in the organization (using SHOW ORGANIZATION ACCOUNTS) and all regions enabled for the organization (using SHOW REGIONS). Can view usage information across the organization.

Answer 12

Blocked IP Addresses If you provide both Allowed IP Addresses and Blocked IP Addresses, Snowflake applies the Blocked List first.

Answer 13

True Snowflake manages all aspects of how this data is stored — the organization, file size, structure, compression, metadata, statistics, and other aspects of data storage are handled by Snowflake. The data objects stored by Snowflake are not directly visible nor accessible by customers; they are only accessible through SQL query operations run using Snowflake.

Answer 14

Account Usage ACCESS_HISTORY view Access History in Snowflake refers to when the user query reads column data and when the SQL statement performs a data write operation, such as INSERT, UPDATE, and DELETE, along with variations of the COPY command, from the source data object to the target data object. The user access history can be found by querying the Account Usage ACCESS_HISTORY view.

Answer 15

False Users who are dropped or disabled in Snowflake are still able to log into their Okta accounts, but they will receive an error message when they attempt to connect to Snowflake. You must recreate or enable the user before they can log in.

Answer 16

Column omission Casts Truncation of Text Strings Column Reordering Snowflake supports transforming data while loading it into a table using the COPY command. Options include: Column reordering Column omission Casts Truncating text strings that exceed the target column length

Answer 17

True Explanation True. VALIDATION_MODE instructs the COPY command to validate the data files instead of loading them into the specified table; i.e., the COPY command tests the files for errors but does not load them. The command validates the data to be loaded and returns results based on the validation option specified: Syntax: VALIDATION_MODE = RETURN_n_ROWS | RETURN_ERRORS | RETURN_ALL_ERRORS RETURN_n_ROWS (e.g. RETURN_10_ROWS) - Validates the specified number of rows, if no errors are encountered; otherwise, fails at the first error encountered in the rows. RETURN_ERRORS - Returns all errors (parsing, conversion, etc.) across all files specified in the COPY statement. RETURN_ALL_ERRORS - Returns all errors across all files specified in the COPY statement, including files with errors that were partially loaded during an earlier load because the ON_ERROR copy option was set to CONTINUE during the load."

Answer 18

Setting the parameter USE_CACHED_RESULT to FALSE We can turn off the query result cache by setting the parameter USE_CACHED_RESULT to FALSE. Though the only reason we would really want to do this is if we are doing performance testing.

Answer 19

Amazon S3 Snowflake Internal Storage Google Cloud Storage Microsoft Azure Blob Storage Snowflake supports loading data from files staged in any of the following locations, regardless of the cloud platform for your Snowflake account: Internal (i.e. Snowflake) stages, Amazon S3, Google Cloud Storage, Microsoft Azure blob storage

Answer 20

Standard Enterprise Business Critical Virtual Private Snowflake Snowflake is available in four editions: Standard, Enterprise, Business Critical, and Virtual Private Snowflake (VPS). Standard comes with most of the available features. Enterprise adds on to Standard with things like: extra days of time travel, materialized view support, and data masking. Business Critical brings to the table: HIPAA support, Tri-secret Secure, and more. And Virtual Private Snowflake is everything that Business Critical has, but with the ability to have customer-dedicated metadata stores and customer-dedicated virtual service.

Answer 21

Materialized Views Analytical Expression Casts on table columns Columns defined with COLLATE clause External Tables Column Concatenation None of these are currently supported by the Search Optimization Service. Additionally, Tables and views protected by row access policies cannot be used with the Search Optimization Search.

Answer 22

OAuth Key Pair Authentication Snowflake SQL API supports Oauth, and Key Pair authentication.

Answer 23

It is Permanent The expiration period of Scoped URL: The URL expires when the persisted query result period ends. The expiration period of the File URL: It is permanent. The expiration period of Pre-Signed URL: Length of time specified in the expiration_time argument.

Answer 24

The data will now retain for a shorter period of 20 days. Decreasing Retention reduces the amount of time data is retained in Time Travel: For active data modified after the retention period is reduced, the new shorter period applies. For data that is currently in Time Travel: If the data is still within the new shorter period, it remains in Time Travel. If the data is outside the new period, it moves into Fail-safe. For example, if you have a table with a 30-day retention period and you decrease the period to 20-day, data from days 21 to 30 will be moved into Fail-safe, leaving only the data from day 1 to 20 accessible through Time Travel. However, the process of moving the data from Time Travel into Fail-safe is performed by a background process, so the change is not immediately visible. Snowflake guarantees that the data will be moved, but does not specify when the process will complete; until the background process completes, the data is still accessible through Time Travel.

Answer 25

HyperLogLog Snowflake uses HyperLogLog to estimate the approximate number of distinct values in a data set. HyperLogLog is a state-of-the-art cardinality estimation algorithm, capable of estimating distinct cardinalities of trillions of rows with an average relative error of a few percent.

Answer 26

AUTO_REFRESH_REGISTRATION_HISTORY AUTO_REFRESH_REGISTRATION_HISTORY table function can be used to query the history of data files registered in the metadata of specified objects and the credits billed for these operations. The table function returns the billing history within a specified date range for your entire Snowflake account. This function returns billing activity within the last 14 days. Please note, STAGE_DIRECTORY_FILE_REGISTRATION_HISTORY table function can be used to query information about the metadata history for a directory table, including: Files added or removed automatically as part of a metadata refresh. Any errors found when refreshing the metadata.

Answer 27

Azure GCP AWS A Snowflake account can be hosted on any of the following cloud platforms: Amazon Web Services (AWS), Google Cloud Platform (GCP), Microsoft Azure (Azure). On each platform, Snowflake provides one or more regions where the account is provisioned.

Answer 28

UNDROP UNDROP is Snowflake's DDL (Data Definition Language) command.

Answer 29

Roles Snowflake supports Role-Based Access control. Permissions on database objects such as databases or tables are granted to Roles.

Answer 30

Consider when you can fully utilize a single warehouse by scheduling multiple concurrent tasks to take advantage of available compute resources. Consider when adherence to the schedule interval is less important. User-managed Tasks is recommended when you can fully utilize a single warehouse by scheduling multiple concurrent tasks to take advantage of available compute resources. Also, recommended when adherence to the schedule interval is less critical. Serverless Tasks is recommended when you cannot fully utilize a warehouse because too few tasks run concurrently or they run to completion quickly (in less than 1 minute). Also, recommended when adherence to the schedule interval is critical.

Answer 31

RECURSIVE => TRUE The expansion is performed for all sub-elements recursively by argument RECURSIVE => TRUE. Only the element referenced by PATH is expanded BY RECURSIVE => FALSE. The OUTER argument is used to handle the input rows that cannot be expanded, either because they cannot be accessed in the path or because they have zero fields or entries.

Answer 32

Enterprise Edition Business Critical VPS Dynamic Data Masking features require Enterprise Edition (or higher).

Answer 33

It does not affect the user's Snowflake sessions. However, to initiate any new Snowflake sessions, the user must log into the IdP again. After a specified period of time (defined by the IdP), a user’s session in the IdP automatically times out, but this does not affect their Snowflake sessions. Any Snowflake sessions that are active at the time remain open and do not require re-authentication. However, to initiate any new Snowflake sessions, the user must log into the IdP again.

Answer 34

24h Results are retained for 24 hours in Query Result Cache. Snowflake resets the 24-hour retention period for the result, up to a maximum of 31 days from the date and time that the query was first executed. After 31 days, the result is purged and the next time the query is submitted, a new result is generated and persisted.

Answer 35

1. Snowflake recommends a maximum of three or four columns (or expressions) per key 2. Clustering keys are not for every table. 3. Tables in multi-terabytes range are a good candidates for clustering keys. Clustering keys are not for every table. Tables in the multi-terabyte range are good candidates for clustering keys. Both automatic clustering and reclustering consume credit. A single clustering key can contain one or more columns or expressions. Snowflake recommends a maximum of three or four columns (or expressions) per key for most tables. Adding more than 3-4 columns tends to increase costs more than benefits.

Answer 36

False Search optimization is a table-level property and applies to all columns with supported data types. The search optimization service aims to significantly improve the performance of selective point lookup queries on tables. A point lookup query returns only one or a small number of distinct rows. A user can register one or more tables to the search optimization service.

Answer 37

REPLICATION_USAGE_HISTORY This REPLICATION_USAGE_HISTORY view in the Account Usage Schema can be used to query the replication history for a specified database. The returned results include the database name, credits consumed, and bytes transferred for replication. Usage data is retained for 365 days (1 year).

Answer 38

Insert-only Insert-only is supported for streams on external tables only. An insert-only stream tracks row inserts only; they do not record delete operations that remove rows from an inserted set (i.e. no-ops).

Answer 39

By specifying SINGLE - TRUE To unload data to a single output file (at the potential cost of decreased performance), specify the SINGLE = TRUE copy option in your statement. You can optionally specify a name for the file in the path.

Answer 40

30 Days All Snowflake-managed keys are automatically rotated by Snowflake when they are more than 30 days old. Active keys are retired, and new keys are created. When Snowflake determines the retired key is no longer needed, the key is automatically destroyed. When active, a key is used to encrypt data and is available for usage by the customer. When retired, the key is used solely to decrypt data and is only available for accessing the data.

Answer 41

Monica should run ALTER TASK COMMAND to resume the task The first time we create the TASK, we need to run the ALTER TASK command to RESUME the task.

Answer 42

Zero-copy cloning Live, ready to query data Share internally with private data exchange or externally with public data exchange Snowgrid allows you to use Secure Data Sharing features to provide access to live data, without any ETL or movement of files across environments.

Answer 43

USERADMIN USERADMIN role is dedicated to user and role management only. More specifically, this role: Is granted the CREATE USER and CREATE ROLE security privileges. Can create users and roles in the account. This role can also manage users and roles that it owns. Only the role with the OWNERSHIP privilege on an object (i.e. user or role), or a higher role, can modify the object properties.

Answer 44

TRUE True, both external (external cloud storage, such as, Amazon S3, Google Cloud Storage, Azure Blob Storage etc.) and internal (i.e. Snowflake) stages support unstructured data.

Answer 45

The connector creates the following objects for each topic: One internal stage to temporarily store data files for each topic. One pipe to ingest the data files for each topic partition. One table for each topic. If the table specified for each topic does not exist, the connector creates it; otherwise, the connector creates the RECORD_CONTENT and RECORD_METADATA columns in the existing table and verifies that the other columns are nullable (and produces an error if they are not).

Answer 46

3 to 4 A single clustering key can contain one or more columns or expressions. Snowflake recommends a maximum of 3 or 4 columns (or expressions) per key for most tables. Adding more than 3-4 columns tends to increase costs more than benefits.

Answer 47

Yes With federated authentication enabled on an account, Snowflake still allows maintaining and using Snowflake user credentials (login name and password). In other words: Account and security administrators can still create users with passwords maintained in Snowflake. Users can still log into Snowflake using their Snowflake credentials. However, if federated authentication is enabled for an account, Snowflake does not recommend maintaining user passwords in Snowflake. Instead, user passwords should be maintained solely in your IdP

Answer 48

Grant access (Select) to the specific tables in the database Grant access (Usage) to the database and the schema containing the table to share Shares are named Snowflake objects that encapsulate all of the information required to share a database. Each share consists of: The privileges that grant access to the database(s) and the schema containing the objects to share. The privileges that grant access to the specific objects in the database. The consumer accounts with which the database and its objects are shared. Example: CREATE SHARE "SHARED_DATA" COMMENT=''; GRANT USAGE ON DATABASE "DEMO_DB" TO SHARE "SHARED_DATA"; GRANT USAGE ON SCHEMA "DEMO_DB"."TWITTER_DATA" TO SHARE "SHARED_DATA"; GRANT SELECT ON VIEW "DEMO_DB"."TWITTER_DATA"."FOLLOWERS" TO SHARE "SHARED_DATA";

Answer 49

False A stored procedure runs with either the caller’s rights or the owner’s rights. It cannot run with both at the same time. A caller’s rights stored procedure runs with the privileges of the caller. The primary advantage of a caller’s rights stored procedure is that it can access information about that caller or about the caller’s current session. For example, a caller’s rights stored procedure can read the caller’s session variables and use them in a query. An owner’s rights stored procedure runs mostly with the privileges of the stored procedure’s owner. The primary advantage of an owner’s rights stored procedure is that the owner can delegate specific administrative tasks, such as cleaning up old data, to another role without granting that role more general privileges, such as privileges to delete all data from a specific table. At the time that the stored procedure is created, the creator specifies whether the procedure runs with owner’s rights or caller’s rights. The default is owner’s rights.

Answer 50

Return an entire table, including all rows in the table. The sampling method is optional. If no method is applied after the sample keyword, the default it takes is BERNOULLI

Answer 51

Users with the ACCOUNTADMIN role can view the billing for Automatic Clustering using Snowsight, the classic web interface, or SQL: Snowsight: Select Admin » Usage. Classic Web Interface: Click on Account tab » Billing & Usage The billing for Automatic Clustering shows up as a separate Snowflake-provided warehouse named AUTOMATIC_CLUSTERING. SQL:Query either of the following: AUTOMATIC_CLUSTERING_HISTORY table function (in the Snowflake Information Schema). AUTOMATIC_CLUSTERING_HISTORY View (in Account Usage).

Answer 52

Secure Data Sharing enables sharing selected objects in a database in your account with other Snowflake accounts. The following Snowflake database objects can be shared: Tables External tables Secure views Secure materialized views Secure UDFs Snowflake enables the sharing of databases through shares created by data providers and “imported” by data consumers.

Answer 53

SQL Classic Web UI Snowsight Only security administrators (i.e., users with the SECURITYADMIN role) or higher or a role with the global CREATE NETWORK POLICY privilege can create network policies using Snowsight, Classic Web Interface, and SQL.

Answer 54

SHOW PARAMETERS The SHOW PARAMETERS command determines whether a network policy is set on the account or for a specific user. For Account level: SHOW PARAMETERS LIKE 'network_policy' IN ACCOUNT; For User level : SHOW PARAMETERS LIKE 'network_policy' IN USER ; Example - SHOW PARAMETERS LIKE 'network_policy' IN USER john;

Answer 55

When the retention period ends for an object, the historical data is moved into Snowflake Fail-safe. Snowflake support needs to be contacted to get the data restored from Fail-safe.

Answer 56

If a policy is assigned to a user who already signed in, they can't do anything else until they sign and signed back in again to make use of the new policy

Answer 57

All of the Snowflake Editions (Standard, Enterprise, Business Critical, Virtual Private Snowflake) automatically store data in an encrypted state.

Answer 58

False The suspended warehouse can be easily resized. Resizing a suspended warehouse does not provision any new compute resources for the warehouse. It simply instructs Snowflake to provision the additional compute resources when the warehouse is next resumed, at which time all the usage and credit rules associated with starting a warehouse apply.

Answer 59

True Roles are the entities to which privileges on securable objects can be granted and revoked. Roles are assigned to users to allow them to perform actions required for business functions in their organization. A user can be assigned multiple roles. It allows users to switch roles (i.e., choose which role is active in the current Snowflake session) to perform different actions using separate sets of privileges.

Answer 60

Java Python SQL JavaScript User-defined functions (UDFs) let you extend the system to perform operations that are not available through the built-in, system-defined functions provided by Snowflake. Snowflake currently supports the following languages for writing UDFs: Java: A Java UDF lets you use the Java programming language to manipulate data and return either scalar or tabular results. JavaScript: A JavaScript UDF lets you use the JavaScript programming language to manipulate data and return either scalar or tabular results. Python: A Python UDF lets you use the Python programming language to manipulate data and return either scalar or tabular results. SQL: A SQL UDF evaluates an arbitrary SQL expression and returns either scalar or tabular results.

Answer 61

Users can use the DATA_RETENTION_TIME_IN_DAYS object parameter with the ACCOUNTADMIN role to set the default retention period for their account

Answer 62

Multi-cluster warehouses are best utilized for scaling resources to improve concurrency for users/queries. They are not as beneficial for improving the performance of slow-running queries or data loading. For these types of operations, resizing the warehouse provides more benefits.

Answer 63

SnowSQL is the primary tool used to load data to Snowflake from a local file system. You can run it in either interactive shell or batch mode.

Answer 64

UDF A UDF evaluates to a value and can be used in contexts in which a general expression can be used (e.g. SELECT my_function() ...). A stored procedure does not evaluate to a value, and cannot be used in all contexts in which a general expression can be used. For example, you cannot execute SELECT my_stored_procedure()....

Answer 65

Dawid is using a comparatively smaller warehouse. If a node has insufficient memory to complete its portion of a query, it will "spill" to local SSD storage. This can negatively impact performance but is sometimes acceptable. If a node has insufficient local SSD storage to complete its portion of a query, it will "spill" to remote cloud storage. This is almost always very bad for performance. The solution, in either case, is to simplify the SQL query or increase the warehouse size.

Answer 66

nothing will return / output of the input row will be omitted If you don’t specify OUTER argument with FLATTEN, it would be defaulted to FALSE. The OUTER => FALSE argument with FLATTEN omits the output of the input rows that cannot be expanded, either because they cannot be accessed in the path or because they have zero fields or entries.

Answer 67

Using pattern matching to identify specific files by pattern Pattern matching using a regular expression is generally the slowest of the three options for identifying/specifying data files to load from a stage; however, this option works well if you exported your files in named order from your external application and want to batch load the files in the same order.

Answer 68

A task can execute any one of the following types of SQL code: Single SQL statement Call to a stored procedure Procedural logic using Snowflake Scripting.

Answer 69

GET_ABSOLUTE_PATH returns the absolute path of a staged file using the stage name and path of the file relative to its location in the stage as inputs.

Answer 70

Users Roles Integrations Database and share replication are available in all editions, including the Standard edition. Replication of all other objects is only available for Business Critical Edition (or higher).

Answer 71

To load files whose metadata has expired, set the LOAD_UNCERTAIN_FILES copy option to true. The copy option references load metadata, if available, to avoid data duplication, but also attempts to load files with expired load metadata. Alternatively, set the FORCE option to load all files, ignoring load metadata if it exists. Note that this option reloads files, potentially duplicating data in a table.

Answer 72

Defining clustering keys for very large tables (in the multi-terabyte range) helps optimize table maintenance and query performance. Small tables are not a good candidate for clustering.

Answer 73

Snowflake supports - CSV, TSV, JSON, AVRO, ORC, PARQUET. Snowflake also supports XML which is a Preview feature as of now. EDI format is not supported by Snowflake.

Answer 74

Consider the trade-off between saving credits by suspending a warehouse versus maintaining the cache of data from previous queries to help with performance.

Answer 75

False A user cannot view the result set from a query that another user executed. This behavior is intentional. For security reasons, only the user who executed a query can access the query results. This behavior is not connected to the Snowflake access control model for objects. Even a user with the ACCOUNTADMIN role cannot view the results for a query run by another user.

Answer 76

False By default, Snowflake allows users to connect to the service from any computer or device IP address. A security administrator (or higher) can create a network policy to allow or deny access to a single IP address or a list of addresses.

Answer 77

When you recreate a pipe, if you do CREATE OR REPLACE PIPE, that load history is reset to empty, so Snowflake doesn't know which files we've already loaded.

Answer 78

VALIDATION_MODE instructs the COPY command to validate the data files instead of loading them into the specified table; i.e., the COPY command tests the files for errors but does not load them. The command validates the data to be loaded and returns results based on the validation option specified: Syntax : VALIDATION_MODE = RETURN_n_ROWS | RETURN_ERRORS | RETURN_ALL_ERRORS RETURN_n_ROWS (e.g. RETURN_10_ROWS) - Validates the specified number of rows, if no errors are encountered; otherwise, fails at the first error encountered in the rows. RETURN_ERRORS - Returns all errors (parsing, conversion, etc.) across all files specified in the COPY statement. RETURN_ALL_ERRORS - Returns all errors across all files specified in the COPY statement, including files with errors that were partially loaded during an earlier load because the ON_ERROR copy option was set to CONTINUE during the load.

Answer 79

Snowflake Time Travel enables accessing historical data (i.e. data that has been changed or deleted) at any point within a defined period. It serves as a powerful tool for performing the following tasks: Restoring data-related objects (tables, schemas, and databases) that might have been accidentally or intentionally deleted. - Duplicating and backing up data from key points in the past. Analyzing data usage/manipulation over specified periods of time.

Snowflake - Udemy - exam 1 Flashcards

(103 cards)