Perf. Concepts: Query Optimization Flashcards
(26 cards)
What does the History tab display?
Query history for past 14 days
Users can view other users’ queries but cannot view their query results.
What are the components of the Db Order of Execution?
Rows - FROM, JOIN, WHERE
Groups - GROUP BY, HAVING
Result - SELECT, DISTINCT, ORDER BY, LIMIT
What is ‘Join Explosion’?
A situation where excessive joins lead to a significant increase in the size of the result set.
What does ‘Spilling to Disk’ refer to?
Bytes spilled to location / remote storage
What is a key consideration for using Group By?
Consider cardinality. If too high, maybe not a good group by.
What is a Metadata cache?
Highly available metadata store which maintains metadata object info and stats
What types of queries can be completed using metadata without requiring a VW?
- SELECT COUNT(*)
- SELECT SYSTEM$WHITELIST()
- SELECT CURRENT_DATABASE_()
- DESCRIBE TABLE
- SHOW TABLES
What is the Results Cache duration?
24 hr / 31 days
What must a new query do to reuse results from the Results Cache?
Must exactly match the previous query conditions
What are the conditions for using the Results Cache?
- Underlying table data has not changed
- Same role is used in previous query
- Time context functions like CURRENT_TIME() cannot be used
What is the Warehouse Cache also known as?
SSD cache, data cache, raw data cache
What does Local Disk Cache improve?
Performance by reading from cache instead of slower remote blob storage
What is a Materialized View (MV)?
A pre-computed & persisted dataset derived from a SELECT query
How are MVs updated?
Via a background process ensuring data is current and consistent with the base table
What is not recommended for Materialized Views?
Creating MVs on data with high churn
What are the limitations of Materialized Views?
- Single table
- JOIN
- UDF
- HAVING
- ORDER BY
- LIMIT
- WINDOW FUNCTIONS
What does Clustering improve?
Micro-partition pruning
What clustering metadata does Snowflake maintain?
- Total # of Micro-partitions
- # of Overlapping micro-partitions
- Depth of overlapping micro-partitions
What is Automatic Clustering?
Specifying one or more table columns/expressions as clustering key for a table
What does clustering aim to do?
Co-locate data of the clustering key in the same micro-partition
What is recommended for choosing a Clustering Key?
A max of 3 or 4 columns
What happens to a clustered table as DML operations are performed?
The table becomes less clustered
What is re-clustering?
A background process which reorganizes data into micro-partitions by the clustering key
What is the Search Optimization Service?
A table level property aimed at improving performance of selective point lookup queries