Which Snowflake feature allows administrators to identify unused data that may be archived or deleted?
Access history
Data classification
Dynamic Data Masking
Object tagging
The Access History feature in Snowflake allows administrators to track data access patterns and identify unused data. This information can be used to make decisions about archiving or deleting data to optimize storage and reduce costs.
What does the LATERAL modifier for the FLATTEN function do?
Casts the values of the flattened data
Extracts the path of the flattened data
Joins information outside the object with the flattened data
Retrieves a single instance of a repeating element in the flattened data
The LATERAL modifier for the FLATTEN function allows joining information outside the object (such as other columns in the source table) with the flattened data, creating a lateral view that correlates with the preceding tables in the FROM clause2345. References: [COF-C02] SnowPro Core Certification Exam Study Guide
Which Snowflake function will parse a JSON-null into a SQL-null?
TO_CHAR
TO_VARIANT
TO_VARCHAR
STRIP NULL VALUE
The STRIP_NULL_VALUE function in Snowflake is used to convert a JSON null value into a SQL NULL value1.
What happens when a Snowflake user changes the data retention period at the schema level?
All child objects will retain data for the new retention period.
All child objects that do not have an explicit retention period will automatically inherit the new retention period.
All child objects with an explicit retention period will be overridden with the new retention period.
All explicit child object retention periods will remain unchanged.
When the data retention period is changed at the schema level, all child objects that do not have an explicit retention period set will inherit the new retention period from the schema4.
What is the relationship between a Query Profile and a virtual warehouse?
A Query Profile can help users right-size virtual warehouses.
A Query Profile defines the hardware specifications of the virtual warehouse.
A Query Profile can help determine the number of virtual warehouses available.
A Query Profile automatically scales the virtual warehouse based on the query complexity.
A Query Profile provides detailed execution information for a query, which can be used to analyze the performance and behavior of queries. This information can help users optimize and right-size their virtual warehouses for better efficiency. References: [COF-C02] SnowPro Core Certification Exam Study Guide
Which metadata table will store the storage utilization information even for dropped tables?
DATABASE_STORAGE_USAGE_HISTORY
TABLE_STORAGE_METRICS
STORAGE_DAILY_HISTORY
STAGE STORAGE USAGE HISTORY
The TABLE_STORAGE_METRICS metadata table stores the storage utilization information, including for tables that have been dropped but are still incurring storage costs2.
What type of query will benefit from the query acceleration service?
Queries without filters or aggregation
Queries with large scans and selective filters
Queries where the GROUP BY has high cardinality
Queries of tables that have search optimization service enabled
The query acceleration service in Snowflake is designed to benefit queries that involve large scans and selective filters. This service can offload portions of the query processing work to shared compute resources, which can handle these types of workloads more efficiently by performing more work in parallel and reducing the wall-clock time spent in scanning and filtering2. References: [COF-C02] SnowPro Core Certification Exam Study Guide
Which ACCOUNT_USAGE schema database role provides visibility into policy-related information?
USAGE_VIEWER
GOVERNANCE_VIEWER
OBJECT_VIEWER
SECURITY_VIEWER
The GOVERNANCE_VIEWER role in the ACCOUNT_USAGE schema provides visibility into policy-related information within Snowflake. This role is specifically designed to access views that display object metadata and usage metrics related to governance12.
What factors impact storage costs in Snowflake? (Select TWO).
The account type
The storage file format
The cloud region used by the account
The type of data being stored
The cloud platform being used
The factors that impact storage costs in Snowflake include the account type (Capacity or On Demand) and the cloud region used by the account. These factors determine the rate at which storage is billed, with different regions potentially having different rates3.
What is the primary purpose of a directory table in Snowflake?
To store actual data from external stages
To automatically expire file URLs for security
To manage user privileges and access control
To store file-level metadata about data files in a stage
A directory table in Snowflake is used to store file-level metadata about the data files in a stage. It is conceptually similar to an external table and provides information such as file size, last modified timestamp, and file URL. References: [COF-C02] SnowPro Core Certification Exam Study Guide
What metadata does Snowflake store for rows in micro-partitions? (Select TWO).
Range of values
Distinct values
Index values
Sorted values
Null values
Snowflake stores metadata for rows in micro-partitions, including the range of values for each column and the number of distinct values1.
A column named "Data" contains VARIANT data and stores values as follows:
How will Snowflake extract the employee's name from the column data?
Data:employee.name
DATA:employee.name
data:Employee.name
data:employee.name
In Snowflake, to extract a specific value from a VARIANT column, you use the column name followed by a colon and then the key. The keys are case-sensitive. Therefore, to extract the employee’s name from the “Data” column, the correct syntax is data:employee.name.
Which statistics can be used to identify queries that have inefficient pruning? (Select TWO).
Bytes scanned
Bytes written to result
Partitions scanned
Partitions total
Percentage scanned from cache
The statistics that can be used to identify queries with inefficient pruning are ‘Partitions scanned’ and ‘Partitions total’. These statistics indicate how much of the data was actually needed and scanned versus the total available, which can highlight inefficiencies in data pruning34.
Which commands are restricted in owner's rights stored procedures? (Select TWO).
SHOW
MERGE
INSERT
DELETE
DESCRIBE
In owner’s rights stored procedures, certain commands are restricted to maintain security and integrity. The SHOW and DESCRIBE commands are limited because they can reveal metadata and structure information that may not be intended for all roles.
Which operation can be performed on Snowflake external tables?
INSERT
JOIN
RENAME
ALTER
Snowflake external tables are read-only, which means data manipulation language (DML) operations like INSERT, RENAME, or ALTER cannot be performed on them. However, external tables can be used for query and join operations3.
References: [COF-C02] SnowPro Core Certification Exam Study Guide
How can performance be optimized for a query that returns a small amount of data from a very large base table?
Use clustering keys
Create materialized views
Use the search optimization service
Use the query acceleration service
The search optimization service in Snowflake is designed to improve the performance of selective point lookup queries on large tables, which is ideal for scenarios where a query returns a small amount of data from a very large base table1. References: [COF-C02] SnowPro Core Certification Exam Study Guide
Which data types can be used in Snowflake to store semi-structured data? (Select TWO)
ARRAY
BLOB
CLOB
JSON
VARIANT
Snowflake supports the storage of semi-structured data using the ARRAY and VARIANT data types. The ARRAY data type can directly contain VARIANT, and thus indirectly contain any other data type, including itself. The VARIANT data type can store a value of any other type, including OBJECT and ARRAY, and is often used to represent semi-structured data formats like JSON, Avro, ORC, Parquet, or XML34.
References: [COF-C02] SnowPro Core Certification Exam Study Guide
What is the minimum Snowflake Edition that supports secure storage of Protected Health Information (PHI) data?
Standard Edition
Enterprise Edition
Business Critical Edition
Virtual Private Snowflake Edition
The minimum Snowflake Edition that supports secure storage of Protected Health Information (PHI) data is the Business Critical Edition. This edition offers enhanced security features necessary for compliance with regulations such as HIPAA and HITRUST CSF4.
Which Snowflake feature allows a user to track sensitive data for compliance, discovery, protection, and resource usage?
Tags
Comments
Internal tokenization
Row access policies
Tags in Snowflake allow users to track sensitive data for compliance, discovery, protection, and resource usage. They enable the categorization and tracking of data, supporting compliance with privacy regulations678. References: [COF-C02] SnowPro Core Certification Exam Study Guide
A user wants to access files stored in a stage without authenticating into Snowflake. Which type of URL should be used?
File URL
Staged URL
Scoped URL
Pre-signed URL
A Pre-signed URL should be used to access files stored in a Snowflake stage without requiring authentication into Snowflake. Pre-signed URLs are simple HTTPS URLs that provide temporary access to a file via a web browser, using a pre-signed access token. The expiration time for the access token is configurable, and this type of URL allows users or applications to directly access or download the files without needing to authenticate into Snowflake5.
References: [COF-C02] SnowPro Core Certification Exam Study Guide
A Snowflake user is writing a User-Defined Function (UDF) that includes some unqualified object names.
How will those object names be resolved during execution?
Snowflake will resolve them according to the SEARCH_PATH parameter.
Snowflake will only check the schema the UDF belongs to.
Snowflake will first check the current schema, and then the schema the previous query used
Snowflake will first check the current schema, and them the PUBLIC schema of the current database.
Object Name Resolution: When unqualified object names (e.g., table name without schema) are used in a UDF, Snowflake follows a specific hierarchy to resolve them. Here's the order:
Current Schema: Snowflake first checks if an object with the given name exists in the schema currently in use for the session.
PUBLIC Schema: If the object isn't found in the current schema, Snowflake looks in the PUBLIC schema of the current database.
Note: The SEARCH_PATH parameter influences object resolution for queries, not within UDFs.
References:
Snowflake Documentation (Object Naming Resolution): https://docs.snowflake.com/en/sql-reference/name-resolution.html
What is a characteristic of materialized views in Snowflake?
Materialized views do not allow joins.
Clones of materialized views can be created directly by the user.
Multiple tables can be joined in the underlying query of a materialized view.
Aggregate functions can be used as window functions in materialized views.
One of the characteristics of materialized views in Snowflake is that they allow multiple tables to be joined in the underlying query. This enables the pre-computation of complex queries involving joins, which can significantly improve the performance of subsequent queries that access the materialized view4. References: [COF-C02] SnowPro Core Certification Exam Study Guide
Which data formats are supported by Snowflake when unloading semi-structured data? (Select TWO).
Binary file in Avro
Binary file in Parquet
Comma-separated JSON
Newline Delimited JSON
Plain text file containing XML elements
Snowflake supports a variety of file formats for unloading semi-structured data, among which Parquet and Newline Delimited JSON (NDJSON) are two widely used formats.
B. Binary file in Parquet: Parquet is a columnar storage file format optimized for large-scale data processing and analysis. It is especially suited for complex nested data structures.
D. Newline Delimited JSON (NDJSON): This format represents JSON records separated by newline characters, facilitating the storage and processing of multiple, separate JSON objects in a single file.
These formats are chosen for their efficiency and compatibility with data analytics tools and ecosystems, enabling seamless integration and processing of exported data.
References:
Snowflake Documentation: Data Unloading
By default, which role has access to the SYSTEM$GLOBAL_ACCOUNT_SET_PARAMETER function?
ACCOUNTADMIN
SECURITYADMIN
SYSADMIN
ORGADMIN
By default, the ACCOUNTADMIN role in Snowflake has access to the SYSTEM$GLOBAL_ACCOUNT_SET_PARAMETER function. This function is used to set global account parameters, impacting the entire Snowflake account's configuration and behavior. The ACCOUNTADMIN role is the highest-level administrative role in Snowflake, granting the necessary privileges to manage account settings and security features, including the use of global account parameters.
References:
Snowflake Documentation: SYSTEM$GLOBAL_ACCOUNT_SET_PARAMETER
How does a Snowflake stored procedure compare to a User-Defined Function (UDF)?
A single executable statement can call only two stored procedures. In contrast, a single SQL statement can call multiple UDFs.
A single executable statement can call only one stored procedure. In contrast, a single SQL statement can call multiple UDFs.
A single executable statement can call multiple stored procedures. In contrast, multiple SQL statements can call the same UDFs.
Multiple executable statements can call more than one stored procedure. In contrast, a single SQL statement can call multiple UDFs.
In Snowflake, stored procedures and User-Defined Functions (UDFs) have different invocation patterns within SQL:
Option B is correct: A single executable statement can call only one stored procedure due to the procedural and potentially transactional nature of stored procedures. In contrast, a single SQL statement can call multiple UDFs because UDFs are designed to operate more like functions in traditional programming, where they return a value and can be embedded within SQL queries.References: Snowflake documentation comparing the operational differences between stored procedures and UDFs.
Which privilege is required to use the search optimization service in Snowflake?
GRANT SEARCH OPTIMIZATION ON SCHEMA
GRANT SEARCH OPTIMIZATION ON DATABASE
GRANT ADD SEARCH OPTIMIZATION ON SCHEMA
GRANT ADD SEARCH OPTIMIZATION ON DATABASE
To utilize the search optimization service in Snowflake, the correct syntax for granting privileges to a role involves specific commands that include adding search optimization capabilities:
Option C: GRANT ADD SEARCH OPTIMIZATION ON SCHEMA
Options A and B do not include the correct verb "ADD," which is necessary for this specific type of grant command in Snowflake. Option D incorrectly mentions the database level, as search optimization privileges are typically configured at the schema level, not the database level.References: Snowflake documentation on the use of GRANT statements for configuring search optimization.
The VALIDATE table function has which parameter as an input argument for a Snowflake user?
Last_QUERY_ID
CURRENT_STATEMENT
UUID_STRING
JOB_ID
The VALIDATE table function in Snowflake would typically use a unique identifier, such as a UUID_STRING, as an input argument. This function is designed to validate the data within a table against a set of constraints or conditions, often requiring a specific identifier to reference the particular data or job being validated.
References:
There is no direct reference to a VALIDATE table function with these specific parameters in Snowflake documentation. It seems like a theoretical example for understanding function arguments. Snowflake documentation on UDFs and system functions can provide guidance on how to create and use custom functions for similar purposes.
If a virtual warehouse runs for 61 seconds, shut down, and then restart and runs for 30 seconds, for how many seconds is it billed?
60
91
120
121
Snowflake bills virtual warehouse usage in one-minute increments, rounding up to the nearest minute for any partial minute of compute time used. If a virtual warehouse runs for 61 seconds and then, after being shut down, restarts and runs for an additional 30 seconds, the total time billed would be 120 seconds or 2 minutes. The first 61 seconds are rounded up to 2 minutes, and the subsequent 30 seconds are within a new minute, which is also rounded up to the nearest minute.
References:
Snowflake Documentation: Virtual Warehouses Billing
Which statement accurately describes Snowflake's architecture?
It uses a local data repository for all compute nodes in the platform.
It is a blend of shared-disk and shared-everything database architectures.
It is a hybrid of traditional shared-disk and shared-nothing database architectures.
It reorganizes loaded data into internal optimized, compressed, and row-based format.
Snowflake's architecture is unique in that it combines elements of both traditional shared-disk and shared-nothing database architectures. This hybrid approach allows Snowflake to offer the scalability and performance benefits of a shared-nothing architecture (with compute and storage separated) while maintaining the simplicity and flexibility of a shared-disk architecture in managing data across all nodes in the system. This results in an architecture that provides on-demand scalability, both vertically and horizontally, without sacrificing performance or data cohesion.
References:
Snowflake Documentation: Snowflake Architecture
For Directory tables, what stage allows for automatic refreshing of metadata?
User stage
Table stage
Named internal stage
Named external stage
For directory tables, a named external stage allows for the automatic refreshing of metadata. This capability is particularly useful when dealing with files stored on external storage services (like Amazon S3, Google Cloud Storage, or Azure Blob Storage) and accessed through Snowflake. The external stage references these files, and the directory table's metadata can be automatically updated to reflect changes in the underlying files.
References:
Snowflake Documentation: External Stages
How can a user get the MOST detailed information about individual table storage details in Snowflake?
SHOW TABLES command
SHOW EXTERNAL TABLES command
TABLES view
TABLE STORAGE METRICS view
To obtain the most detailed information about individual table storage details in Snowflake, the TABLE STORAGE METRICS view is the recommended option. This view provides comprehensive metrics on storage usage, including data size, time travel size, fail-safe size, and other relevant storage metrics for each table. This level of detail is invaluable for monitoring, managing, and optimizing storage costs and performance.
References:
Snowflake Documentation: Information Schema
What should be used when creating a CSV file format where the columns are wrapped by single quotes or double quotes?
BINARY_FORMAT
ESCAPE_UNENCLOSED_FIELD
FIELD_OPTIONALLY_ENCLOSED_BY
SKIP BYTE ORDER MARK
When creating a CSV file format in Snowflake and the requirement is to wrap columns by single quotes or double quotes, the FIELD_OPTIONALLY_ENCLOSED_BY parameter should be used in the file format specification. This parameter allows you to define a character (either a single quote or a double quote) that can optionally enclose each field in the CSV file, providing flexibility in handling fields that contain special characters or delimiters as part of their data.
References:
Snowflake Documentation: CSV File Format
A Snowflake user wants to temporarily bypass a network policy by configuring the user object property MINS_TO_BYPASS_NETWORK_POLICY.
What should they do?
Use the SECURITYADMIN role.
Use the SYSADMIN role.
Use the USERADMIN role.
Contact Snowflake Support.
To temporarily bypass a network policy by configuring the user object property MINS_TO_BYPASS_NETWORK_POLICY, the USERADMIN role should be used. This role has the necessary privileges to modify user properties, including setting a temporary bypass for network policies, which can be crucial for enabling access under specific circumstances without permanently altering the network security configuration.
References:
Snowflake Documentation: User Management
Which Snowflake data type is used to store JSON key value pairs?
TEXT
BINARY
STRING
VARIANT
The VARIANT data type in Snowflake is used to store JSON key-value pairs along with other semi-structured data formats like AVRO, BSON, and XML. The VARIANT data type allows for flexible and dynamic data structures within a single column, accommodating complex and nested data. This data type is crucial for handling semi-structured data in Snowflake, enabling users to perform SQL operations on JSON objects and arrays directly.
References:
Snowflake Documentation: Semi-structured Data Types
The effects of query pruning can be observed by evaluating which statistics? (Select TWO).
Partitions scanned
Partitions total
Bytes scanned
Bytes read from result
Bytes written
Query pruning in Snowflake refers to the optimization technique where the system reduces the amount of data scanned by a query based on the query conditions. This typically involves skipping unnecessary data partitions that do not contribute to the query result. The effectiveness of this technique can be observed through:
Option A: Partitions scanned. This statistic indicates how many data partitions were actually scanned as a result of query pruning, showing the optimization in action.
Option C: Bytes scanned. This measures the volume of data physically read during query execution, and a reduction in this number indicates effective query pruning, as fewer bytes are read when unnecessary partitions are skipped.
Options B, D, and E do not directly relate to observing the effects of query pruning. "Partitions total" shows the total available, not the impact of pruning, while "Bytes read from result" and "Bytes written" relate to output rather than the efficiency of data scanning.References: Snowflake documentation on performance tuning and query optimization techniques, specifically how query pruning affects data access.
Which Snowflake object contains all the information required to share a database?
Private listing
Secure view
Sequence
Share
In Snowflake, a Share is the object that contains all the information required to share a database with other Snowflake accounts. Shares are used to securely share data stored in Snowflake tables and views, enabling data providers to grant data consumers access to their datasets without duplicating data. When a database is shared, it can include one or more schemas, and each schema can contain tables, views, or both.
References:
Snowflake Documentation on Shares: Shares
Which command is used to upload data files from a local directory or folder on a client machine to an internal stage, for a specified table?
GET
PUT
CREATE STREAM
COPY INTO
To upload data files from a local directory or folder on a client machine to an internal stage in Snowflake, the PUT command is used. The PUT command takes files from the local file system and uploads them to an internal Snowflake stage (or a specified stage) for the purpose of preparing the data to be loaded into Snowflake tables.
Syntax Example:
PUT file://
This command is crucial for data ingestion workflows in Snowflake, especially when preparing to load data using the COPY INTO command.
Which Snowflake edition offers the highest level of security for organizations that have the strictest requirements?
Standard
Enterprise
Business Critical
Virtual Private Snowflake (VPS)
The Virtual Private Snowflake (VPS) edition offers the highest level of security for organizations with the strictest security requirements. This edition provides a dedicated and isolated instance of Snowflake, including enhanced security features and compliance certifications to meet the needs of highly regulated industries or any organization requiring the utmost in data protection and privacy.
References:
Snowflake Documentation: Snowflake Editions
A user has semi-structured data to load into Snowflake but is not sure what types of operations will need to be performed on the data. Based on this situation, what type of column does Snowflake recommend be used?
ARRAY
OBJECT
TEXT
VARIANT
When dealing with semi-structured data in Snowflake, and the specific types of operations to be performed on the data are not yet determined, Snowflake recommends using the VARIANT data type. The VARIANT type is highly flexible and capable of storing data in multiple formats, including JSON, AVRO, BSON, and more, within a single column. This flexibility allows users to perform various operations on the data, including querying and manipulation of nested data structures without predefined schemas.
References:
Snowflake Documentation: Semi-structured Data Types
Which service or feature in Snowflake is used to improve the performance of certain types of lookup and analytical queries that use an extensive set of WHERE conditions?
Data classification
Query acceleration service
Search optimization service
Tagging
The Search Optimization Service in Snowflake is designed to improve the performance of specific types of queries, particularly those involving extensive sets of WHERE conditions. By maintaining a search index on tables, this service can accelerate lookup and analytical queries, making it a valuable feature for optimizing query performance and reducing execution times for complex searches.
References:
Snowflake Documentation: Search Optimization Service
Which Snowflow object does not consume and storage costs?
Secure view
Materialized view
Temporary table
Transient table
Temporary tables in Snowflake do not consume storage costs. They are designed for transient data that is needed only for the duration of a session. Data stored in temporary tables is held in the virtual warehouse's cache and does not persist beyond the session's lifetime, thereby not incurring any storage charges.
References:
Snowflake Documentation: Temporary Tables
What information does the Query Profile provide?
Graphical representation of the data model
Statistics for each component of the processing plan
Detailed Information about I he database schema
Real-time monitoring of the database operations
The Query Profile in Snowflake provides a graphical representation and statistics for each component of the query's execution plan. This includes details such as the execution time, the number of rows processed, and the amount of data scanned for each operation within the query. The Query Profile is a crucial tool for understanding and optimizing the performance of queries, as it helps identify potential bottlenecks and inefficiencies.
References:
Snowflake Documentation: Understanding the Query Profile
What is used to denote a pre-computed data set derived from a SELECT query specification and stored for later use?
View
Secure view
Materialized view
External table
A materialized view in Snowflake denotes a pre-computed data set derived from a SELECT query specification and stored for later use. Unlike standard views, which dynamically compute the data each time the view is accessed, materialized views store the result of the query at the time it is executed, thereby speeding up access to the data, especially for expensive aggregations on large datasets.
References:
Snowflake Documentation: Materialized Views
What is the default value in the Snowflake Web Interface (Ul) for auto suspending a Virtual Warehouse?
1 minutes
5 minutes
10 minutes
15 minutes
The default value for auto-suspending a Virtual Warehouse in the Snowflake Web Interface (UI) is 10 minutes. This setting helps manage compute costs by automatically suspending warehouses that are not in use, ensuring that compute resources are efficiently allocated and not wasted on idle warehouses.
References:
Snowflake Documentation: Virtual Warehouses
Which role has the ability to create a share from a shared database by default?
ACCOUNTADMIN
SECURITYADMIN
SYSADMIN
ORGADMIN
By default, the ACCOUNTADMIN role in Snowflake has the ability to create a share from a shared database. This role has the highest level of access within a Snowflake account, including the management of all aspects of the account, such as users, roles, warehouses, and databases, as well as the creation and management of shares for secure data sharing with other Snowflake accounts.
References:
Snowflake Documentation: Roles
What are the benefits of the replication feature in Snowflake? (Select TWO).
Disaster recovery
Time Travel
Fail-safe
Database failover and fallback
Data security
The replication feature in Snowflake provides several benefits, with disaster recovery and database failover and fallback being two of the primary advantages. Replication allows for the continuous copying of data from one Snowflake account to another, ensuring that a secondary copy of the data is available in case of outages or disasters. This capability supports disaster recovery strategies by allowing operations to quickly switch to the replicated data in a different account or region. Additionally, it facilitates database failover and fallback procedures, ensuring business continuity and minimizing downtime.
References:
Snowflake Documentation: Data Replication
Which Snowflake mechanism is used to limit the number of micro-partitions scanned by a query?
Caching
Cluster depth
Query pruning
Retrieval optimization
Query pruning in Snowflake is the mechanism used to limit the number of micro-partitions scanned by a query. By analyzing the filters and conditions applied in a query, Snowflake can skip over micro-partitions that do not contain relevant data, thereby reducing the amount of data processed and improving query performance. This technique is particularly effective for large datasets and is a key component of Snowflake's performance optimization features.
References:
Snowflake Documentation: Query Performance Optimization
What is the only supported character set for loading and unloading data from all supported file formats?
UTF-8
UTF-16
ISO-8859-1
WINDOWS-1253
UTF-8 is the only supported character set for loading and unloading data from all supported file formats in Snowflake. UTF-8 is a widely used encoding that supports a large range of characters from various languages, making it suitable for internationalization and ensuring data compatibility across different systems and platforms.
References:
Snowflake Documentation: Data Loading and Unloading
How does Snowflake reorganize data when it is loaded? (Select TWO).
Binary format
Columnar format
Compressed format
Raw format
Zipped format
When data is loaded into Snowflake, it undergoes a reorganization process where the data is stored in a columnar format and compressed. The columnar storage format enables efficient querying and data retrieval, as it allows for reading only the necessary columns for a query, thereby reducing IO operations. Additionally, Snowflake uses advanced compression techniques to minimize storage costs and improve performance. This combination of columnar storage and compression is key to Snowflake's data warehousing capabilities.
References:
Snowflake Documentation: Data Storage and Organization
Which data types optimally store semi-structured data? (Select TWO).
ARRAY
CHARACTER
STRING
VARCHAR
VARIANT
In Snowflake, semi-structured data is optimally stored using specific data types that are designed to handle the flexibility and complexity of such data. The VARIANT data type can store structured and semi-structured data types, including JSON, Avro, ORC, Parquet, or XML, in a single column. The ARRAY data type, on the other hand, is suitable for storing ordered sequences of elements, which can be particularly useful for semi-structured data types like JSON arrays. These data types provide the necessary flexibility to store and query semi-structured data efficiently in Snowflake.
References:
Snowflake Documentation: Semi-structured Data Types
What is it called when a customer managed key is combined with a Snowflake managed key to create a composite key for encryption?
Hierarchical key model
Client-side encryption
Tri-secret secure encryption
Key pair authentication
Tri-secret secure encryption is a security model employed by Snowflake that involves combining a customer-managed key with a Snowflake-managed key to create a composite key for encrypting data. This model enhances data security by requiring both the customer-managed key and the Snowflake-managed key to decrypt data, thus ensuring that neither party can access the data independently. It represents a balanced approach to key management, leveraging both customer control and Snowflake's managed services for robust data encryption.
References:
Snowflake Documentation: Encryption and Key Management
Why would a Snowflake user decide to use a materialized view instead of a regular view?
The base tables do not change frequently.
The results of the view change often.
The query is not resource intensive.
The query results are not used frequently.
A Snowflake user would decide to use a materialized view instead of a regular view primarily when the base tables do not change frequently. Materialized views store the result of the view query and update it as the underlying data changes, making them ideal for situations where the data is relatively static and query performance is critical. By precomputing and storing the query results, materialized views can significantly reduce query execution times for complex aggregations, joins, and calculations.
References:
Snowflake Documentation: Materialized Views
Which file function generates a SnowFlake-hosted URL that must be authenticated when used?
GET_STATE_LOCATION
GET_PRESENT_URL
BUILD_SCOPED_FILE_URL
BUILD_STAGE_FILE_URL
Purpose: The BUILD_STAGE_FILE_URL function generates a temporary, pre-signed URL that allows you to access a file within a Snowflake stage (internal or external). This URL requires authentication to use.
Key Points:
Security: The URL has a limited lifespan, enhancing security.
Use Cases: Sharing staged data with external tools or applications, or downloading it directly.
Snowflake Documentation (BUILD_STAGE_FILE_URL): https://docs.snowflake.com/en/sql-reference/functions/build_stage_file_url.html
How does Snowflake describe its unique architecture?
A single-cluster shared data architecture using a central data repository and massively parallel processing (MPP)
A multi-duster shared nothing architecture using a soloed data repository and massively parallel processing (MPP)
A single-cluster shared nothing architecture using a sliced data repository and symmetric multiprocessing (SMP)
A multi-cluster shared nothing architecture using a siloed data repository and symmetric multiprocessing (SMP)
Snowflake's unique architecture is described as a multi-cluster, shared data architecture that leverages massively parallel processing (MPP). This architecture separates compute and storage resources, enabling Snowflake to scale them independently. It does not use a single cluster or rely solely on symmetric multiprocessing (SMP); rather, it uses a combination of shared-nothing architecture for compute clusters (virtual warehouses) and a centralized storage layer for data, optimizing for both performance and scalability.
References:
Snowflake Documentation: Snowflake Architecture Overview
How does Snowflake define i1s approach to Discretionary Access Control (DAC)?
A defined level of access to an object
An entity in which access can be granted
Each object has an owner, who can in turn grail access to that object.
Access privileges are assigned to roles. which are in turn assigned to use's
Snowflake implements Discretionary Access Control (DAC) by using a role-based access control model. In this model, access privileges are not directly assigned to individual objects or users but are encapsulated within roles. These roles are then assigned to users, effectively granting them the access privileges contained within the role. This approach allows for granular control over database access, making it easier to manage permissions in a scalable and flexible manner.References: Snowflake Documentation on Access Control
Which function is used to unload a relational table into a JSON file*?
PARSE_JSON
JSON_EXTRACT_PATH_TEXT
OBJECT_CONSTRUCT
TO_JSON
The TO_JSON function in Snowflake is used to convert a relational table or individual rows into JSON format. This function is helpful for exporting data in JSON format.
Using TO_JSON Function:
SELECT TO_JSON(OBJECT_CONSTRUCT(*))
FROM my_table;
Exporting Data: The TO_JSON function converts the table rows into JSON format, which can then be exported to a file.
References:
Snowflake Documentation: TO_JSON Function
Snowflake Documentation: Exporting Data
What should be considered when deciding to use a secure view? (Select TWO).
No details of the query execution plan will be available in the query profiler.
Once created there is no way to determine if a view is secure or not.
Secure views do not take advantage of the same internal optimizations as standard views.
It is not possible to create secure materialized views.
The view definition of a secure view is still visible to users by way of the information schema.
When deciding to use a secure view, several considerations come into play, especially concerning security and performance:
A. No details of the query execution plan will be available in the query profiler: Secure views are designed to prevent the exposure of the underlying data and the view definition to unauthorized users. Because of this, the detailed execution plans for queries against secure views are not available in the query profiler. This is intended to protect sensitive data from being inferred through the execution plan.
C. Secure views do not take advantage of the same internal optimizations as standard views: Secure views, by their nature, limit some of the optimizations that can be applied compared to standard views. This is because they enforce row-level security and mask data, which can introduce additional processing overhead and limit the optimizer's ability to apply certain efficiencies that are available to standard views.
B. Once created, there is no way to determine if a view is secure or not is incorrect because metadata about whether a view is secure can be retrieved from the INFORMATION_SCHEMA views or by using the SHOW VIEWS command.
D. It is not possible to create secure materialized views is incorrect because the limitation is not on the security of the view but on the fact that Snowflake currently does not support materialized views with the same dynamic data masking and row-level security features as secure views.
E. The view definition of a secure view is still visible to users by way of the information schema is incorrect because secure views specifically hide the view definition from users who do not have the privilege to view it, ensuring that sensitive information in the definition is not exposed.
When using the ALLOW_CLI£NT_MFA_CACHING parameter, how long is a cached Multi-Factor Authentication (MFA) token valid for?
1 hour
2 hours
4 hours
8 hours
A cached MFA token is valid for up to four hours. https://docs.snowflake.com/en/user-guide/security-mfa#using-mfa-token-caching-to-minimize-the-number-of-prompts-during-authentication-optional
Which Query Profile metrics will provide information that can be used to improve query performance? (Select TWO).
Synchronization
Remote disk IO
Local disk IO
Pruning
Spillage
Two key metrics in Snowflake’s Query Profile that provide insights for performance improvement are:
Remote Disk IO: This measures the time the query spends waiting on remote disk access, indicating potential performance issues related to I/O bottlenecks.
Pruning: This metric reflects how effectively Snowflake’s micro-partition pruning is reducing the data scanned. Better pruning (more partitions excluded) leads to faster query performance, as fewer micro-partitions need to be processed.
These metrics are essential for identifying and addressing inefficiencies in data retrieval and storage access, optimizing overall query performance.
At what level is the MIN_DATA_RETENSION_TIME_IN_DAYS parameter set?
Account
Database
Schema
Table
The MIN_DATA_RETENTION_TIME_IN_DAYS parameter is set at the Account level in Snowflake. This parameter specifies the minimum number of days Snowflake retains the historical data for time travel, which allows users to access and query data as it existed at previous points in time.
Here's how to understand and adjust this parameter:
Purpose of MIN_DATA_RETENTION_TIME_IN_DAYS: This parameter is crucial for managing data lifecycle and compliance requirements within Snowflake. It determines the minimum time frame for which you can perform operations like restoring deleted objects or accessing historical versions of data.
Setting the Parameter: Only account administrators can set or modify this parameter. It is done at the account level, impacting all databases and schemas within the account. The setting can be adjusted based on the organization's data retention policy.
Adjusting the Parameter:
To view the current setting, use:
SHOW PARAMETERS LIKE 'MIN_DATA_RETENTION_TIME_IN_DAYS';
To change the setting, an account administrator can execute:
ALTER ACCOUNT SET MIN_DATA_RETENTION_TIME_IN_DAYS =
Which MINIMUM set of privileges is required to temporarily bypass an active network policy by configuring the user object property MINS_TO_BYPASS_NETWORK_POLICY?
Only while in the ACCOUNTADMIH role
Only while in the securityadmin role
Only the role with the ownership privilege on the network policy
Only Snowflake Support can set the value for this object property
To temporarily bypass an active network policy by configuring the user object property MINS_TO_BYPASS_NETWORK_POLICY, the minimum set of privileges required is having the ACCOUNTADMIN role. This role has the necessary privileges to make such changes, including modifying user properties that affect network policies.
References:
Snowflake Documentation: Network Policy Management
Given the statement template below, which database objects can be added to a share?(Select TWO).
GRANT
Secure functions
Stored procedures
Streams
Tables
Tasks
In Snowflake, shares are used to share data across different Snowflake accounts securely. When you create a share, you can include various database objects that you want to share with consumers. According to Snowflake's documentation, the types of objects that can be shared include tables, secure views, secure materialized views, and streams. Secure functions and stored procedures are not shareable objects. Tasks also cannot be shared directly. Therefore, the correct answers are streams (C) and tables (D).
To share a stream or a table, you use the GRANT statement to grant privileges on these objects to a share. The syntax for sharing a table or stream involves specifying the type of object, the object name, and the share to which you are granting access. For example:
GRANT SELECT ON TABLE my_table TO SHARE my_share; GRANT SELECT ON STREAM my_stream TO SHARE my_share;
These commands grant the SELECT privilege on a table named my_table and a stream named my_stream to a share named my_share. This enables the consumer of the share to access these objects according to the granted privileges.
What is the Fail-safe period for a transient table in the Snowflake Enterprise edition and higher?
0 days
1 day
7 days
14 days
The Fail-safe period for a transient table in Snowflake, regardless of the edition (including Enterprise edition and higher), is 0 days. Fail-safe is a data protection feature that provides additional retention beyond the Time Travel period for recovering data in case of accidental deletion or corruption. However, transient tables are designed for temporary or short-term use and do not benefit from the Fail-safe feature, meaning that once their Time Travel period expires, data cannot be recovered.
References:
Snowflake Documentation: Understanding Fail-safe
When unloading data, which file format preserves the data values for floating-point number columns?
Avro
CSV
JSON
Parquet
When unloading data, the Parquet file format is known for its efficiency in preserving the data values for floating-point number columns. Parquet is a columnar storage file format that offers high compression ratios and efficient data encoding schemes. It is especially effective for floating-point data, as it maintains high precision and supports efficient querying and analysis.
References:
Snowflake Documentation: Using the Parquet File Format for Unloading Data
How can a user MINIMIZE Continuous Data Protection costs when using large, high-churn, dimension tables?
Create transient tables and periodically copy them to permanent tables.
Create temporary tables and periodically copy them to permanent tables
Create regular tables with extended Time Travel and Fail-safe settings.
Create regular tables with default Time Travel and Fail-safe settings
To minimize Continuous Data Protection (CDP) costs when dealing with large, high-churn dimension tables in Snowflake, using transient tables is a recommended approach.
Transient Tables: These are designed for data that does not require fail-safe protection. They provide the benefit of reducing costs associated with continuous data protection since they do not have the seven-day Fail-safe period that is mandatory for permanent tables.
Periodic Copying to Permanent Tables: By periodically copying data from transient tables to permanent tables, you can achieve a balance between data protection and cost-efficiency. Permanent tables offer the extended data protection features, including Time Travel and Fail-safe, but these features can be applied selectively rather than continuously, reducing the overall CDP costs.
References:
Snowflake Documentation on Transient Tables
Snowflake Documentation on Time Travel & Fail-safe
What causes objects in a data share to become unavailable to a consumer account?
The DATA_RETENTI0N_TIME_IN_DAYS parameter in the consumer account is set to 0.
The consumer account runs the GRANT IMPORTED PRIVILEGES command on the data share every 24 hours.
The objects in the data share are being deleted and the grant pattern is not re-applied systematically.
The consumer account acquires the data share through a private data exchange.
Objects in a data share become unavailable to a consumer account if the objects in the data share are deleted or if the permissions on these objects are altered without re-applying the grant permissions systematically. This is because the sharing mechanism in Snowflake relies on explicit grants of permissions on specific objects (like tables, views, or secure views) to the share. If these objects are deleted or if their permissions change without updating the share accordingly, consumers can lose access.
The DATA_RETENTION_TIME_IN_DAYS parameter does not directly affect the availability of shared objects, as it controls how long Snowflake retains historical data for time travel and does not impact data sharing permissions.
Running the GRANT IMPORTED PRIVILEGES command in the consumer account is not related to the availability of shared objects; this command is used to grant privileges on imported objects within the consumer's account and is not a routine maintenance command that would need to be run regularly.
Acquiring a data share through a private data exchange does not inherently make objects unavailable; issues would only arise if there were problems with the share configuration or if the shared objects were deleted or had their permissions altered without re-granting access to the share.
Who can create network policies within Snowflake? (Select TWO).
SYSADMIN only
ORCADMIN only
SECURITYADMIN or higher roles
A role with the CREATE NETWORK POLICY privilege
A role with the CREATE SECURITY INTEGRATION privilege
In Snowflake, network policies define the allowed IP address ranges from which users can connect to Snowflake, enhancing security by restricting access based on network location. The creation and management of network policies require sufficient privileges. Specifically, a user with the SECURITYADMIN role or any role with higher privileges, such as ACCOUNTADMIN, can create network policies. Additionally, a custom role can be granted the CREATE NETWORK POLICY privilege, enabling users assigned to that role to also create network policies. This approach allows for flexible and secure management of network access to Snowflake.References: Snowflake Documentation on Network Policies
Which command can be used to list all network policies available in an account?
DESCRIBE SESSION POLICY
DESCRIBE NETWORK POLICY
SHOW SESSION POLICIES
SHOW NETWORK POLICIES
To list all network policies available in an account, the correct command is SHOW NETWORK POLICIES. Network policies in Snowflake are used to define and enforce rules for how users can connect to Snowflake, including IP whitelisting and other connection requirements. The SHOW NETWORK POLICIES command provides a list of all network policies defined within the account, along with their details.
The DESCRIBE SESSION POLICY and DESCRIBE NETWORK POLICY commands do not exist in Snowflake SQL syntax. The SHOW SESSION POLICIES command is also incorrect, as it does not pertain to the correct naming convention used by Snowflake for network policy management.
Using SHOW NETWORK POLICIES without any additional parameters will display all network policies in the account, which is useful for administrators to review and manage the security configurations pertaining to network access.
A Snowflake user wants to optimize performance for a query that queries only a small number of rows in a table. The rows require significant processing. The data in the table does not change frequently.
What should the user do?
Add a clustering key to the table.
Add the search optimization service to the table.
Create a materialized view based on the query.
Enable the query acceleration service for the virtual warehouse.
In a scenario where a Snowflake user queries only a small number of rows that require significant processing and the data in the table does not change frequently, the most effective way to optimize performance is by creating a materialized view based on the query. Materialized views store the result of the query and can significantly reduce the computation time for queries that are executed frequently over unchanged data.
Why Materialized Views: Materialized views precompute and store the result of the query. This is especially beneficial for queries that require heavy processing. Since the data does not change frequently, the materialized view will not need to be refreshed often, making it an ideal solution for this use case.
Implementation Steps:
To create a materialized view, use the following SQL command:
CREATE MATERIALIZED VIEW my_materialized_view AS SELECT ... FROM my_table WHERE ...;
When the query is run, Snowflake uses the precomputed results from the materialized view, thus skipping the need for recalculating the data and improving query performance.
Which command should be used to load data incrementally based on column values that are specified in the source table or subquery?
MERGE
COPY INTO
GET
INSERT INTO
The MERGE command in Snowflake is used for incremental loading based on column values in a source table or subquery. It enables the insertion, updating, or deletion of records in a target table depending on whether matching rows are found, making it ideal for loading data that changes incrementally, such as daily updates or modifications.
Authorization to execute CREATE
Primary role
Secondary role
Application role
Database role
In Snowflake, the authorization to execute CREATE <object> statements, such as creating tables, views, databases, etc., is determined by the role currently set as the user's primary role. The primary role of a user or session specifies the set of privileges (including creation privileges) that the user has. While users can have multiple roles, only the primary role is used to determine what objects the user can create unless explicitly specified in the session.
Which command will unload data from a table into an external stage?
PUT
INSERT
COPY INTO
GET
In Snowflake, the COPY INTO <location> command is used to unload (export) data from a Snowflake table to an external stage, such as an S3 bucket, Azure Blob Storage, or Google Cloud Storage. This command allows users to specify the format, file size, and other options for the data being unloaded, making it a flexible solution for exporting data from Snowflake to external storage solutions for further use or analysis.References: Snowflake Documentation on Data Unloading
Which security models are used in Snowflake to manage access control? (Select TWO).
Discretionary Access Control (DAC)
Identity Access Management (1AM)
Mandatory Access Control (MAC)
Role-Based Access Control (RBAC)
Security Assertion Markup Language (SAML)
Snowflake uses both Discretionary Access Control (DAC) and Role-Based Access Control (RBAC) to manage access control. DAC allows object owners to grant access privileges to other users. RBAC assigns permissions to roles, and roles are then granted to users, making it easier to manage permissions based on user roles within the organization.
References:
Snowflake Documentation: Access Control in Snowflake
To use the overwrite option on insert, which privilege must be granted to the role?
truncate
DELETE
UPDATE
SELECT
To use the overwrite option on insert in Snowflake, the DELETE privilege must be granted to the role. This is because overwriting data during an insert operation implicitly involves deleting the existing data before inserting the new data.
Understanding the Overwrite Option: The overwrite option (INSERT OVERWRITE) allows you to replace existing data in a table with new data. This operation is particularly useful for batch-loading scenarios where the entire dataset needs to be refreshed.
Why DELETE Privilege is Required: Since the overwrite operation involves removing existing rows in the table, the executing role must have the DELETE privilege to carry out both the deletion of old data and the insertion of new data.
Granting DELETE Privilege:
To grant the DELETE privilege to a role, an account administrator can execute the following SQL command:
sqlCopy code
GRANT DELETE ON TABLE my_table TO ROLE my_role;
What does Snowflake attempt to do if any of the compute resources for a virtual warehouse fail to provision during start-up?
Repair the failed resources.
Restart failed resources.
Queue the failed resources
Provision the failed resources
If any compute resources for a virtual warehouse fail to provision during startup, Snowflake will attempt to restart those failed resources. The system is designed to automatically handle transient issues that might occur during the provisioning of compute resources. By restarting the failed resources, Snowflake aims to ensure that the virtual warehouse has the necessary compute capacity to handle the user's workloads without manual intervention.References: Snowflake Documentation on Virtual Warehouses
How can staged files be removed during data loading once the files have loaded successfully?
Use the DROP command
Use the purge copy option.
Use the FORCE = TRUE parameter
Use the LOAD UNCERTAIN FILES copy option.
To remove staged files during data loading after they have been successfully loaded, the PURGE copy option is used in Snowflake.
PURGE Option: This option automatically deletes files from the stage after they have been successfully copied into the target table.
Usage:
FROM @my_stage
FILE_FORMAT = (type = 'csv')
PURGE = TRUE;
References:
Snowflake Documentation on COPY INTO
How many credits does a size 3X-Large virtual warehouse consume if it runs continuously for 2 hours?
32
64
128
256
In Snowflake, the consumption of credits by a virtual warehouse is determined by its size and the duration for which it runs. A size 3X-Large virtual warehouse consumes 128 credits if it runs continuously for 2 hours. This consumption rate is based on the principle that larger warehouses, capable of providing greater computational resources and throughput, consume more credits per hour of operation. The specific rate of consumption is defined by Snowflake’s pricing model and the scale of the virtual warehouse.References: Snowflake Pricing Documentation
In the Data Exchange, who can get or request data from the listings? (Select TWO).
Users with ACCOUNTADMIN role
Users with sysadmin role
Users with ORGADMIN role
Users with import share privilege
Users with manage grants privilege
In the Snowflake Data Exchange, the ability to get or request data from listings is generally controlled by specific roles and privileges:
A. Users with ACCOUNTADMIN role: This role typically has the highest level of access within a Snowflake account, including the ability to manage and access all features and functions. Users with this role can access data listings within the Data Exchange.
D. Users with import share privilege: This specific privilege is necessary for users who need to import shared data from the Data Exchange. This privilege allows them to request and access data listings explicitly shared with them.
Which table function is used to perform additional processing on the results of a previously-run query?
QUERY_HISTORY
RESULT_SCAN
DESCRIBE_RESULTS
QUERY HISTORY BY SESSION
The RESULT_SCAN table function is used in Snowflake to perform additional processing on the results of a previously-run query. It allows users to reference the result set of a previous query by its query ID, enabling further analysis or transformations without re-executing the original query.
References:
Snowflake Documentation: RESULT_SCAN
How can a 5 GB table be downloaded into a single file MOST efficiently?
Keep the default MAX_FILE_SIZE to 16 MB
Set the default MAX_FILE_SI2E to 5 GB.
Set the SINGLE parameter to TRUE.
Use a regular expression in the stage specifications of the COPY command.
To download a 5 GB table into a single file most efficiently in Snowflake, you should set the SINGLE parameter to TRUE. This parameter ensures that the COPY INTO command outputs the result into a single file, regardless of the file size. This approach is more efficient than relying on the default MAX_FILE_SIZE setting, which would split the output into multiple files.
References:
Snowflake Documentation: COPY INTO
A Snowflake table that is loaded using a Kafka connector has a schema consisting of which two variant columns? (Select TWO).
RECORD_TIMESTAMP
RECORD_CONTENT
RECORDKEY
RECORD_SESSION
RECORD_METADATA
When using the Snowflake Kafka connector, the table schema includes two important variant columns:
RECORD_TIMESTAMP: This column stores the timestamp from the Kafka record, enabling time-based analysis of incoming data.
RECORDKEY: This captures the unique key of each Kafka message, useful for uniquely identifying records or managing deduplication.
These columns ensure that each message’s metadata and key information are preserved, facilitating data analysis and real-time processing tasks in Snowflake.
Awarding a user which privileges on all virtual warehouses is equivalent to granting the user the global MANAGE WAREHOUSES privilege?
MODIFY, MONITOR and OPERATE privileges
ownership and usage privileges
APPLYBUDGET and audit privileges
MANAGE LISTING ADTOTOLFillment and resolve all privileges
Granting a user the MODIFY, MONITOR, and OPERATE privileges on all virtual warehouses in Snowflake is equivalent to granting the global MANAGE WAREHOUSES privilege. These privileges collectively provide comprehensive control over virtual warehouses.
MODIFY Privilege:
Allows users to change the configuration of the virtual warehouse.
Includes resizing, suspending, and resuming the warehouse.
MONITOR Privilege:
Allows users to view the status and usage metrics of the virtual warehouse.
Enables monitoring of performance and workload.
OPERATE Privilege:
Grants the ability to start and stop the virtual warehouse.
Includes pausing and resuming operations as needed.
References:
Snowflake Documentation: Warehouse Privileges
Which service or tool is a Command Line Interface (CLI) client used for connecting to Snowflake to execute SQL queries?
Snowsight
SnowCD
Snowpark
SnowSQL
SnowSQL is the Command Line Interface (CLI) client provided by Snowflake for executing SQL queries and performing various tasks. It allows users to connect to their Snowflake accounts and interact with the Snowflake data warehouse.
Installation: SnowSQL can be downloaded and installed on various operating systems.
Configuration: Users need to configure SnowSQL with their Snowflake account credentials.
Usage: Once configured, users can run SQL queries, manage data, and perform administrative tasks through the CLI.
References:
Snowflake Documentation: SnowSQL
Snowflake Documentation: Installing SnowSQL
When should a stored procedure be created with caller's rights?
When the caller needs to be prevented from viewing the source code of the stored procedure
When the caller needs to run a statement that could not execute outside of the stored procedure
When the stored procedure needs to run with the privileges of the role that called the stored procedure
When the stored procedure needs to operate on objects that the caller does not have privileges on
Stored procedures in Snowflake can be created with either 'owner's rights' or 'caller's rights'. A stored procedure created with caller's rights executes with the privileges of the role that calls the procedure, not the privileges of the role that owns the procedure. This is particularly useful in scenarios where the procedure needs to perform operations that depend on the caller's access permissions, ensuring that the procedure can only access objects that the caller is authorized to access.
Which type of charts are supported by Snowsight? {Select TWO)
Flowcharts
Gantt charts
Line charts
Pie charts
Scatterplots
What is the MINIMUM size of a table for which Snowflake recommends considering adding a clustering key?
1 Kilobyte (KB)
1 Megabyte (MB)
1 Gigabyte (GB)
1 Terabyte (TB)
Snowflake recommends considering adding a clustering key to a table when its size reaches 1 Terabyte (TB) or larger. Clustering keys help optimize the storage and query performance by organizing the data in a table based on the specified columns. This is particularly beneficial for large tables where data retrieval can become inefficient without proper clustering.
Why Clustering Keys Are Important: Clustering keys ensure that data stored in Snowflake is physically ordered in a way that aligns with the most frequent access patterns, thereby reducing the amount of scanned data during queries and improving performance.
Recommendation Basis: The recommendation for tables of size 1 TB or larger is based on the observation that smaller tables generally do not benefit as much from clustering, given Snowflake's architecture. However, as tables grow in size, the benefits of clustering become more pronounced.
Implementing Clustering Keys:
To set a clustering key for a table, you can use the CLUSTER BY clause during table creation or alter an existing table to add it:
CREATE TABLE my_table (... ) CLUSTER BY (column1, column2);
Or for an existing table:
ALTER TABLE my_table CLUSTER BY (column1, column2);
Which access control entity in Snowflake can be created as part of a hierarchy within an account?
Securable object
Role
Privilege
User
In Snowflake, a role is an access control entity that can be created as part of a hierarchy within an account. Roles are used to grant and manage privileges in a structured and scalable manner.
Understanding Roles:
Roles are logical entities that group privileges together.
They are used to control access to securable objects like tables, views, warehouses, and more.
Role Hierarchy:
Roles can be organized into a hierarchy, allowing for the inheritance of privileges.
A role higher in the hierarchy (parent role) can grant its privileges to a lower role (child role), simplifying privilege management.
Creating Roles:
Roles can be created using the CREATE ROLE command.
You can define parent-child relationships by granting one role to another.
Example Usage:
CREATE ROLE role1;
CREATE ROLE role2;
GRANT ROLE role1 TO role2;
In this example, role2 inherits the privileges of role1.
Benefits:
Simplifies privilege management: Hierarchies allow for efficient privilege assignment and inheritance.
Enhances security: Roles provide a clear structure for managing access control, ensuring that privileges are granted appropriately.
References:
Snowflake Documentation: Access Control in Snowflake
Snowflake Documentation: Creating and Managing Roles
When unloading data, which combination of parameters should be used to differentiate between empty strings and NULL values? (Select TWO).
ESCAPE_UNENCLOSED_FIELD
REPLACE_INVALID_CHARACTERS
FIELD_OPTIONALLY_ENCLOSED_BY
EMPTY_FIELD_AS_NULL
SKIP_BLANK_LINES
When unloading data in Snowflake, it is essential to differentiate between empty strings and NULL values to preserve data integrity. The parameters FIELD_OPTIONALLY_ENCLOSED_BY and EMPTY_FIELD_AS_NULL are used together to address this:
FIELD_OPTIONALLY_ENCLOSED_BY: This parameter specifies the character used to enclose fields, which can differentiate between empty strings (as enclosed fields) and NULLs.
EMPTY_FIELD_AS_NULL: By setting this parameter, Snowflake interprets empty fields as NULL values when unloading data, ensuring accurate representation of NULLs versus empty strings.
These parameters are crucial when exporting data for systems that need explicit differentiation between NULL and empty string values.
Which function returns an integer between 0 and 100 when used to calculate the similarity of two strings?
APPROXIMATE_SIMILARITY
JAROWINKLER_SIMILARITY
APPROXIMATE_JACCARD_INDEX
MINHASH COMBINE
The JAROWINKLER_SIMILARITY function in Snowflake returns an integer between 0 and 100, indicating the similarity of two strings based on the Jaro-Winkler similarity algorithm. This function is useful for comparing strings and determining how closely they match each other.
Understanding JAROWINKLER_SIMILARITY: The Jaro-Winkler similarity metric is a measure of similarity between two strings. The score is a number between 0 and 100, where 100 indicates an exact match and lower scores indicate less similarity.
Usage Example: To compare two strings and get their similarity score, you can use:
SELECT JAROWINKLER_SIMILARITY('string1', 'string2') AS similarity_score;
Application Scenarios: This function is particularly useful in data cleaning, matching, and deduplication tasks where you need to identify similar but not identical strings, such as names, addresses, or product titles.
Where would a Snowflake user find information about query activity from 90 days ago?
account__usage . query history view
account__usage.query__history__archive View
information__schema . cruery_history view
information__schema - query history_by_ses s i on view
To find information about query activity from 90 days ago, a Snowflake user should use the account_usage.query_history_archive view. This view is designed to provide access to historical query data beyond the default 14-day retention period found in the standard query_history view. It allows users to analyze and audit past query activities for up to 365 days after the date of execution, which includes the 90-day period mentioned.
References:
[COF-C02] SnowPro Core Certification Exam Study Guide
Snowflake Documentation on Account Usage Schema1
When does a materialized view get suspended in Snowflake?
When a column is added to the base table
When a column is dropped from the base table
When a DML operation is run on the base table
When the base table is reclustered
A materialized view in Snowflake gets suspended when structural changes that could impact the view's integrity are made to the base table, such as B. When a column is dropped from the base table. Dropping a column from the base table on which a materialized view is defined can invalidate the view's data, as the view might rely on the column that is being removed. To maintain data consistency and prevent the materialized view from serving stale or incorrect data, Snowflake automatically suspends the materialized view.
Upon suspension, the materialized view does not reflect changes to the base table until it is refreshed or re-created. This ensures that only accurate and current data is presented to users querying the materialized view.
References:
Snowflake Documentation on Materialized Views: Materialized Views
True or False: A 4X-Large Warehouse may, at times, take longer to provision than a X-Small Warehouse.
True
False
Provisioning time can vary based on the size of the warehouse. A 4X-Large Warehouse typically has more resources and may take longer to provision compared to a X-Small Warehouse, which has fewer resources and can generally be provisioned more quickly.References: Understanding and viewing Fail-safe | Snowflake Documentation
How would you determine the size of the virtual warehouse used for a task?
Root task may be executed concurrently (i.e. multiple instances), it is recommended to leave some margins in the execution window to avoid missing instances of execution
Querying (select) the size of the stream content would help determine the warehouse size. For example, if querying large stream content, use a larger warehouse size
If using the stored procedure to execute multiple SQL statements, it's best to test run the stored procedure separately to size the compute resource first
Since task infrastructure is based on running the task body on schedule, it's recommended to configure the virtual warehouse for automatic concurrency handling using Multi-cluster warehouse (MCW) to match the task schedule
The size of the virtual warehouse for a task can be configured to handle concurrency automatically using a Multi-cluster warehouse (MCW). This is because tasks are designed to run their body on a schedule, and MCW allows for scaling compute resources to match the task’s execution needs without manual intervention. References: [COF-C02] SnowPro Core Certification Exam Study Guide
A sales table FCT_SALES has 100 million records.
The following Query was executed
SELECT COUNT (1) FROM FCT__SALES;
How did Snowflake fulfill this query?
Query against the result set cache
Query against a virtual warehouse cache
Query against the most-recently created micro-partition
Query against the metadata excite
Snowflake is designed to optimize query performance by utilizing metadata for certain types of queries. When executing a COUNT query, Snowflake can often fulfill the request by accessing metadata about the table’s row count, rather than scanning the entire table or micro-partitions. This is particularly efficient for large tables like FCT_SALES with a significant number of records. The metadata layer maintains statistics about the table, including the row count, which enables Snowflake to quickly return the result of a COUNT query without the need to perform a full scan.
References:
Snowflake Documentation on Metadata Management
SnowPro® Core Certification Study Guide
Which Snowflake partner specializes in data catalog solutions?
Alation
DataRobot
dbt
Tableau
Alation is known for specializing in data catalog solutions and is a partner of Snowflake. Data catalog solutions are essential for organizations to effectively manage their metadata and make it easily accessible and understandable for users, which aligns with the capabilities provided by Alation.
References:
[COF-C02] SnowPro Core Certification Exam Study Guide
Snowflake’s official documentation and partner listings
During periods of warehouse contention which parameter controls the maximum length of time a warehouse will hold a query for processing?
STATEMENT_TIMEOUT__IN__SECONDS
STATEMENT_QUEUED_TIMEOUT_IN_SECONDS
MAX_CONCURRENCY__LEVEL
QUERY_TIMEOUT_IN_SECONDS
The parameter STATEMENT_QUEUED_TIMEOUT_IN_SECONDS sets the limit for a query to wait in the queue in order to get its chance of running on the warehouse. The query will quit after reaching this limit. By default, the value of this parameter is 0 which mean the queries will wait indefinitely in the waiting queue
https://community.snowflake.com/s/article/Warehouse-Concurrency-and-Statement-Timeout-Parameters#:~:text=The%20parameter%20STATEMENT_QUEUED_TIMEOUT_IN_SECONDS%20sets%20the,indefinitely%20in%20the%20waiting%20queue .
True or False: It is possible for a user to run a query against the query result cache without requiring an active Warehouse.
True
False
Snowflake’s architecture allows for the use of a query result cache that stores the results of queries for a period of time. If the same query is run again and the underlying data has not changed, Snowflake can retrieve the result from this cache without needing to re-run the query on an active warehouse, thus saving on compute resources.
Which statement about billing applies to Snowflake credits?
Credits are billed per-minute with a 60-minute minimum
Credits are used to pay for cloud data storage usage
Credits are consumed based on the number of credits billed for each hour that a warehouse runs
Credits are consumed based on the warehouse size and the time the warehouse is running
Snowflake credits are the unit of measure for the compute resources used in Snowflake. The number of credits consumed depends on the size of the virtual warehouse and the time it is running. Larger warehouses consume more credits per hour than smaller ones, and credits are billed for the time the warehouse is active, regardless of the actual usage within that time.
References: [COF-C02] SnowPro Core Certification Exam Study Guide
A developer is granted ownership of a table that has a masking policy. The developer's role is not able to see the masked data. Will the developer be able to modify the table to read the masked data?
Yes, because a table owner has full control and can unset masking policies.
Yes, because masking policies only apply to cloned tables.
No, because masking policies must always reference specific access roles.
No, because ownership of a table does not include the ability to change masking policies
Even if a developer is granted ownership of a table with a masking policy, they will not be able to modify the table to read the masked data if their role does not have the necessary permissions. Ownership of a table does not automatically confer the ability to alter masking policies, which are designed to protect sensitive data. Masking policies are applied at the schema level and require specific privileges to modify12.
References:
[COF-C02] SnowPro Core Certification Exam Study Guide
Snowflake Documentation on Masking Policies
What Snowflake features allow virtual warehouses to handle high concurrency workloads? (Select TWO)
The ability to scale up warehouses
The use of warehouse auto scaling
The ability to resize warehouses
Use of multi-clustered warehouses
The use of warehouse indexing
Snowflake’s architecture is designed to handle high concurrency workloads through several features, two of which are particularly effective:
B. The use of warehouse auto scaling: This feature allows Snowflake to automatically adjust the compute resources allocated to a virtual warehouse in response to the workload. If there is an increase in concurrent queries, Snowflake can scale up the resources to maintain performance.
D. Use of multi-clustered warehouses: Multi-clustered warehouses enable Snowflake to run multiple clusters of compute resources simultaneously. This allows for the distribution of queries across clusters, thereby reducing the load on any single cluster and improving the system’s ability to handle a high number of concurrent queries.
These features ensure that Snowflake can manage varying levels of demand without manual intervention, providing a seamless experience even during peak usage.
References:
Snowflake Documentation on Virtual Warehouses
SnowPro® Core Certification Study Guide
Which Snowflake technique can be used to improve the performance of a query?
Clustering
Indexing
Fragmenting
Using INDEX__HINTS
Clustering is a technique used in Snowflake to improve the performance of queries. It involves organizing the data in a table into micro-partitions based on the values of one or more columns. This organization allows Snowflake to efficiently prune non-relevant micro-partitions during a query, which reduces the amount of data scanned and improves query performance.
References:
[COF-C02] SnowPro Core Certification Exam Study Guide
Snowflake Documentation on Clustering
Which stage type can be altered and dropped?
Database stage
External stage
Table stage
User stage
External stages can be altered and dropped in Snowflake. An external stage points to an external location, such as an S3 bucket, where data files are stored. Users can modify the stage’s definition or drop it entirely if it’s no longer needed. This is in contrast to table stages, which are tied to specific tables and cannot be altered or dropped independently.
References:
[COF-C02] SnowPro Core Certification Exam Study Guide
Snowflake Documentation on Stages1
Which Snowflake object enables loading data from files as soon as they are available in a cloud storage location?
Pipe
External stage
Task
Stream
In Snowflake, a Pipe is the object designed to enable the continuous, near-real-time loading of data from files as soon as they are available in a cloud storage location. Pipes use Snowflake’s COPY command to load data and can be associated with a Stage object to monitor for new files. When new data files appear in the stage, the pipe automatically loads the data into the target table.
References:
Snowflake Documentation on Pipes
SnowPro® Core Certification Study Guide
https://docs.snowflake.com/en/user-guide/data-load-snowpipe-intro.html
What feature can be used to reorganize a very large table on one or more columns?
Micro-partitions
Clustering keys
Key partitions
Clustered partitions
Clustering keys in Snowflake are used to reorganize large tables based on one or more columns. This feature optimizes the arrangement of data within micro-partitions to improve query performance, especially for large tables where efficient data retrieval is crucial. References: [COF-C02] SnowPro Core Certification Exam Study Guide
https://docs.snowflake.com/en/user-guide/tables-clustering-keys.html
Which of the following Snowflake features provide continuous data protection automatically? (Select TWO).
Internal stages
Incremental backups
Time Travel
Zero-copy clones
Fail-safe
Snowflake’s Continuous Data Protection (CDP) encompasses a set of features that help protect data stored in Snowflake against human error, malicious acts, and software failure. Time Travel allows users to access historical data (i.e., data that has been changed or deleted) for a defined period, enabling querying and restoring of data. Fail-safe is an additional layer of data protection that provides a recovery option in the event of significant data loss or corruption, which can only be performed by Snowflake.
References:
Continuous Data Protection | Snowflake Documentation1
Data Storage Considerations | Snowflake Documentation2
Snowflake SnowPro Core Certification Study Guide3
Snowflake Data Cloud Glossary
https://docs.snowflake.com/en/user-guide/data-availability.html
What can be used to view warehouse usage over time? (Select Two).
The load HISTORY view
The Query history view
The show warehouses command
The WAREHOUSE_METERING__HISTORY View
The billing and usage tab in the Snowflake web Ul
To view warehouse usage over time, the Query history view and the WAREHOUSE_METERING__HISTORY View can be utilized. The Query history view allows users to monitor the performance of their queries and the load on their warehouses over a specified period1. The WAREHOUSE_METERING__HISTORY View provides detailed information about the workload on a warehouse within a specified date range, including average running and queued loads2. References: [COF-C02] SnowPro Core Certification Exam Study Guide
A user unloaded a Snowflake table called mytable to an internal stage called mystage.
Which command can be used to view the list of files that has been uploaded to the staged?
list @mytable;
list @%raytable;
list @ %m.ystage;
list @mystage;
The command list @mystage; is used to view the list of files that have been uploaded to an internal stage in Snowflake. The list command displays the metadata for all files in the specified stage, which in this case is mystage. This command is particularly useful for verifying that files have been successfully unloaded from a Snowflake table to the stage and for managing the files within the stage.
References:
Snowflake Documentation on Stages
SnowPro® Core Certification Study Guide
Which data types does Snowflake support when querying semi-structured data? (Select TWO)
VARIANT
ARRAY
VARCHAR
XML
BLOB
Snowflake supports querying semi-structured data using specific data types that are capable of handling the flexibility and structure of such data. The data types supported for this purpose are:
A. VARIANT: This is a universal data type that can store values of any other type, including structured and semi-structured types. It is particularly useful for handling JSON, Avro, ORC, Parquet, and XML data formats1.
B. ARRAY: An array is a list of elements that can be of any data type, including VARIANT, and is used to handle semi-structured data that is naturally represented as a list1.
These data types are part of Snowflake’s built-in support for semi-structured data, allowing for the storage, querying, and analysis of data that does not fit into the traditional row-column format.
References:
Snowflake Documentation on Semi-Structured Data
[COF-C02] SnowPro Core Certification Exam Study Guide
What are the default Time Travel and Fail-safe retention periods for transient tables?
Time Travel - 1 day. Fail-safe - 1 day
Time Travel - 0 days. Fail-safe - 1 day
Time Travel - 1 day. Fail-safe - 0 days
Transient tables are retained in neither Fail-safe nor Time Travel
Transient tables in Snowflake have a default Time Travel retention period of 1 day, which allows users to access historical data within the last 24 hours. However, transient tables do not have a Fail-safe period. Fail-safe is an additional layer of data protection that retains data beyond the Time Travel period for recovery purposes in case of extreme data loss. Since transient tables are designed for temporary or intermediate workloads with no requirement for long-term durability, they do not include a Fail-safe period by default1.
References:
Snowflake Documentation on Storage Costs for Time Travel and Fail-safe
Which command is used to unload data from a Snowflake table into a file in a stage?
COPY INTO
GET
WRITE
EXTRACT INTO
The COPY INTO command is used in Snowflake to unload data from a table into a file in a stage. This command allows for the export of data from Snowflake tables into flat files, which can then be used for further analysis, processing, or storage in external systems.
References:
Snowflake Documentation on Unloading Data
Snowflake SnowPro Core: Copy Into Command to Unload Rows to Files in Named Stage
What happens when a cloned table is replicated to a secondary database? (Select TWO)
A read-only copy of the cloned tables is stored.
The replication will not be successful.
The physical data is replicated
Additional costs for storage are charged to a secondary account
Metadata pointers to cloned tables are replicated
When a cloned table is replicated to a secondary database in Snowflake, the following occurs:
C. The physical data is replicated: The actual data of the cloned table is physically replicated to the secondary database. This ensures that the secondary database has its own copy of the data, which can be used for read-only purposes or failover scenarios1.
E. Metadata pointers to cloned tables are replicated: Along with the physical data, the metadata pointers that refer to the cloned tables are also replicated. This metadata includes information about the structure of the table and any associated properties2.
It’s important to note that while the physical data and metadata are replicated, the secondary database is typically read-only and cannot be used for write operations. Additionally, while there may be additional storage costs associated with the secondary account, this is not a direct result of the replication process but rather a consequence of storing additional data.
References:
SnowPro Core Exam Prep — Answers to Snowflake’s LEVEL UP: Backup and Recovery
Snowflake SnowPro Core Certification Exam Questions Set 10
What tasks can be completed using the copy command? (Select TWO)
Columns can be aggregated
Columns can be joined with an existing table
Columns can be reordered
Columns can be omitted
Data can be loaded without the need to spin up a virtual warehouse
The COPY command in Snowflake allows for the reordering of columns as they are loaded into a table, and it also permits the omission of columns from the source file during the load process. This provides flexibility in handling the schema of the data being ingested. References: [COF-C02] SnowPro Core Certification Exam Study Guide
Which services does the Snowflake Cloud Services layer manage? (Select TWO).
Compute resources
Query execution
Authentication
Data storage
Metadata
The Snowflake Cloud Services layer manages a variety of services that are crucial for the operation of the Snowflake platform. Among these services, Authentication and Metadata management are key components. Authentication is essential for controlling access to the Snowflake environment, ensuring that only authorized users can perform actions within the platform. Metadata management involves handling all the metadata related to objects within Snowflake, such as tables, views, and databases, which is vital for the organization and retrieval of data.
References:
[COF-C02] SnowPro Core Certification Exam Study Guide
Snowflake Documentation12
https://docs.snowflake.com/en/user-guide/intro-key-concepts.html
Which of the following can be executed/called with Snowpipe?
A User Defined Function (UDF)
A stored procedure
A single copy_into statement
A single insert__into statement
Snowpipe is used for continuous, automated data loading into Snowflake. It uses a COPY INTO