Is your data team budget simply under pressure? Are you aiming to allocate more funds for innovation and AI investment? A good starting point to decrease cloud computing costs is by optimizing your data and analytics platforms.
We’ve compiled a list of the best resources on Snowflake cost optimization. Ensure that you are maximizing the benefits of your credits, have implemented appropriate measures, and deactivated any unused features. Check out this advice from Snowflake expert practitioners to improve performance and maximize cost savings.
If all this sounds too complicated, or you’ve already tried these cloud cost optimizations, talk to us at Yellowbrick. Moving to Yellowbrick automatically gives you savings and better performance. Yellowbrick can be 7x cheaper and 2x faster than Snowflake – watch the benchmark. We’re so confident, we even guarantee it!
Understand the Credit-Based Model
Make sure you don’t confuse the Snowflake credit-based model for “pay for what you use.” You pay for the data warehouses you have running, not for active queries.
Implement Effective Governance Controls
Ensure you benchmark, monitor credit usage, and gain a holistic picture of your Snowflake usage. Put the right governance controls in place to restrict account-level permissions for creating new warehouses and turning on costly features.
Resource Monitors and Query Timeouts
Creating resource monitors in Snowflake and query timeouts to make sure you don’t blow through your budget. Of course, be careful when doing this for business-critical processes that you can’t just terminate arbitrarily.
Auto-Suspending in Snowflake: Avoiding Costly Cache Misses
Understand how auto-suspending too aggressively can actually increase cost by wiping out the cache. Use Materialized Views only when you need them to avoid unnecessary credit consumption through automatic refreshes.
Retention Duration Options
Ensure you consciously chose fail-safe and time travel retention duration options – including setting retention to zero if appropriate. You could pay up to 90x for storage and Snowflake cloud resources that you don’t need. Use transient or temporary tables to minimize these costs.
Effective Snowflake Clustering
Use Snowflake clustering wisely. When utilized effectively, clustering can enhance performance by decreasing the number of micro-partitions that need to be scanned for frequently used query patterns that utilize the cluster key. Avoid clustering for tables with frequently changing data – you’ll still pay the cost, but without any benefit.
Leveraging the Search Optimization Service
Another approach to optimize performance is to use the Search Optimization Service. This is particularly useful when you have a significant number of queries with patterns that don’t align with the cluster key strategy. Beware, this also incurs additional cloud spend and credit consumption. Most other databases call this approach an index.
Snowflake relies on data to be in-memory for performance. Queries which spill to disk adversely impact performance. Use the Activity area of Snowsight web interface to view queries that are spilling.
The query profile will also provide valuable information on query spillage. You can fix spilling by optimizing individual queries, looking at the appropriate cluster keys, reducing the MAX_CONCURRENCY_LEVEL – so fewer queries are competing for resources at the same time, or by using a large warehouse size which provides more cache and memory.
Multi-Cluster Warehouses and Auto-Scaling
Check for queuing in Snowflake to decide if you need to scale to more clusters. The only way to drive more concurrency in Snowflake is to use multi cluster warehouses. Use auto-scaling to minimize the time that these additional clusters are running. Set min and max clusters to avoid cost spiraling out of control.
Strategies for Snowflake Cost Optimization
One of Snowflake’s main challenges is managing cost, which can vary depending on the usage and configuration of the platform. Strategies for Snowflake cost optimization that can help you reduce your spending include:
– Auto-suspend and auto-resume features to avoid paying for idle time.
– Resource monitors and alerts to track and control your usage and spending.
– Using warehouses of different sizes and types for different workloads and scenarios.
– Clustering keys and partitions to improve query performance and reduce scanning costs.
– Materialized views and secure user-defined functions to avoid redundant computations.
– Result caching and persistent result sets to reuse query results and save on compute costs.
– Use zero-copy cloning and time travel to create copies of data without duplicating storage costs.
– Data sharing and external functions to access data across accounts and platforms without moving or copying data.
– Data retention policies and fail-safe to delete or archive old data that is no longer needed.
– Use the Snowflake credits estimator and optimizer to estimate and optimize your costs based on your usage patterns.
Snowflake Cost Reduction: Tuning Tactics
To optimize Snowflake’s performance, users will need to apply tuning tactics to improve query speed and reduce costs.
– Choose clustering keys that match your query patterns to reduce the amount of data scanned and improve query performance.
– Use result caching, which allows Snowflake to reuse the results of previous queries without having to recompute them.
– Use warehouse scaling to adjust the size and number of warehouses (compute clusters) that execute your queries.
If you are using Snowflake at any scale, manually monitoring cloud usage will quickly become problematic for account administrators. Use automated tooling to get to grips with and optimize your Snowflake spend together and build a solid finops foundation.
It’s a common enough problem that dedicated tools to solve Snowflake cost challenges have started to appear:
Simplify Your Data Warehousing Performance Without the Need for Optimization Techniques
Snowflake is undoubtedly a leader in cloud data warehousing in the public cloud. However, it’s not always as straightforward as Snowflake makes out to consistently get the performance you need.
With its extreme efficiency and higher performance at lower cost, Yellowbrick eliminates the need for many of the optimization techniques discussed here.