Many SaaS businesses are perfectly happy to let customers shoot themselves in the foot if it generates more revenue. The BigQuery example (presently, by default, `select * from table limit 10` obediently scans the entire table at your expense!) is spot-on.
As the article so well puts it, every SaaS company has a vested financial interest "to leave optimization gremlins in."
In practice, as has been pointed out in other comments, they do improve their performance (for competitive reasons) and it does cost them money when they do it.... They did it a couple qtrs ago and left $97 mill on the table.
https://www.fool.com/earnings/call-transcripts/2022/03/02/sn...
My own experience with Snowflake absolutely backs up the article's point. At my work we routinely encounter abysmal performance for certain types of queries, due to a flaw on Snowflake's side. We have had numerous talks with them and there is no question that they have an issue, but they have shown absolutely no urgency to fix it. Their recommendation is that we spend more money to work around the problem on their end.
And you’re right. The motivation snowflake has to improve is survival. It’s not like their architecture is impossible to replicate. Redshift is doing a total reorganization of the product and rewrite to compete more directly with snowflake (redshift aqua etc).
They also seem to completely discount the value of SaaS outsourcing database and storage operations to snowflake whose only focus is operating the database product. Running your own clusters is an exercise that seems smart in the first few months then like a puppy when it grows up you’re stuck with a dog. If you love dogs and train them well then great. But fact is most people are terrible dog owners, and the same is true for MPP clusters. Being able to focus on the query management operations exclusively is really ideal. Highly stateful distributed products are a PITA.
He also rants about snowflake not telling him the hardware. Snowflake runs in ec2, gcp, azure. You can literally guess the hardware types - there’s just not that many saddle point instance types for that sort of workload. Discussing ssd vs hdd is also an obvious sign of ignorance - it’s basic premise is it does very wide highly concurrent s3 gets and scans of the data using a foundation db metadata catalog to help prune. Being in aws, it’s implausible they use hdd and realistically they could elide ssds (I do not remember if they use local disks for caching, but it’s stateless regardless).
The unit costing being hardware agnostic is totally normal too - they don’t have to expose to you the details of their costing because they normalize it to a standard fictional unit.
I agree if the performance of one of them fell behind the others for any prolonged period of time the cost to the laggard in market share would be much much worse then short term revenue gain of "being slow on purpose".
It benefits no one except for a couple thousand people to so blatantly play their customers in this way. In fact, it's worse, as it incentivizes that same behavior of other market actors in the space.
We have seen many, many examples of executives who are willing to sacrifice the future of the company to get a personal short-term gain. Jack up the revenues (or slash costs) in ways that alienate customers is a great strategy when you plan to jump off with your golden parachute in a couple years when all your stock options vest.
Agree, but the author has one thing right. Snowflake is not transparent about product behavior, which makes it hard to reason about costs and performance.
Open source data warehouses like ClickHouse and Druid don't have this problem. If you want to know how something works, you can look at the code. Or listen to talks from the committers. This transparency is an enduring strength of open source projects.
Snowflake compete on marketing.
Plenty of people rave about Snowflake and have never heard of Databricks, BigQuery or Redshift.
I suspect most data warehouses have similar NDRs.
In many companies a data warehouse is the place where you dump all your data and let everyone run poorly written programs against it.
Add to that poor engineering culture in data teams (often lead by non-technical people) and costs are bound to skyrocket.
Everyone from consultants, SAs, Sales, support etc is constantly working toward getting customers to “optimize” their spend. Of course any business wants you to give them more money. But, none of us are pushed to get them to spend money on services or methods to do things inefficiently.
I specifically work in consulting specializing in “application modernization”. That means most of my implementations are cheap and I’m constantly spending time making sure my implementation is cheap as possible and still meet the requirements. I first noticed this attitude from AWS when I was working for a startup.
This isn’t just with AWS. I spent years working in enterprise shops and saw the same attitude working with Microsoft.
I can’t speak for any other large organizations - AWS and Microsoft are the only two I’ve worked with as either a customer or employee where there was huge spending on infrastructure or software.
Now I could easily get started about my opinion of Oracle from the customer standpoint. But I won’t.
Just another way that vendor lock-in occurs (intentionally or otherwise).
It depends on the time scale. A SaaS optimizing for, say, a 1-3 year financial return will see their interests through a different lens than one optimizing for a multi-decade return. Leaving optimization gremlins in isn't aligned with customers' interests in the long run, so the customers will eventually find alternatives if the SaaS doesn't eventually align itself with customers.
[0] https://cloud.google.com/bigquery/docs/querying-partitioned-...
Or even better engage with a neutral third party such as Jepsen to get on an even playing field and duke it out.
It's like the cloud in general, the cost is high but so is the hype. When all that dust settles over the coming years the business will start shopping on price. They will then realize they have been locked in to some extent and will need to start wriggling loose of the lock-in.
I found the Snowflake statement pretty reasonable. [0]
Vendor benchmarks are largely propaganda. What actually counts is performance on real-world workloads, starting with your own. Plus good bencharks are costly to do well. If vendors are going to invest in load testing, it's way better to do it as part of the QA process, which directly benefits users. The other thing for vendors to do is to drop DeWitt clauses so others can run benchmarks and share the results. Snowflake announced this in the statement and also changed their acceptable use policy accordingly. [1]
[0] https://www.snowflake.com/blog/industry-benchmarks-and-compe...
[1] https://www.snowflake.com/legal/acceptable-use-policy/
Disclaimer: My company runs a cloud service for ClickHouse that competes against Snowflake.
* being easy to manage * being able to scale up and down compute so you can get good performance without having to keep a bunch of machines running.
This bit me on big queries Public patent search, which I was just noodling with for fun. Each query was $4. Ow!
Standard disclaimer: I work in ProServe at AWS.
When you “consult” and are employed by the company selling the software, billable hours and utilization is not the be all end all. Consulting is just the “nose of the camel in the tent”. They want you to be as efficient as possible so they can make ongoing revenue.
Trust me, AWS is not going to complain if it only took me 20 hours to do work that was estimated for 40 and brings in half as much consulting revenue if it means ongoing revenue from the customer.
There isn’t just a singular focus on utilization rates.
My billable hours do fine while making operations more efficient and cost less.
The best way to describe Snowflake is that it is a brute force method to run complex queries without creating indexes.
If you have a more traditional database, you will notice you need to set up indexes to be able to get anything from it in finite time. What if you don't know the indexes upfront? What if you want your users to be able to ask arbitrary queries and get answers before bedtime?
That's what Snowflake is for. It automates using ENORMOUS amount of hardware to get your query executed fast, very inefficiently.
It is not for free though. That inefficiency will cause a lot of resources used for queries. It is meant for those few queries when your users try to get some insight into your data and you can't predict indexes beforehand. Sometimes this is exactly what you want, like when you let your data people in to figure stuff out. Or when you have very rare functionality that allows the user to build their own queries -- which you should avoid like hell (and there are tricks to make it index pretty well) but can't always avoid.
For everything else, whenever you can predict your indexes, you always want to use more traditional database that can be very efficient on queries properly supported by indexes.
The issue is a lot of people try to use Snowflake as a database or to support frequently executing queries of the same kind. This is bad and it will cost you.
What I found however is that Snowflake is indeed super cheap if we look at Total Cost of Ownership (TCO). Compared with other cloud data warehouses it is even easy for to cost control (warehouse size with autosuspend and resource monitors).
I work with many Snowflake customers and the biggest cost they are concerned with is usually training users so they don't shoot themselves (wrong joins, external programs "pinging" the service, ...).
Snowflake is mainly expensive because of usage, not because of bad query optimization.
(Co-Founder at https://www.sled.so/)
It seems totally natural to expect these use cases to be well-supported & cost-efficient. That they're not I think is likely to be misunderstood by a great many people, even technical folks.
1. I like Snowflake and I think they brought several innovations to the field: Instant scale out/up, time-travel, unstructured data query support. 2. Snowflake obviously makes innovations and performance improvements, otherwise they would not be the market leader they are. But I'm also suspecting that they make just enough performance improvements to be at par and then use the vendor lock in features to make switching hard.
My argument is that their rate of performance innovation has considerably gone down and DataBricks, Firebolt, and open source alternatives just seem more attractive from a cost/performance ratio. I agree that Snowflake is still the best data-warehouse to start with if you have 100k, but not if you truly plan for a multi-year horizon and your usage expands.
- Redshift also brought a lot of innovation that allowed people to execute analytical queries 100x-1000x faster than any OLTP that existed out there. I've used Redshift for four years and they kept ignoring performance and features until Snowflake came out. All of a sudden because of competitor pressure, they put more effort into the product to maintain and gain market share. My hope is that Snowflake finds a solution to their innovator's dilemma, since competitors are hot on their tails.
- Some people point out that 70% usage growth just shows that Snowflake is useful. Nobody disagrees with that. The issue is that majority of the companies don't experience a 70% revenue growth to catch up with the growth in costs. At some point, you have to clamp down on costs, which means that you have to look for alternatives to run things more efficiently.
Re: Firebolt, I don't consider it to be in the same class as Snowflake whatsoever (even though their advertising seems to indicate otherwise). Snowflake is like a very powerful swiss army knife. Firebolt is good for a very specific (dare I say niche?) workload but falls all over itself for the vast majority of data org needs.
It runs SQL queries on structured data. Is that niche?
I think you are misunderstanding something very fundamental here. Snowflake has usage pricing and no one is forcing companies to use Snowflake 70% more every year. In my experience, companies are typically evaluating spend on other platforms and after some testing, moving additional workloads there to displace cost elsewhere. Let's say your Snowflake bill was $100k and you were unhappy with your your security data lake provider and replace a $1M bill there with $200k of Snowflake. Your Snowflake bill has now increased 200% to $300k, but you are still $800k ahead overall. In other words, your existing workload (the original $100k) didn't get more expensive.
I've worked in data warehousing for a lot of years now and stepping back, I guess I don't understand what you are trying to accomplish here. I certainly think everyone should take a "trust but verify" approach with their vendors but honestly, I don't think you proven your case, especially since you appear to complete ignore the competitive reality these vendors live in. Beyond that, I don't think "speeds and feeds" are the most important improvements going on with these platforms at the moment. Check the monthly release notes:
BigQuery: https://cloud.google.com/bigquery/docs/release-notes Databricks: https://docs.databricks.com/release-notes/product/index.html Snowflake: https://docs.snowflake.com/en/release-notes.html
Performance is important but it doesn't exist in a vacuum. What percentage of features in the past two months for each of these platforms relate to performance? On the flip side, how much does your company spend on things like data governance? How much would a data breach cost? How many people maintain the platform? What do pipeline failures cost? How is connectivity to other solutions your company uses?
If you look at where innovation is happening (and this is a VERY interesting space these days), the bulk of improvements are in areas arguably more important to companies. BigQuery has added migration improvements, Databricks has added Photon and Unity Catalog improvements, Snowflake has added Java and Python stored procedures. The list is miles long for all of these vendors and I challenge anyone in the space to keep up with everything.
Another comment here said all of these vendors are within 10-20% performance of each other. If that is true, in my opinion you're focused on a problem that is an edge case at best. Something to watch, but not nearly as interesting or as impactful as the rapid pace of innovation across this space in all areas. IMHO.
Fair point, some of that net revenue increase is because of consolidation of workloads, although the majority of the cost is likely still driven by consumers expanding usage beyond what they expected. As I mention in my article, the second part of increase in costs has to do with data governance, and my argument is that snowflake doesn't make governance easy. Why can't they stand up a IAM-like service with a nice UI and dashboards? why can't they make integrations with pagerduty, slack, email work out of the box? Why can't I specify team based budgets and instead have to do it on a per warehouse-team basis? Why do I have to build custom bespoke tooling on top to make governance work?
I can unequivocally say that at a certain scale you need to move on and that Snowflake and many of the SaaS providers are too expensive even at medium scale companies. This article describes this paradox better than I could: https://a16z.com/2021/05/27/cost-of-cloud-paradox-market-cap...
Moreover Snowflake's enterprise pricing model is even more non-scalable. Why do companies often have to pay two times higher price per credit relative to the standard model? Shouldn't guarantees on security or support come with a fixed cost? Shouldn't enterprise offer economies of scale in pricing?
I also wish folks would read my article from end to end because my conclusion in the article is that you don't really have a choice but to use an enterprise solution when your scale is small. If I had to start my own company and had only 2 data engineers, you betcha I would use Snowflake and DataBricks.
--- btw, it really surprises me that nobody has commented on the workload manager. Am I the only one seeing that as an issue? I have enough exposure to compare it with Redshift and I can say that Snowflake's workload manager is just very bad at optimizing throughput.
For the exact reason that the article claims Snowflake wouldn’t innovate, I’d assert that they would. If they are expensive and slow, and a competitor is faster and cheaper, eventually they will see business move to the competitor. We see it all the time.
Alternatively there is a faster impact on new sign-ups when falling behind competitors on costs and benchmarks.
And large customers are moving to them in droves.
Snowflake let's you roll into pay-as-you-go after a contract expires.
I don't know the market at all, but Snowflake is certainly large and successful (IPOed in 2020, $50bn market cap). I could readily imagine that a company doing so well might not feel the incentive to improve very strongly. Or that they might see themselves more as a sales/marketing-led company than one where technical quality is a key driver. Whereas you folks as a challenger would have a lot more incentive to differentiate yourselves.
Snowflake is not expensive because of perverse incentives, which is the primary claim of the article. It is expensive because it is a highly differentiated and very sticky product.
As others have mentioned, competition is the ultimate incentive to work on performance. Every dollar of Snowflake revenue is a dollar of revenue that Amazon, Google, Microsoft and Databricks are fighting for.
This is true, but misses one detail...
Snowflake runs in the cloud so every dollar of Snowflake revenue is roughly $0.40^1 of Amazon/Google/Microsoft revenue anyway.
^1: Snowflakes gross margin is in the range of 50-60% https://www.macrotrends.net/stocks/charts/SNOW/snowflake/gro...
It eats/consolidates formerly-disparate costs around the org. Because it's so good.
Which makes it look expensive.
That said, I'm not sure your comment is fully accurate: 1) "lack of query level attribution of costs" Snowflake doesn't charge per query so there can't be default query level attribution of cost. Snowflake charges by second of warehouse use. But you CAN easily see which queries ran on which warehouse and allocate costs back to that using your own criteria (by query second, usually better than by number of queries). 2) "no in-built features for monitoring" Snowflake has built in cost monitoring dashboards: https://docs.snowflake.com/en/user-guide/cost-overview.html And resource monitors: https://docs.snowflake.com/en/user-guide/resource-monitors.h...
That said, I'm sure improvements could be made. Ask for them. There must be a market for this because Capital One and Acceldata and others offer similar solutions for optimization recommendations.
Snowflake/Databricks scales infinitely across cloud object stores like S3. Clickhouse is run as a single (or sharded) process that uses the local file system like any other SQL database, and requires volume provisioning as your data scales. It also has a fixed run cost (EC2 or wherever it's hosted) versus an "on-demand" model where read clusters are spun up to run queries against static objects that have no fixed cost other than storage pricing.
However, it's probably not a great pick if you're already struggling with the operations side of things, which seems to be the main selling point for services like Snowflake.
I don't think there's really a right or wrong answer here, just trade-offs.
Disclaimer: I work on Altinity.Cloud, a platform for managed ClickHouse
0 - https://clickhouse.com/docs/en/sql-reference/functions/date-...
Me: My builds are really slow
CircleCI: Here are a few very low effort answers
Me: git checkout is taking literally 60 seconds, but it takes 3 seconds locally, why?
CircleCI: Mumble Mumble.
They charge per minute, so why would they care if builds are slow? Was about a year of this getting worse and worse, till I finally cancelled the service last week and built my own server in my basement.
I know get 200% faster builds, and the hardware payback time is not very long (6 months of my CircleCI bill?).
I think it's a huge red flag anytime the metric you care about is something that being "worse" makes the provider more money.
Always try to find partners or counter parties who win when you do as well. I know we don’t always have that luxury but sometimes a little headache initially is better than being stuck with someone who works in opposition to you in the long run.
Thanks so much for sharing your story. We are in the process of outsourcing some of our Jenkins functionality and these stories are useful to hear.
Rule of thumb: Anyone talking about their honesty is not honest.
https://github.com/philips-labs/terraform-aws-github-runner
phillips-labs has some good resources for scaling this up as well.
Not to mention the constant failures.
It's worse than just not caring: they have a direct financial incentive to make sure your builds are as slow as you'll tolerate.
I ended up doing TeamCity over Jenkins, but they do the same thing.
Amazing how fast a 32C / 64T EYPC server in my basement can be..
SF: https://www.williamweisslaw.com/sf-e-scooter-laws/ NYC: https://www1.nyc.gov/html/dot/html/bicyclists/ebikes.shtml#:....
It's usage-based pricing and customers are using more of it.
> a customer that joins a year ago and spends $1 is paying out well over $1.7 a year later
The entire article is based on this 1.7x "net dollar expansion" statement.
After integrating Snowflake, customers have found value in using Snowflake and are using more of it 1 year later.
Since Snowflake is billed on usage, that explains the net-dollar expansion.
It’s also very simple to manage and optimise so less DBA or DevOps type manpower.
Then of course you can perfectly right size your instances and pay by the second for compute and by the byte for storage.
Expensive, but lower TCO than alternate approaches I suspect.
Like any other service there are scale points where it no longer makes sense but for most smaller orgs it's still a bargain over DIY
I think people are falling into a trap of not considering costs because “it takes care of everything”.
Then of course you can perfectly right size your instances and pay by the second for compute and by the byte for storage.
These two are connected vessels.
I know a bit about the effort involved in chucking around 100 petabyte datasets, and there are numerous niches a SaaS could fill in there, but it’s very murky from the outside.
> The best way to describe Snowflake is that it is a brute force method to run complex queries without creating indexes.
I guess I’m trying to get a read on whether their core competency / moat is distributed columnar query technology or sales/support/marketing.
However, the generally accepted wisdom there was that improving performance had always led to more builds being run - and so still come out as a net-positive. This had happened a bunch of times as we upgraded CPUs or storage drivers or the version - there'd be a short term drop in direct revenue, but then it would bounce back quickly as people took advantage of being able to do more stuff in the same amount of time.
I'm told the revenue and finance people were pretty concerned the first time it happened though!
Most dev teams are underinvested in CI. That is, if you queried some random team, they'd probably have a dozen ideas for tests or processes they'd like to write/run if they had the resources, most of which would provide some real value - the ideas likely coming from some previous actual bugs that hit prod.
Most BI teams are overinvested in data. They have way more than is valuable. Large scale analysis is mostly exploratory and speculative, and rarely yields results. Any induced usage is more from fear they might throw away the magic bits than real value being unlocked by better efficiency. (And I think this is probably necessarily true. Any BI process that gets to the point the data is clear and regularly actionable also gets operationalized and right-sized through a more normal dev process.)
I think Snowflake is (still) expensive because it is a venture-backed enterprise software company and goes through a typical trajectory...
Story goes like this: founders are product-driven and first movers -> find PMF -> need VC funding -> VCs only fund enterprise software ventures with 70%+ gross margins and high retention rates -> product/service gets priced to achieve these metrics -> VCs happy to fund sales & marketing machine needed to obtain sales growth, nobody cares about profitability until after IPO -> startup is everyone’s darling until ~2 years after IPO.
Then: economic crisis hits, customers become more price sensitive, competition intensifies. Plus now management is exposed to quarterly pressure of financial markets to deliver on top-line and margin expectations.
Meanwhile a bunch of startups are building (lower priced) alternatives. Perhaps not as mature or feature-rich as Snowflake, but good enough for 80% of use cases that Snowflake covers.
Therefore the assertion that Snowflake is not optimizing their product sounds a bit crazy to me. It would be optimizing for short-term gain, while jeopardizing its reputation as the leader in the space. Obtaining excessive margins through excessive pricing only works under monopolistic conditions or if they had a truly distinctive product. Both are not the case imo. Also, it's early days. Not exactly sure what Snowflake's market share is, but I bet it is < 5%.. so they haven't locked in everyone yet...
I bet that Snowflake will be forced to compete "also on price" in the next five years because free enterprise is a powerful thing. The title of the article could be “Why Snowflake is (still) expensive but will get more affordable over the next few years”..
This is not true. Snowflake has done just that - it has continuously improved performance resulting in reduced credit consumption and revenue from customers on a unit compute/storage basis. And it has negatively impacted their revenues and stock price. Snowflake's incentive is to strengthen their competitive position and to hopefully generate more long-term revenue from their customers.
The CFO forecasted a $97 million dollar short fall when guiding for 2022 revenue resulting from product improvements. Snowflake stock dropped immediately after.
See Q4 transcript -- https://www.fool.com/earnings/call-transcripts/2022/03/02/sn...
"Similarly, phased throughout this year, we are rolling out platform improvements within our cloud deployments. No two customers are the same, but our initial testing has shown performance improvements ranging on average from 10% to 20%. We have assumed an approximately $97 million revenue impact in our full-year forecast, but there is still uncertainty around the full impact these improvements can have. While these efforts negatively impact our revenue in the near term, over time, they lead customers to deploy more workloads to Snowflake due to the improved economics."
Also see the Bloomberg article -- https://www.bloomberg.com/news/articles/2022-03-02/snowflake....
"Snowflake Inc., a software company that helps businesses organize data in the cloud, dropped the most ever in a single day Thursday after projecting that annual product sales growth would slow from its previous triple-digit-percentage pace.
Executives said improvements to the company’s data storage and analysis products will let customers get the same results by spending less, which will hurt revenue in the short term, but attract more clients in the future.
“The full-year impact of that next year is quite significant,” Chief Executive Officer Frank Slootman said on a conference call Wednesday after the results were released. But “when customers see their performance per credit get cheaper, they realize they can do other things cheaper in Snowflake and they move more data into us to run more queries.”"
FWIW, Keebo (https://keebo.ai/) tries to solve this problem & reduce your Snowflake bill by using Data Learning techniques. It can be configured to return exact results or approximate results.
I don't see AWS changing so dramatically that companies like DataBricks are put in hot water (but I could be wrong), but I could see Snowflake improving its product due to competition, putting Keebo in a tough situation.
The bit about Snowflake not being incentivized to care about costs are trivially untrue. The rest of the article perceives trade offs as simple feature gaps.
For example, Snowflake gives the user more latitude to distribute workloads among “warehouses” than other offerings. With poor distribution the author will experience the workload provisioning issues he describes.
Ops. Unless your core competency is running reports and spark nodes, it's probably cheaper to outsource the management of Spark and friends than to hire people to make sure it's always up and running. To be fair I haven't touched Spark in many years but having to page someone who was good enough to spark to debug why a job stopped at 3am isn't fun.
I think as an end user I would absolutely agree on this point. But many companies use Databricks as part of their automated backend systems that they resell to customers. The cost per "DBU" unit is astronomical for the amount of raw compute in use. It feels a bit like running a restaurant where you serve takeout.
What ops am I missing?
It's a tradeoff. It might cost less dollars but more time. The time and expertise to run their own clusters effectively is not something every org can or desires to do.
Would love to know the TCO trade-off between procuring, securing and deploying on your own clusters vs having them managed via SaaS.
What it can do, successfully, with three engineers was previously impossible with dozens.
What IS expensive is not being careful with it.
The trend in the data space currently is for usage to increase -- as more companies adopt dbt they're running more and more prebuilt (materialized views) queries on a scheduled basis, rather than on demand. This is overall a good thing in that data is becoming easier to manage and use, but it does come at an increase in warehousing costs.
I think eventually the pendulum will swing back to tools that help optimize warehouse usage, as long as they allow for the same increase in productivity as dbt (disclosure - I work for one such company)
I’m also not sure I understand the dig at streamlit dashboards. If you’re running hardware and introduce new read workflows, eventually you’ll need more read replicas and you’ll pay more for it. Maybe you can argue that snowflake is doing this at a higher cost but the metric data is not available in the sources to make that claim.
EDIT: There it is: https://www.snowflake.com/
Data warehousing, basically.
I'm genuinely curious and would appreciate anyone who could show a real life example of this kind of pipeline where data is accumulated, then processed, then turned into revenue at the other end.
I've implemented systems that do this but my experience is that accumulating data is (too) easy, processing it in a meaningful way is slightly more challenging but ultimately driving positive business processes according to this data, which require a lot of friction with employees (training, procedures, maintenance, support) is the most difficult part.
Every business in every market need to understand what is going on with their processes. How many sales did I do yesterday, last week, last month, compared to last year, in which stores, what is the average basket amount, customers buy what with what, what size t-shirt do I sell the most, etc.
That being said, Snowflake is also pushing for the marketplace model where you publish your app natively to move your code where the customers environment is. If they become successful, the performance might not be the one of the incentives for the companies to go with Snowflake and the switching cost might be higher as companies will move more of their business logic embedded in the system.
Vantage just launched this - https://www.vantage.sh/blog/vantage-launches-snowflake-suppo.... The problems the author describes are almost exactly what we heard from customers:
- list of users/queries that are the most expensive
- alerts and notifications for costs
- query timeout. Not something a third party can do but there is an interesting 'query tagging' feature for snowflake which Vantage supports.
Let's consider Snowflake in this paradigm
- Problems: analytics on data that is not laid out in a way that's directly accessible for analysts.
- Resources: SQL analysts, few or no competent data engineers, spare cash
- Outcomes: run analytics at an industrial scale without requiring competent engineers or DevOps.
Since Snowflake's optimal client gets very easily locked in, it follows up that saving said client's money is not something even the client would care about
The issue with dbt models in Snowflake is that if you ever perform a full-refresh and don't sort it, you ruin any natural clustering that arises from an incremental model. I've run into this issue many times. Auto-clustering gets too expensive at scale and Snowflake doesn't give you much guidance on alternatives.
Small nit: Redshift isn't open source. I would also add Clickhouse, Citus, and TimescaleDB as majorly capable open source technologies with commercial offerings in this space.
If they improve performance they can lower the cost to customers, which will make the product more attractive to prospective customers. But if they are already swimming in cash they may not feel the need to gain more customers.
Only threats prompt companies to improve things. Threat of a competitor, threat of losing all their money, threat of bad PR, threat of regulation, threat to the stock price, etc.
I see this every day in companies that don't care about managing their cloud costs. They waste money like crazy because they literally don't care if they lose money, because some exec doesn't care, or they got enough funding until the next round, etc. A couple years later another exec asks why the CISO/CTO is spending so much money without any ROI, and then everybody has to stop everything they're doing to shave pennies off cloud costs.
Companies run by individual executives are insane. I don't understand why people allow companies to be run this way. I think a co-op where employees could be active participants in the running of the company would allow for more sane decision-making.
Certainly snowflake wants to make it easier for people to spend money and solve all their problems on that platform, every company wants that. But it's a very competitive world out there, and snowflake leaders aren't complete idiots - they have to keep lowering their prices when they can, otherwise new people will come along and do things cheaper.
Nonetheless I agree with the basic points of the article.
Another example of misaligned incentives is LinkedIn. LinkedIn charges $3/message. The more messages sent on their platform, the more money they make. They are not incentivized to help sales or recruiters target the right people. It can be a cash cow in the short term, but it creates a negative experience for your users.
The fact that it has worked for so long is a testament to how strong network effects are.
In the case of Snowflake, high switching costs will protect them for a while.
I work for AWS in billing, and the way we calculate bills is to try to et the customer the maximum discount.
Things like calculating savings plan coverage from smallest to largest to maximize utilization, or turning on Reserved Instance sharing on by default within an org.
I would say that the seemingly gouging behavior is more often than not technical or time constraints.
I’d be very interested to hear the Snowflake side of this decision, but to the customer it’s simply unforgivable to have cosmetic constraints on a database.
And now "XxxOps" is a meaningless buzzword.
>"Snowflake has no incentive to push a code change that makes things 20% faster because that can correspond to 10–20% drop in short-term revenue" Completely untrue. There is constant optimization of scheduler, execution process, global services, and compute fabric. The famous "we shipped AWS Graviton and it's like 10%" cheaper was something we did to ourselves. There is work underway to make FoundationDB faster/more efficient too that's totally out of this world. In short, nobody wants to burn extra CPU cycles and bill you for it.
>"Disclose Hardware Specs" This isn't hard to find if you work with Snowflake's SE and Services, but it's not going to give you anything. The whole POINT of Snowflake is to hide all this nonsense and make it "just work". You want CPU and SSD metrics, feel free to use Databricks (many do) or whatever.
Now, there IS something to be said about some sort of observability into query execution as it is going. There are constant discussions on that, and some of the new upcoming features (like programmatic access to query profiler) can open that up. But yeah, Snowflake is NOT something that will open up what's under the hood and it is super intentional
>"Not adopting benchmarks" This goes around and everyone freaks out. Just profile your own work. Whatever. Nobody cares about benchmarks.
>"Optimizer gremlins" Snowflake COULD do more to expose some of the internals. My job (and job of 100s of my services and technical SE colleagues) is to help customers understand what's happening under the hood. Some of the company's "make it simple" ethos COULD be a bit more open. However, much of the common things (MP pruning) can be solved by simple user education. I've lost count of how many customers I worked with who had 0 education in Snowflake and even like 20-30 minute intro in it made them open their eyes and go "woah, I get it now". On other hand, dozens of people told me that it was amazingly easy to use without training, and it IS!
>"Improve the workload manager to increase throughput" Workload manager is considerably more complex and sophisticated than this guy tells us it is. I saw an internal presentation on its internals that I asked to convert to a confluence article which thankfully happened pretty quickly and lots of people benefitted. There is cost-based scheduling that takes expected resources of queries to schedule and also considers actual resources consumed, all very frequently and for every XP. I wish that article was public but I think it will not be made one, but still, it's definitely not FIFO.
>"Not providing observability to monitor and reduce costs" This is valid feedback now and constantly what we do in services. New manageability features are coming to help with this. See CapitalOne or bunches of companies in this ecosystem.
>"What companies that use Snowflake could do better? I agree with point about education. Huge portion of people using and abusing Snowflake don't have any formal education. Best think you can do is hire Snowflake PS or get a partner/SI, or just take a damn class, they are REALLY good.
Source: 2 years in services at Snowflake with focus on perf, cost, and manageability.
I spent a number of months last year focused on lowering Snowflake spend. In the process I learned a ton about Snowflake and gained a fair amount of respect for the product. Respect as in "this is really great" as well as respect as in "I need to be on guard here or I'm going to get hurt."
I think my biggest misconception at the outset was thinking of Snowflake like it's a relational database. It's not. Or rather, it is with a large number of caveats. Snowflake doesn't have b-tree indexes -- rather it has "clustering keys," which are sort of like coarse grained indexes that colocate data in micropartions, allowing queries to do micropartition pruning. If you have a well clustered table and you're filtering on your clustering keys, things will be great. But if not, or, for example you have to do multi-table joins on non-clustered columns, you'll suffer. So unless you have search optimization enabled (which costs more!), you have to retrain yourself away from "oh, just add an index here or there to make things fast" type of thinking you may have had working with Postgres or whatnot.
Regarding the author's complaints about lack of observability, I generally found it pretty easy to analyze what was going on via the query_history table. And the built in query analyzer is quite helpful. We did add tags to our dbt runs, which was pretty easy, and I wrote a handful of queries to find like the most expensive dbt models. It wasn't really that hard.
That said, dbt in particular provides a number of foot guns wrt Snowflake. Subqueries, as the author mentions, is one. We created some custom dbt macros to do things like instead of `select * from foo where x in (select * from blah)` -- if blah was small -- do a query on blah and write the query using a literal list, like `select * from foo where x in ('a', 'b', 'c', 'etc...').
Another issue we discovered is that in dbt it's trivial to create views. But we found that if views get too deeply nested, Snowflake can't adequately do predicate pushdown. So big stacks of views on views are suboptimal.
Another interesting one was tests. Dbt makes it trivial to perform null or uniqueness checks against a column. We found we were spending a lot on those tests that simply were doing something like `select * from blah where col is null`. On non-cluster key columns or complex views, these were causing full table scans. We took a number of steps to mitigate those issues. (Combining queries; changing where we did these checks in the dag). The way tests are scheduled is problematic as well. One "long pole" test will keep your warehouse up and using credits even after the other 99.9% of the tests have completed. After some analysis we separated long pole tests from the others and put them on different warehouses.
I could go on and on, actually, but I think that provides a taste of some of the complexities involved. Like almost any tool, you have to really understand it to use it effectively. But it's all too easy for, say, analysts, who may be blissfully unaware of the issues above, to write really poorly performing SQL on Snowflake.