WDYM? They were the Hadoop company. You can't just become the Spark company, the philosophies of the products are very different. This comment is pretty silly.
The "only" stumble I can identify is that they're selling a last-generation solution and most companies see Hadoop as tech debt nowadays. Which is to say, it's a systemic issue with their entire product, not a tiny mistake. This is like Mesos vs Kubernetes. One got squashed.
Spark's initial path to success was "a faster way to process your data in HDFS". Cloudera was selling users Spark before DataBricks was even founded. The idea was that Hadoop was an ecosystem of tools for processing data built on commodity storage and compute hardware, for when your data was too big and expensive to transfer to the cloud.
Over time it became increasingly popular to use cloud storage instead of running HDFS. This really destroyed Cloudera's moat, because there was no operational overhead to putting your data in S3 or GCS. You just needed to run some stateless compute, and if you fucked up it didn't matter. Nowadays your "data lake" is a bunch of files in commodity storage someone else runs.
Yes I agree. It isn't really that Spark killed Hadoop but S3/GCS made managing Hadoops clusters pointless. Spark plays well with the storage ecosystem so it's thriving now. But my whole point is that it seems unlikely to me that Cloudera would just become a compute company if they had invested more into Spark. The core thing they were selling became less and less important over time. That was the problem
Yeah, branding themselves as the "Hadoop company" made it difficult to get on board with Spark, etc. If they had branded themselves as the "big data company," it would have been far easier to move with the market.
There is probably a business-school case study there for branding yourself with the problem area rather than a single solution, esp. in fast-moving domains such as tech.
The "only" stumble I can identify is that they're selling a last-generation solution and most companies see Hadoop as tech debt nowadays. Which is to say, it's a systemic issue with their entire product, not a tiny mistake. This is like Mesos vs Kubernetes. One got squashed.