There is no real world site reliability benefit to declarative over imperative. They are just software design paradigms. You still have to design the software to not suck.
People spend loads of engineering effort (years, really) to get that controller to function properly under every weird state case, while the neckbeard with a shell script, hot failover master-master write nodes and an ELB gets five nines without needing a CS degree.
I think the benefit is not needing to grok, debug or re-invent every neckbeard's special shell script. Just declare your desired state and move on to other things.
What concerns me somewhat about Pulumi is we will lose some of "standardization" that comes with the widespread use of Terraform.
Like anything else, it is possible to write Terraform in an incomprehensible way. However, overall I have found that Terraform "constricts" things into being done a certain way. This allows engineers to jump into a new TF codebase and grok what is happening fairly easily.
Now with Pulumi, we lose that constraint, and now have to grok not only the different languages that infra is defined in, but also the wide variety of different "styles" that a language can be written in. An engineer joining from another company or team now has to deal with a new language and a new code style/organization in order to wrap their head around the infra. Not a desirable property when it comes to managing the infrastructure for a company.
While it is appealing that a developer can use a language they are already familiar with to manage infra, that only holds true if the language being used for Pulumi is one they are already familiar with.
I could of course be totally wrong. Regardless, I am excited to see a new contender in the space.
Some of the Terraform codebases I have worked on have been wildly over-complicated and take a good hour to grok. And I literally spend 10 times more of my day writing a new "standardized" Terraform module than I do writing a shell script that does the same thing.
And Terraform doesn't even support declarative configuration management!!! The syntax is declarative, but it literally can not change the infrastructure to be the way you describe, if it has changed outside your state file (which can happen at literally any time).
> Some of the Terraform codebases I have worked on have been wildly over-complicated and take a good hour to grok. And I literally spend 10 times more of my day writing a new "standardized" Terraform module than I do writing a shell script that does the same thing.
Yep, which is why in the comment you replied to I said: "Like anything else, it is possible to write Terraform in an incomprehensible way.". You can write muck in anything.
Also, an hour sounds like a pretty reasonable amount of time to grok a totally new codebase of any significant complexity, but maybe I'm just slow. I can assure you that trying to grok the equivalent mess in a language and style you have never worked in before isn't going to be any faster.
> And Terraform doesn't even support declarative configuration management!!! The syntax is declarative, but it literally can not change the infrastructure to be the way you describe, if it has changed outside your state file (which can happen at literally any time).
I'm not sure what you mean by this. The first thing Terraform does before a plan or apply is refresh the state file with the real-world status of resources.
Terraform can only manage the resources that are explicitly defined in HCL files, in the exact way that they are defined (including based on a given module hierarchy). So if anything is either not mapped in an HCL file, or something changes in the files in an unexpected way, Terraform may refuse to do anything at all. And often this is based on computed values, meaning you don't know it's going to break until you apply, and then you have half-broken infrastructure in production.
The simplest is if you start renaming modules or moving resources in and out of different modules. Terraform will get confused and try to destroy everything rather than modify in place, even though it's the same resource, or it will sometimes just fail altogether, like it's trying to resolve some module that used to exist but no longer exists. Basically it doesn't comprehend that its logical resources are really real-world things, and leans so heavily on its logical mapping (based on things like module inheritance, which is a Terraform logic thing, not a real-world AWS resource thing) that it often just becomes unusable, and you have to perform heroics of moving about pieces of code and importing various things to work around it, if it works at all.
The bigger problem is when you have a resource which might be associated with another resource, like the various ways to represent IAM policies and roles in terraform resources. You can create the resource one way and deploy it, and then maybe someone modifies the existing real-world resource in a way that now depends on some other resource... but that resource isn't in a Terraform file. Terraform doesn't know what to do, so it will either clobber the modified state, or just die because it doesn't know how to resolve the conflict.
I would need to go back and curate a list of all the times that Terraform has just failed to do anything because something changed that it didn't expect, but basically it refuses to "fix" things that are unexpected. That, and the lack of automatic importing of existing resources is just absurd.... If Terraformer can make it work, Terraform could have too.
People spend loads of engineering effort (years, really) to get that controller to function properly under every weird state case, while the neckbeard with a shell script, hot failover master-master write nodes and an ELB gets five nines without needing a CS degree.