Yeah, I think that sadly, there is going to be a little bit of an inevitable equivalent to the unix wars of the early 80s. The sooner we can reach a standard place, the better it's going to be for the container community and developers more generally.
One of the reasons that I pushed hard to get Kubernetes open sourced, is the hope that we could get out in front of this, and allow the developer community to rally around Kubernetes as an open standard, independent of any provider or corporate agenda.
We've spent a lot of time working with the Kubernetes community. I can only speak to our experience, but Brendan, Craig, and the rest of the team at Google have 100% lived up to the commitment of treating the Kubernetes project as truly open and independent.
Our Kubernetes dashboard was recently merged into Kubernetes [1]. We brought our own vision of a web ui to the project, and we could have gotten bogged down defending technology decisions, and philosophical nits. Instead, the response from Google, RedHat, and others in the community, was basically "Awesome! How soon can we get it in?"
All of the key players have the right approach, and that gives me confidence in the project's longevity.
I'm curious, @caniszczyk why would it need to become independant outside of Google? It's already an Apache licensed open-source project hosted on GitHub.
In essence, having diversity in ownership can help the project have a long life instead of being governed by one entity. There's a lot of risk that the main entity in charge will do things in its self interest instead of the self interest of the project (and its constituency) over the long term.
Independent ownership and proper governance will setup the project for long term success and as a small company, you should prefer it to be that way.
I'm extremely pleased that Kubernetes has been open sourced by Google. It truly seems to me that the developer community is and will remain to be able to rally around Kubernetes as an open standard both today and in the future without fear of any outside agendas; as Brendan so eloquently stated. I for one applaud Google's level of transparency when it comes to the future of the project and the overall product vision.
I'm wondering if it was intentional or subconsciously accidental that you went with the "I, for one" construction... which is of course usually suffixed with "welcome our new [adjective] overlords".
Each pod has it's own IP address that is routeable anywhere in the cluster. This makes life much easier because you don't have to do port-forwarding onto the host node.
In all current k8s set-ups, each Minion/Worker node has a subnet that it allocates these Pod IP addresses out of. This isn't a hard requirement necessarily, but it tends to be much easier to make this work, since you only have O(Workers) routes to configure instead of O(Pods), but long term, I think we would rather do away with subnets per node, and simply allocate IP addresses for each Pod individually.
This is also very true. The purpose of systems like Kubernetes/Mesos is to be set up once by a company, and for most developers to just interact with the CLI tool to run containers. Tools like Google Container Engine, make this into a software as a service product, where the cloud provider provides the API. In that world, as a developer, you never really think about the OS, just your app. But unlike in traditional PaaS, there is no framework that restricts what you can code (language, libraries, etc).
The main difference between an PaaS like Heroku and a general purpose cluster manager service like GKE is that the former can simplify my making certain assumptions about your workload.
For example if a service knows your container is serving a web application then it can sensibly provision load balancers, DNS, HTTPS, auto-scaling, static content caching, automatic QPS monitoring etc. to support your app with little explicit configuration. And with a commercial service you get an SLA for those things.
You can get much of this with a general purpose cluster as well, but of course you need to configure it yourself, and more importantly - debug it when it goes awry.
Sadly in the Cedar stack, Heroku did away with Varnish or static asset caching from their "Dynos". Instead all assets bundled with your app are served using Ruby and dyno time. While a simple and clean architecture, this either raises the cost or forces you to place static assets on other servers (or use CDNs). And you'll still need to debug someone else's stack just as much -- when you get a service-specific error for exceeding memory limits or having requests take too long because of instance-level queues, etc.
At this point, I never pick PaaS because it's less work, I pick it simply because I've used it before and it's easy to get started with. Production apps are never hands-off if you're the one developing them ;-)
> But unlike in traditional PaaS, there is no framework that restricts what you can code (language, libraries, etc).
I'm not sure how traditional PaaSes are limited in that sense. Heroku accepts a wide variety of software through the buildpacks mechanism. Cloud Foundry supports both buildpacks and docker images, with clean extension points to add further mechanisms -- for example, .NET-on-Windows deployments, which is under development.
Disclaimer: I've worked on Cloud Foundry and I'm employed by Pivotal.
It really depends on how you do deployment. Containers provide deployment (and more important rollback) that is better than other deployment tools like Puppet/Chef/... because they are atomic (they either work, or they fail, they don't get stuck in the middle) and they package up all of their dependencies within them, so that they don't have the "well it worked on my machine" problems.
Systems like Mesos and Kuberenetes, decouple applications from the individual machines (and the operating system on those machines), and are online systems with self-healing properties (so that they will fix themselves rather than waking you up in the middle of the night)
k8s and mesos turn containers into an API that spans an entire fleet of machines, and enables you to dynamically use (and re-use) the fleet of machines for multiple different applications. No more dedicated boxes for mysql, mongo, etc. This in turn enables you to have an easier ops experience, because every single machine in your fleet is homogenous (same OS, same patches, etc) OS management is abstracted away from Application management, so that they don't interfere with each other. Since things in the API are expressed in terms of applications, it's easy to add health checks and automatic restart to the system, and provide self-healing properties as well. Both kubernetes and Mesos also make replication a first-order primitive so that it is easy to scale in response to load.
Ansible is sort of orthogonal to systems like Kubernetes and Mesos. Kubernetes and Mesos are designed to be online, self-repairing systems. Ansible is a way to easily execute commands on a bunch of machines. I can see collaborative use cases, where you generally use Kubernetes for deployments, but use Ansible for querying some data while debugging, or somesuch.
Anyway, sorry for the extended response. There actually is way more that I could say about the topic ;)
Thanks for the detailed response. The point about atomic deploys is a good one. However, if we all agree that immutable deployments are a good thing, I've been wondering how is that fundamentally different from launching new instances via AWS/Rackspace/Google API? Is there really a fundamental difference between shipping a Docker container to a set of servers, or just relaunching/rebuilding servers?
You also mention the "no dedicated Mysql, Mongo boxes" etc. I am 100% for that. However, how can you really make that work with databases/systems like MySQL which were fundamentally designed to work on one machine, and scaling them across machines is usually very painful, or at least, let's say, not very "idiomatic" (if I can use that word here lol). I can see the auto scaling part working with distributed databases like Riak/Cassandra, but even there the solution is not clear-cut and "out of the box". It still feels like some manual orchestration work is needed - correct me if I'm wrong.
I can totally see the "online, self-repairing point" but only for application servers that were designed from the scratch to be easily scalable by just adding servers. Which is the case for most scripting languages PHP/Ruby/Python et al and for well designed JVM/CLR/native/Go applications as well. However, I would argue scaling the app boxes/containers is the EASY part. Again, you can always scale up by "just" copying machines (I know, it's always harder that that). The hard part comes with managing the database servers or your cluster of messaging queues, or some other stateful thing that has to persist data SAFELY. Does Kubernetes/Docker REALLY make my life easier with those kinds of things? Is the answer to use DynamoDB/BigTable/RDS/managed queues and forget about that hassle of managing a database or a queueing cluster? Looking for answers :) . Thanks!
> I've been wondering how is that fundamentally different from launching new instances via AWS/Rackspace/Google API? Is there really a fundamental difference between shipping a Docker container to a set of servers, or just relaunching/rebuilding servers?
There's two perspectives to this:
* From a user's perspective, ignoring performance implications, containers should be the same as VMs.
* From a hardware perspective however, containers are much more lightweight in terms of CPU, Ram and Disk space, since they all share at least the kernel.
Why should you care? Well if something is less resource consuming, it means that (a) you could run it for less than what you pay for VMs or (b) it can be sold to you for less. There's some additional benefits like fast "boots", strict decoupling of persistant and non-persistant storage (which I find an advantage, restarting a container cleans up whatever you don't care for), but at the end I think it comes down to money.
To add to this. Sure you can spin up ec2 instances and you don't have to worry about atomicity. However, what happens if you want to run more than one service per VM? That's where containers allow to take full advantage of the system.
> No more dedicated boxes for mysql, mongo, etc. This in turn enables you to have an easier ops experience, because every single machine in your fleet is homogenous (same OS, same patches, etc)
Is that a good thing? Granted, the AWS account I used to managed was 8-10 machines at most, which is probably well below the point where Kuberenetes makes sense, but I remember that it was useful to configure different kind of machines for MySQL than Apache (disk needs were vastly different, for instance) and that the MySQL instance for the web server had different needs than the MySQL instance for offline data processing.
You have to design you applications bearing in mind where do they run (AWS, baremetal...) and now, you have to design your applications bearing in mind the "Datacenter OS", which is fine, but adapting solutions to new ways of doing takes time.
To me, unless you have a big park of machines, these systems are a total overkill... but I guess that time will say.
To me, unless you have a big park of machines, these systems are a total overkill
I think that's an important point, and one which container vendors are not going to labour, as they want as many people as possible on their platform, even before they really need it. A lot of people are trying to use docker or coreos who really don't need to, and as they're not the focus of containerisation efforts, they'll suffer as they find out they're not really tailored to what they want to do, which is just get their small web service running reliably with the minimum of fuss, and be sure they can rebuild it or move it between providers easily.
If you have 1-10 machines which don't change much, use Ansible or similar to get predictable (re)deployments and don't worry about using containers.
If you have > say 10 machines, this sort of stuff becomes more useful, because you are herding cattle, and need the infrastructure necessary to keep that herd going, even if a few die off from time to time - then you can scale to hundreds easily as your business grows, you can manage lots of workers reliably on one VM in containers etc, etc.
For probably 90% of websites out there, with a sane setup that's never even going to become an issue and they could run easily on just a few servers.
that's not true. what happens if one of your server dies?
you either heal it back or you shoot it.
shooting is way faster and docker could help you by that.
Also 1-10 servers could be much, it really depends how much stuff you need.
Also docker adds "some" security.
Docker isn't the perfect match, but on our site we run a match between ansible and docker (without coreos) and are very happy.
In our aws cloud we have another system which only uses fleet and coreos, the cluster upgrades itself which is a big plus, but doesn't work that good in our internal infrastructure with proxies, firewalls, etc..
(also, can you update your blog to point to the hyperkube:v0.14.1 image instead of :dev, :dev is a random binary from my client, where as v0.14.1 is an official release... Thanks!)
I have also tried kubernetes a few times, and got stuck everywhere. I was unable to find a "this is how you build a kubernetes system from scratch" document somewhere. All I managed to find where very specific howto's for specific system combinations, none of which fits my needs. I tried a few times with building systems on the documentation list, but ran into issues at each step.
My biggest worry and issue with running kubernetes in production is the overall workflow of standing up an environment, which seems to be:
1. download some images/dockerfiles
2. [magic]
3. profit
I cannot find a document anywhere that tells me what components are used, how they interact, what settings are required, etc. I'd love to be able to give kubernetes a try and see how it would work for our service, but am having a very hard time getting the right level of detail. It appears to be either "go the [magic] route", or "read all the code" with little in between.
If I had some kind of pointer, I'd be happy to write something up about how to get it running for a prod setup...
I think it's too early for you to think about Kub in production by the sounds of it.
This whole area is still in the early stages; any documentation you see on specifics are likely to be soon out of date.
If you try the hyperkube command you'll see very many command line options, and I can only see that growing.
To an extent though, there is a certain amount of magic when downloading images that do stuff for you. For example, the scheduler is a pod that just starts up on the master. What exactly it does, I've no idea yet.
To quote the kub github page:
"Kubernetes is in pre-production beta!
While the concepts and architecture in Kubernetes represent years of experience designing and building large scale cluster manager at Google, the Kubernetes project is still under heavy development. Expect bugs, design and API changes as we bring it to a stable, production product over the coming year."
> I think it's too early for you to think about Kub in production by the sounds of it.
Maybe this is getting lost in translation, but I find this line to be somewhat condescending. I am asking for some hint as to where I can find documentation that goes beyond "here, run the docker/vagrant/VM image, and have fun" - I want to know what is what, which pieces talk to which pieces, and how. I specifically not asking "Is this ready for production" which is a decision I am happy to make for myself, on the basis of the research I hope to do.
Just to be clear: I am an (admittedly increasingly unfashionable) infrastructure guy. I am pretty sure that if I follow your examples, I can get something up and running that allows me to feed some DSL into some tool, and have a running set of containers. Not interesting to me. I am happy to believe that this works, and am happy to take people's word for it.
I want to know and learn about required infrastructure, about failure domains, about networking requirements, about load and overhead, and ultimately, about "If I build something like this, what are going to be the issues in making sure it will never go down". I see a lot of high-level architecture, which appears to segue quite suddenly into "now do magic, and here is how you then start a pod". I am asking about this in between stage, as I have been unable to find this.
> This whole area is still in the early stages; any documentation you see on specifics are likely to be soon out of date.
I would expect any documentation to be at least somewhat relevant to the version it is released with. It isn't important for us to be on the latest greatest - it is important for us to know and understand how the stuff we use behaves and is to be operated, especially in failure modes. If my only recourse during a failure is "well, lets try to restart a container running something critical and lets see if the problem goes away" then indeed it isn't ready for anything other than being a toy.
I do see the likes of Kismatic and now tectonic making moves to run this as a production system, so somewhere, somehow, it would be possible to stand up a system that has not had all the key decisions made for you, and would allow you to build something that is ready for a particular environment.
Kismatic have actually released some packages that appear helpful in pulling the pieces apart and I will be looking at those to get a better understanding of what does what.
> To quote the kub github page: "Kubernetes is in pre-production beta!
Yeah, I saw that, thanks... From the perspective of the Kubernetes team "pre-production" likely means that they have not yet evaluated many probable edge-cases for many different use-types. This is important to them, not so much for me. What is important for me is that my workload works. It is a lot easier and faster for me to test this (and feed back the results, thus helping the project towards production status) on the basis of an infrastructure I know and understand than it is for me to try and put together something something [magic] and trying to figure out stuff isn't working. Case in point: I followed one of the VMware install guides. at the "now run this image, and xyz will happen" nothing happened. No reasonable possibility of troubleshooting, as the image in question was/is a black box, and no documentation I could find, outside of the "invoke these magic incantations".
To be very honest, I am not too bothered - there are plenty of viable alternatives that do the same / similar things. We are evaluating a large list of possible environments, and kube got crossed off the list pretty quickly, which is a pity as it looks interesting.
One of the reasons that I pushed hard to get Kubernetes open sourced, is the hope that we could get out in front of this, and allow the developer community to rally around Kubernetes as an open standard, independent of any provider or corporate agenda.