Container based virtualization can provide an impressive amount of isolation whi...

bcantrill · on March 20, 2013

As trotsky mentions, we at Joyent are fervent believers in OS-based virtualization -- to the point that in SmartOS, we run hardware virtualization within an OS container. There are many reasons to favor OS-based virtualization over hardware-based virtualization, but first among these (in my opinion) is DRAM utilization: with OS-based virtualization, all unused DRAM is available to the system at large, and in the SmartOS case is used as adaptive replacement cache (ARC) that benefits all tenants. Given that few tenants consume every byte of their allocated DRAM, this alone leads to huge efficiencies from both the perspective of the cloud operator and the cloud user -- a higher-performing, higher-margin service. By contrast, for hardware-based virtualization, unused DRAM remains with the guest and is simply wasted (kludges like kernel samepage mapping and memory ballooning notwithstanding).

DRAM isn't the only win, of course: for every other resource in the system (CPU, network, disk), OS-based virtualization offers tremendous (and insurmountable) efficiency advantages over hardware-based virtualization -- and it's great to see others make the same realization!

For more details on the relative performance of OS-based virtualization, hardware-based virtualization and para-virtualization, see my colleague Brendan Gregg's excellent blog post on the subject[1].

[1] http://dtrace.org/blogs/brendan/2013/01/11/virtualization-pe...

zobzu · on March 20, 2013

Solaris zones use similar concepts to LXC/namespaces, but are actually providing secure isolation.

Recent patches DO NOT provide "full isolation" and never did. What they add is usermode containers. Those are broken weekly since the release. Seriously. Have a look at http://blog.gmane.org/gmane.comp.security.oss.general

price · on March 20, 2013

> Those are broken weekly since the release. Seriously. > Have a look at http://blog.gmane.org/gmane.comp.security.oss.general

Funny you should say that. The latest virtualization-related CVEs there are actually in KVM -- a trio including two host memory corruptions, which usually enables completely owning the host. http://permalink.gmane.org/gmane.comp.security.oss.general/9...

And on the other hand, I don't see any container-related CVEs at all from 2013 in the CVE database: http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=linux+kernel (The KVM issues I mentioned don't show up yet either, because they're from today.) What vulnerabilities are you referring to?

Maybe you mean kernel vulnerabilities in general, some of which could be usable by a user inside a container. Everyone should stay on top of kernel updates in any event. If you hate the rebooting, Ksplice is free for Ubuntu (and Fedora.)