Habitat and Workload Placement

The previous articles in this series discuss why intelligent application automation is important. In this article we'll talk about how Habitat packages work with various strategies for workload placement, with a focus on how Habitat applications can run in container-based infrastructure.

The Habitat build process includes an optional post-processing step that can create an image for any runtime environment from the Habitat package. For example, Habitat can create Docker container images, Application Container Images (ACIs), or Marathon-native images for Apache Mesos. Eventually this approach could be extended to create Amazon Machine Images (AMIs), CloudFoundry buildpacks, or any other image format. Habitat is completely agnostic about the image format, if any, you use.

Let's look at the case of Habitat creating container images. After a Habitat package has been used to create a new container image during the Habitat build process, the resulting container can be deployed using whatever mechanism is desired, including container scheduling systems (such as Kubernetes and Mesosphere) or cloud-based container management systems like Amazon EC2 Container Service (ECS). Container images created by Habitat are smarter than other containers—they expose all of the automation intelligence of the Habitat app.  For example, Habitat components can self-organize with peers into leader/follower relationships, and dynamic configuration changes can propagate to all running components of the application using robust, decentralized, peer-based protocols.

Of course, as we said earlier, with Habitat containers are just one execution option. Habitat supports workload placement strategies that are not based on containers. Applications packaged with Habitat can be deployed on virtual machines (VMs) and non-container PaaS environments. For example, the post-processing step could just as easily create an Amazon machine image (AMI). The resulting image could be launched using any orchestration system that works with Amazon Web Services (AWS). Integration with any cloud or virtualization system would work similarly.

With Habitat, there's an underlying principle at work, which is that the concerns of data center management are separate from application management. This insight is powerful. For one thing, it means you don't have to worry about a particular runtime until you're ready to deploy. Secondly, once you've decided on that runtime, you can then take advantage of whatever resource management capabilities and tools that make sense in your situation.

In other words, use the operational management system, which is part of the runtime environment, to do what it does best—allocating hardware resources in an optimal way for the desired load level. Use Habitat to manage and monitor the application.

As is often the case with software systems, a carefully chosen separation of concerns pays off in a big way. To understand this, let's go back to the case of containers and look at the main capabilities of container management systems.

Container management services
Workload placement and job scheduling (optimizing for rack affinity, power consumption, etc.)
Dynamic resource allocation (autoscaling, workload consolidation and migration, fault-tolerant restart, etc.)
Software-defined networking (SDN), including VPN, load balancing and peer-connectivity
Integration with external services (persistent disk storage, etc.)
Enterprise authentication and authorization
Management dashboards (GUIs)

You can see that the items on this list are really about the data center and not about the application. This makes sense—operating your data center efficiently is mission critical—but there's a gap that needs to be filled for application-level configuration and management. Also, in those cases when container management systems do support application management, they use a command-and-control approach that treats workloads themselves as passive components that must be acted upon. As we explained in "Why Package the App and Its Automation Together," we Habitat folks definitely favor a distributed approach over a centralized one, and we think applications should have the intelligence to manage themselves.

Not all of the many container management solutions take the same approach. There's a spectrum from IaaS to PaaS. Many cloud vendors like Amazon, Microsoft, and Google, have offerings that make it easy to consume infrastructure and target the IaaS space.

Container management that has more of a PaaS flavor is arguably more application focused, but it comes at a high cost. These systems have deep hooks into the application itself, and they tie the application to that particular PaaS platform. Becoming cloud ready means "rewrite your application to fit a proprietary platform." You pay a cost in the initial conversion and in the long-term risk of being tied to a particular vendor's ecosystem.

As we developed our own philosophy of managing applications in the world of microservice workloads, we in the Habitat community decided that that these precepts should be the basis for what is under the control of Habitat and what is under the control of the runtime environment.

Convergence still matters. With Chef, nodes converge to a particular state. Convergence is just as important with containers, particularly for runtime configuration changes. Typically, in container management systems, the only way you can make configuration changes is at the beginning, when you set up the container. After that, if you change it, you've modified it in a completely unknown, unsafe way.

With Habitat, configuration changes are convergent. You inject them as rumors into the system, and everything will eventually converge to that new state.

Automation should prevent vendor lock in, not cause it. With Habitat, you can use whatever runtime environment suits you best and you can make that decision when you're ready to deploy the package. Because the application is independent of the runtime, you can move your application to a new environment when it's to your advantage and you don't have to retool your application to do it.

The workload is the unit of deployment, not the image format. With Habitat, it's the workload that's immutable. The image format is irrelevant. This means that, with Habitat, update strategies are under the control of the package's supervisor, not some external management system.

All runtime environments are good. We at Habitat love these runtime environments and we want to make sure that Habitat applications can run in all of them. Over time, we'll have more and more integrations that will make this easier.