Habitat Best Practice Guides

This chapter covers recommended best practices for runtime and buildtime.

We focus on best practices for packages that use Habitat Builder for continuous builds.

Table of Contents

Running Habitat on Servers (Linux and Windows)

Habitat can be run on bare metal servers, as well as virtual machines. Currently, Habitat can run on Linux and Windows platforms, and in all cases, running a Supervisor boils down to running hab sup run. How that happens, of course, depends on which platform you choose to use.

Running Habitat on Linux


First, you must install Habitat itself on the machine.

Second, many packages default to running as the hab user, so you should ensure that both a hab user and group exist:

sudo groupadd hab
sudo useradd -g hab hab

Finally, you will need to wire Habitat up to your systems init system. This may be SysVinit, SystemD, runit, etc. The details will be different for each system, but in the end, you must call hab sup run.

Running under SystemD

A basic SystemD unit file for Habitat might look like this. This assumes that you have already created the hab user and group, as instructed above, and that your hab binary is linked to /bin/hab.

Description=The Habitat Supervisor

ExecStart=/bin/hab sup run


Depending on your needs and deployment, you will want to modify the options passed to hab sup run. In particular, if you wish to participate in larger Supervisor networks, you will need to pass at least one --peer option.

Running Habitat on Windows

As with Linux, you must first install Habitat on the machine. Unlike Linux, however, the Windows Supervisor has no requirements for any hab user.

On Windows, you can run the Supervisor as a Windows Service. You can use the windows-service Habitat package to host the Supervisor inside the Windows Service Control Manager:

PS C:\> hab pkg install core/windows-service
PS C:\> hab pkg exec core/windows-service install

Running Habitat Linux Containers

Container Setup and Assumptions

When you run hab pkg export docker, you'll get a Docker container that provides a few things. First, a minimal Linux OS filesystem is provided, with just enough configuration (e.g., /etc/passwd, /etc/resolv.conf, etc.) to run. Second, the contents of the exported Habitat package, along with its complete dependency tree, as well as a complete Habitat Supervisor installation, are provided, unpacked, in the /hab/pkgs directory. Finally, an entrypoint script that will start the Supervisor, running the exported Habitat package, is provided, allowing the container itself to behave as though it were the Supervisor.

On Linux, the Habitat Supervisor will normally run as the root user, and will start Habitat services as the pkg_svc_user specified by that service. However, in some deployment scenarios, it is not desirable (or perhaps even possible) to run the Supervisor as root. The OpenShift container platform, for instance, does not run containers as root by default, but as randomly-chosen anonymous user IDs. From version 0.53.0 onward, the Habitat Supervisor can now run as an arbitrary user, providing users with more flexibility in how and where the use Habitat.

In order to support this in containers and provide maximal flexibility, the contents of the /hab directory are both readable and writable by the root group. When specifying a user to run a container process as, the user's primary group will be reported as root if no matching group can be found in /etc/group. This will allow the user to create and populate the /hab/sup directory for managing Supervisor state, as well as the /hab/svc directory, which will contain all the service's state. This is in line with recommendations from OpenShift on how to create containers that can run as a non-root user, but nothing in Habitat's implementation is specific to OpenShift; indeed, all the examples provided below use pure Docker.

Caveats To Running as a Non-Root User

"There's no such thing as a free lunch", as the saying goes, and that holds true here. If the Supervisor is running as a non-root user, any processes that it supervises will be run as the same user; any values that the process might specify via pkg_svc_user and pkg_svc_group are essentially ignored. Furthermore, any files written out by the service during its operation are also owned by that same user.


Actually, the Supervisor does not actually care what user it is running as; rather, it uses Linux capabilities to guide its behavior. If the process has the CAP_SETUID, CAP_SETGID, and CAP_CHOWN capabilities, it will be able to run processes as the specified pkg_svc_user and pkg_svc_group (CAP_CHOWN is needed to ensure that the service processes can read and write files within the service's state directories). The Supervisor checks for the presence of these capabilities, and does not rely on having a user ID of 0 or the username root.

Container Deployment Scenarios

Running a Habitat Container as root

For completeness, we'll quickly cover the base case. If you are fine with running your container as root, you can do that directly:

docker run --rm -it core/redis:latest

Here, core/redis:latest would be the image exported from the core/redis Habitat package. The Supervisor will run as normal, with supervised processes running as the desired user.

Running a Habitat Container as a Non-Root User

If you cannot run as the root user, but you are fine with root being the container user's primary group, you can simply specify a user ID to run as. This user need not exist in the container itself, and it's better if it doesn't. Using pure Docker, it might look like this:

docker run --rm -it --user=888888 core/redis:latest

Again, we use our core/redis Habitat package container; the user ID 888888 is simply a number chosen at random (this is how platforms like OpenShift operate). No user inside the container has this ID, meaning that the user will be an anonymous user with root as its primary group. Because of how we generate Habitat containers, this fact ensures that the user has write permissions within the /hab directory.

Due to the current logic around package installation, there is an extra step needed if you would like to have your containerized Supervisors update either themselves or the services they supervise. When installing packages as a non-root user, Habitat will download keys and compressed hart files into the user's ${HOME}/.hab directory, rather than the global /hab/cache directory. You will need to ensure that a user-writable directory is mounted into the container, and specify it as the user's home directory using the HOME environment variable. Using pure Docker with a volume that is accessible by the user, that might look like this:

docker volume create --driver local \
--opt type=tmpfs \
--opt device=tmpfs \
--opt o=size=100m,uid=888888 \
docker run --rm -it \
--user=888888 \
--mount type=volume,src=test_home,dst=/myhome \
--env HOME=/myhome \
core/redis:latest --auto-update --strategy=at-once

This is merely an illustration; use whatever volume management approaches and service update strategies that are appropriate for your container scheduling system and your local deployment.

As illustrated, updates of this kind are completely optional; you may prefer to move update responsibility to your container scheduler and treat your containers as immutable in this regard.

Running a Habitat Container as a Non-Root User in a Non-Root Group

If for whatever reason you do not want your user to be in the root group inside the container, you will need to add some additional volumes in order to create the needed supervisor and service state directories. However, since you will (by definition) not have write permissions on the /hab directory as a whole, your Supervisor will not be able to update either itself or the services it supervises.

To implement this using pure Docker, you could do something like this (the group ID of 999999 was again chosen arbitrarily, as with the user ID):

docker volume create --driver local \
--opt type=tmpfs \
--opt device=tmpfs \
--opt o=size=100m,uid=888888 \
docker volume create --driver local \
--opt type=tmpfs \
--opt device=tmpfs \
--opt o=size=100m,uid=888888 \
docker run --rm -it \
--user=888888:999999 \
--mount type=volume,src=sup_state,dst=/hab/sup \
--mount type=volume,src=svc_state,dst=/hab/svc \

Again, this is just an illustrative example; use the appropriate strategies for your specific circumstances. The key information here is to ensure that both the /hab/sup and /hab/svc directories are writable by the user inside the container.

Running Habitat Windows Containers

Container Base Image

Exported Windows images use microsoft/windowsservercore as their base. This is the equivelent of a minimal Windows Server 2016 Core install. So you should not expect non default features and roles to be enabled such as IIS or Active Directory. Consider using an init hook to install any features needed by your Habitat service.

Container Pull and Startup time

The microsoft/windowsservercore image is approximately 5GB. Due to this large size, you can expect that the first time you run an exported Habitat service, pulling down the image may take several minutes. This wait should only occur on the very first docker run of any Habitat Windows service. Additionally, depending on the Windows host operating system, running the container may also take considerably longer than what one is accustomed to with Linux based containers. This startup time will be highly influenced by the container isolation mode described below.

Windows Containers and Host Kernel Isolation

There are two types of Windows containers and each runs under a different level of kernel isolation.

Windows Server Containers

These Containers, like their Linux counterparts, share the host's kernel. You can expect these containers to start pretty quickly and this is the default container type on Windows Server 2016 hosts.

Hyper-V Containers

Windows Hyper-V containers run inside of a very minimal Hyper-V VM. As a result, they do not share the host's kernel and offer a higher level of security and isolation. The cost of this isolation is that it will take longer for the container to start - perhaps a noticable delay. Also be aware that the VM is provisioned with a default 1 GB limit of memory. If your service requires more than a gigabyte of memory, you can use the --memory argument with docker run and pass a larger limit.

docker run --memory 2GB -it core/mysql

On a Windows 10 host, Windows containers will always run inside of Hyper-V isolation. Kernel sharing Windows Server containers are only available on Windows Server 2016 hosts. On Windows Server 2016, Windows Server containers are the default container type but you can force docker run to use Hyper-V containers by setting the --isolation argument to hyperv.

docker run --isolation hyperv -it core/mysql

Host Loopback Network

A common container pattern is to forward the container port to a local port and then access the container application by accessing localhost on the forwarded port. With Windows containers, published ports cannot be accessed using localhost. You will instead need to use the IP address of the host or the IP of the individual container to access the application's endpoint.

Robust Supervisor Networks

Habitat Supervisors communicate amongst each other using "gossip" algorithms, which underpin the membership management, leadership election, and service discovery mechanics of Habitat. By simply being "peered" to a single existing Supervisor, a new Supervisor will gradually come to know about all the Supervisors in a Habitat network. The gossip algorithm has built-in features to counteract brief network splits, but care must be taken to set up a robust Supervisor network.

The Initial Peer

While a Habitat Supervisor does not need to connect with any other Supervisors in order to be useful, leveraging a network of Supervisors is really the key to unlocking the full potential of Habitat as a platform. In order to do this, a Supervisor must be given the address of at least one other Supervisor in this network when it starts up; this is known as the "initial peer problem". You might think of a Supervisor network as an exclusive members-only club; you must first know a member in order to become a member yourself.

This Supervisor does not know about any other Supervisors, and will (at least initially) run in complete isolation.

hab sup run

This Supervisor, on the other hand, will start up knowing about three other Supervisors, and will quickly establish contact with each of them. Thanks to the gossip mechanism, it will also find out about every other Supervisor those initial Supervisors know about. Similarly, every other Supervisor will discover the presence of this new Supervisor.

hab sup run --peer= --peer= --peer=

It should be noted that peering is symmetric. Even though our first Supervisor above did not start out peered with any other Supervisors, it can still become part of a Supervisor network if some other Supervisor declares it to be a peer.

Managing Membership with SWIM

In order for Habitat's network functionality to work, the Supervisors must first know which other Supervisors they can communicate with. This is a problem of maintaining "membership lists", and is achieved using the membership protocol known as SWIM. As detailed above, we must first "seed" a Supervisor's membership list with at least one "peer"; that is, another Supervisor that it can communicate with.

Given a non-empty membership list, the Supervisor can begin probing the members of that list to see if they are still alive and running. Supervisor A sends a message to Supervisor B, effectively asking "are you still there?". If Supervisor B is available, it will reply to Supervisor A, also sending contact information for up to five Supervisors that it has in its membership lists (Supervisor A sends these introductions in its initial message, as well). In this way, Supervisors can both maintain and grow their membership lists. In short order, Supervisor A will come to know of all the other Supervisors in the network, and they, too, will come to know of Supervisor A.

If Supervisor A cannot establish contact with Supervisor B for some reason, it does not immediately consider it to be dead. This would be too strict, and could lead to unnecessary service "flapping". Instead, Supervisor A will consider Supervisor B "suspect". In this case, it will ask Supervisor C (another Supervisor in its membership list) if it can contact Supervisor B. If Supervisor C can make contact, it relays that information back to Supervisor A, which will then consider Supervisor B to be alive again, and not suspect. This scenario can arise, for example, if there is a network split between Supervisors A and B, but not between A and C, or B and C. Similarly, network congestion could delay messages such that Supervisor A's request times out before Supervisor B's reply can make it back.

If no Supervisor can make contact with Supervisor B, either directly or indirectly, the network comes to view Supervisor B as "confirmed" dead. In this case, Supervisor B is effectively removed from all membership lists across the network. As a result, no Supervisors try to contact it again. This is ultimately what happens when you shut down a Supervisor; the rest of the network realizes that it is gone and can reconfigure any services to no longer communicate with any services that were running on it.

If, on the other hand, Supervisor B is started back up again, it can rejoin the network. All the other Supervisors will (through the same SWIM mechanism described) recognize that it is back, and will mark it as alive once again. Services will be reconfigured to communicate with Supervisor B's services as appropriate.

This mechanism forms the foundation of the Habitat network, but cannot by itself provide a completely robust network. For that, we need something additional.

Permanent Peers

An important thing to keep in mind about the basic SWIM mechanism is that if two Supervisors are separated from each other for a long enough amount of time, they will each come to view the other as being dead, and will not try to reestablish contact. While this is the behavior you want when you legitimately shut a Supervisor down, it is definitely not the behavior you want if your Habitat network experiences an extended network incident. In such a case, you could end up with two (or more!) smaller Supervisor networks that are all still internally connected, but completely disconnected from each other. Supervisors in "Network A" would view those in "Network B" as dead, and vice versa. Once network connectivity had been restored, you would continue to have a fractured network, because each network would collectively consider the other to still be dead.

By starting a few Supervisors in the network using the --permanent-peer option, an additional bit of information is gossipped about these Supervisors. In effect, it tells all other Supervisors it communicates with to always try to reestablish contact with it, even if that Supervisor considers the "permanent" Supervisor to be dead. This provides a mechanism by which split networks can stitch themselves together again after the split has been resolved.

The "Bastion Ring"

Defining a few Supervisors to be "permanent peers" will provide a robust network, but unless done with care, it can be less than ideal. We recommend running a small number of Supervisors as permanenent peers, but to run no services on those Supervisors. In modern dynamic architectures, it's common for nodes to come and go; VMs may get shut down, containers can be rescheduled, and so on. If you were to go to the extreme and have all your Supervisors be permanent peers, you would end up with unnecessary network traffic as the Supervisors come and go as the infrastructure evolves across time. Each Supervisor would try to maintain contact with every Supervisor that had ever been a member of the network!

If your permanent peer Supervisors are not running any services, they will be less subject to the pressures that would cause service-running Supervisors to come and go. They can exist solely to anchor the entire Supervisor network.

Pulling It All Together: A Robust Supervisor Network

With all this, we can come up with a robust Habitat network architecture. In fact, this is the same architecture the Habitat team uses to run the public Builder service.

Create the Bastion Ring

First, set up three Supervisors as permanent peers, all mutually peered to each other (The labels "A", "B", and "C" are stand-ins for the IP addresses at which these Supervisors are reachable):

# Supervisor "A"
hab sup run --permanent-peer --peer=B --peer=C
# Supervisor "B"
hab sup run --permanent-peer --peer=A --peer=C
# Supervisor "C"
hab sup run --permanent-peer --peer=A --peer=B

These Supervisors should never be used to run services. They can however, serve as convenient, well-known, and stable entry points to the network for doing things like injecting configurations using hab config apply, adding files using hab file upload, or departing Supervisors using hab sup depart.

Peer additional Supervisors to the Bastion Supervisors

Each additional Supervisor you add to the network should be peered to at least one of the bastion ring Supervisors. Technically speaking, only one peer is necessary, as that provides access to the rest of the network. However, it could be possible to not fully connect to all of them if, say, the Supervisor joined during a network split event. Out of convention and redundancy, we peer to all the bastion ring Supervisors, like so:

# Supervisor "D" (a "normal" Supervisor)
hab sup run --peer=A --peer=B --peer=C

This Supervisor should be used to run services, but should not be started as a permanent peer.


Hopefully, the above discussion has given you a better idea of how Habitat's networking works, and how you can best take advantage of it to provide a robust network foundation for the services you run.

For those that like to keep things succinct, the above advice can be summed up thusly:

  1. Run three mutually-peered, permanent Supervisors
  2. Never run services on those Supervisors
  3. Peer all other Supervisors to those first three

If you would like additional details, the following technical journal articles describe the algorithms that form the basis of Habitat's gossip system:

Container orchestration with Habitat

Habitat packages may be exported with the Supervisor directly into a a variety of container formats, but frequently the container is running in a container orchestrator such as Kubernetes or Mesos. Container orchestrators provide scheduling and resource allocation, ensuring workloads are running and available. Containerized Habitat packages can run within these runtimes, managing the applications while the runtimes handle the environment surrounding the application (ie. compute, networking, security).


Kubernetes is an open source container cluster manager that is available as a stand-alone platform or embedded in several distributed platforms including Google's Container Engine, Tectonic by CoreOS, and OpenShift by RedHat. Habitat and Kubernetes are complementary: Kubernetes focuses on providing a platform for deployment, scaling, and operations of application containers across clusters of hosts while Habitat manages the build pipeline and lifecycle of those application containers.

Habitat Operator

The Habitat Kubernetes Operator is on-going work to create an operator that leverages Kubernetes API services to create a native and robust integration between the two technologies.

By using the Habitat Operator, you can abstract from many of the low level details of running a Habitat package in Kubernetes, and jump straight to deploying your application, with support for Habitat features like service configuration, binding, topologies and more.

For more details on the Habitat Operator, please refer to the introductory blogpost, follow along on github, and join us in our #kubernetes channel in the Habitat Slack.

Kubernetes exporter

When using the Habitat Operator, you can easily convert packages and run them on your Kubernetes cluster using the Kubernetes exporter:

$ hab pkg export kubernetes ORIGIN/NAME

Bare Kubernetes

Users are not required to use the Habitat Operator. Habitat packages exported as containers may be deployed to Kubernetes through the kubectl command. Using the Docker exporter to create a containerized application, the container may be launched like this example:

$ kubectl run mytutorial --image=myorigin/mytutorial --port=8080

Assuming the Docker image is pulled from myorigin/mytutorial we are exposing port 8080 on the container for access. Networking ports exposed by Habitat need to be passed to kubectl run as --port options. We can see our deployment with the kubectl get command:

$ kubectl get pods -l run=mytutorial

Docker and ACI

Habitat packages can be exported in both Docker and ACI formats (as well as others). Kubernetes currently supports the Docker runtime and integration of the rkt container runtime (an implementation of the App Container spec) is under active development.

Environment variables and Networking

Kubernetes supports passing environment variables into containers, which can be done via the Habitat Operator.

Multi-container Pods

Multi-container pod support through Habitat is still under active development as part of the Habitat Operator.

Azure Container Services (AKS)

Azure Container Services (AKS) is a fully managed Kubernetes service running on the Azure platform. It supports running Habitat packages using the Habitat Kubernetes Operator.

Azure Container Registry (ACR)

Azure Container Registry is a managed Docker container registry service used for storing private Docker container images. It’s a fully managed Azure resource and gives you local, network-close storage of your container images when deploying to AKS. Habitat Builder has a native integration with this service so you can publish your packages directly to Azure Container Registry.

In order to do this you need to create an Azure Service Principal that has Owner rights on your ACR instance. You can do this with the following script, changing the environment variable values to match your environment.

# Create Service Principal for Habitat Builder
ACR_ID=$(az acr show --name $ACR_NAME --resource-group $ACR_RESOURCE_GROUP --query "id" --output tsv)
az ad sp create-for-rbac --scopes $ACR_ID --role Owner --password "$BLDR_PRINCIPAL_PASSWORD" --name $BLDR_PRINCIPAL_NAME
BLDR_ID=$(az ad sp list --display-name $BLDR_PRINCIPAL_NAME --query "[].appId" --output tsv)
echo "Configuration detals for Habitat Builder Principal:"
echo " ID : $BLDR_ID"

Note: The unique Service Principal Name (the UUID) should be provided in the Habitat Builder configuration.

Connecting ACR and AKS for Habitat Operator

Since ACR is a private Docker registry, AKS must be authorized to pull images from it. The best way is to create a role assignment on the Service Principal that is automatically created for AKS, granting it Reader access on your ACR instance.

To do this you can use the following script, changing the environment variable values to match your configuration.

# Get the id of the service principal configured for AKS
CLIENT_ID=$(az aks show --resource-group $AKS_RESOURCE_GROUP --name $AKS_CLUSTER_NAME --query "servicePrincipalProfile.clientId" --output tsv)
# Get the ACR registry resource id
ACR_ID=$(az acr show --name $ACR_NAME --resource-group $ACR_RESOURCE_GROUP --query "id" --output tsv)
# Create role assignment
az role assignment create --assignee $CLIENT_ID --role Reader --scope $ACR_ID

Habitat Updater

As for any other Kubernetes cluster, you should use the Habitat Kubernetes Updater if you want automatic updates of Habitat packages on AKS.

Amazon ECS and Habitat

Amazon Web Services provides a container management service called EC2 Container Service (ECS). ECS provides a Docker registry, container hosting and tooling to make deploying Docker-based containers fairly straightforward. ECS will schedule and deploy your Docker containers within a Task while Habitat manages the applications.

EC2 Container Registry

EC2 Container Registry (ECR) is a fully-managed Docker registry provided by Amazon Web Services. Applications exported to Docker with hab pkg export docker put the containers into namespaced repositories, so you will need to create these within ECR. For example, if you were building core/mongodb containers you would use the following command:

$ aws ecr create-repository --repository-name core/mongodb

To tag and push the images to the ECR you will use your Repository URI (substituting your aws_account_id and availability zone).

$ docker tag core/mongodb:latest aws_account_id.dkr.ecr.ap-southeast-2.amazonaws.com/core/mongodb:latest
$ docker push aws_account_id.dkr.ecr.ap-southeast-2.amazonaws.com/core/mongodb:latest

EC2 Compute Service

Once Docker images are pushed to ECR, they may be run on Amazon's ECS within a Task Definition which may be expressed as a Docker Compose file. Here is an example of a Tomcat application using a Mongo database demonstrating using Habitat-managed containers:

version: '2'
image: aws_account_id.dkr.ecr.ap-southeast-2.amazonaws.com/billmeyer/mongodb:latest
hostname: "mongodb"
image: aws_account_id.dkr.ecr.ap-southeast-2.amazonaws.com/mattray/national-parks:latest
- "8080:8080"
- mongo
command: --peer mongodb --bind database:mongodb.default

From the example, the mongo and national-parks services use the Docker images from the ECR. The links entry manages the deployment order of the container and according to the Docker Compose documentation links should create /etc/hosts entries. This does not appear to currently work with ECS so we assign the hostname: "mongodb".

The command entry for the National Parks Tomcat application allows the Habitat Supervisor to --peer to the mongo gossip ring and --bind applies database entries to its Mongo configuration.

Additional Reading

Google Kubernetes Engine (GKE)

Google Kubernetes Engine is a fully managed Kubernetes service running on the Google Cloud Platform. It supports running Habitat packages using the Habitat Kubernetes Operator.

Google Container Registry (GCR)

Google Container Registry is a private Docker repository that works with popular continuous delivery systems. It runs on GCP to provide consistent uptime on an infrastructure protected by Google's security. The registry service hosts your private images in Cloud Storage under your GCP project.

Before you can push or pull images, you must configure Docker to use the gcloud command-line tool to authenticate requests to Container Registry. To do so, run the following command (you are only required to do this once):

$ gcloud auth configure-docker
Further access control information is available here.

After a successful Habitat package build, images can be pushed to the Container Registry using the registry URI. The format of this follows: [HOSTNAME]/[PROJECT-ID]/[IMAGE]:[TAG], more details at this link:

$ hab pkg export kubernetes ./results/habskp-hab-gcr-demo-0.1.0-20180710145742-x86_64-linux.hart
$ docker tag habskp/hab-gcr-demo:latest eu.gcr.io/spaterson-project/hab-gcr-demo:latest
$ docker push eu.gcr.io/spaterson-project/hab-gcr-demo:latest

Google Kubernetes Engine (GKE)

After images have been pushed to the Container Registry, they may be deployed to GKE in the same project without any further configuration changes. To make images available publically or across projects, see this documentation.

Below is a sample manifest that deploys the Habitat managed container to GKE, pulling the image uploaded in the previous section from the Container Registry:

kind: Habitat
name: hab-gcr-demo
customVersion: v1beta2
image: eu.gcr.io/spaterson-project/hab-gcr-demo:latest
count: 1
name: hab-gcr-demo
topology: standalone
apiVersion: v1
kind: Service
name: hab-gcr-demo-lb
type: LoadBalancer
- name: web
port: 80
targetPort: 8080
habitat-name: hab-gcr-demo

This also creates a Kubernetes load balancer service to expose port 8080 from the container to a public IP address on port 80.

This example assumes Habitat Operator is running on the Kubernetes cluster but it is also possible to deploy using kubectl for Habitat packages exported as containers.

Apache Mesos and DC/OS

Apache Mesos is an open source distributed systems kernel and the distributed systems kernel for Mesosphere's DC/OS distributed platform.

Apache Mesos is an open source distributed systems kernel and the distributed systems kernel for Mesosphere's DC/OS distributed platform.

Mesos Containerizers

Mesos has support for containerizers for running commands and applications within isolated containers. Mesos supports Docker and its own Mesos containerizer format. The Mesos containerizer provides lightweight containerization with cgroups/namespaces isolation without actual isolation. The hab pkg export mesos command creates a mostly empty base filesystem with the application and the Habitat Supervisor and packages it into a compressed tarball.

Marathon Applications

Marathon is a container orchestration platform for Mesos and DC/OS, handling the scheduling and deployment of applications. Marathon applications support Docker and the Mesos container formats, wrapping them in JSON metadata describing the resources needed to deploy the application. Once the application has been deployed to Marathon, it schedules it across the Mesos cluster and ensures the application is running optimally.

Export to a Mesos container and Marathon application

You can create native Mesos containers from Habitat packages by following these steps:

  1. Create an interactive studio in any directory with the hab studio enter command.

  2. Install or build the Habitat package from which you want to create a Marathon application, for example:

    $ hab pkg install yourorigin/yourpackage
  3. Run the Mesos exporter on the package.

    $ hab pkg export mesos yourorigin/yourpackage
  4. This will create a Mesos container-format tarball in the results directory, and also print the JSON needed to load the application into Marathon. Note that the tarball needs to be uploaded to a download location and the "uris" in the JSON need to be updated manually. This is an example of the output:

    "id": "yourorigin/yourpackage",
    "cmd": "/bin/id -u hab &>/dev/null || /sbin/useradd hab; /bin/chown -R hab:hab *; mount -t proc proc proc/; mount -t sysfs sys sys/;mount -o bind /dev dev/; /usr/sbin/chroot . ./init.sh start yourorigin/yourpackage",
    "cpus": 0.5,
    "disk": 0,
    "mem": 256,
    "instances": 1,
    "uris": [ "https://storage.googleapis.com/mesos-habitat/yourorigin/yourpackage-0.0.1-20160611121519.tgz" ]
  5. Note that the default resource allocation for the application is very small: 0.5 units of CPU, no disk, one instance, and 256MB of memory. To change these resource allocations, pass different values to the Mesos exporter as command line options (defaults are documented with --help).

  6. From the DC/OS web interface, launch the Marathon Service.

    Screen shot of DC/OS Services

  7. Select "Create Application".

    Screen shot of Marathon Applications List

  8. Click on the "JSON Mode" selector and enter the JSON output of the Mesos exporter and click "Create Application".

    Screen shot of Marathon New Application JSON Mode

  9. Marathon will then deploy the application and enter the "Running" status.

    Screen shot of Marathon Application Running


You can get to the output from the running application by clicking on the "Marathon" service from the DC/OS "Services" tab. Select the application and the "Log Viewer" and choose either the "Error" or "Output" to see stderr and stdout respectively. If you have SSH access into the nodes, the Mesos container directories are beneath /var/lib/mesos/slave/slaves. Screen shot of Debugging a Running Application

Future Enhancements

This is a basic integration, there are many improvements yet to be made. Here are a few examples:

  • Marathon environment variables are not passed into the Habitat package "cmd" yet.
  • Networking ports exposed by Habitat need to be added to the JSON.
  • The Habitat gossip protocol needs to be included as a default exposed port.
  • If Marathon is running the artifact store, support uploading the tarball directly into it.
  • Upload applications directly to the Marathon application API.
  • Marathon supports unpacking several archive formats. Native .hart support could be added directly to Marathon.

Advanced Plan Writing Guide

The following is a best practice guide to how to write a production quality plan. These best practices are reflected in the requirements for a user to contribute a plan to the Habitat Core Plans.

If you haven't already, a good first step is to read the Developing Packages articles.

A well written plan consists of well-formed:

Package Metadata

Each package plan should contain a value adhering to the guidelines for each of the following elements:

  • pkg_description
  • pkg_license (in SPDX format)
  • pkg_maintainer in the format of "The Habitat Maintainers humans@habitat.sh"
  • pkg_name see the section of this document on "Package Name Conventions"
  • pkg_origin must be set to core
  • pkg_source
  • pkg_upstream_url
  • pkg_version must be the complete version number of the software

Package Name Conventions

Each package is identified by a unique string containing four sub-strings separated by a forward slash (/) called a PackageIdent.


The origin, name, and version values of this identifier are user defined by setting their corresponding variable in your plan.sh or plan.ps1 file while the value of release is generated at build-time.

The value of name should exactly match the name of the project it represents and the plan file should be located within a directory of the same name in this repository.

Example: The plan for the bison project project contains setting pkg_name=bison and resides in $root/bison/plan.sh.

There is one exception to this rule: Additional plans may be defined for projects for their past major versions by appending the major version number to its name. The plan file for this new package should be located within a directory of the same name.

Example: the bison project maintains the 2.x line along with their current major version (at time of writing: 3.x). A second plan is created as bison2 and placed within a directory of the same name in this repository.

Packages meeting this exception will always have their latest major version found in the package sharing the exact name of their project. A new package will be created for the previous major version following these conventions.

Example: the bison project releases the 4.x line and is continuing to support Bison 3.x. The bison package is copied to bison3 and the bison package is updated to build Bison 4.x.

Plan syntax

You can review the entire plan syntax guide here.

Please note that the following conditions must be observed for any plan to be merged into core plans (and are important best practices for any plan):

Plan basic settings

You can read more about basic plan settings here. The minimum requirements for a core plan are:

  • pkg_name is set
  • pkg_origin is set
  • pkg_shasum is set
  • pkg_description is set


You can read more about callbacks here. The minimum requirement for a core plan are:


  • do_prepare() (Invoke-Prepare in a plan.ps1) is a good place to set environment variables and set the table to build the software. Think of it as a good place to do patches.


  • You should never call exit within a build phase. In a plan.sh, you should instead return an exit code such as return 1 for failure, and return 0 for success. In a plan.ps1 you should call Write-Exception or throw an exception upon failure.
  • If you clone a repo from git, you must override do_verify() to return 0 in a plan.sh or if you are authoring a plan.ps1 then override Invoke-Verify with an empty implementation.
  • Never use pkg_source unless you are downloading something as a third party.
  • You should never shell out to hab from within a callback. If you think you want to, you should use a utility function instead.
  • You should not call any function or helper that begin with an underscore, for example _dont_call_this_function(). Those are internal only functions that are not supported for external use and will break your plan if you call them.
  • Don't run any code or run anything outside of a build phase or a function.


The Supervisor dynamically invokes hooks at run-time, triggered by an application lifecycle event. You can read more about hooks here.

  • You cannot block the thread in a hook unless it is in the run hook. Never call hab or sleep in a hook that is not the run hook.
  • You should never shell out to hab from within a hook. If you think you want to, you should use a runtime configuration setting instead. If none of those will solve your problem, open an issue and tell the core team why.
  • Run hooks should:
    • Redirect stderr to stdout (e.g. with exec 2>&1 at the start of the hook)
    • In a Linux targeted hook, call the command to execute with exec <command> <options> rather than running the command directly. This ensures the command is executed in the same process and that the service will restart correctly on configuration changes.
    • If you are running something with a pipe exec won't work.
  • Attempting to execute commands as a root user or trying to do sudo hab pkg install are not good practice.
  • Don't edit any of the Supervisor rendered templates.
    • You can only write to: /var/, /static/, /data/ directories. You should only access these with your runtime configuration setting variable.
    • No one should ever edit anything in /hab/ directly.
    • No one should write to anything in /hab/ directly.


All plans should have a README. Items to strongly consider including:

  • Your name as maintainer and supporter of this plan.
  • What habitat topology it uses (and the plan should have the correct topology for the technology).
  • Clear, step by step instructions as to how to use the package successfully.
  • What is the best update strategy for different deployments?
  • What are some configuration updates a user can make, or do they always need to do a full rebuild?
  • Documentation on how to scale the service.
  • Instructions on how to monitor the health of the service at the application layer.
  • Can a user simply call the package as a dependency of their application?
  • How does the package integrate into their application

Iterative Development

To assist in creating new packages, or modifying existing ones, the Supervisor has an option to allow you to use the configuration directly from a specific directory, rather than the one it includes in the compiled artifact. This can significantly shorten the cycle time when working on configuration and application lifecycle hooks.

Build the plan as you normally would. When you start the Supervisor, pass the name of the directory with your plan inside it:

$ hab sup run core/redis --config-from /src

The Supervisor will now take its configuration and hooks from /src, rather than from the package you previously built. When the configuration is as you want it, do a final rebuild of the package.

Binary Wrapper Packages

While Habitat provides the best behavior for applications that can be compiled from source into the Habitat ecosystem, it can also bring the same management benefits to applications distributed in binary-only form.

You can write plans to package up these binary artifacts with minimal special handling. This article covers some tips and tricks for getting this software into Habitat.

Override The Build Phases You Don't Need

A Habitat package build proceeds in phases: download, verification, unpacking (where you would also patch source code, if you had it), build, and finally installation. Each of these phases has default behavior within the build system.

When building binary packages, you override the behavior of phases that do not apply to you. At the very minimum, you must override the do_build and do_install phases, for example:

do_build() {
# relocate library dependencies here, if needed -- see next topic
return 0
do_install() {
mkdir -p $pkg_prefix/bin
cp $PLAN_CONTEXT/bin/hello_world $pkg_prefix/bin/hello_world
chmod +x $pkg_prefix/bin/hello_world

Relocate Hard-Coded Library Dependencies If Possible

On Linux, many binaries hardcode library dependencies to /lib or /lib64 inside their ELF symbol table. Unfortunately, this means that Habitat is unable to provide dependency isolation guarantees if packages are dependent on any operating system's libraries in those directories. These Habitat packages will also fail to run in minimal environments like containers built using hab-pkg-export-docker, because there will not be a glibc inside /lib or /lib64.

Note: On Windows, library dependency locations are not maintained in a binary file's headers. See this MSDN article for a complete explanation of how Windows binaries are located. However, it's typically sufficient to ensure that the dependent binaries are on the PATH. You should make sure to include all dependencies in the pkg_deps of a plan.ps1 to ensure all of their respective DLLs are accessible by your application.

Most binaries compiled in a full Linux environment have a hard dependency on /lib/ld-linux.so or /lib/ld-linux-x86_64.so. In order to relocate this dependency to the Habitat-provided variant, which is provided by core/glibc, use the patchelf(1) utility within your plan:

  1. Declare a build-time dependency on core/patchelf as part of your pkg_build_deps line.
  2. Invoke patchelf on any binaries with this problem during the do_install() phase. For example:

    patchelf --interpreter "$(pkg_path_for glibc)/lib/ld-linux-x86-64.so.2" </span>

  3. The binary may have other hardcoded dependencies on its own libraries that you may need to relocate using other flags to patchelf like --rpath. For example, Oracle Java provides additional libraries in lib/amd64/jli that you will need to relocate to the Habitat location:

    export LD_RUN_PATH=$LD_RUN_PATH:$pkg_prefix/lib/amd64/jli
    patchelf --interpreter "$(pkg_path_for glibc)/lib/ld-linux-x86-64.so.2" </span>
    --set-rpath ${LD_RUN_PATH} </span>

  4. For more information, please see the patchelf documentation.

If You Cannot Relocate Library Dependencies

In some situations it will be impossible for you to relocate library dependencies using patchelf as above. For example, if the version of glibc the software requires is different than that provided by an available version of glibc in a Habitat package, attempting to patchelf the program will cause execution to fail due to ABI incompatibility.

Your software vendor's support policy might also prohibit you from modifying software that they ship you.

In these situations, you will have to give up Habitat's guarantees of complete dependency isolation and continue to rely on the library dependencies provided by the host operating system. However, you can continue to use the features of the Habitat Supervisor that provide uniform manageability across your entire fleet of applications.

Fix Hardcoded Interpreters

Binary packages often come with other utility scripts that have their interpreter, or "shebang", line (first line of a script) hardcoded to a path that will not exist under Habitat. Examples are: #!/bin/sh, #!/bin/bash, #!/bin/env or #!/usr/bin/perl. It is necessary to modify these to point to the Habitat-provided versions, and also declare a runtime dependency in your plan on the corresponding Habitat package (for example, core/perl).

Use the fix_interpreter function within your plan to correct these interpreter lines during any phase, but most likely your do_build phase. For example:

fix_interpreter ${target} core/coreutils bin/env

The arguments to fix_interpreter are the file (represented here by ${target}) you are trying to fix, the origin/name pair of the Habitat package that provides that interpreter, and the interpreter pattern to search and replace in the target.

If you have many files you need to fix, or the binary package automatically generates scripts with hardcoded shebang lines, you may need to simply symlink Habitat's version into where the binary package expects it to go:

ln -sv $(pkg_path_for coreutils)/bin/env /usr/bin/env

This is a last resort as it breaks the dependency isolation guarantees of Habitat.