Troubleshooting

How does Lattice manage applications?

A helpful step when debugging is to have an accurate mental model of the system in question.

Lattice is founded upon the notions of eventual consistency. In particular, Lattice is constantly working to reconcile desired state with actual state.

When you issue commands via the ltc CLI (or the Lattice API) you are modifying Lattice’s desired state. Typically, you are informing Lattice about a desire to run some number of instances of an application. Lattice updates this desired state synchronously.

The actual state (i.e. the set of running instances), however, is updated asynchronously as the Lattice cluster works to reconcile the current actual state with the desired state.

Typically, when the user updates desired state Lattice immediately takes actions to perform this reconciliation. Should an action fail (perhaps a network partition occurs) or a running instance be lost (perhaps a Cell explodes) Lattice will eventually attempt to reconcile actual and desired state again (this happens every 30 seconds - though Lattice can detect a missing Cell within ~5 seconds).

How does Lattice work with Docker images?

A Docker image consists of two things: a collection of layers to download and mount (the raw bits that form the file system) and metadata that describes what command should be launched (the ENTRYPOINT and CMD directives, among others, specified in the Dockerfile).

Lattice uses Garden-Linux to construct Linux containers. These containers are built on the same Linux kernel technologies that power all Linux containers: namespaces and cgroups. When a container is created a file system must be mounted as the root file system of the container. Garden-Linux supports mounting Docker images as root file systems for the containers it constructs. Garden-Linux takes care of fetching and caching the individual layers associated with the Docker image and combining and mounting them as the root file system - it does this using the same libraries that power Docker.

This yields a container with contents that exactly match the contents of the associated Docker image.

Once a container is created Lattice is responsible for running and monitoring processes in the container. The Lattice API allows the user to define exactly which commands to run within the container; in particular, it is possible to run, monitor, and route to multiple processes within a single container.

When launching a Docker image, ltc directs Lattice to create a container backed by the Docker image’s root fs, and to run the command encoded in the Docker image’s metadata. It does this by fetching the metadata associated with the Docker image (using the same libraries that power Docker) and making the appropriate Lattice API calls. ltc allows users to easily override the values it pulls out of the Docker image metadata. This is outlined in detail in the ltc documentation.

There are some remaining areas of Docker compatibility that we are working on:

  • Removing assumptions about container contents. Currently, Garden-Linux makes some assumptions about what is available inside the container. Some Docker images do not satisfy these assumptions though most do (the lightweight busybox base image, for example).

ltc is giving no such host errors with *.xip.io addresses. Help!

DNS resolution for xip.io addresses can sometimes be flaky, resulting in errors such as the following:

 ltc target 55.66.77.88.xip.io
 Error verifying target: Get http://receptor.55.66.77.88.xip.io/v1/desired_lrps:
 dial tcp: lookup receptor.55.66.77.88.xip.io: no such host

Resolution Steps

  1. Follow these instructions to reset the DNS cache in OS X. There have been several reported issues with DNS resolution on OS X, specifically on Yosemite, insofar as the latest beta build of OS X 10.10.4 has replaced discoveryd with mDNSResponder.

  2. Check your networking DNS settings. Local “forwarding DNS” servers provided by some home routers can have trouble resolving xip.io addresses. Try setting your DNS to point to your real upstream DNS servers, or alternatively try using Google DNS by using 8.8.8.8 and/or 8.8.4.4.

  3. If the above steps don’t work (or if you must use a DNS server that doesn’t work with xip.io), our recommended alternative is to follow the dnsmasq instructions, pass the LATTICE_DOMAIN environment variable to the vagrant up command, and target using dev.lattice instead of .xip.io to point to the cluster, as follows:

LATTICE_DOMAIN=dev.lattice vagrant up
ltc target dev.lattice

dnsmasq is currently only supported for vagrant deployments.

My app keeps crashing with exit code 255. Help!

Exit code 255 typically means the application ran out of memory in its container. Try increasing the amount of memory by passing --memory-mb to ltc launch-droplet or ltc create.

Vagrant IP conflict errors

Multiple Vagrant VMs

The below errors can come from having multiple vagrant instances using the same IP address (e.g., 192.168.11.11).

$ ltc target local.lattice.cf
Error connecting to the receptor. Make sure your lattice target is set, and that lattice is up and running.
  Underlying error: Get http://receptor.local.lattice.cf/v1/desired_lrps: read tcp 192.168.11.11:80: connection reset by peer

$ ltc target local.lattice.cf
Error connecting to the receptor. Make sure your lattice target is set, and that lattice is up and running.
  Underlying error: Get http://receptor.local.lattice.cf/v1/desired_lrps: use of closed network connection  

$ ltc target local.lattice.cf
Error verifying target: Get http://receptor.local.lattice.cf/v1/desired_lrps: net/http: transport closed before response was received

To check whether multiple VMs might have an IP conflict, run the following:

$ vagrant global-status
id       name    provider   state   directory
----------------------------------------------------------------------------------------------------------------
fb69d90  default virtualbox running /Users/user/workspace/lattice
4debe83  default virtualbox running /Users/user/workspace/lattice-bundle-v0.7.0/vagrant

You can then destroy the appropriate instance with:

$ cd </path/to/vagrant-directory>
$ vagrant destroy

Running VirtualBox and VMware

If you’re using both VirtualBox and VMware on the same machine, you may see this error:

The specified host network collides with a non-hostonly network!
This will cause your specified IP to be inaccessible. Please change
the IP or name of your host only network so that it no longer matches that of
a bridged or non-hostonly network.

In this case, one of your hypervisors has grabbed the 192.168.11.* IP range and is preventing the other from accessing them. Use ifconfig to figure out which owns the network:

$ ifconfig
...
vboxnet1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    ether 0a:00:27:00:00:01
    inet 192.168.11.1 netmask 0xffffff00 broadcast 192.168.11.255
...

In this case the VirtualBox interface vboxnet1 has the network, so you can bring it down to free up the network:

sudo ifconfig vboxnet1 down

If VMware owns the network you’ll see something like this:

$ ifconfig
...
vmnet9: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    ether 00:50:56:c0:00:09
    inet 192.168.11.1 netmask 0xffffff00 broadcast 192.168.11.255
...

You can configure Lattice to run on an alternate IP address by setting, for example, LATTICE_IP=192.168.22.22 during vagrant up.

I can’t run my Docker image. Help!

Here are a few pointers to help you debug and fix some common issues:

Increase ltc’s Timeout

ltc create will wait up to two minutes for your application(s) to start. If this fails, it may be that your Docker container is large and has not downloaded yet. You can pass the --timeoutflag to instruct ltc to wait longer. Note that ltc does not remove your application when this timeout occurs, so your application may eventually start in the background.

Increase Memory and Disk Limits

By default, ltc applies a memory limit of 128MB to the container. If your process is exiting prematurely, it may be attempting to consume more than 128MB of memory. You can increase the limit using the --memory-mb flag. To turn off memory limits, set --memory-mb to 0.

Disk limits are configurable via ltc but quotas are currently disabled on the Lattice cluster.

Check the Application Logs

ltc logs APP_NAME will aggregate and stream application logs. These may point you in the right direction. In particular, if you see issues related to file permissions or a health check failing, read on…

Disable Health Monitoring

By default, ltc requests that Lattice perform a periodic health check against the running application. This health check verifies that the application is listening on a port. For applications that do not listen on ports (e.g. a worker that does not expose an endpoint) you can disable the health check via the --no-monitor flag.

Watch Lattice Component Logs

If you’re still stuck you can try streaming the Lattice cluster logs with ltc debug-logs while launching your application. If you’re still stuck and want to submit a bug report, please include the relevant output from ltc debug-logs.

How do I get a shell inside a lattice container?

Use ltc ssh to get a shell in your container:

$ ltc ssh APP-NAME

or

$ ltc ssh -i INSTANCE-INDEX APP-NAME

How do I communicate with my containers over TCP?

You can create TCP routes to an app container using the --tcp-route option of ltc create and ltc launch-droplet. TCP routes permit mapping a specified app container port to a specified exposed port.

See this section of the ltc documentation for more information.

What external ports are unavailable to bind as TCP routes?

The following ports are reserved for use by Lattice:

22 53 80 1169 1234
1700 2222 2380 4001 4222
4223 7001 7777 8070 8080
8081 8082 8090 8300 8301
8302 8400 8444 8500 8888
8889 9016 9999 17009 17014
17110 17111 17222 44445

How do I communicate between containers?

Lattice does not apply any firewall rules between containers. Any container can freely communicate with any other container. All you need is to identfiy the IP and Port - information available via ltc status -d or the Receptor API.

Can I communicate directly with Lattice cells?

As of v0.7.0, access to Lattice cells is unrestricted. In the future, access to Lattice cells will be restricted to VMs within the Lattice cluster.

How do I do service discovery?

Lattice does not ship with a service discovery solution. It is relatively straightforward, however, to build a solution on top of the Receptor API. We have plans to explore this space soon after release.

How do I upgrade Lattice?

Lattice does not support rolling upgrades.

How do I use Lattice with an HTTP proxy?

Lattice supports HTTP proxies, but some setup is needed. See Proxy configuration.

I’m having trouble running vagrant up with VirtualBox

Remote connection disconnect

If you see the following error while running vagrant up --provider virtualbox:

default: Warning: Remote connection disconnect. Retrying...
default: Warning: Authentication failure. Retrying...
...

try upgrading to the latest VirtualBox.

Unsupported VirtualBox

If you see the following error while running vagrant up --provider virtualbox:

Vagrant has detected that you have a version of VirtualBox installed
that is not supported. Please install one of the supported versions
listed below to use Vagrant:

4.0, 4.1, 4.2, 4.3

you may be running an old version of Vagrant that doesn’t support VirutalBox 5+. Upgrading to Vagrant 1.7.3+ will fix the issue.

Can I use ltc on Windows?

Most of the ltc subcommands work on Windows, although without color support.

ltc ssh requires a pseudo-terminal and will not function properly from the default Windows command shells (Cmd, Powershell). We recommend using ltc ssh non-interactively:

ltc ssh -- ps auxw

Modular. Flexible. Powerful

Application Instances