A Guide to Using

BOSH on GCP

This tutorial aims to give a relatively simple introduction to using BOSH on Google Cloud Platform (GCP).

It is heavily based on the excellent A Guide to Using BOSH tutorial, written by Maria Shaldibina.

Prerequisites:

  • A GCP account (at the time of writing a Free Tier is available). However, EU residents are not currently able to use the Free Tier as individuals. This is discussed in Google's Cloud Platform Free Tier FAQ and on Quora.
  • A local installation of Google Cloud SDK on your environment PATH. Version 183.0.0 is known to work with the instructions in the guide.
  • A local installation of BOSH CLI v2 on your environment PATH. Version 2.0.45 is known to work with the instructions in this guide.
  • An environment on which to run BASH scripts.

Learn more:

Prepare

We are going to create a BOSH Environment. First we will create the necessary infrastructure in GCP within a dedicated GCP Project, then create a BOSH Jumpbox using BOSH. We will then create a BOSH Director tunnelling connections through the BOSH Jumpbox executed from your host machine. Once the BOSH Director is running we will use the BOSH CLI to send commands through the BOSH Jumpbox to the BOSH Director.

Learn more:

1

Create Google Cloud Project

Initialize a gcloud session:

$ gcloud init

Create a GCP Project (the project name must be unique) and assign it to your Billing Account (you may be prompted to install the "alpha" component):

$ gcloud projects create <YOUR_GCP_PROJECT_NAME> \
    --name=<YOUR_GCP_PROJECT_NAME> --set-as-default

$ gcloud alpha billing projects link <YOUR_GCP_PROJECT_NAME> \
    --billing-account=<YOUR_BILLING_ACCOUNT>

NOTE: Your Billing Account can be discovered by using:

$ gcloud alpha billing accounts list --format json

Enable the compute.googleapis.com, iam.googleapis.com, cloudresourcemanager.googleapis.com and dns.googleapis.com Google Cloud APIs:

$ gcloud --project <YOUR_GCP_PROJECT_NAME> \
    services enable compute.googleapis.com

$ gcloud --project <YOUR_GCP_PROJECT_NAME> \
    services enable iam.googleapis.com

$ gcloud --project <YOUR_GCP_PROJECT_NAME> \
    services enable cloudresourcemanager.googleapis.com

$ gcloud --project <YOUR_GCP_PROJECT_NAME> \
    services enable dns.googleapis.com

Create a Service Account, e.g. my-service-account, generate keys for that Service Account and make it a Project Owner:

$ gcloud iam --project <YOUR_GCP_PROJECT_NAME> \
    service-accounts create my-service-account \
    --display-name=my-service-account

$ gcloud iam --project <YOUR_GCP_PROJECT_NAME> \
    service-accounts keys create \
    --iam-account=my-service-account@<YOUR_GCP_PROJECT_NAME>.iam.gserviceaccount.com \
    my-service-account.key.json

$ gcloud projects add-iam-policy-binding <YOUR_GCP_PROJECT_NAME> \
    --member=serviceAccount:my-service-account@<YOUR_GCP_PROJECT_NAME>.iam.gserviceaccount.com \
    --role=roles/owner

Learn more:

1

Create Google Cloud Network & Security

This creates a VPC Network and subnetwork within the europe-west2 region.

$ gcloud compute networks create bosh --subnet-mode custom

$ gcloud compute networks subnets create bosh-europe-west2 \
    --region=europe-west2 \
    --range=10.0.0.0/24 \
    --network=bosh

Create a firewall rule allowing ssh access to the jumpbox instance:

$ gcloud compute firewall-rules create ssh-to-jumpbox \
    --network bosh \
    --allow tcp:22 \
    --target-tags jumpbox

Create a firewall rule allowing port 6868 to the jumpbox instance (this is used by the bosh client when creating the jumpbox instance):

$ gcloud compute firewall-rules create mbus-to-jumpbox \
    --network bosh \
    --allow tcp:6868 \
    --target-tags jumpbox

Create a firewall rule allowing internal traffic (TCP & UDP) on all ports:

$ gcloud compute firewall-rules create intra-subnet-subnet-open \
    --network bosh \
    --allow tcp:1-65535,udp:1-65535 \
    --source-tags internal

Learn more:

1

Create Google Cloud NAT Instance

Create a NAT instance.

$ gcloud compute instances create nat-instance-primary \
    --machine-type n1-standard-1 \
    --zone europe-west2-a \
    --tags "nat,internal" \
    --image ubuntu-1604-xenial-v20180109 \
    --image-project ubuntu-os-cloud \
    --subnet bosh-europe-west2 \
    --can-ip-forward \
    --tags nat \
    --metadata startup-script='#!/bin/bash \
    sh -c "echo 1 > /proc/sys/net/ipv4/ip_forward" \
    iptables -t nat -A POSTROUTING -o ens4 -j MASQUERADE'

Create a route through the NAT instance using the 'no-ip' network tag. Instances tagged with 'no-ip' will route traffic through the nat-instance-primary instance.

$ gcloud compute routes create nat-primary \
    --next-hop-instance nat-instance-primary \
    --network bosh \
    --tags no-ip \
    --priority 800 \
    --next-hop-instance-zone europe-west2-a \
    --destination-range 0.0.0.0/0

Learn more:

1

Create BOSH Jumpbox

Create an External IP address for the BOSH Jumpbox.

$ gcloud compute addresses create jumpbox-ip \
    --region europe-west2

$ gcloud compute addresses list

Clone the BOSH jumpbox deployment

$ git clone https://github.com/cppforlife/jumpbox-deployment

BOSH deploy your jumpbox:

$ bosh create-env ./jumpbox-deployment/jumpbox.yml \
    --state ./jumpbox-state.json \
    --vars-store ./jumpbox-creds.yml \
    -o ./jumpbox-deployment/gcp/cpi.yml \
    -v zone=europe-west2-a \
    -v network=bosh \
    -v subnetwork=bosh-europe-west2 \
    -v internal_cidr=10.0.0.0/24 \
    -v internal_gw=10.0.0.1 \
    -v internal_ip=10.0.0.3 \
    -v external_ip=<JUMPBOX_EXTERNAL_IP> \
    -v tags=[jumpbox,internal] \
    -v project_id=<PROJECT_ID> \
    --var-file gcp_credentials_json=my-service-account.key.json

Extract the private key of the jumpbox instance from the credentials file created by BOSH:

$ bosh int ./jumpbox-creds.yml --path /jumpbox_ssh/private_key > jumpbox.key \
    && chmod 600 jumpbox.key

Open a socks5 tunnel and export the tunnel configuration for BOSH CLI to use:

$ ssh -4 -D 5000 -fNC jumpbox@<JUMPBOX_EXTERNAL_IP> -i jumpbox.key

Export BOSH_ALL_PROXY to ensure bosh commands go via the BOSH Jumpbox

$ export BOSH_ALL_PROXY=socks5://localhost:5000

Learn more:

3

Create BOSH Director

To create the BOSH Director, execute the following commands:

$ git clone https://github.com/cloudfoundry/bosh-deployment

$ bosh int bosh-deployment/bosh.yml \
    --vars-store=./director-creds.yml \
    -o ./bosh-deployment/gcp/cpi.yml \
    -v director_name=gcpbosh \
    -v internal_cidr=10.0.0.0/24 \
    -v internal_gw=10.0.0.1 \
    -v internal_ip=10.0.0.6 \
    --var-file gcp_credentials_json=./my-service-account.key.json \
    -v project_id=<PROJECT_ID> \
    -v zone=europe-west2-a \
    -v tags=[internal,no-ip] \
    -v network=bosh \
    -v subnetwork=bosh-europe-west2

$ bosh create-env bosh-deployment/bosh.yml \
    --state=./director-state.json \
    --vars-store=./director-creds.yml \
    -o ./bosh-deployment/gcp/cpi.yml \
    -v director_name=gcpbosh \
    -v internal_cidr=10.0.0.0/24 \
    -v internal_gw=10.0.0.1 \
    -v internal_ip=10.0.0.6 \
    --var-file gcp_credentials_json=./my-service-account.key.json \
    -v project_id=<PROJECT_ID> \
    -v zone=europe-west2-a \
    -v tags=[internal,no-ip] \
    -v network=bosh \
    -v subnetwork=bosh-europe-west2

Learn more:

4

Log in

Create the BOSH_ENVIRONMENT, BOSH_CLIENT and BOSH_CLIENT_SECRET environment variables, which make it easier to interact with:

$ bosh alias-env bosh-director -e 10.0.0.6 --ca-cert <(bosh int ./director-creds.yml --path /director_ssl/ca)

$ export BOSH_ENVIRONMENT=bosh-director

$ export BOSH_CLIENT=admin

$ export BOSH_CLIENT_SECRET=$(bosh int ./director-creds.yml --path /admin_password)

Running bosh env should return details of the BOSH Director, similar to:

$ bosh env

Using environment '10.0.0.6' as client 'admin' Name gcpbosh UUID a5c4d96f-6812-4310-a2e6-a890f8d1aaf2 Version 262.3.0 (00000000) CPI google_cpi Features compiled_package_cache: disabled config_server: disabled dns: disabled snapshots: disabled User admin Succeeded

We are now ready to deploy!

Deploy

Before we proceed we need to understand what BOSH needs to deploy software.

What to deploy

Software that is deployed with BOSH needs to be packaged in a special format called a release. For each service that will be deployed, a release needs to contain source files, configuration files, installation scripts, etc. For example, a redis release would contain the source code for redis, redis configuration defaults and redis init scripts.

How to deploy

Each BOSH deployment needs to provide a specially structured configuration file - deployment manifest. This file defines what resources are going to be deployed, what services are going to be running on each of resources and properties that will be passed to services configuration files. For example, for a redis deployment manifest, there are entries for how many and what size redis VMs there should be and how redis should be configured.

Learn more:

1

Create BOSH release

We are going to use a simple BOSH release that deploys an http server.

$ git clone https://github.com/finkit/learn-bosh-on-gcp-release

$ cd learn-bosh-on-gcp-release

$ bosh create-release

Upload generated release to BOSH Director:

$ bosh upload-release

Check uploaded releases:

$ bosh releases

Using environment '10.0.0.6' as client 'admin' Name Version Commit Hash learn-bosh-on-gcp 0+dev.1 [your commit hash] 1 releases

Learn more:

2

Upload stemcell

A Stemcell is an operating system image that BOSH uses to create VMs. Official BOSH stemcells are maintained with security updates at bosh.io.

Upload stemcell to BOSH Director:

$ bosh upload-stemcell https://s3.amazonaws.com/bosh-core-stemcells/google/bosh-stemcell-3431.10-google-kvm-ubuntu-trusty-go_agent.tgz

The attempt to upload the stemcell may fail due to the Cloud Storage JSON API not being enabled for your project. In this case, the response from the upload-stemcell command should include a link to the Google Developer Console, which can be used to enable this API. Alternatively, run the command "gcloud service-management enable storage-api.googleapis.com".

Check uploaded stemcells:

$ bosh stemcells

Using environment '10.0.0.6' as client 'admin' Name Version bosh-warden-google-kvm-ubuntu-trusty-go_agent [your stemcell version] 1 stemcells

Learn more:

3

Update cloud config

The newly created BOSH Director will not have any cloud config defined:

$ bosh cloud-config

Using environment '10.0.0.6' as client 'admin' No cloud config Exit code 1

Update cloud config on BOSH Director:

$ bosh update-cloud-config cloud-config.yml

Check cloud config:

$ bosh cloud-config

Using environment '10.0.0.6' as client 'admin' azs: ... Succeeded

Learn more:

4

And deploy

Run deploy by providing path to deployment manifest. Deployment manifest specifies what services to deploy, their properties and resources configuration.

$ bosh -d learn-bosh-on-gcp deploy manifest.yml

See the list of deployed instances as it was specified in manifest:

$ bosh instances

Using environment '10.0.0.6' as client 'admin' ... Deployment 'learn-bosh-on-gcp' Instance Process State AZ IPs learn-bosh-on-gcp/guid running z1 10.0.0.10 1 instances Succeeded

See that our service is up and running.

$ curl http://10.0.0.10:8080 --proxy socks5://127.0.0.1:5000

Hello, Anonymous from <uuid>

Learn more:

Modify Deployment

Now we will update our deployment with new version of software. We will modify some properties. And we are going to scale our deployment.

1

Modify release

BOSH makes it easy to modify and deploy new versions of software. Let's modify our release source files.

In release folder open src/simple_server/app.rb and change the name to yours.

Create new version of release (force option is used to ignore warning about local changes), upload new version of release to the BOSH Director and deploy:

$ bosh create-release --force

$ bosh upload-release

$ bosh -d learn-bosh-on-gcp deploy manifest.yml

See that the updated version was deployed:

$ curl http://10.0.0.10:8080 --proxy socks5://127.0.0.1:5000

Hello, [your name] from <uuid>

2

Scale deployment

With BOSH it is easy to scale deployments. All you need to do is modify number of instances in manifest file.

Open manifest.yml and change number of instances under instance_groups from 1 to 2. Add another IP to list of static_ips: 10.0.0.11.

Run deploy:

$ bosh -d learn-bosh-on-gcp deploy manifest.yml

Check that 2 instances were deployed:

$ bosh instances

Using environment '10.0.0.6' as client 'admin' Deployment 'learn-bosh-on-gcp' Instance Process State AZ IPs learn-bosh-on-gcp/guid-1 running z1 10.0.0.10 learn-bosh-on-gcp/guid-2 running z1 10.0.0.11 2 instances Succeeded

See that we have 2 instances of our service running:

$ curl http://10.0.0.10:8080 --proxy socks5://127.0.0.1:5000

Hello, [your name] from <uuid-1>

$ curl http://10.0.0.11:8080 --proxy socks5://127.0.0.1:5000

Hello, [your name] from <uuid-2>

3

Change properties

Every release can specify a set of properties that need to be set in deployment manifest and provided to service. For example, that can be database credentials, address of another service, etc.

Our release allows to change property port on which server is listening. You can see the list of properties that can be modified in learn-bosh-release/jobs/app/spec. Let's open manifest.yml and under the section properties set the value of port to 8888 - not forgetting to remove the curly brackets after properties (representing the formerly empty set):

... jobs: - name: app release: learn-bosh-on-gcp properties: port: 8888 ...

Now we can just re-deploy our manifest changes. Note, we don't need to build new release version, configuration files will be regenerated with new properties:

$ bosh -d learn-bosh-on-gcp deploy manifest.yml

Let's see that our property was changed:

$ curl http://10.0.0.10:8888 --proxy socks5://127.0.0.1:5000

Hello, [your name] from <uuid-1>

$ curl http://10.0.0.11:8888 --proxy socks5://127.0.0.1:5000

Hello, [your name] from <uuid-1>

When something goes wrong

BOSH provides a set of recovery mechanisms. Let's break our deployment and find ways to fix it.

1

Failing service

BOSH is using monit to monitor running services. If the service goes down it will bring it up. Let's watch how this works. SSH to one of instances:

$ bosh -d learn-bosh-on-gcp ssh learn-bosh-on-gcp/0

$ sudo -i

# watch monit summary

The Monit daemon 5.2.5 uptime: 2m Process 'app' running System 'system_localhost' running

In a separate window (on host) let's kill our runnning server:

$ curl http://10.0.0.10:8888/kill --proxy socks5://127.0.0.1:5000

Back in the instance window notice that monit will report process as 'Does not exist' and after some period service will be brought back up by monit again.

Learn more:

2

Failing VM

What if there is a problem with instance that is running our service? BOSH offers manual and automatic recovery when there are problems with infrastructure resources like VMs or disks. In this exercise we are going to kill one of our instances and use manual recovery option.

Lets destroy one of our instances.

Delete the second instance - the name of this VM instance can be found under the NAME column from the output of the first command below.

$ gcloud compute instances list

$ gcloud compute instances delete NAME_OF_SECOND_VM_INSTANCE

Let's see that one of the instances is in a bad state:

$ bosh instances

... Instance Process State IPs app/guid-1 running 10.0.0.10 app/guid-2 unresponsive agent 10.0.0.11 ...

One of the components in BOSH is the Health Monitor. It independently watches system health and will bring missing instances back up by instructing infrastructure to recreate missing resources like VMs with the required persistent disk. Keep running bosh instances and see that instance is brought up and service is running eventually.

Now let's turn off automatic repair and manually resolve the issue.

$ bosh update-resurrection off

Kill one of the containers again as described above. Run cloud check and select option "Recreate VM and wait for processes to start".

$ bosh -d learn-bosh-on-gcp cloud-check

Cloud check command allows to manually resolve issues when resources (VMs and persistent disks) are in a bad state. Run bosh instances to see all instances running again.

Now let's re-enable automatic repair for completeness.

$ bosh update-resurrection on

Learn more:

3

Debugging failing deploy

When deploy command fails there are could be a number of reasons:

  • Invalid network configuration in deployment manifest (e.g. IP address is in use or out of subnet range)
  • Infrastructure provider failed to create VM or disk (e.g. quota exceeded, instance type is not available)
  • Properties required by release were not provided in manifest
  • Let's add another job to our manifest and call it router. It will balance requests between the app servers in a round-robin fashion. Since the uploaded release already contains the router job, we don't need to update the release.

    To do this, we'll create a new instance group and give it the router job. Add the following text to the bottom of manifest.yml:

    - name: router azs: - z1 templates: - name: router instances: 1 vm_type: g1-small stemcell: default networks: - name: default static_ips: [10.0.0.12]

    Re-deploy with the new job.

    $ bosh -d learn-bosh-on-gcp deploy manifest.yml

    ...Failed: `router/0 (...)' is not running after update.

    Oh-oh, looks like the deployment failed. Let's get our service logs, untar them and check stderr log.

    $ bosh -d learn-bosh-on-gcp logs router/0

    We should find this error: "At least one server must be provided". The router fails to route because there are no servers specified.

    Let's add a property to the router job to specify our servers pointing to their static IPs and ports.

    - name: router azs: - z1 templates: - name: router instances: 1 vm_type: g1-small stemcell: default networks: - name: default static_ips: [10.0.0.12] properties: servers: ["http://10.0.0.10:8888", "http://10.0.0.11:8888"]

    Re-deploy and see that it now succeeds.

    Now running curl -L http://10.0.0.12:8080 --proxy socks5://127.0.0.1:5000 should give us responses from different servers.

Done!

In this tutorial we created the necessary infrastructure in Google Cloud Platform and used the BOSH CLI to create the BOSH Jumpbox in order to create a BOSH Director. We deployed a release, updated our deployment with source changes, scaled the number of services and changed their properties on that BOSH Director. We also recovered a failing service, failing VM and failing deploy.

The BOSH Director can work with any CPI (Cloud Provider Interface) that implements a certain API to manage IaaS resources. There are several supported CPIs for different IaaS providers: AWS, GCP, Openstack, vSphere, vCloud and VirtualBox (a.k.a. BOSH Lite). You can read more about CPIs here: http://bosh.io/docs/cpi-api-v1.html.

To avoid consuming credits from your GCP account, don't forget to tear down all the instances in your project.

Learn more: