Installing Bosun for production

June 21, 2015

Introduction

Bosun is (description from official site) an open-source, MIT licensed, monitoring and alerting system by Stack Exchange.

Bosun installation usually consists of:

Bosun server (single static binary written in Go)
Backend(s): Opentsdb, Graphite or ElasticSearch with Logstash schema
scollector agent for gathering system metrics

We’ll focus on installing Bosun with OpenTSDB backend as it’s currently best supported, although Bosun team tries it’s best not to be heavily dependent on it.

What we’ll do

Disclaimer: this is an example installation, it’s insecure, not tuned and most likely ineffective to run continuously. Also it’ll cost real money to play with (according to my calculations the procedure will take less than an hour and it’s $0.36 USD per one). Feel free to explore the result and convert my scripts/configurations for your environment.

I’ll guide you through: installing HBase, OpenTSDB and Bosun on an example infrastructure in DigitalOcean. Instructions were prepared for unix-like system in mind, i.e. recent GNU/Linux or MacOSX system.

We, sysadmins, should strive for maximum automation and repeatable consistent operations. That is why this guide includes Terraform and Ansible tools that’ll automagically transform this task from boring and error prone to easy and fun. You’ll need DigitalOcean account with positive balance (shameless plug: if you don’t already have one here’s my referral DO link).

Installing Terraform: go to Terraform download page , download and unzip files somewhere into your $PATH.

Installing Ansible: official guide here.

Install Git and clone this repo:

git clone git@github.com:dimamedvedev/bosun-install-example.git

Let’s start with the most intimidating part - HBase.

Installing HBase

Hadoop is well known for it’s weight and complexity, no wonder project mascot is an elephant. Installing it by hand is cumbersome to say the least. I’ll describe Apache Ambari method, although Cloudera’s distribution is also fine.

You’ll need 5 hosts with preferably CentOS/RHEL 6.5.

First host is for Apache Ambari itself, it may be slightly less powerful (let’s call it ambari).
Second is for NameNode, SecondaryNameNode and monitoring services. It may be low on disk space. (bosun-store00).
Everything else is for Hadoop data nodes and HBase regions (bosun-store01, bosun-store02, bosun-store03).

Creating droplets

First generate new SSH key (or use existing one):

cd bosun-example-install
./generate-ssh-key.sh

copy/paste last script output line to export $TF_VAR_ssh_fingerprint shell environment variable that Terraform will use to setup droplets access.

Generate DigitalOcean token here and export it as a shell variable too like this:

export TF_VAR_do_token="0000000111111112222222..."

Everything’s ready for droplets creation, run terraform plan to check everything and then:

terraform apply

It’ll take around a minute, next type in ./gen_ansible_inventory.sh to create Ansible inventory file.

Prepare droplets for Apache Ambari

Ansible is an agent-less configuration management solution working via SSH. There are few so named playbooks in playbooks directory that’ll do all boring stuff.

First, install Ambari repository, ambari-agent and ntpd on all droplets for Ambari to use. Installing agent should be optional but for some unknown reason (bug?) provision fails without it.

ansible-playbook --private-key=files/id_rsa -i inventory ./playbooks/common.yml

install Ambari server

ansible-playbook --private-key=files/id_rsa -i inventory ./playbooks/ambari.yml

Connect via SSH to ambari server, it’s ip is in ./inventory and ./hosts files

ssh -i ./files/id_rsa root@$(grep ambari ./hosts | awk '{print $1}')
ambari-server setup

And choose default options, start ambari-server /etc/init.d/ambari-server start and open http://ambari-server-ip-here:8080, login as admin:admin, you may need to wait a bit for server to initialize.

Install Hadoop in Apache Ambari GUI

Apache Ambari is an easy way to manage a lot of Hadoop services, I’ll choose rather basic setup just to get things running. No HA/DR and no security too.

Press Launch Install Wizard, name your cluster, choose distribution, insert

bosun-store00
bosun-store01
bosun-store02
bosun-store03

into Target Hosts field and select files/id_rsa key via Browse button. Press Next and ignore Transparent Huge Pages with IPTables for a now.

Choose HDFS, HBase, Zookeeper and Ambari Metrics.

Assign masters:

bosun-store00 - NameNode, SNameNode, HBase Master
bosun-store01-03 - ZooKeeper Server Assign all Data Nodes and Region Servers to bosun-store01-03 droplets. Proceed to Install, this step will take around 10 minutes.

Create OpenTSDB tables in HBase

Congratulate yourself with successful Hadoop install. Then proceed to running Ansible again to install OpenTSDB on HBase nodes, we wont run OpenTSDB there, it’s only needed to create necessary tables. You should run ./create_opentsdb_tables.sh once. Make sure to read this script, GZ compression ~~is explicitly required for Bosun to work~~ will save you quite a bit on disk space and data retrieval latency (thanks, Victor Boschi!).

ansible-playbook --private-key=files/id_rsa -i inventory ./playbooks/bosun-store.yml
ssh -i ./files/id_rsa root@$(grep bosun-store01 ./hosts | awk '{print $1}')
cd /root
./create_opentsdb_tables.sh

Install OpenTSDB and Bosun

OpenTSDB may be run on several nodes, but for the purposes of this guide it’ll be installed on the same host as Bosun. Another Ansible playbook will

Install Java
Install OpenTSDB from an RPM hosted on GitHub
Generate opentsdb.conf that’ll include ZooKeeper servers from zookeeper group defined in Ansible inventory, for some reason only IP addresses work
Install Bosun, provide a simplest config file
Install Runit from packagecloud.io repo
Run Bosun through Runit

If everything is right then you can open http://bosun-server-ip-here:8070/items and see that there’s already some data flowing. You can play around, see how it works, test how HBase behaves etc. After that destroying droplets is a simple:

terraform destroy

Conclusion

I hope that this guide has shown you that installing Bosun in a very basic configuration is actually a rather easy thing to do. Also Bosun is a bleeding edge software, I may very well update this post several times.

I’m still undecided on what to write next. Some ideas:

Securing Bosun installation: firewall, TLS, Auth/z
Configuring Hadoop HA
Using tsdbrelay in case Bosun is down
Bosun in a Chef environment

Feel free to vote in comments. You can also get in touch on slack channel.