Deploying Couchbase in infrastructure containers
Couchbase's standout performance, built in sharding, and cross-datacenter replication features make it ideal for cloud-scale applications serving hundreds of millions or billions of users. However, typical cloud infrastructure is ironically unsuited to high performance Couchbase deployments. The high network and file I/O latency imposed by hardware virtual machines--the hardware hypervisor tax--have forced many users to trade cloud elasticity for bare metal performance.
Infrastructure containers running in Joyent's Triton Elastic Container Service or in private data centers powered by Triton Elastic Container Infrastructure offer an alternative that delivers the bare metal performance and elasticity demanded by sophisticated Couchbase ops teams.
Containerizing an application in infrastructure containers is easy because they offer all the services of a typical unix host and behave similarly to hardware virtual machines. Containers enjoy their own virtual NICs, filesystems, and all the resource and security isolation that you'd expect of a hardware VM, but with the elastic performance and bursting that's only possible with containers.
Create an infrastructure container running container-optimized CentOS
Couchbase recommends installing on RHEL-based operating systems such as CentOS, so let's start with an infrastructure running container-optimized CentOS.
You can start a container in the dashboard with just a few clicks. Find the "create instance" button, search for CentOS, and then choose the memory, CPU, and disk size.
I prefer to create containers using the command line tools, since that allows me to kickstart the installation with a script that automatically runs as the container is provisioned.
If you've got the tools installed, just copy and past this code block:
curl -sL -o couchbase-install-triton-centos.bash https://raw.githubusercontent.com/misterbisson/couchbase-benchmark/master/bin/install-triton-centos.bashsdc-createmachine \ --name=couchbase-container-benchmarks-1 \ --image=$(sdc-listimages | json -a -c "this.name === 'centos-6' && this.type === 'smartmachine'" id | tail -1) \ --package=$(sdc-listpackages | json -a -c "this.memory === 16384 && /^t4/.test(this.name)" id | tail -1) \ --networks=$(sdc-listnetworks | json -a -c "this.name ==='default'" id) \ --networks=$(sdc-listnetworks | json -a -c "this.name ==='Joyent-SDC-Public'" id) \ --script=./couchbase-install-triton-centos.bash
Another advantage of using that code block is that it provisions the container in our beta data center, which features the latest equipment and software before it's deployed in our production data centers. The generation of software and hardware there will be in all our data centers worldwide soon, but it's fun to test on the latest stuff, no?
Install and configure Couchbase
Infrastructure containers look and feel a lot like virtual machines, just faster, so installing software is as simple as SSHing in. You can get the connection information in the portal or via the API.
Lookup the IP address for this new instance using the API:
sdc-listmachines | json -a -c "this.name === 'couchbase-container-benchmarks-1'" ips.1
Once in, installing and configuring Couchbase is easy with this script:
curl -sL https://raw.githubusercontent.com/misterbisson/couchbase-benchmark/master/bin/install-triton-centos.bash | bash
You can see the details of how that script works in the Github repo. There are actually three scripts that install, set environment variables, and configure Couchbase.
When done, the script will output some summary information about Couchbase and how to connect to the dashboard. Take a moment to open the dashboard now, because we're going to run some benchmarks in a moment and it will be fun to watch the graphs there while the action happens.
# Couchbase is installed and configured## Dashboard: http://165.225.138.202:8091# Internal IP: 10.112.5.196# Bucket: benchmark# username=Administrator# password=password#
Run the benchmarks
The benchmarks will load a large data set and execute a hanful of queries designed by decimal.io's Corbin Uselton to test relative performance. As the benchmarks execute, look at the time to load data and query it. Shorter times are better.
The benchmarking tool uses Node.js, so some of the first steps are to install Node.js and some other dependencies. Kick it all off like so:
curl -sL https://raw.githubusercontent.com/misterbisson/couchbase-benchmark/master/bin/benchmark.bash | bash
The result should look similar to the following:
series 1: load test docs [====================] 300000/300000 100% 8.3s elapsedcompleted series 1series 2: load test docs [====================] 300000/300000 100% 7.6s elapsedcompleted series 2series 3: load test docs [====================] 300000/300000 100% 7.9s elapsedcompleted series 3series 4: load test docs [====================] 300000/300000 100% 7.0s elapsedcompleted series 4series 5: load test docs [====================] 300000/300000 100% 7.3s elapsedcompleted series 5series 6: load test docs [====================] 300000/300000 100% 8.7s elapsedcompleted series 6series 7: load test docs [====================] 300000/300000 100% 7.5s elapsedcompleted series 7series 8: load test docs [====================] 300000/300000 100% 8.8s elapsedcompleted series 8query people with SUVspeople with SUVs: 342732people with SUVs in: 9322msquery number of convertiblesnumber of convertibles: 342446number of convertibles in: 346msquery average ageaverage age: 42average age: 331mswaiting 60 seconds to run queries againquery people with SUVspeople with SUVs: 342732people with SUVs in: 7041msquery number of convertiblesnumber of convertibles: 342446number of convertibles in: 343msquery average ageaverage age: 42average age: 436ms
The benchmarks will run five times, but you can trigger them manually using the following command string:
cloud-benchmark run -d /root/cb-cloud-benchmark-data-79bd88b76cbf9cbec987d84f1ef6ad996973d526 -c couchbase://127.0.0.1
There are instructions in the repository for installing on AWS and on generic CentOS environments, go ahead and compare the numbers. I think you'll be impressed with the performance per dollar. Say goodbye to the hypervisor tax and say hello to bare metal performance with the ease and scale of cloud infrastructure: Couchbase without Compromise.
Surprises
When installing, you might notice Couchbase detect 32 or 48 CPUs. That's no lie. Containers really do run on bare metal and can see all the CPUs the hardware has available. How many of those CPUs you can actually use (or memory, or disk) depends on the package you selected with creating the container. The installation script includes some additional guidance to tell Couchbase not to try using all the CPUs it can see, for best performance. Part of the container performance advantage is the flexibility to individually schedule threads on any available CPU, but it does require telling the application how many threads it's limited to running simultaneously.
Post written by Casey Bisson