Wednesday, August 14, 2013

Setting up your own S3 storage on VPS servers

After using Amazon S3 for a while, and seeing that the performance in Northern Europe is basically crap (even with CloudFront), it was time to try setting up an alternative on VPS servers that were closer to "home". I chose only to look at S3 API compatible storage as I already had apps running using S3 storage as a backend, and I wanted to move the storage for the apps to servers running in Northern Europe.

The requirements


Easy to scale
Replication
S3 Compatible API
Must run as an application on top of Ubuntu 12.04 LTS
Must use the storage available on a VPS
Can limit disk usage per node. (This is not an absolute requirement, but a nice-to-have requirement)
Can work on low to medium latency connections

The contenders

Riak CS


Runs on top of Riak
Riak is a database (Key/value store) and therefore runs on top of the majority of *nix based distributions
Easy to add nodes using command line tools to a Riak cluster
Has S3 compatible API
Replication is a must for a Riak cluster
No node is master/slave, all are equal (good for HA)
Has a nice web administration tool

Eucalyptus Walrus


Medium difficult to add nodes
Has addons for S3 compatible apis
Can use a normal folder for storage, but not if replication is used
Must have a block device to replicate
Easy to limit usage on a node
Difficult to configure replication
Must have a Cloud Controller, and for HA, a secondary controller

OpenStack Swift


Medium difficult to scale
Can have problems with medium latency connections, due to writing on a majority of nodes
S3 compatible API as an addon
Must have block devices
Easy to limit usage on a node
Syncs through rsync. (I never liked rsync...)

Cloudian


Has a community Edition, but documentation is sign-up only (Vmware/citrix anyone?)
Claims to be OSS, but in reality: no.
Read up on the docs, but the documentation is sparse and it does not feel "production-ready"

Apache Cloudstack


Has S3 API
Not usable as it requires a management server and a host/hypervisor system

Ceph


Is a distributed file system.
Easy to add nodes
Replicates across nodes
Does have a S3 compatible API, although some limitations (http://ceph.com/docs/next/radosgw/s3/)
Has a nice deploy-tool
Requires block devices for storage

Wrap up


Basically this gives two different directions.
Setting up Riak CS directly on the system or choosing Ceph, Walrus or Swift and setting up a file as a block device.

After reading up on the docs, I am considering both Ceph and Riak CS, and will start by testing Riak CS. The

Both provide good chef cookbooks, so for large scale deployments, use time to setup chef properly. It will save you time when you need that next node if you plan to grow.

However, this is most likely going to be more expensive than using cloud storage, so do consider if you want to use your time on this or just pay for cloud storage.

Other openstack options would work fine as well, since the client library I am using supports both.

Thursday, August 8, 2013

wicked_pdf and stylesheets

There are several articles showing how to use wicked_pdf on a Ruby on Rails application, so I am not going to exlain that here. Setup is quite simple, and the gem is just awesome.

There are however a few things to notice:

Outputs on osx and ubuntu can be inconsistent

If you are using wkhtmltopdf-binary-11 on osx in development and the same binary in production on a ubuntu system the output of the pdf is most likely to be quite different. Solution: Develop on your local computer, fine-tune using an Ubuntu machine in Virtualbox.

Stylesheets don't 'behave'

You might get mixed results when using linked stylesheets. The simple solution is to inline your pdf or print stylesheet. Put this in your pdf layout file:
<style type="text/css">
<%= Rails.application.assets.find_asset('pdf_print.css').to_s.html_safe %>
</style>
This way your stylesheet will be inline, and all linking issues with wkhtmltopdf is resolved. If you have full paths for your assets, that will sometimes work as well. (e.g. if they are hosted on a cdn), but to my experience, inline CSS is more consistant when using wkhtmltopdf.

After writing this blog post, I found this repository:
https://github.com/jwo/railsdotpdf
It is a fully working example of generating pdf with linking to the assets instead of inline css. It is a fairly basic example, but it shows how to setup wicked_pdf with linked stylesheets.

Friday, August 2, 2013

Nginx max length of server_names / server_names_hash_bucket_size

I have deployed quite a lot of sites lately, and since some of them have more than one alias, I tend to use the server_names attribute with more than one name

e.g. in /etc/nginx/sites-enables/mysite.conf
servernames somesite.example.com othersite.on.example2.com

This will throw the following error:
Starting nginx: nginx: [emerg] could not build the server_names_hash, you should increase server_names_hash_bucket_size: 32

To fix this, just increase the number in nginx.conf
server_names_hash_bucket_size 128;
Depending on how many sites you have as aliases, you might need the value to be higher or lower than 128