Before we went live with the new Rails site at CarePages, we decided to use Munin for monitoring. We needed to write a few custom plugins to visualize application health. Fortunately, Munin plugins are really easy to write - I'll provide the source for a couple of them. We wrote plugins for Passenger process status, Passenger memory stats, Rails response time, Rails hits, the ARMailer queue, and the BJ queue. Munin also comes with default plugins for things like CPU and memory usage that are good to use for correlation.
Passenger has a passenger-status command that will output the status of all the Rails (or Rack) processes. It will provide the maximum number of processes that can be spawned, how many processing are currently running, and how many are in use. The output looks like this:
$ sudo passenger-status ----------- General information ----------- max = 16 count = 4 active = 2 inactive = 3 ----------- Applications ----------- /redacted/releases/20081024053350: PID: 18984 Sessions: 0 PID: 18980 Sessions: 0 PID: 20126 Sessions: 1 PID: 15735 Sessions: 1
And here's the Munin graph. Parsing the output of the passenger-status command was simple, and provided good visibility into the application health. Here's the source for the passenger-status munin plugin.
Passenger also provides memory stats via passenger-memory-stats. It will show the amount of memory being used per process. Here's what it looks like in the form of a Munin graph. You can see that as processes go idle when there's low traffic over night, Passenger kills them, freeing up memory. You can also see the application slowly leaking memory during the day.
We wanted to be able to correlate utilization with actual traffic, so we also made a plugin to graph requests that hit Rails. We set up a custom log to make this easier, but in the end we had a plugin that would chart how many requests were hitting Rails every 5 minutes per app server.
And of course, we wanted to be able to see the app performance using mean and median response times. We logged Rails' X-Runtime header in our custom log, and we used that to produce yet another Munin graph. I'll blog about the details of that custom log later - we found several good uses for it.
Munin comes with some default plugins that are good to correlate with the application-specific plugins that we wrote. For example, here's the CPU usage graph.
Munin also has bundled plugins for MySQL, making it easy to keep an eye on queries / second and slow queries.
Finally, it was also easy to write plugins to keep an eye on background queues. Here's one that watches the AR Mailer queue.
And another that watches the BJ queue (source).