AWS Lambda is unique among PaaS offerings. Lambda takes all the utility grid analogies we use to explain the cloud and embraces them to the extreme.

Lambda runs a function you define in a Node.js or Java 8 runtime, although you can execute a subshell to run other kinds of processes. Amazon charges you by memory use and execution time in increments of 128 MiB of memory and 100ms. The upper limit for memory use is 1.5GiB and your Lambda function cannot take more than 60 seconds to complete, although you can set lower limits for both.

There is a pretty generous free tier, but if you exceed the free tier, pricing is still very friendly. For usage that does exceed the free tier, you’ll be paying $0.00001667 per GiB*s and $0.20 for every 1M invocations.

To bring that down to earth, let’s say you write a lambda function that takes on average 500ms to run and uses 256MiB of memory. You could handle 3.2M requests before exausting the free compute tier, but you would pay $0.40 to handle the 2.2M requests beyond the 1M request free tier. Another 3.2M requests would cost another $6.67 including both compute time and request count charges.

Since my company’s new static web page needed a contact form handler, I took the opportunity to learn about how Lambda can provide cheap, dynamic service for a static site.

In the example below, I’ll show you what I came up with. The idea is that I would present a simple, static web form to my users and submitting a form would activate some client-side JavaScript to validate and submit the contents to a remote endpoint. The endpoint would connect to the AWS API Gateway service and trigger a lambda function. The lambda function would perform any required server-side validation and then use the AWS SDK for Node.js to send an email using AWS Simple Email Service. Just like any other API endpoint, the Lambda function can return information about the result of its own execution in an HTTP response back to the client:

var AWS = require('aws-sdk');
var ses = new AWS.SES({apiVersion: '2010-12-01'});

function validateEmail(email) {
  var tester = /^[-!#$%&'*+\/0-9=?A-Z^_a-z{|}~](\.?[-!#$%&'*+/0-9=?A-Z^_a-z`{|}~])*@[a-zA-Z0-9](-?\.?[a-zA-Z0-9])*(\.[a-zA-Z](-?[a-zA-Z0-9])*)+$/;
  if (!email) return false;

  if(email.length>254) return false;

  var valid = tester.test(email);
  if(!valid) return false;

  // Further checking of some things regex can't handle
  var parts = email.split("@");
  if(parts[0].length>64) return false;

  var domainParts = parts[1].split(".");
  if(domainParts.some(function(part) { return part.length>63; })) return false;

  return true;

exports.handler = function(event, context) {
  console.log('Received event:', JSON.stringify(event, null, 2));

  if (! {'Must provide email'); return; }
  if (!event.message || event.message === '') {'Must provide message'); return; }

  var email = unescape(;
  if (!validateEmail(email)) {'Must provide valid from email'); return; }

  var messageParts = [];
  var replyTo = + " <" + email + ">";

  if ( messageParts.push("Phone: " +;
  if ( messageParts.push("Website: " +;
  messageParts.push("Message: " + event.message);

  var subject = event.message.replace(/\s+/g, " ").split(" ").slice(0,10).join(" ");

  var params = {
    Destination: { ToAddresses: [ 'Branded Crate <>' ] },
    Message: {
      Body: { Text: { Data: messageParts.join("\r\n"), Charset: 'UTF-8' } },
      Subject: { Data: subject, Charset: 'UTF-8' }
    Source: "Contact Form <>",
    ReplyToAddresses: [ replyTo ]

  ses.sendEmail(params, function(err, data) {
    if (err) {
      console.log(err, err.stack);;
    } else {
      context.succeed('Thanks for dropping us a line');

Not bad, right? I’ve just added an element of dynamism to my static web site. It’s highly available, costs nothing, there’s no servers manage and there’s no processes to monitor. AWS provides some basic monitoring and any script output is available in CloudWatch for inspection. Now that basically all browsers support CORS, your users can make cross-origin requests from anywhere on the web. Setting this up in AWS is a bit ugly, but I’m willing to make the effort to get all the benefits that come along with it.

I’m excited about the possibilities of doing much more with Lambda, especially the work Austen Collins is doing with his new Lambda-based web framework, JAWS.

The hardest part about this whole thing was properly setting up the API Gateway. I tried in vain to get the API Gateway to accept url-encoded form parameters, but that was a losing battle. Just stick with JSON.

Continue reading »

A recent client of mine needed an app to help him build bite-sized CSV files from a large PostgreSQL table. The problem was simple enough and it takes little time to write a simple Rails action to query a table, generate CSV from the objects in memory and flush it out to the client. Writing an app to do this one thing using a traditional Rails action is a matter of just a few hours.

But our client wanted to run queries that could potentially return 10, 20 or even 100 thousand records. When dealing with large numbers of records, performance can suffer because the application has to spend a lot of CPU time taking all those in-memory records and transforming them into a bunch of in-memory strings for the CSV file. Doing this in the application and entirely before sending the response means the app consumes a lot of memory and a lot of CPU time. Eventually, these responses would come through, but when you start talking about 30+ second response times, you can run into trouble from both users who don’t want to wait so long for responses and application environments where resource use and extended response times are unacceptable or maybe even disallowed.

Since I was querying a pretty large Postgres table (200M+ rows) with a fairly involved query type (geographic proximity), I spent a lot of time debugging the query before realizing the problem was really in my own app. After I realized what was going on, I set about looking for a better way to build the CSV and send it to the waiting client. I found two things. First, I found that Postgres can directly generate CSV from any query and stream it back on the socket. And second, I found that Rails can stream the response coming from Postgres, directly to the end-user waiting on the other end of the HTTP connection using ActionController::Live.

Here’s how it works. I’ve taken all the application-specific content out of here so you can more clearly see this technique:

class SearchController < ApplicationController
  include ActionController::Live

  def run
    response.headers['Content-Disposition'] = 'attachment; filename="filename.csv"'
    response.headers['Content-Type'] = 'text/csv'

    conn = ActiveRecord::Base.connection.instance_variable_get(:@connection)
    conn.copy_data "COPY ( #{query} ) TO STDOUT WITH CSV HEADER;" do
      while row=conn.get_copy_data row

To tell Rails you want to stream responses from this controller, include ActionController::Live at the top. Basically, this tells Rails you want to use chunked encoding for your HTTP responses. And in your action, your response object now has a special stream property which is an IO-like object representing the outward-facing HTTP response. Anything you write to the stream is sent immediately to the user agent.

That’s why it’s important to set any headers you need to set before writing any of the response body. In this case, I’m using the Content-Disposition header so browsers know to treat the response as a file download.

I am using a bit of a hack to grab hold of the raw Postgres connection underlying the ActiveRecord connection because I don’t know a better way. copy_data is a method the Postgres gem provides which invokes an SQL COPY command. It typically would copy query results to a file, but since I’ve specified “TO STDOUT” I’ll be able to read the response right here from the Postgres connection using get_copy_data. As a bonus, I can ask for the results in CSV format and not have to worry about converting it myself. Now that Postgres is generating CSV for me, all my action needs to do is read the lines from the Postgres socket and write then to the HTTP response stream.

The results shocked me. Queries for even large amounts of data were imperceptibly fast. The download starts so quickly its not even worth measuring and the data transfer bottleneck is certainly my own middle-tier cable Internet connection.

Continue reading »

My battle with Errbit performance is over for the time being. This concludes an effort I began in June to improve throughput on error inserts and error searching as the database grows over time. If you’re interested in reading about the effort leading up to this point, here are the related posts:

The short version of the story is that I tried all kinds of ideas, but failed to notice the actual improvements due to an issue with the special purpose test rig I created specifically for measuring these improvements. Once I found and fixed the test rig issue, it was clear that my efforts had paid off. Unfortunately, it was not clear which ones had the biggest impact because I had hoped this final post would be an evidence-based exploration into where the performance issues lived.

Instead, I only have evidence that the sum total of my effort lead to a real and measurable performance impact and I can speculate as to where the largest gains were had. But first, let’s look at the overall impact in the two areas where we have data, starting with error insertion:

My best guess as to why we’re seeing this improvement is twofold. First, the number of mongo queries required to insert an error are down to a minimum of five rather than a minimum of nine. Secondly, inserting an error no longer requires instantiating a Mongoid document for every line in the backtrace. In fact, the model representing a backtrace line no longer exists at all. There could be other explanations, but I’m satisfied with the results as they are. Although I’d like to know where the improvements came from, I’m not inclined to spend the time to figure it out at this point.

Next, we’ll look at error searching:

This is exactly the kind of result I was looking for. What began as seemingly linear performance degradation now looks a lot more like constant time. It isn’t actually that good, but the performance degradation between zero and 100k records is barely perceptible.

I’m convinced the meat of this improvement came from switching to mongo’s built-in full-text search mechanism. It makes sense that using mongo’s full-text search implementation would be much more performant than doing a multi-index string search.

Hopefully our users will notice and appreciate these performance gains. Keep in mind, the results shown above ran with a single thread and a single process. Depending on your hardware, you should get better throughput in a real deployment by running multiple processes and hopefully multiple threads in the future.

Continue reading »

Last week, I continued my exploration into Errbit slowness. I eliminated a few ideas and added a few more conjectures to the list of optimization targets. While testing some ideas, I began to suspect that my test rig was not working as designed. There was a flaw in the way I was using Typheous that was causing the test process to run slower than it should and even worse, report inaccurate statistics.

So I reworked the test rig to manage its own threads and take its own request time measurements. The test process now runs much faster and on v0.4.0 reports response times that are a little less than half of what I had logged previously.

Testing against my latest mongoid5 branch, where I’ve been accumulating performance optimization ideas reveals that performance has actually improved significantly. It’s unfortunate that because my test rig had been inaccurate as I was making changes, I can’t easily show which changes correspond to how much of this improvement.

My focus up until this point has been mainly in creating error reports, and these numbers reflect a pretty significant improvement. It’s nice to finally see some real improvement after all this research.

Continue reading »

Round One

One of the first things I did when tracking down performance issues was motivated by a lot of experience with Rails apps. A very common issue I’ve seen with Rails applications comes down to a simple misuse of active_record. Errbit doesn’t use active_record, but it does use Mongoid which presents something rather like an active_record interface for mongo-powered applications.

After spending some time with Errbit and some cursory analysis, I suspected the performance problem was in the data layer (making unnecessary and expensive queries). So I set out to record all mongo queries made while inserting notices.

Here’s the list of queries I collected from one notice insertion:

namespace=errbit_development.apps selector={"api_key"=>"acae731a1bd8cd2b0c0946f2a952027b"} flags=[:slave_ok] limit=0 skip=0 project=nil | runtime: 31.7640ms
namespace=errbit_development.backtraces selector={"fingerprint"=>"a6e12c6962939b955c0bf325c0cff0936ae0e98f"} flags=[:slave_ok] limit=0 skip=0 project=nil | runtime: 2.1269ms
namespace=errbit_development.backtraces selector={"_id"=><BSON::ObjectId:0x70012658875760 data=55748bb163726f67e4000004>} flags=[:slave_ok] limit=0 skip=0 project=nil | runtime: 1.8985ms
namespace=errbit_development.errs selector={"fingerprint"=>"ece6d74c1056b16c50ed3ba2f03dfc0bf6e5e7d2"} flags=[:slave_ok] limit=0 skip=0 project=nil | runtime: 0.4275ms
namespace=errbit_development.$cmd selector={:insert=>"notices", :documents=>[{"_id"=><BSON::ObjectId:0x70012660942640 data=5574c1a963726f7f89000000>, "message"=>"Whatever the mind of man can conceive and believe, it can achieve", "error_class"=>"ArgumentError", "backtrace_id"=><BSON::ObjectId:... flags=[] limit=-1 skip=0 project=nil | runtime: 0.6309ms
namespace=errbit_development.problems selector={"_id"=><BSON::ObjectId:0x70012665027820 data=55748bb563726f67e400008c>} flags=[:slave_ok] limit=0 skip=0 project=nil | runtime: 0.3846ms
namespace=errbit_development.$cmd selector={:update=>"problems", :updates=>[{:q=>{"_id"=><BSON::ObjectId:0x70012665383440 data=55748bb563726f67e400008c>}, :u=>{"$inc"=>{"notices_count"=>1}}, :multi=>false, :upsert=>false}], :writeConcern=>{:w=>1}, :ordered=>true} flags=[] limit=-1 skip=0 project=nil | runtime: 0.4282ms
namespace=errbit_development.$cmd selector={:update=>"problems", :updates=>[{:q=>{"_id"=><BSON::ObjectId:0x70012665383440 data=55748bb563726f67e400008c>}, :u=>{"$set"=>{"messages"=>{"cd4c2e0346474d0b2eb7f1a630309239"=>{"value"=>"Whatever the mind of man can conceive and believe, it can achieve... flags=[] limit=-1 skip=0 project=nil | runtime: 0.5181ms
namespace=errbit_development.apps selector={"_id"=>"55748b8463726f67e4000001"} flags=[:slave_ok] limit=0 skip=0 project=nil | runtime: 0.3517ms

While working on these queries, I noticed that Durran Jordan had recently announced a new chapter of mongoid development. Upgrading to the latest version seemed like a nice, easy thing to try. So I did that hoping to see a cheap improvement in Errbit. If anything, the performance measurements were slightly worse than with Mongoid 4, but not enough to warrant rolling back. I decided to stick with Mongoid 5 while continuing the investigation.

I noticed by looking at this list that some of these queries are in fact unnecessary. By making fewer queries, it should be possible to improve performance because each query blocks the running thread from doing any other work until the response comes in from the socket.

After a bunch of work, I was able to reduce the query count from nine to five:

namespace=errbit_development.apps selector={"api_key"=>"3777b363b3c03bc4e146f1e2523bf70b"} flags=[:slave_ok] limit=0 skip=0 project=nil | runtime: 0.4294ms
namespace=errbit_development.$cmd selector={:findandmodify=>"backtraces", :query=>{"fingerprint"=>"b8ff41db0f445d80a3a9c3ec91a2ecd9a06b1b56"}, :update=>{"$setOnInsert"=>{:fingerprint=>"b8ff41db0f445d80a3a9c3ec91a2ecd9a06b1b56", :lines=>[{"number"=>"425", "file"=>"[GEM_ROOT]/gems/activesupport-... flags=[:slave_ok] limit=-1 skip=0 project=nil | runtime: 2.0866ms
namespace=errbit_development.errs selector={"fingerprint"=>"155745cfe8d1e28f7485ce3236da5438f11d49df"} flags=[:slave_ok] limit=0 skip=0 project=nil | runtime: 0.4470ms
namespace=errbit_development.$cmd selector={:insert=>"notices", :documents=>[{"_id"=><BSON::ObjectId:0x69947441683300 data=557e0c9b63726f35570004c7>, "message"=>"Life is about making an impact, not making an income", "error_class"=>"ArgumentError", "request"=>{"url"=>" flags=[] limit=-1 skip=0 project=nil | runtime: 0.3948ms
namespace=errbit_development.$cmd selector={:findandmodify=>"problems", :query=>{"_id"=><BSON::ObjectId:0x69947441715500 data=557dfdd763726f1d6d000031>}, :update=>{"$set"=>{"app_name"=>"test", "environment"=>"development", "error_class"=>"ArgumentError", "last_notice_at"=>Sun, 14 Jun 2015 23:2... flags=[:slave_ok] limit=-1 skip=0 project=nil | runtime: 0.6351ms

By making use of findandmodify, I was able to combine the operations to find and modify backtraces and problems from five queries into just two. And I was able to shave one more query from the end by caching the result of querying the app record.

Expecting this to make a big difference in performance, I took some new measurements. But in my next set of tests, I found no measurable impact.

Round Two

Not to be deterred, I kept playing with Errbit until I noticed something interesting that I hadn’t before. While running my tests, I saw that the request log seemed to flow in bursts, even though my server is running a single process and a single thread. This burst-pause-burst behavior made me suspect Ruby GC was somehow involved, so I instrumented Errbit with gc_tracer to see if I could confirm.

Here’s a look at the server log during this test where I print GC!!!!!! on every GC event:

$ bundle exec unicorn -c config/unicorn.default.rb > /dev/null
I, [2015-06-14T17:36:21.656538 #27248]  INFO -- : Refreshing Gem list
I, [2015-06-14T17:36:24.054633 #27248]  INFO -- : listening on addr= fd=10
I, [2015-06-14T17:36:24.057564 #27248]  INFO -- : master process ready
I, [2015-06-14T17:36:24.058008 #27323]  INFO -- : worker=0 spawned pid=27323
I, [2015-06-14T17:36:24.060624 #27323]  INFO -- : worker=0 ready - - [14/Jun/2015 17:36:29] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 1.0742 - - [14/Jun/2015 17:36:29] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0239
GC!!!!!! - - [14/Jun/2015 17:36:29] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0910 - - [14/Jun/2015 17:36:29] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0273 - - [14/Jun/2015 17:36:29] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0400 - - [14/Jun/2015 17:36:29] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0267 - - [14/Jun/2015 17:36:29] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0253 - - [14/Jun/2015 17:36:29] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0266 - - [14/Jun/2015 17:36:30] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.7641 - - [14/Jun/2015 17:36:30] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0310 - - [14/Jun/2015 17:36:30] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0827 - - [14/Jun/2015 17:36:30] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0264
GC!!!!!! - - [14/Jun/2015 17:36:30] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0952 - - [14/Jun/2015 17:36:30] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0313 - - [14/Jun/2015 17:36:30] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0236 - - [14/Jun/2015 17:36:30] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0262 - - [14/Jun/2015 17:36:31] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.7083 - - [14/Jun/2015 17:36:31] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0324 - - [14/Jun/2015 17:36:31] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.1000 - - [14/Jun/2015 17:36:31] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0381 - - [14/Jun/2015 17:36:31] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0689 - - [14/Jun/2015 17:36:31] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0357
GC!!!!!! - - [14/Jun/2015 17:36:31] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0375 - - [14/Jun/2015 17:36:31] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.0266 - - [14/Jun/2015 17:36:32] "POST /notifier_api/v2/notices HTTP/1.1" 200 - 0.6870

The amount of time spent performing the actual GC is not negligible, but also not immense. We’re seeing one GC event for every 10 requests. We can tune Ruby’s GC parameters to change this behavior, but it will come at the expense of memory. And increasing the time between GC events will likely increase the time used to perform GC since in theory it should take more time to sweep more memory space. Running into this many garbage collection events on a regular basis indicates the process is steadily allocating chunks of memory.

It’s worth pointing out that what we’re seeing in these logs is not inherently bad. It’s merely indicative of a process that steadily allocates more garbage collectible memory. Whether or not this is too much depends entirely on the actual work that the process is doing. In this case, the process is basically recording one error notification per request and I’m inclined to think there’s a large amount of unnecessary memory allocation which contributes in a sizable way to Errbit’s performance issues. Since allocating and garbage collecting memory takes some amount of time, figuring out where all that memory is going will be my next priority.

Continue reading »