mod_gridfs performance

In my previous post, I announced mod_gridfs. Now, it's time for some numbers. Serving a 3KiB file over a gigabit network on modern hardware, 100 concurrent requests, MongoDB replica set of 3 machines as a backend:
  • NGINX + nginx-gridfs: 1.3krps
  • Apache + mod_gridfs: 6.6krps
  • Apache + mod_gridfs with SlaveOk and one slave: 12.2krps
Not testing with larger files, because this way I'll be benchmarkng OS I/O performance instead of user-mode code.



As we were planning to move our terabytes of files into MongoDB GridFS, it occured to us that there is no readily available way to efficiently serve these files over the web, without resorting to using an ASP.NET GridFS IHttpHandler we implemented for local debugging some time ago.

After much hassle while developing GridFS handler for G-WAN (it is certainly fast, but keeps crashing even on Ubuntu 10.04.4 LTS, both x86 and x64 versions, even when I removed all non-boilerplate code from the module itself), performance measuring https://github.com/mdirolf/nginx-gridfs (until there is an asynchronous MongoDB driver, any GridFS module for NGINX is doomed), I decided to write an Apache 2.x (actually tested on 2.4) module to serve files from GridFS.

I decided to release it as open source here: https://bitbucket.org/onyxmaster/mod_gridfs/. It runs faster than the NGINX one, even with multiple workers, so we're going to use it as a backend (with NGINX as a caching frontend of course).

Configuration example:
GridFSConnection rsTest/db1,db2
GridFSDatabase my_database