Personal technical blog of Aristarkh Zagorodnikov. Topics covered include (or at least plan to include) C++, C#, MongoDB, NGINX and other development technologies.
2011-12-11
Windows Azure, [not grand] finale
It appears that while Windows Azure has a lot of good points (for example, the idea of PaaS is pretty good, since Web Roles look not that much different from the fabled Heroku deployment), the built-in limitations (like 20 cores limitation per account) and the inability to run different OSes (custom Windows images via VM Role are fine, but we need Linux also) will make our migration much harder. In fact, we only need Windows for application servers, Linux runs everything other just fine and is much easier to staff. So, while Azure looked fine in the first place, it looks like Amazon Web Services would be our choice (yes, it has its own bad limits like 2Gbps EC2->EBS bandwidth cap, yet it appears to be more flexible).
2011-12-06
7 million hits and transfer billing
A week ago we had 7 million daily hits on one of our websites and today we have it another time, another small, yet important milestone for us =)
I'm happy we aren't growing exponentially though, even linear growth with limited resources wasn't that easy to handle in terms of performance and (unsurprisingly) staffing.
The biggest headache this far though was media storage, which currently has several terabytes of images stored on our hardware with over a hundred TiB served per month (yes we aren't THAT big yet), and that we're looking forward to move somewhere like Windows Azure, but the estimated transfer bills are, well, shocking: ten times more expensive than we pay now. I wonder if transfer prices will drop soon. While storage is really getting cheaper (hard disks are getting cheaper, well they were until that flood), there is no visible driving force that could cause transfer rates go down.
I'm happy we aren't growing exponentially though, even linear growth with limited resources wasn't that easy to handle in terms of performance and (unsurprisingly) staffing.
The biggest headache this far though was media storage, which currently has several terabytes of images stored on our hardware with over a hundred TiB served per month (yes we aren't THAT big yet), and that we're looking forward to move somewhere like Windows Azure, but the estimated transfer bills are, well, shocking: ten times more expensive than we pay now. I wonder if transfer prices will drop soon. While storage is really getting cheaper (hard disks are getting cheaper, well they were until that flood), there is no visible driving force that could cause transfer rates go down.
2011-11-28
Windows Azure, part 2
When I first checked out Windows Azure, I was glad to find it has root containers. Unfortunately, they are almost unusable for us, since they do not allow subdirectories (see the docs for that). I learned that the semi-hard way, stumbling upon the very helpful StorageClientException: "The requested URI does not represent any resource on the server." So, although root containers technically exist, with this limitation they are of no use.
Also, while the retry policy and timeout handling in Windows Azure .NET SDK is fine, exception handling is not. While getting StorageClientException and StorageServerException is expected, the WebException is not expected at all (I thought that one should be wrapped in StorageServerException).
Other than that, though, Windows Azure .NET SDK is pretty straightforward and easy to use.
Also, while the retry policy and timeout handling in Windows Azure .NET SDK is fine, exception handling is not. While getting StorageClientException and StorageServerException is expected, the WebException is not expected at all (I thought that one should be wrapped in StorageServerException).
Other than that, though, Windows Azure .NET SDK is pretty straightforward and easy to use.
2011-11-23
Windows Azure first experience
Windows Azure looks to be a fine platform, but the toolset installation could be more streamlined. First, Azure Tools for Visual Studio complained about "Error 0x80070643", which was resolved by installing Azure SDK, Libraries and Emulator before. Then, emulator told me that I do not have SQL Server 2008 installed by popping up with a helpful message that there is a "possible security problem, see here", which leaded me to a page that had nothing to do with real cause of the problem.
After that, though, everything went smoothly. Since we're mostly interested in storage side of cloud services, I'll explore Windows Azure storage and probably will write a post or two about it.
After that, though, everything went smoothly. Since we're mostly interested in storage side of cloud services, I'll explore Windows Azure storage and probably will write a post or two about it.
2011-11-22
Basic SSD tuning for MongoDB
We explored several options when using MongoDB on an SSD and came to following conclusions:
- Don't turn off MongoDB journaling unless you really 100% sure -- the degree of comfort it provides you after your server restarts non-gracefully, is enough to warrant it's usage even within a replica set.
Use ext4 file system, mounted with "noatime,data=writeback,nobarrier"(nobarrierdidn't give measurable differences on our workload, but others say it's still a good thing). ext4 is fast when allocating files (see 3. below), along with allowing you to delay file metadata updates (reliability is already covered by MongoDB journals).
Correction from the year 2019: just use XFS.- Enable MongoDB options smallfiles and noprealloc (unless you're writing an application that is very heavy on inserts that is going to push the SSD to it's limits). SSDs still cost a lot of money and if you're installing 120GB or 160GB ones as we do, you don't want five empty databases occupy a gigabyte of that precious space (we run with directoryperdb=true also, it's handy for management). With smallfiles=true, noprealloc=true works just fine -- 512MiB files that get created by MongoDB are allocated in abour 300-600ms even under load, thus saving you even more space.
Improving robustness for C# MongoDB clients
I wonder if I should publish a set of tools and patches that make easier to write close-to-zero-downtime-without-users-noticing-that-half-servers-are-gone applications. Guess I'll put in a bit of effort to make it better suited for public release, like translating all the documentation comments from Russian to English :)
The basic idea for the tools is to provide a side-attached layer that gracefully handles failure and retries the operation if the tool decides that it still might succeed. While the idea is easy, it really works for something as crude as pulling the plug for half of the servers with users noticing only a slight (several seconds max) delay with their web pages load times for a few seconds.
The basic idea for the tools is to provide a side-attached layer that gracefully handles failure and retries the operation if the tool decides that it still might succeed. While the idea is easy, it really works for something as crude as pulling the plug for half of the servers with users noticing only a slight (several seconds max) delay with their web pages load times for a few seconds.
Replication is never a proper replacement for backups
Today I almost had that special moment that makes you glad you did backups -- was going to drop the database and noticed that I'm on the wrong server less than a second before my finger finally reached that "Enter" key =) I wonder if MongoDB should have some built-in measures against this, maybe some kind of database/collection setup versioning that prevents actual data loss. Then again, I think that's too much for a general-purpose database, but still, having some kind of schema versioning with ability to do a rollback would be a nice feature for something that is already complex and proven to be robust enough for critical applications, like PostgreSQL. Still, the (learned before) lesson for today was that you should never count on replication alone for protection, software failures (both client- and server-side) still may destroy your data.
2011-11-18
MongoDB connection affinity
When using MongoDB via C# driver (might apply to other drivers as well), if you're queuing modifications (i.e. using SafeMode.False) and expect to see the first modification done before the second (I was Remove-ing one item and Insert-ing another one with the same key), never forget to use the same connection (via MongoDatabase.RequestStart), unless would like to get unpleasant surprises, like unexplainable intermittent failures that only occur when your application gets enough load. While after having a close look it is obvious that write ordering is not preserved by default, it wasn't obvious enough when I wrote the initial code for our message queue.
2011-11-17
Homegrown replication
Recently, I was working on improving our homegrown file replication system that we use to allow redundant image storage. Currently it serves tens of millions of files that occupy about 4TiB of storage, about 8-10 gigabytes are added per day. In reality, it's not a "replication system" by itself, it's just a system that delays writes to unavailable targets until they become available. It turned out to be very efficient and resilient even to the "we just lost two disks" cases, without any centralized authority.
We considered both custom filesystems for both Linux and Windows, but all of them required some kind of a central management server or servers, which we rather not have. Also, we considered storing files in MongoDB GridFS, but a simple session with calculator told us that replacing a node (taking down, adding another one, syncing before oplog gets exhausted) for such volumes and large items would be prohibitive, copying a virtual disk image is much simpler and faster than doing it though the database layer, so while the idea of specialized file storage in database is very appealing, MongoDB GridFS deployment for terabyte-scale files with intensive write load requires considerable amount of preplanning that defeats the main (well, for me) feature of MongoDB -- simplicity.
We considered both custom filesystems for both Linux and Windows, but all of them required some kind of a central management server or servers, which we rather not have. Also, we considered storing files in MongoDB GridFS, but a simple session with calculator told us that replacing a node (taking down, adding another one, syncing before oplog gets exhausted) for such volumes and large items would be prohibitive, copying a virtual disk image is much simpler and faster than doing it though the database layer, so while the idea of specialized file storage in database is very appealing, MongoDB GridFS deployment for terabyte-scale files with intensive write load requires considerable amount of preplanning that defeats the main (well, for me) feature of MongoDB -- simplicity.
Horizontal scaling
Today we finished conversion of a medium-load (~1.2k requests for dynamic content per second) application (frontend, C#+ASP.NET Web Forms+ASP.NET Web Pages) to work on two separate machines. While moving databases around and adding memcached memory was relatively easy, splitting the application that contained state (lots of internal caches) proved a bit hard, even we already have a plan for proper user affinity (cookies+ip hashing via haproxy). PHP users, for example, do not have the luxury of large-scale persistent internal state, so they do not plan for it and use external devices like memcached, message queues, etc. Our application was written in ASP.NET and used about 70 internal caches when I started working on it. Well, now it perfectly runs on two machines behind NGINX+haproxy, and where's two, there's three and more =)
2011-11-16
NGINX and keepalives to backend
I wonder when NGINX will have keepalive connections to backend? Patch by Maxim Dounin was floating around for about half a year, then it made into beta, but it's still in beta now, while our media frontend server is busy creating/destroying several thousands of connections to backends instead of reusing existing ones, there are only about 20 connections needed to serve ~3.5kreq/s (current load). Well, maybe G-WAN will be better, although AFAIK it doesn't come with a prebuilt proxy caching.
2011-11-10
MongoDB as an MQ persistence solution
After reviewing several message queues (ZeroMQ and RabbitMQ looked fine, but ZeroMQ has no durability and is in fact more a protocol than an MQ, and RabbitMQ has that "solutionness" all around it), we decided to roll our own simple MQ, using MongoDB (multiple publishers/multiple subscribers with implicit configuration). After I completed the implementation, a reference to a nice article about MQ in MongoDB popped up on mongodb-user or mongodb-masters list, so I got assured that the idea itself was O.K. Too bad I found another article on the same topic late enough -- while I had the idea of using capped collections (and rejected it because write performance isn't that important for our MQ, while persistence is), using tailable cursors just didn't even appear in my mind, although I knew of the feature's existence. Many thanks to the author =)
2011-11-07
Subscribe to:
Posts (Atom)