Sunday, February 17, 2013

Linux guests on Hyper-V: I/O and CPU utilization caveats

Probably this is known to experienced Hyper-V admins, but I've recently found out, that high "system" CPU utilization in Linux guest (Ubuntu 12.04.2 x64) on a Hyper-V Server 2012 might actually be an indicator of I/O capacity being exhausted.

We are running PostgreSQL on one of the VMs and it had per-core 5-minute average load greater than 5 (actually 7-10). When I checked monitoring, I was surprised that the CPU breakdown was as follows: ~20% softirq (expected since this VM has about 100Mbps in/300Mbps out and is not the only VM on the host system, while NIC's on this machine don't support SR-IOV), 30-40% user (expected, since we don't do sequential scans), ~4% iowait (unexpected since the working set of the database does not completely fit in RAM) and 30-40% system (completely unexpected and unexplainable).

I tried numerous changes including adjusting PostgreSQL settings, adding/removing memory and cores, turning off NUMA both at the guest and the host level, trying a newer OS kernel... until I decided to stop and think it over.

When the monitoring system told us that CPU load is too high, it also mentioned that time spend doing disk I/O on one of the partitions was high too (about 85-90% disk time). I disregarded this warning at first, since iowait was low, but after monitoring PostgreSQL per-table statistics using a simple tool built around the query alike to 'select relid, relname, heap_blks_read + idx_blks_read + coalesce(toast_blks_read, 0) + coalesce(tidx_blks_read, 0) from pg_statio_user_tables' and shuffling around tables to balance reads between different partitions that are mapped to different disks, the system CPU utilization dropped by half.

This is different from running on real hardware, where iowait CPU utilization was going much higher when disk performance capacity was about to be exhausted, while system CPU utilization stayed mostly the same. I believe, that this has something to do not only with virtualization, but also with I/O scheduler used (we're using deadline on hardware and noop on VMs).

P.S. Microsoft's latest hypervisor is actually better at running Linux than many Linux ones and is very good at running Windows guests (no surprises here) -- we evaluated XenServer and KVM, both of them not only had problems with Windows stability and both network and I/O performance, but also with Linux kernels newer than 2.6.x.