The rate at which my server can accept() new incoming TCP connections is really bad under Xen. The same test on bare metal hardware shows 3-5x speed ups.
- How come this is so bad under Xen?
- Can you tweak Xen to improve performance for new TCP connections?
- Are there other virtualization platforms better suited for this kind of use-case?
Background
Lately I've been researching some performance bottlenecks of an in-house developed Java server running under Xen. The server speaks HTTP and answers simple TCP connect/request/response/disconnect calls.
But even while sending boatloads of traffic to the server, it cannot accept more than ~7000 TCP connections per second (on an 8-core EC2 instance, c1.xlarge running Xen). During the test, the server also exhibit a strange behavior where one core (not necessarily cpu 0) gets very loaded >80%, while the other cores stay almost idle. This leads me to think the problem is related to the kernel/underlying virtualization.
When testing the same scenario on a bare metal, non-virtualized platform I get test results showing TCP accept() rates beyond 35 000/second. This on a Core i5 4 core machine running Ubuntu with all cores almost fully saturated. To me that kind of figure seems about right.
On the Xen instance again, I've tried enable/tweak almost every settings there is in sysctl.conf. Including enabling Receive Packet Steering and Receive Flow Steering and pinning threads/processes to CPUs but with no apparent gains.
I know degraded performance is to be expected when running virtualized. But to this degree? A slower, bare metal server outperforming virt. 8-core by a factor of 5?
- Is this really expected behavior of Xen?
- Can you tweak Xen to improve performance for new TCP connections?
- Are there other virtualization platforms better suited for this kind of use-case?
Reproducing this behavior
When further investigating this and pinpointing the problem I found out that the netperf performance testing tool could simulate the similar scenario I am experiencing. Using netperf's TCP_CRR test I have collected various reports from different servers (both virtualized and non-virt.). If you'd like to contribute with some findings or look up my current reports, please see https://gist.github.com/985475
How do I know this problem is not due to poorly written software?
- The server has been tested on bare metal hardware and it almost saturates all cores available to it.
- When using keep-alive TCP connections, the problem goes away.
Why is this important?
At ESN (my employer) I am the project lead of Beaconpush, a Comet/Web Socket server written in Java. Even though it's very performant and can saturate almost any bandwidth given to it under optimal conditions, it's still limited to how fast new TCP connections can be made. That is, if you have a big user churn where users come and go very often, many TCP connections will have to be set up/teared down. We try to mitigate this keeping connections alive as long as possible. But in the end, the accept() performance is what keeps our cores from spinning and we don't like that.
Update 1
Someone posted this question to Hacker News, there's some questions/answers there as well. But I'll try keeping this question up-to-date with information I find as I go along.
Hardware/platforms I've tested this on:
- EC2 with instance types c1.xlarge (8 cores, 7 GB RAM) and cc1.4xlarge (2x Intel Xeon X5570, 23 GB RAM). AMIs used was ami-08f40561 and ami-1cad5275 respectively. Someone also pointed out that the "Security Groups" (i.e EC2s firewall) might affect as well. But for this test scenario, I've tried only on localhost to eliminate external factors such as this. Another rumour I've heard is that EC2 instances can't push more than 100k PPS.
- Two private virtualized server running Xen. One had zero load prior to the test but didn't make a difference.
- Private dedicated, Xen-server at Rackspace. About the same results there.
I'm in the process of re-running these tests and filling out the reports at https://gist.github.com/985475 If you'd like to help, contribute your numbers. It's easy!
(The action plan has been moved to a separate, consolidated answer)
 Notice the TCP_CRR bars, compare "2.6.18-239.9.1.el5" vs "2.6.39 (with Xen 4.1.0)".
 Notice the TCP_CRR bars, compare "2.6.18-239.9.1.el5" vs "2.6.39 (with Xen 4.1.0)".