1

I'm preparing a server setup for a dating website who need to be able to process around 5'000 - 10'000 req/sec to the main site.

My idea was to do it this way:

Server for static content (css, js, img) :: Varnish cache => nginx webserver

Server for member photos :: [1] Varnish cache => [n] nginx webserver

Server for member videos and streaming :: nginx webserver with Erlyvideo or Wowza (only paid member)

Server for web app :: [1] nginx (as cache if needed) => [n] HipHop webserver or Apache mod_php nolog

is this OK or is there a better way?

We developed the web app with a custom framewrok and optimized as much possible, the result is that the execution time per site don't take more then 0.05 sec (no cache) 0.0009 sec (with apc or memcached) on a 3 yr old development webserver with apache and mysql.

I'm not sure how much Server we will need for the web app and for the DB to can handle this amount of requests.

2 Answers 2

0

put a load balancer in front of it all so you can pull servers in and out, spread the load [preferably for all types of content - both static and dynamic]. [you can use nginx or varnish as reverse proxy for that and if that's not enough - put cluster of LVS or haproxy servers to spread the load across multiple reverse proxies].

make sure your generators of dynamic content are stateless or at least would allow partitioning with sticky sessions so you can easily increase capacity with liner investment.

maybe it makes sense to offload static hosting to some cdn?

6
  • I have take a view on HA Proxy, but I don't know if such software will be able to handle the amount of requests, another idea is to use a managed Switch (something like Cisco Catalyst WS-C3750G-24TS-E1U) who can do a simple load balancing too. i will take care to be able to increase the web app server if needed. the static content (css, js, img) will be a solution to move to a cdn. the member photos will be not possible to move because we need to provide a basic security for member photos marked as private and the size of around 3 TB make its heave for fast moving. Commented Aug 8, 2010 at 9:50
  • HAproxy will do fine with 10Kreq/s. I don't see how a 3700 would provide suitable L3/L7 loadbalancing; that's more the terrain of the cisco ACE. That will add up fast on your budget 'tough. There is no real need to 'move' the content to a CDN, it can fetch it from your servers first time arround. However, I consider it unlikely that a CDN is required here, rather a good webserver setup (with proper http and tcp tuning). Commented Aug 8, 2010 at 10:41
  • @Nenad - sure - software LB is first stage, then you can go for a hardware solution. Commented Aug 8, 2010 at 10:51
  • @Joris - regarding cdn - it all depends on costs and focusing on core stuff vs trying to do everything in-house. sometimes it just makes more economic sense to offload that to a 3rd party. luckily you can go forth-and-back with that part. Commented Aug 8, 2010 at 10:52
  • @Joris - Yes, Cisco ACE is to heavy for our Budget. I will try to do some benchmarks with HAProxy and Webserver in back to understand better how much load it can handle. When you tell tcp tuning do you mean set proper setting for values like netdev_max_backlog, tcp_max_tw_buckets, disable tcp acknowledgements or something other? Commented Aug 8, 2010 at 11:16
0

It sounds like a reasonable setup, although you can probably do without the varnish for the static content.

The biggest gains are probably in what you glossed over: how many video streaming users, how good/bad is the framework and how loaded is the database.

Also typically a gamechanger-metric is how many logged in users you expect from those 10K pages.

The only way to tell what hardware you need is to run proper benchmarks against your production-ready software stack. If you can, make the uptake gradual and keep optimizing the bottlenecks.

Btw, if you need help scaling or performance-optimizing: I'm available for projects.

9
  • After i figure out how HA Proxy => Webserver will perform i will try to do benchmark with the actual version of the web app. As you mention, we already take care for the web app so we mostly try to query apc/memcache instead DB server, build a custom framework for our needs only (Zend FW, CakePHP, Codeigniter with them i think we have need 3-7 times more server) and I hope we will be able to handle many requests with a small amount of server. Commented Aug 8, 2010 at 11:21
  • @Nenad HAproxy will probably not bring much gain in synthetic benchmarks, it's mostly useful for balancing multiple servers and some of the real-life annoyances (tcp setup and spoon-feeding slow clients, freeing up on your apache). A highly tuned webapp is always good news, and it sounds like you've written it with caching in mind - so that's great. Now it's a matter of testing it's real performance and finding bottlenecks. Commented Aug 8, 2010 at 13:04
  • @Nenad btw, when you mention 5-10Kreq/s, is that 'pageviews' or 'total requests including assets/css/javascript'? Also, some measures for fast-frontend sites (see yahoo performance team) often have a reduced server-load (and bandwidth) as a nice side-effect. I'm also available for help on that ;-) Commented Aug 8, 2010 at 13:16
  • @Joris - I see I need build a server setup for testing all possible solutions. Yes, I mean pageviews = requests to a page contain different assets (css, js, img) means 5'000 req/sec will generate around 50'000 request to "files" per sec, we will put all js in around 2-3 compressed file, all small img in 1 img and work with img-map over css and compress css. if we can not self handle the requests we will move the assets to a cdn. At the moment i see the problem with the webserver for php stuff, im wondering how much concurrent session and req/sec nginx or HipHop will handle per server. Commented Aug 8, 2010 at 17:19
  • @Nenad You definitely need to benchmark, and consider a slow-starting scenario (switching part of your users to the new platform). The concurrency of the application server is not my main worry (HAproxy/nginx/varnish will take care of the connections), you can (shared nothing/little) put many of those application boxes shoulder to shoulder. The concurrency of the backend (both database and caches) may be more of a concern; there may be hidden non-linear elements in there. Commented Aug 8, 2010 at 19:06

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.