Running a scalable and reliable Symfony2 application 22nd Nov 2013 Symfony Sweden November Camp
@vjom = Ville Mattila github.com/vmattila/ | villescorner.com | fi.linkedin.com/in/villemattila • CTO & President of the board in Eventio Oy – We provide IT tools and services for event organizers – Office(s) in Kaarina & Turku (Åbo), Finland • Our flagship product eventio.com runs on Symfony 2.3
Being a reliable means that you accept and prepare that ANYTHING CAN FAIL, ANY TIME (scalability comes free on top)
Our Infrastructure • • • Region: eu-west-1 Multiple AZ’s Auto-scaling Load Balancer (ELB) PHP/HTTP Servers (EC2) PHP/CLI Background Workers (EC2) (RDS Multi-AZ) CDN (CloudFront) S3 Storage for ”in-app” files S3 Storage for assets
Session Handling Store Sessions in Database • Replace NativeFileSessionHandler with an existing database handler or create your own config.yml: services: my.distributed.session_handler: class: MyDistributedSessionHandler arguments: [ . . . ] framework: session: handler_id: my.distributed.session_handle Symfony2 offers handlers for PDO, MongoDB and Memcached by default!
Session Handling Beware of Race Conditions • Default SessionHandler implementations (or PHP itself) do not implement any kind of locking • Only session snapshot from the request that finishes last is stored (other changes vanish) Request #1 $session->set(“email”, “ville@eventio.fi”) Request #2 TIME $session->set(“twitter”, “vjom”); // // // // Some time consuming operation that takes seconds... like email validation. Request #3 // And other operations $session->set(“name”, “Ville”) exit; // And other operations exit; exit;
Session Handling in Eventio.com: • Symfony2 (”browser”) sessions in Redis Cluster, without locking The ”business session” identifier Flash Messages Other temporary and/or cached data PredisSessionHandler is in our GitHub • ”Business sessions” are stored in MongoDB • Common identifier for the user’s or customer’s session throughout the system • Business sessions are never purged for accounting and auditing reasons • Updates to session parameters are done atomically (by Doctrine ODM)
Background Processing return new Response(); // ASAP!
Background Processing Flow Controller validates the request. On PHP/HTTP Servers POST /get_report {”poll”: ”/is_ready?handle=1234”} ReportController:: generateAction() The actual task is passed to the queue GET /is_ready?handle=1234 ReportController:: pollAction() {”status”: ”wait”} Queue Abstraction Library Status reports to a shared store. On PHP/CLI Worker Servers Task Processor QH invoked CLI Process Queue Handler (QH) Long running CLI Process … and forwards it to another process that does the actual job. Queue Implementation EventioBBQ is in our GitHub Queue Handler asks new jobs from the queue
Session Handling Reporting Status • Pattern 1: Send status reports to a database (or a key-value store) // Set status in a background worker $predis->set('bg_task_status:' . $handle, 'completed'); // Get status in the poll handling controller $predis->get('bg_task_status:' . $handle); • Pattern 2: Pass result to a (temporary) queue and poll the queue for status reports
Remember with CLI Workers • Service Container’s request scope is not available Use service_container and/or abstract away the requested feature. • Session information is not directly available in worker processes • We pass Session ID (and Request ID) with the job and recreate the environment for the background process • service_container is a great help here
S3 as a Common File Store • Send files to S3 with a simple HTTP Call • Files are accessible over HTTP by the application (private) or directly by external users (public) • URL signing enables you to grant temporary access also to private files • Object Expiration for a bucket works as a /tmp for cloud // composer.json "require": { "aws/aws-sdk-php": "2.*" } $objectParams = array( 'Bucket' => '...', 'Key' => 'my/files/pngfile.png', 'Body' => $fileContents, 'ACL' => CannedAcl::PRIVATE_ACCESS, ); $s3->putObject($objectParams); @see http://aws.amazon.com/sdkforphp2/ $objectParams = array( 'Bucket' => '...', 'Key' => 'my/files/pngfile.png', 'SaveAs' => '/run/shm/pngfile.png', ); $s3->getObject($objectParams);
Reliable Cron How to ensure that 1) a cron initiated process is run only once (to avoid double actions)? 2) cron processes are not our single point of failure?
Reliable Cron instance-1 instance-2 instance-3 eventio:cron is run at every instance, at every minute Every eventio:cron process tries to acquire a global lock, but only one gets it SETNX cron:lock:2013-11-22T11:30:00Z instance-1 From Redis documentation: SETNX Set key to hold string value if key does not exist. In that case, it is equal to SET. When key already holds a value, no operation is performed.
Reliable Cron CronBundle is in our GitHub instance-1 instance-2 instance-3 The current timestamp is persisted and used in future processing Instance that receives the lock continues the cron process and triggers cron.tick event in the application. $cronTime = new DateTime(date("Y-m-d H:i:00")); $cronEvent = new CronEvent($cronTime); $this->get('event_dispatcher')->dispatch('cron.tick', $cronEvent); Job classes listen for the triggered cron.tick event: Decide if we should do something now. Use $cronTime as the current time. class CronJob { public function run(CronEvent $event) { $cronTime = $event->getCronTime(); if ($cronTime->getTimestamp() % (5 * 60)) { return; } // Do the task but consider current time as $cronTime! } } Consider background workers.
ASSET & APPLICATION DEPLOYMENTS Going Live!
Asset Deployment <commithash> Deploymen t master server Asset hash is stored in a simple <commithash>.txt file in our asset S3 bucket The master server creates a hash of all asset files and their (original) contents with the help of assetic:list If the asset S3 bucket does not have /<assethash>.txt file, deployment master does a full assetic:dump and uploads the dumped asset files to the S3 bucket under /<assethash>/ directory. This ensures that HTTP caches are busted.
Application deployments <commithash> Deploymen t master server • • • cache:clear Quick local & connectivity tests (Instance registration back to ELB) Instance is unregistered from the Load Balancer S3 bucket is queried for the <assethash> matching the current <commithash> … and added to the parameters.yml asset_hash: <assethash> config.yml: framework: templating: assets_base_urls: ["https://.../%asset_hash%"]
https://eventio.com/ Thanks! https://github.com/eventio/

Running a Scalable And Reliable Symfony2 Application in Cloud (Symfony Sweden November Camp 22 Nov 2013)

  • 1.
    Running a scalableand reliable Symfony2 application 22nd Nov 2013 Symfony Sweden November Camp
  • 2.
    @vjom = VilleMattila github.com/vmattila/ | villescorner.com | fi.linkedin.com/in/villemattila • CTO & President of the board in Eventio Oy – We provide IT tools and services for event organizers – Office(s) in Kaarina & Turku (Åbo), Finland • Our flagship product eventio.com runs on Symfony 2.3
  • 3.
    Being a reliablemeans that you accept and prepare that ANYTHING CAN FAIL, ANY TIME (scalability comes free on top)
  • 4.
    Our Infrastructure • • • Region: eu-west-1 MultipleAZ’s Auto-scaling Load Balancer (ELB) PHP/HTTP Servers (EC2) PHP/CLI Background Workers (EC2) (RDS Multi-AZ) CDN (CloudFront) S3 Storage for ”in-app” files S3 Storage for assets
  • 5.
    Session Handling Store Sessionsin Database • Replace NativeFileSessionHandler with an existing database handler or create your own config.yml: services: my.distributed.session_handler: class: MyDistributedSessionHandler arguments: [ . . . ] framework: session: handler_id: my.distributed.session_handle Symfony2 offers handlers for PDO, MongoDB and Memcached by default!
  • 6.
    Session Handling Beware ofRace Conditions • Default SessionHandler implementations (or PHP itself) do not implement any kind of locking • Only session snapshot from the request that finishes last is stored (other changes vanish) Request #1 $session->set(“email”, “ville@eventio.fi”) Request #2 TIME $session->set(“twitter”, “vjom”); // // // // Some time consuming operation that takes seconds... like email validation. Request #3 // And other operations $session->set(“name”, “Ville”) exit; // And other operations exit; exit;
  • 7.
    Session Handling inEventio.com: • Symfony2 (”browser”) sessions in Redis Cluster, without locking The ”business session” identifier Flash Messages Other temporary and/or cached data PredisSessionHandler is in our GitHub • ”Business sessions” are stored in MongoDB • Common identifier for the user’s or customer’s session throughout the system • Business sessions are never purged for accounting and auditing reasons • Updates to session parameters are done atomically (by Doctrine ODM)
  • 8.
  • 9.
    Background Processing Flow Controllervalidates the request. On PHP/HTTP Servers POST /get_report {”poll”: ”/is_ready?handle=1234”} ReportController:: generateAction() The actual task is passed to the queue GET /is_ready?handle=1234 ReportController:: pollAction() {”status”: ”wait”} Queue Abstraction Library Status reports to a shared store. On PHP/CLI Worker Servers Task Processor QH invoked CLI Process Queue Handler (QH) Long running CLI Process … and forwards it to another process that does the actual job. Queue Implementation EventioBBQ is in our GitHub Queue Handler asks new jobs from the queue
  • 10.
    Session Handling Reporting Status •Pattern 1: Send status reports to a database (or a key-value store) // Set status in a background worker $predis->set('bg_task_status:' . $handle, 'completed'); // Get status in the poll handling controller $predis->get('bg_task_status:' . $handle); • Pattern 2: Pass result to a (temporary) queue and poll the queue for status reports
  • 11.
    Remember with CLIWorkers • Service Container’s request scope is not available Use service_container and/or abstract away the requested feature. • Session information is not directly available in worker processes • We pass Session ID (and Request ID) with the job and recreate the environment for the background process • service_container is a great help here
  • 12.
    S3 as aCommon File Store • Send files to S3 with a simple HTTP Call • Files are accessible over HTTP by the application (private) or directly by external users (public) • URL signing enables you to grant temporary access also to private files • Object Expiration for a bucket works as a /tmp for cloud // composer.json "require": { "aws/aws-sdk-php": "2.*" } $objectParams = array( 'Bucket' => '...', 'Key' => 'my/files/pngfile.png', 'Body' => $fileContents, 'ACL' => CannedAcl::PRIVATE_ACCESS, ); $s3->putObject($objectParams); @see http://aws.amazon.com/sdkforphp2/ $objectParams = array( 'Bucket' => '...', 'Key' => 'my/files/pngfile.png', 'SaveAs' => '/run/shm/pngfile.png', ); $s3->getObject($objectParams);
  • 13.
    Reliable Cron How toensure that 1) a cron initiated process is run only once (to avoid double actions)? 2) cron processes are not our single point of failure?
  • 14.
    Reliable Cron instance-1 instance-2 instance-3 eventio:cron isrun at every instance, at every minute Every eventio:cron process tries to acquire a global lock, but only one gets it SETNX cron:lock:2013-11-22T11:30:00Z instance-1 From Redis documentation: SETNX Set key to hold string value if key does not exist. In that case, it is equal to SET. When key already holds a value, no operation is performed.
  • 15.
    Reliable Cron CronBundle is inour GitHub instance-1 instance-2 instance-3 The current timestamp is persisted and used in future processing Instance that receives the lock continues the cron process and triggers cron.tick event in the application. $cronTime = new DateTime(date("Y-m-d H:i:00")); $cronEvent = new CronEvent($cronTime); $this->get('event_dispatcher')->dispatch('cron.tick', $cronEvent); Job classes listen for the triggered cron.tick event: Decide if we should do something now. Use $cronTime as the current time. class CronJob { public function run(CronEvent $event) { $cronTime = $event->getCronTime(); if ($cronTime->getTimestamp() % (5 * 60)) { return; } // Do the task but consider current time as $cronTime! } } Consider background workers.
  • 16.
  • 17.
    Asset Deployment <commithash> Deploymen t master server Assethash is stored in a simple <commithash>.txt file in our asset S3 bucket The master server creates a hash of all asset files and their (original) contents with the help of assetic:list If the asset S3 bucket does not have /<assethash>.txt file, deployment master does a full assetic:dump and uploads the dumped asset files to the S3 bucket under /<assethash>/ directory. This ensures that HTTP caches are busted.
  • 18.
    Application deployments <commithash> Deploymen t master server • • • cache:clear Quicklocal & connectivity tests (Instance registration back to ELB) Instance is unregistered from the Load Balancer S3 bucket is queried for the <assethash> matching the current <commithash> … and added to the parameters.yml asset_hash: <assethash> config.yml: framework: templating: assets_base_urls: ["https://.../%asset_hash%"]
  • 19.